Using linear regression coefficients from table to compute values

  Kiến thức lập trình

I have a file with many variable names and coefficients. The task is to use those variable names and coefficients to create a linear regression formula and apply it to data. Here’s a small example:

coefs <- tibble(varname = c("(Intercept)", "dxaids", "abnormal_bun"),
                coef = c(-3.1, 0.1, 0.2))

data <- tibble(dxaids = c(0,0,1), abnormal_bun = c(1,0,0))

The goal is a new column, effectively

data %>% mutate(y = -3.1 + 0.1 * dxaids + 0.2 * abnormal_bun)

Of course I can write an ugly loop for this, shown below, but is there any cleaner way with tidyverse tools? Perhaps this can be accomplished with a single matrix-vector multiply, but dplyr doesn’t seem amenable to matrix operations.

y <- as.numeric(coefs[coefs$varname == "(Intercept)", "coef"])

for (i in 1:nrow(coefs)) {
  varname <- as.character(coefs[i,"varname"])
  coef <- as.numeric(coefs[i,"coef"])
  if (varname != "(Intercept)") 
    y <- y + coef * data[,varname] 
}

LEAVE A COMMENT