Compare PCs to data with lsfit()

  1. first pc_model$x is just the coordinates of the observations in the new space defined by axises (PC1, PC2, PC3), so you’ll have as many rows as there are observations, i.e 2000 rows for 2000 observations.
  2. ls.fit(X, Y) is trying to fit the model Y = Xb + e where Y and e are (N,M) matrices, X is (N,K) matrix and b is (K,M) vector. and K is the number of variables you want to use in the estimation (K=number of columns in the original X matrix + 1 if you want to calculate the coefficient of the intercept which is the default) also N>=K for this regression to be computable.
    • Running fit2 <- lsfit(df, pcs) will give correct output, as the conditions are verified, i.e same number of lines and N=2000>=K=601.
    • the error Error in lsfit(df_trans, pcs2) : only 600 cases, but 2001 variables is caused by df_trans having 2000 columns (variables + 1 for the intercept) while pcs2 having only 600 rows. selecting the first 599 columns circumvents the error lsfit(df_trans[,1:599] ,pcs2)
    • the error not all arguments have the same length is caused by the arguments complete.cases call inside of ls.fit because df and pcs2 have different row numbers this error is thrown before reaching the conditional on different row numbers inside of lsfit.

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top