pc_model$xis just the coordinates of the observations in the new space defined by axises
(PC1, PC2, PC3), so you’ll have as many rows as there are observations, i.e 2000 rows for 2000 observations.
ls.fit(X, Y)is trying to fit the model
Y = Xb + ewhere Y and e are
(N,M)matrices, X is
(N,K)matrix and b is
(K,M)vector. and K is the number of variables you want to use in the estimation (K=number of columns in the original X matrix + 1 if you want to calculate the coefficient of the intercept which is the default) also N>=K for this regression to be computable.
fit2 <- lsfit(df, pcs)will give correct output, as the conditions are verified, i.e same number of lines and N=2000>=K=601.
- the error
Error in lsfit(df_trans, pcs2) : only 600 cases, but 2001 variablesis caused by df_trans having 2000 columns (variables + 1 for the intercept) while pcs2 having only 600 rows. selecting the first 599 columns circumvents the error
- the error
not all arguments have the same lengthis caused by the arguments
complete.casescall inside of ls.fit because df and pcs2 have different row numbers this error is thrown before reaching the conditional on different row numbers inside of lsfit.
CLICK HERE to find out more related problems solutions.