Skip to content

Help: PLSR

2 messages · Shengzhe Wu, Bjørn-Helge Mevik

#
Hello,

I have a data set with 15 variables (first one is the response) and
1200 observations. Now I use pls package to do the plsr as below.

trainSet = as.data.frame(scale(trainSet, center = T, scale = T))
trainSet.plsr = mvr(formula, ncomp = 14, data = trainSet, method = "kernelpls",
                            model = TRUE, x = TRUE, y = TRUE)

from the model, I wish to know the values of Xvar (the amount of
X-variance explained by each number of components) and Xtotvar (total
variance in X).

Because the trainSet has been scaled before training, I think Xtotvar
should be equal to 14, but unexpectedly Xtotvar = 16562, and the
values of Xvar are also very big and sum of Xvar = 16562. Why does
this type of result occur? for the reason of kernel algorithm?

Thank you,
Shengzhe
#
Shengzhe Wu writes:
[...]
Because the Xtotvar is the "total X variation", measured by sum(X^2)
(where X has been centered).  With 14 variables, scaled to sd == 1,
and 1200 observations, you should get Xtotvar == 14*(1200-1) ==
16786.  (Maybe you have 1184 observations: 14*1183 == 16562.)