prcomp - arbitrary direction of the returned principal components
This reminds me of a situation in 1975 where a large computer service bureau had contracted to migrate scientific software from a Univac 1108 to a an IBM System 360. They spent 3 weeks trying to get the IBM to give the same eigenvectors on a problem as the Univac. There were at least 2 eigenvalues that were equal. They were trying to fix something that was not broken. Their desperation was enough to offer me a very large fee to "fix" things. However, I had a nice job, so told them to go away and read a couple of books on the real-symmetric eigenvalue problem or singular value decomposition, though the latter was just becoming known outside of numerical linear algebra. I suspect the OP should go back to basics with principal components and not try to fiddle with the output. It is likely that the "loadings" (I'm never sure of the nomenclature -- I use the matrix setup) can be rotated, but you can't just rotate one vector of a set on its own. Amazing how these old issues linger for decades. Or maybe linear algebra is not on the curriculum. John Nash
On Thu, 2022-10-13 at 19:35 +0530, Ashim Kapoor wrote:
Dear All, Many thanks for your replies. My PC1 loading turns out to be : 1/sqrt(2) , -1/sqrt(2) In simple words : I had 2 variables and I ran prcomp on them. I got my PC1 as :? .7071068 var1 - .7071068 var2 PC2 turned out to be the same as PC1 with a PLUS replacing the minus, ie. .7071068 var1 + .7071068 var2 But forget PC2 for the time being. Now my question is : I am not able to use the rule that : choose the variable with a bigger magnitude of loading and multiply PC1 by -1 if needed (to flip the PC1 since any vector x and it's flipped version -x ?are the same vector but with opposite direction) if the variable with bigger magnitude is of negative sign. I have an alternative measure of stress which is trending UP and has 2 peaks during 2 recessions and I can see that PC1 is trending DOWN and has 2 TROUGHS during the same recessions. That's how I wish to FLIP PC1 with a negative sign. The data is not mine and I am not at liberty to share it. I can construct an artificial example but I would need time to do that. That's what's happening. Best Regards and Many thanks. Ashim On Thu, Oct 13, 2022 at 5:38 PM Ebert,Timothy Aaron <tebert at ufl.edu> wrote:
I still do not understand. However, the general approach would be to identify a
specific value to test. If the test is TRUE then do "this" otherwise do nothing. Once
the test condition is properly identified, the coding easily follows.
?abs() is the same as
if x<0 then x = -x?? (non-R code, just idea)
The R code might look something more like
for (number in 1:ncol(x)){
?? if (x[3,2] < 0) {
???????? x[number, number] = -x[number, number] #only change the diagonal
?? }
}
Depending on what values need to be changed you may need a nested for loop to go
through all values of x[number1, number2].
Your words: " I can forcefully use a NEGATIVE sign to FLIP the index when it is LOW."
Where it appeared that "low" was defined as values that are negative. You still will
have low values (close to zero) and high values (far from zero).
You could make the condition some other value:
if x< -4 then x = -x
If you just want to rotate about zero then
x = -x
In this case the positive values will become negative and the negative values
positive.
Add an if test to selectively rotate based on the value of a single test element in x
(as in x[3,2]).
In debugging or trouble shooting setting seed is useful. For actual data analysis you
should not set seed, or possibly better yet use set.seed(NULL).
Tim
-----Original Message-----
From: Ashim Kapoor <ashimkapoor at gmail.com>
Sent: Thursday, October 13, 2022 12:28 AM
To: Ebert,Timothy Aaron <tebert at ufl.edu>
Cc: R Help <r-help at r-project.org>
Subject: Re: [R] prcomp - arbitrary direction of the returned principal components
[External Email]
Dear Aaron,
Many thanks for your reply.
Please allow me to illustrate my query a bit.
I take some data, throw it to prcomp and extract the x data frame from prcomp.
From ?prcomp:
?????? x: if 'retx' is true the value of the rotated data (the centred
????????? (and scaled if requested) data multiplied by the 'rotation'
????????? matrix) is returned.? Hence, 'cov(x)' is the diagonal matrix
????????? 'diag(sdev^2)'.? For the formula method, 'napredict()' is
????????? applied to handle the treatment of values omitted by the
????????? 'na.action'.
I consider x[,1] as my index. This makes sense as x[,1] is the projection of the data
on the FIRST principal component.
Now this x[,1] can be a high +ve number or a low -ve number. I can't ignore the sign.
If I ignore the sign by taking the absolute value, the HIGH / LOW stress values will
be indistinguishable.
Hence I do not think using absolute values of x[,1] is the solution.
Yes it will make the results REPRODUCIBLE but that will be at the cost of losing
information.
Any other idea ?
Many thanks,
Ashim
On Wed, Oct 12, 2022 at 5:23 PM Ebert,Timothy Aaron <tebert at ufl.edu> wrote:
Use absolute value Tim -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Ashim Kapoor Sent: Wednesday, October 12, 2022 7:48 AM To: R Help <r-help at r-project.org> Subject: [R] prcomp - arbitrary direction of the returned principal components [External Email] Dear R experts, From ?prcomp, ---- snip ----- Note: ???? The signs of the columns of the rotation matrix are arbitrary, and ???? so may differ between different programs for PCA, and even between ???? different builds of R. ---- snip ------ My problem is that I am building an index based on Principal Components Analysis. When the index is high it should indicate stress in the market. Due to the arbitrary sign sometimes I get an index which is HIGH when there is stress and sometimes I get? the OPPOSITE - an index which is LOW when there is stress. This program is shared with other people who may have a different build of R. I can forcefully use a NEGATIVE sign to FLIP the index when it is LOW. That works. Now my query is : Just like we do set.seed(1234) and force the pattern of generation of random number and make it REPRODUCIBLE, can I do something like : set.direction.for.vector.in.pca(1234) Now each time I do prcomp it should choose the SAME ( high or low ) direction of the principle component on ANY computer having ANY version of R installed. That's what I want. I don't want the the returned principal component to be HIGH(LOW) on my computer and LOW(HIGH) on someone else's computer. That would confuse the people the code is shared with. Is this possible ? How do people deal with this ? Many thanks, Ashim
______________________________________________ R-help at r-project.org?mailing list -- To UNSUBSCRIBE and more, see https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat .ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl .edu%7C60e6d6ae8645462db99b08daacd36b76%7C0d4da0f84a314d76ace60a62331e 1b84%7C0%7C0%7C638012321302591064%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4w LjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C &sdata=AHMEDU%2BTyInvW%2FH6EZQteO1qZ%2BtW3JZfybfaveTD8Yk%3D&re served=0 PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r -project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu% 7C60e6d6ae8645462db99b08daacd36b76%7C0d4da0f84a314d76ace60a62331e1b84% 7C0%7C0%7C638012321302591064%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwM DAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C& sdata=yavXAiQorhZjPTozG4Ulo8SuNmR6XFhvA%2FLX9Tfwgi0%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org?mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.