Skip to content

PCA and gglot2

10 messages · S Ellison, ashz, John Kane +1 more

#
Hi,

I was trying as well as looking for an answer without success (a bit strange
since it should be an easy problem) and therefore I will appreciate you
help:

My simple script is:
# Loadings data of 5 columns and 100 rows of data
data1<-read.csv("C:/?/MyPCA.csv")
pairs(data1[,1:4])
pca1 <- princomp(data1[,1:4], score=TRUE, cor=TRUE)
biplot(pca1)

The biplot present the data points as numbers. How can I present the data
point in color (depends on their group-column 5). I was thinking about doing
it using ggplot2 but I can not succeed. Any idea how to do it?

Thanks 



--
View this message in context: http://r.789695.n4.nabble.com/PCA-and-gglot2-tp4671225.html
Sent from the R help mailing list archive at Nabble.com.
#
It looks like you can if I understand properly. Try this
dat1  <-  data.frame(dat1$scores)
dat1$items  <-  rownames(data1)
ggplot(dat1, aes(Comp.1, Comp.2, colour = items)) + geom_point() +
   theme(legend.position="none")


John Kane
Kingston ON Canada
____________________________________________________________
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!
#
Perhaps the post at
http://www.codesofmylife.com/2012/06/07/plotting-pca-results-with-ggplot2/

would help?
 
(as would googling "biplot in ggplot2", which is how I found it...)

Incidentally, if you want base graphics biplots with points and colour coding, you'd need to modify the biplot code a bit or roll your own. 

S Ellison


*******************************************************************
This email and any attachments are confidential. Any use...{{dropped:8}}
#
Hi,

Thanks. Fig 4 in the link you provided is what I am looking for.

I still do not know how to implement my data1 and pca1 in the script you
provided as I think it is only a part of a full script.
"
data1<-read.csv("C:/?/MyPCA.csv")
pca1 <- princomp(data1[,1:4], score=TRUE, cor=TRUE) 
"

Am I right, how can I implement my data.frames?

Thanks again



--
View this message in context: http://r.789695.n4.nabble.com/PCA-and-gglot2-tp4671225p4671237.html
Sent from the R help mailing list archive at Nabble.com.
#
'Sorry I made a mistake .  I was using some data of my own and didn't make some key changes to the script to match your variables.


dat1  <-  data.frame(pca1 $scores)  # creates the data.frame
dat1$items  <-  rownames(data1pca1 ) # adds item names
ggplot(dat1, aes(Comp.1, Comp.2, colour = items)) + geom_point() +
   theme(legend.position="none")

A quick look suggests that this is roughly the same plot as in the example Fig 4 but there the author is using geom_segment to add the lines but I have not looked at it all that carefully.





John Kane
Kingston ON Canada
____________________________________________________________
FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your desktop!
#
Dear John,

Thanks for the help.

I did some minor modifications to your script as I had some problems:
... 
pca = PCA(data[,1:4], scale.unit=T, graph=F)
dat1  <-  data.frame(pca$scores)  # creates the data.frame
dat1$items  <-  rownames(data$group) # adds item names
ggplot(dat1, aes(pca$ind$coord[,1], pca$ind$coord[,2], colour = dat1$item))
+ geom_point() + theme(legend.position="none")

I still do not get separation by color by group (column 5 of csv file) as
the  dat1 is empty (data frame with 0 columns and 0 rows).

Any reason why?

Thanks again.



--
View this message in context: http://r.789695.n4.nabble.com/PCA-and-gglot2-tp4671225p4671253.html
Sent from the R help mailing list archive at Nabble.com.
#
Hi,

Thanks to ssefick for the ggbiplot tip.

It works fine so I submit a general script thats works for future users.

library(ggbiplot)
data<-read.csv("C:/?/MyPCA.csv") 
data1<-data[,1:4] 
my.pca <- prcomp(data1, scale. = TRUE)
my.class<- data$Group  
g <- ggbiplot(my.pca, obs.scale = 1, var.scale = 1,groups = my.class,
ellipse = TRUE, circle = TRUE)
g <- g + scale_color_discrete(name = '')
g <- g + theme(legend.direction = 'horizontal', 
               legend.position = 'top')
print(g)

BTW
Installation:
library(devtools)
install_github("ggbiplot", "vqv")

you will need to instal before Rtools
(http://cran.r-project.org/bin/windows/Rtools/)

Thanks a lot for the help.




--
View this message in context: http://r.789695.n4.nabble.com/PCA-and-gglot2-tp4671225p4671258.html
Sent from the R help mailing list archive at Nabble.com.
#
Not sure why the problem. I think I'd need see your actual data and give it a try.  If you want to supply your data or a sample of it see ?dput for a convenient way to do so.

I see thought that you've found a dedicated ggplot biplot  so if may not be worth your while.

John Kane
Kingston ON Canada
____________________________________________________________
GET FREE SMILEYS FOR YOUR IM & EMAIL - Learn more at http://www.inbox.com/smileys
Works with AIM?, MSN? Messenger, Yahoo!? Messenger, ICQ?, Google Talk? and most webmails