Hi, I have a problem with making PCA plots that are readable. I would like to set different sympols instead of the numbers of my samples or their names, that I get plotted (xlabs). How is this possible? With points, i don´t seem to get the right data plotted onto the PCA plot, as I do not quite understand from where it is taken. I dont know how to plot the correct columns of the prcomp outcome (p). I would really appreciate if someone could help me, I have struggled with this for days now. How can I make a function that gives different symbols for the points, depending on how big the number given to it as xlabs is? Making the plots. read.table(file = "S:\\SEDIM\\TRFLP\\B90-700.txt",sep="\t", header=T)->boutbout <-bout[-1]p <- prcomp(bout) biplot(p, choices = c(2,3), scale = 1, pc.biplot = FALSE, var.axes = F, ylabs = NULL, xlabs=c("119","175","135","330","51","422","67","409","470","70","67","89","135","215","330","409","470","51","80","119","175","222","301","422","280","171","256","243","404","37","157","28","187","70","42","283","261","85","147","204","235","411","514","77","204","87","366","306","351","371","38","534","199","407","42","167","480","195","22","35","80","433","43","109","214","363","292","61","115","178","273","521","72","126","253","288","501","83","113","250","359","498","19","130","389","324","24","58","124","388","319","164","101","153","383","345","219","179","161","375","298","450","555","439","54","54","490","465","411","18","85","503","455","394","179","187","416","447","219","461","164","366","474","167","236","507","319","509","467","507","450","359","507","192","453","101","456","512","517"), cex=0.67, main="90-700bp") _________________________________________________________________ [[elided Hotmail spam]] PLink [[alternative HTML version deleted]]
Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : <https://stat.ethz.ch/pipermail/r-help/attachments/20080617/ae2edaea/attachment.pl>
I am not entirely sure after reading your email, but I thought you wanted to do something like this: ###Start of example ###create random data for the example x=rnorm(100,100,10) ##create Xs e=rnorm(100,0,5) ##create Errors y=x+e ##create Ys ###plot plot(y~x,pch=NA) ##plot Ys against Xs but suppress all symbols (i.e. plot invisibly) text(y~x,labels=round(x),pch=NULL) ##use values of X (rounded to its integer value) as symbols for the X-Y plot ###End of example So you could just substitute your variable names for x and y in the plot() and text() commands. Let us know whether your problem is solved. Cheers, Daniel ------------------------- cuncta stricte discussurus ------------------------- -----Urspr?ngliche Nachricht----- Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im Auftrag von Monna Nyg?rd Gesendet: Tuesday, June 17, 2008 5:04 AM An: r-help at r-project.org Betreff: [R] PCA analysis Hi, I have a problem with making PCA plots that are readable. I would like to set different sympols instead of the numbers of my samples or their names, that I get plotted (xlabs). How is this possible? With points, i don4t seem to get the right data plotted onto the PCA plot, as I do not quite understand from where it is taken. I dont know how to plot the correct columns of the prcomp outcome (p). I would really appreciate if someone could help me, I have struggled with this for days now. How can I make a function that gives different symbols for the points, depending on how big the number given to it as xlabs is? Making the plots. read.table(file = "S:\\SEDIM\\TRFLP\\B90-700.txt",sep="\t", header=T)->boutbout <-bout[-1]p <- prcomp(bout) biplot(p, choices = c(2,3), scale = 1, pc.biplot = FALSE, var.axes = F, ylabs = NULL, xlabs=c("119","175","135","330","51","422","67","409","470","70","67","89"," 135","215","330","409","470","51","80","119","175","222","301","422","280"," 171","256","243","404","37","157","28","187","70","42","283","261","85","147 ","204","235","411","514","77","204","87","366","306","351","371","38","534" ,"199","407","42","167","480","195","22","35","80","433","43","109","214","3 63","292","61","115","178","273","521","72","126","253","288","501","83","11 3","250","359","498","19","130","389","324","24","58","124","388","319","164 ","101","153","383","345","219","179","161","375","298","450","555","439","5 4","54","490","465","411","18","85","503","455","394","179","187","416","447 ","219","461","164","366","474","167","236","507","319","509","467","507","4 50","359","507","192","453","101","456","512","517"), cex=0.67, main="90-700bp") _________________________________________________________________ [[elided Hotmail spam]] PLink [[alternative HTML version deleted]]
Monna, The way i do it is to re-create the biplot for the PCA ..... I am attaching my code (i am sure this can be done even easier ..... but this works as well) where i am using pca() function from labdsv and my data is called veg1. library (labdsv) pca.1<-pca(veg1,cor=TRUE) # The scores are what are typically plotted in a PCA "ordination", but we will scale them between -1 and 1 so you can plot together with loadings scores <- pca.1$scores nrows <- nrow(scores) ncols <- ncol(scores) #re-scaling for (i in 1:nrows) { for (j in 1:ncols){ if (pca.1$scores[i,j] < 0) scores[i,j] <- (-1)*(pca.1$scores[i,j])/min(pca.1$scores) else scores[i,j] <- pca.1$scores[i,j]/max(pca.1$scores) }} pc1 <- scores[,1] pc2 <- scores[,2] plot(pc1, pc2, pch=16, cex=2, col ="paleturquoise", xlim = c(-1, 1), xlab = "PC1", ylab = "PC2",main = "Principal Component Analysis, Region A") ## here you can change pch as a function of your values so the points will have different sizes - if you wish - but the sizes need to be a vector of numbers ## and not characters and .... your numbers are too big so maybe you should scale them somehow .... let's say between 1 and 3 or whatever ..... abline(v=0, lty=2, col="green") abline(h=0, lty=2, col="green") # add the loadings loadings <- pca.1$loadings load1 <- loadings[,1] load2 <- loadings[,2] c0 <- rep(0,length(load1)) d0 <- rep(0,length(load1)) c1 <- load1 d1 <- load2 segments(c0,d0,c1,d1, col = "grey") load <- cbind(load1, load2) points(load, pch=17, cex = 2, col = "darkblue") identify(load[,1],load[,2], dimnames(load)[[1]], col = "deeppink3", font = 2) I hope this helps, Monica _________________________________________________________________ The other season of giving begins 6/24/08. Check out the i?m Talkathon.
Hi Mona, I cannot get it done with the princomp and the biplot commands either (maybe somebody can), but there are always many ways to Rome. This is how you can do it (below). However, the label=rep... below assumes that your values are in order, i.e. that you really want to plot the first fifty rows with one symbol, the second with another, and so forth. If your values are not ordered, you will either have to order your dataset or create a variable that indicates the condition by which you want to group your data and choose the symbols. Assigning this variable for your desired grouping would then most likely involve a loop or a nested ifelse() statement, unless you already have this variable. You then assign your grouping variable to the "pch" argument (for different symbols), the "col" argument (for different colors), or both. ##create data z<-sample(401:600) y<-sample(701:900) x<-sample(1:200) data.frame(x,y)->df cbind(df, z)->df ##pc analysis pc=prcomp(df) ##inspect results pc summary(pc) pc$rotation ##compute pc values for each observation pc.data=t(t(pc$rotation)%*%t(df)) ##check pc.data ##create point labels label=rep(1:4, each=50) ##plot first PC ##versus second PC ##with label indicated ##by the variable label plot(pc.data[,1],pc.data[,2],pch=label,col=label ,xlab="First principal component",ylab="Second principal component") --------------------------------------------------- Thank you for your reply. pch=NA got me rid of the numbers or names of samples that I?m plotting. The problem with how I can replace these with different symbols still remain. I know I can use points to give additional symbols, but I can?t get the rigth values plotted from the outcome of princomp(data). The class of the object is princomp, and I can?t specify which columns should be plotted for the points. ex (my real dataframe consists multiple(hundreds) colums of data for ca 200 samples): z<-sample(401:600)> y<-sample(701:900) > x<-sample(1:200) > data.frame(x,y)->df > cbind(df, z)->df > princomp(df)->p > biplot(p, pch=NA) > row.names(df)<-1:200Now I would like for instance all the samples that have row.names under 50 to be plotted in one symbol, the iones from 50-100 in another and so on. Do I need a special function for specifying these different symbols, when my samples are not in a correct order? As you realize I am quite new with R. Thank you so much for taking your time helping me, I really appreciate it. Regards, Monna> From: daniel at umd.edu > To: monnire at hotmail.com; r-help at r-project.org > Subject: AW: [R] PCA analysis > Date: Tue, 17 Jun 2008 19:40:41 -0400 > > I am not entirely sure after reading your email, but I thought you wantedto> do something like this: > > ###Start of example > > ###create random data for the example > x=rnorm(100,100,10) ##create Xs > e=rnorm(100,0,5) ##create Errors > y=x+e ##create Ys > > ###plot > plot(y~x,pch=NA) ##plot Ys against Xs but suppress all symbols (i.e. > plot invisibly) > text(y~x,labels=round(x),pch=NULL) ##use values of X (rounded to itsinteger> value) as symbols for the X-Y plot > > ###End of example > > So you could just substitute your variable names for x and y in the plot() > and text() commands. Let us know whether your problem is solved. > > Cheers, > Daniel > > ------------------------- > cuncta stricte discussurus > ------------------------- > > -----Urspr?ngliche Nachricht----- > Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im > Auftrag von Monna Nyg?rd > Gesendet: Tuesday, June 17, 2008 5:04 AM > An: r-help at r-project.org > Betreff: [R] PCA analysis > > > Hi, > > I have a problem with making PCA plots that are readable. > I would like to set different sympols instead of the numbers of my samples > or their names, that I get plotted (xlabs). > How is this possible? With points, i don4t seem to get the right data > plotted onto the PCA plot, as I do not quite understand from where it is > taken. I dont know how to plot the correct columns of the prcomp outcome > (p). > I would really appreciate if someone could help me, I have struggled with > this for days now. How can I make a function that gives different symbols > for the points, depending on how big the number given to it as xlabs is? > > Making the plots. > > read.table(file = "S:\\SEDIM\\TRFLP\\B90-700.txt",sep="\t", > header=T)->boutbout <-bout[-1]p <- prcomp(bout) biplot(p, choices c(2,3), > scale = 1, pc.biplot = FALSE, var.axes = F, ylabs = NULL, >xlabs=c("119","175","135","330","51","422","67","409","470","70","67","89",">135","215","330","409","470","51","80","119","175","222","301","422","280",">171","256","243","404","37","157","28","187","70","42","283","261","85","147>","204","235","411","514","77","204","87","366","306","351","371","38","534">,"199","407","42","167","480","195","22","35","80","433","43","109","214","3>63","292","61","115","178","273","521","72","126","253","288","501","83","11>3","250","359","498","19","130","389","324","24","58","124","388","319","164>","101","153","383","345","219","179","161","375","298","450","555","439","5>4","54","490","465","411","18","85","503","455","394","179","187","416","447>","219","461","164","366","474","167","236","507","319","509","467","507","4> 50","359","507","192","453","101","456","512","517"), cex=0.67, > main="90-700bp") > > _________________________________________________________________ > [[elided Hotmail spam]] > > PLink > [[alternative HTML version deleted]] > >---------------------------------------------------------------------------- ---- Senaste k?ndisnyheterna & hetaste skvallret! MSN K?ndisnytt