Hi,
I have a problem with making PCA plots that are readable.
I would like to set different sympols instead of the numbers of my samples or
their names, that I get plotted (xlabs).
How is this possible? With points, i don´t seem to get the right data plotted
onto the PCA plot, as I do not quite understand from where it is taken. I dont
know how to
plot the correct columns of the prcomp outcome (p).
I would really appreciate if someone could help me, I have struggled with this
for days now. How can I make a function that gives different symbols
for the points, depending on how big the number given to it as xlabs is?
Making the plots.
read.table(file = "S:\\SEDIM\\TRFLP\\B90-700.txt",sep="\t",
header=T)->boutbout <-bout[-1]p <- prcomp(bout)
biplot(p, choices = c(2,3), scale = 1, pc.biplot = FALSE, var.axes = F, ylabs =
NULL,
xlabs=c("119","175","135","330","51","422","67","409","470","70","67","89","135","215","330","409","470","51","80","119","175","222","301","422","280","171","256","243","404","37","157","28","187","70","42","283","261","85","147","204","235","411","514","77","204","87","366","306","351","371","38","534","199","407","42","167","480","195","22","35","80","433","43","109","214","363","292","61","115","178","273","521","72","126","253","288","501","83","113","250","359","498","19","130","389","324","24","58","124","388","319","164","101","153","383","345","219","179","161","375","298","450","555","439","54","54","490","465","411","18","85","503","455","394","179","187","416","447","219","461","164","366","474","167","236","507","319","509","467","507","450","359","507","192","453","101","456","512","517"),
cex=0.67, main="90-700bp")
_________________________________________________________________
[[elided Hotmail spam]]
PLink
[[alternative HTML version deleted]]
Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : <https://stat.ethz.ch/pipermail/r-help/attachments/20080617/ae2edaea/attachment.pl>
I am not entirely sure after reading your email, but I thought you wanted to
do something like this:
###Start of example
###create random data for the example
x=rnorm(100,100,10) ##create Xs
e=rnorm(100,0,5) ##create Errors
y=x+e ##create Ys
###plot
plot(y~x,pch=NA) ##plot Ys against Xs but suppress all symbols (i.e.
plot invisibly)
text(y~x,labels=round(x),pch=NULL) ##use values of X (rounded to its integer
value) as symbols for the X-Y plot
###End of example
So you could just substitute your variable names for x and y in the plot()
and text() commands. Let us know whether your problem is solved.
Cheers,
Daniel
-------------------------
cuncta stricte discussurus
-------------------------
-----Urspr?ngliche Nachricht-----
Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im
Auftrag von Monna Nyg?rd
Gesendet: Tuesday, June 17, 2008 5:04 AM
An: r-help at r-project.org
Betreff: [R] PCA analysis
Hi,
I have a problem with making PCA plots that are readable.
I would like to set different sympols instead of the numbers of my samples
or their names, that I get plotted (xlabs).
How is this possible? With points, i don4t seem to get the right data
plotted onto the PCA plot, as I do not quite understand from where it is
taken. I dont know how to plot the correct columns of the prcomp outcome
(p).
I would really appreciate if someone could help me, I have struggled with
this for days now. How can I make a function that gives different symbols
for the points, depending on how big the number given to it as xlabs is?
Making the plots.
read.table(file = "S:\\SEDIM\\TRFLP\\B90-700.txt",sep="\t",
header=T)->boutbout <-bout[-1]p <- prcomp(bout) biplot(p, choices =
c(2,3),
scale = 1, pc.biplot = FALSE, var.axes = F, ylabs = NULL,
xlabs=c("119","175","135","330","51","422","67","409","470","70","67","89","
135","215","330","409","470","51","80","119","175","222","301","422","280","
171","256","243","404","37","157","28","187","70","42","283","261","85","147
","204","235","411","514","77","204","87","366","306","351","371","38","534"
,"199","407","42","167","480","195","22","35","80","433","43","109","214","3
63","292","61","115","178","273","521","72","126","253","288","501","83","11
3","250","359","498","19","130","389","324","24","58","124","388","319","164
","101","153","383","345","219","179","161","375","298","450","555","439","5
4","54","490","465","411","18","85","503","455","394","179","187","416","447
","219","461","164","366","474","167","236","507","319","509","467","507","4
50","359","507","192","453","101","456","512","517"),
cex=0.67,
main="90-700bp")
_________________________________________________________________
[[elided Hotmail spam]]
PLink
[[alternative HTML version deleted]]
Monna,
The way i do it is to re-create the biplot for the PCA ..... I am attaching my
code (i am sure this can be done even easier ..... but this works as well) where
i am using pca() function from labdsv and my data is called veg1.
library (labdsv)
pca.1<-pca(veg1,cor=TRUE)
# The scores are what are typically plotted in a PCA "ordination", but
we will scale them between -1 and 1 so you can plot together with loadings
scores <- pca.1$scores
nrows <- nrow(scores)
ncols <- ncol(scores)
#re-scaling
for (i in 1:nrows) {
for (j in 1:ncols){
if (pca.1$scores[i,j] < 0) scores[i,j] <-
(-1)*(pca.1$scores[i,j])/min(pca.1$scores) else scores[i,j] <-
pca.1$scores[i,j]/max(pca.1$scores)
}}
pc1 <- scores[,1]
pc2 <- scores[,2]
plot(pc1, pc2, pch=16, cex=2, col ="paleturquoise", xlim = c(-1, 1),
xlab = "PC1", ylab = "PC2",main = "Principal Component
Analysis, Region A")
## here you can change pch as a function of your values so the points will have
different sizes - if you wish - but the sizes need to be a vector of numbers ##
and not characters and .... your numbers are too big so maybe you should scale
them somehow .... let's say between 1 and 3 or whatever .....
abline(v=0, lty=2, col="green")
abline(h=0, lty=2, col="green")
# add the loadings
loadings <- pca.1$loadings
load1 <- loadings[,1]
load2 <- loadings[,2]
c0 <- rep(0,length(load1))
d0 <- rep(0,length(load1))
c1 <- load1
d1 <- load2
segments(c0,d0,c1,d1, col = "grey")
load <- cbind(load1, load2)
points(load, pch=17, cex = 2, col = "darkblue")
identify(load[,1],load[,2], dimnames(load)[[1]], col = "deeppink3",
font = 2)
I hope this helps,
Monica
_________________________________________________________________
The other season of giving begins 6/24/08. Check out the i?m Talkathon.
Hi Mona, I cannot get it done with the princomp and the biplot commands either (maybe somebody can), but there are always many ways to Rome. This is how you can do it (below). However, the label=rep... below assumes that your values are in order, i.e. that you really want to plot the first fifty rows with one symbol, the second with another, and so forth. If your values are not ordered, you will either have to order your dataset or create a variable that indicates the condition by which you want to group your data and choose the symbols. Assigning this variable for your desired grouping would then most likely involve a loop or a nested ifelse() statement, unless you already have this variable. You then assign your grouping variable to the "pch" argument (for different symbols), the "col" argument (for different colors), or both. ##create data z<-sample(401:600) y<-sample(701:900) x<-sample(1:200) data.frame(x,y)->df cbind(df, z)->df ##pc analysis pc=prcomp(df) ##inspect results pc summary(pc) pc$rotation ##compute pc values for each observation pc.data=t(t(pc$rotation)%*%t(df)) ##check pc.data ##create point labels label=rep(1:4, each=50) ##plot first PC ##versus second PC ##with label indicated ##by the variable label plot(pc.data[,1],pc.data[,2],pch=label,col=label ,xlab="First principal component",ylab="Second principal component") --------------------------------------------------- Thank you for your reply. pch=NA got me rid of the numbers or names of samples that I?m plotting. The problem with how I can replace these with different symbols still remain. I know I can use points to give additional symbols, but I can?t get the rigth values plotted from the outcome of princomp(data). The class of the object is princomp, and I can?t specify which columns should be plotted for the points. ex (my real dataframe consists multiple(hundreds) colums of data for ca 200 samples): z<-sample(401:600)> y<-sample(701:900) > x<-sample(1:200) > data.frame(x,y)->df > cbind(df, z)->df > princomp(df)->p > biplot(p, pch=NA) > row.names(df)<-1:200Now I would like for instance all the samples that have row.names under 50 to be plotted in one symbol, the iones from 50-100 in another and so on. Do I need a special function for specifying these different symbols, when my samples are not in a correct order? As you realize I am quite new with R. Thank you so much for taking your time helping me, I really appreciate it. Regards, Monna> From: daniel at umd.edu > To: monnire at hotmail.com; r-help at r-project.org > Subject: AW: [R] PCA analysis > Date: Tue, 17 Jun 2008 19:40:41 -0400 > > I am not entirely sure after reading your email, but I thought you wantedto> do something like this: > > ###Start of example > > ###create random data for the example > x=rnorm(100,100,10) ##create Xs > e=rnorm(100,0,5) ##create Errors > y=x+e ##create Ys > > ###plot > plot(y~x,pch=NA) ##plot Ys against Xs but suppress all symbols (i.e. > plot invisibly) > text(y~x,labels=round(x),pch=NULL) ##use values of X (rounded to itsinteger> value) as symbols for the X-Y plot > > ###End of example > > So you could just substitute your variable names for x and y in the plot() > and text() commands. Let us know whether your problem is solved. > > Cheers, > Daniel > > ------------------------- > cuncta stricte discussurus > ------------------------- > > -----Urspr?ngliche Nachricht----- > Von: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Im > Auftrag von Monna Nyg?rd > Gesendet: Tuesday, June 17, 2008 5:04 AM > An: r-help at r-project.org > Betreff: [R] PCA analysis > > > Hi, > > I have a problem with making PCA plots that are readable. > I would like to set different sympols instead of the numbers of my samples > or their names, that I get plotted (xlabs). > How is this possible? With points, i don4t seem to get the right data > plotted onto the PCA plot, as I do not quite understand from where it is > taken. I dont know how to plot the correct columns of the prcomp outcome > (p). > I would really appreciate if someone could help me, I have struggled with > this for days now. How can I make a function that gives different symbols > for the points, depending on how big the number given to it as xlabs is? > > Making the plots. > > read.table(file = "S:\\SEDIM\\TRFLP\\B90-700.txt",sep="\t", > header=T)->boutbout <-bout[-1]p <- prcomp(bout) biplot(p, choices c(2,3), > scale = 1, pc.biplot = FALSE, var.axes = F, ylabs = NULL, >xlabs=c("119","175","135","330","51","422","67","409","470","70","67","89",">135","215","330","409","470","51","80","119","175","222","301","422","280",">171","256","243","404","37","157","28","187","70","42","283","261","85","147>","204","235","411","514","77","204","87","366","306","351","371","38","534">,"199","407","42","167","480","195","22","35","80","433","43","109","214","3>63","292","61","115","178","273","521","72","126","253","288","501","83","11>3","250","359","498","19","130","389","324","24","58","124","388","319","164>","101","153","383","345","219","179","161","375","298","450","555","439","5>4","54","490","465","411","18","85","503","455","394","179","187","416","447>","219","461","164","366","474","167","236","507","319","509","467","507","4> 50","359","507","192","453","101","456","512","517"), cex=0.67, > main="90-700bp") > > _________________________________________________________________ > [[elided Hotmail spam]] > > PLink > [[alternative HTML version deleted]] > >---------------------------------------------------------------------------- ---- Senaste k?ndisnyheterna & hetaste skvallret! MSN K?ndisnytt