Olivier Eterradossi
2013-Oct-11 12:08 UTC
[R] behaviour of read.xls (gdata package) when worksheet usesuser-defined cells formats
(I'm afraid this post didn't reach the list on last Wednesday, here it is again ) hi R-list, And sorry for my frenglish ! I am running R < Good Sport > release ( i386-w64-mingw32/i386 (32-bit) ) ) under Windows 7 Professional, Service Pack 1. My perl executable is ActivePerl build 817 [257965] (i.e. version 5.8.8.817). Usually it is working fine. Using the gdata ::read.xls function I am used to, I am now facing a stupid, probably trivial problem I never encountered before... and can't fix it by myself. I have received Excel files (saved as Excel 97-2003 files) in which the values (say 11.6185410334347) are displayed in a user-defined format in the cells (here the format is defined as "0.00", giving a displayed value of 0.12) but fully and correctly displayed in the formula bar. In these files, the cells in the 4 first lines are used for header information (strings). I read any of the files using : foo.data<- read.xls(xls=my.path.to.xls.file,sheet=sheet.number,perl=my.perl.path,as.is=TRUE,pattern="[0123456789]",head=FALSE) The value I get in the resulting foo.data$the.variable is the displayed value (0.12), not the "true", "underlying" value of 11.6185...... and it is neither a factor, nor a string :> is.what(foo.data$the.variable)[1] "is.atomic" "is.double" "is.numeric" "is.standard" "is.unsorted" "is.vector"> foo.data$the.variable [2]*1e10[1] 1.2e+09 Is this normal behavior ? Is this related to how I chose the read.xls arguments ? How should I specify the arguments to recover the full, "true" values directly from the files (without changing the format manually to "standard", of course !) ? Or is this related to my perl executable ? ... Or do I miss a point ? Thanks for helping, all the best , Olivier [[alternative HTML version deleted]]
Arnaud Michel
2013-Oct-14 15:27 UTC
[R] add points on an existant ggplot from another dataframe
Hello I had draw the results of PCA (Principal Components Analysis) (1) Is it possible to put on this graphic the 75% ellipse confidence of each sex calculated by dataEllipse (library(car)) ? My code is 1) PCA data frame with 169 rows and 2 columns library(ggplot2) p <- ggplot(PCA, aes(x=F1, y=F2 )) p + geom_point(aes(colour = Sexe, shape = Sexe), size=3) + geom_hline(yintercept = 0) + geom_vline(xintercept = 0) + labs(title = "ACP", x = "Facteur 1", y = "Facteur 2",fill = "Sexe") + theme(legend.position = c(0.95,0.95), legend.background = element_rect(colour = "black")) 2) library(car) XH <- dataEllipse( X1[ACP$Sexe=="H"], X2[ACP$Sexe=="H"], levels=0.75, lty=2, add=TRUE, plot.points=FALSE, center.cex=0, col=4) XF <- dataEllipse( X1[ACP$Sexe=="F"], X2[ACP$Sexe=="F"], levels=0.75, lty=2, add=TRUE, plot.points=FALSE, center.cex=0, col=2) XH and XF are two matrix with 2 columns and 52 rows Thank you for your help -- Michel ARNAUD Charg? de mission aupr?s du DRH DGDRD-Drh - TA 174/04 Av Agropolis 34398 Montpellier cedex 5 tel : 04.67.61.75.38 fax : 04.67.61.57.87 port: 06.47.43.55.31