Hi,
I have a comma separated file with element names in first column like shown
below :
Name_1,0
Name_2,0.8878,0
Name_3,0.6777,0.7643,0
Name_4,0.9844,0.1234,0.1414,0
Original data is a 10000x10000 symmetric matrix (600 MB). To reduce file
size, I have minimized matrix to only lower triangle. Is there a (memory)
efficient way to 1) read file 2) compute first and second principal
components and 3) and plot first vs second PC's ?
In the past, I could do this by :
b <- read.csv("distance.csv", sep=",", head=F)  #
distance.csv file is
complete data matrix, so this command worked !!
my_matrix <- data.matrix(b)
pca2 <- princomp(my_matrix)
plot(pca2$scores[,1],pca2$scores[,2])
text(pca2$scores[,1],pca2$scores[,2],rownames(nba_matrix), cex=0.5, pos=1)
This time, I don't have a complete file. So, I was wondering, how to do this
?
Any help is much appreciated
TIA
M
--
View this message in context:
http://r.789695.n4.nabble.com/Principal-componet-plot-from-lower-triangular-matrix-file-tp4114840p4114840.html
Sent from the R help mailing list archive at Nabble.com.
R distance objects are triangular, maybe consider as.dist() that would require
the square matrix as input. Which could be reconstructed(or you have it
already.) I do not know if there is a biglm() alternative to princomp(), but
maybe consider using subsets of your data because that plot, if created, is
going to be very hectic.
      HTH
        Ken Hutchison
On Nov 28, 2554 BE, at 5:55 AM, cm <mbnchakravarthy at gmail.com> wrote:
> Hi,
> 
> I have a comma separated file with element names in first column like shown
> below :
> 
> Name_1,0
> Name_2,0.8878,0
> Name_3,0.6777,0.7643,0
> Name_4,0.9844,0.1234,0.1414,0
> 
> Original data is a 10000x10000 symmetric matrix (600 MB). To reduce file
> size, I have minimized matrix to only lower triangle. Is there a (memory)
> efficient way to 1) read file 2) compute first and second principal
> components and 3) and plot first vs second PC's ?
> 
> In the past, I could do this by :
> b <- read.csv("distance.csv", sep=",", head=F)  #
distance.csv file is
> complete data matrix, so this command worked !!
> my_matrix <- data.matrix(b)
> pca2 <- princomp(my_matrix)
> plot(pca2$scores[,1],pca2$scores[,2])
> text(pca2$scores[,1],pca2$scores[,2],rownames(nba_matrix), cex=0.5, pos=1)
> 
> This time, I don't have a complete file. So, I was wondering, how to do
this
> ?
> 
> Any help is much appreciated
> 
> TIA
> M
> 
> --
> View this message in context:
http://r.789695.n4.nabble.com/Principal-componet-plot-from-lower-triangular-matrix-file-tp4114840p4114840.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
marella
2011-Nov-28  14:21 UTC
[R] Principal componet plot from lower triangular matrix file
Yes. I agree that plot is going to be crowded. But idea is to see if elements of same type (different color code etc) group together or not. I would need only first two principal components (at most three). Since princomp calculates all components, it is taking very long time ! -- View this message in context: http://r.789695.n4.nabble.com/Principal-componet-plot-from-lower-triangular-matrix-file-tp4114840p4115339.html Sent from the R help mailing list archive at Nabble.com.
Maybe Matching Threads
- What ruby/rails componet do I need?
- Análisis de componentes principales con ade4 y FactoMineR
- Análisis de componentes principales con ade4 y FactoMineR
- Análisis de componentes principales con ade4 y FactoMineR
- Análisis de componentes principales con ade4 y FactoMineR