Martin Tomko
2010-Aug-12 09:30 UTC
[R] help usin scan on large matrix (caveats to what has been discussed before)
Dear all, I have a few points that I am unsure about using scan. I know that it is covered in the intro to R, and also has been discussed here: http://www.mail-archive.com/r-help at r-project.org/msg04869.html but nevertheless, I cannot get it to work. I have a potentially very large matrix that I need to read in (35MB). I am about to run it on a server with 16G of memory etc, so I hope it will work. I ultimately only need to run image() on it, producing a heatmap. read.table crashes on it, and is slow, so I would like to read it using scan. The file where I store it has the following format: "V1" "V2" "V3" "V4" "V5" "1" 508 424 208 111 66 "2" 59 101 95 113 81 "3" 26 30 24 17 18 "4" 4 0 8 3 9 "5" 0 0 0 0 0 "6" 0 0 0 0 0 where the first line are column names, the first column rownames. read.table works perfectly without any parameters on this (the file has been output using write.table). I use: rows<-length(R) cols <- max(unlist(lapply(R,function(x) length(unlist(gregexpr(" ",x,fixed=TRUE,useBytes=TRUE)))))) c<-scan(file=f,what=list(c("",(rep(integer(0),cols)))), skip=1) m<-matrix(c, nrow = rows, ncol=cols,byrow=TRUE); for some reason I end up with a character matrix, which I don't want. Is this the proper way to skip the first column (this is not documented anywhere - how does one skip the first column in scan???). is my way of specifying "integer(0)" correct? And finally - would any sparse matrix package be more appropriate, and can I use a sparse matrix for the image() function producing typical heat,aps? I have seen that some sparse matrix packages produce different looking outputs, which would not be appropriate. Thanks Martin
peter dalgaard
2010-Aug-12 10:24 UTC
[R] help usin scan on large matrix (caveats to what has been discussed before)
On Aug 12, 2010, at 11:30 AM, Martin Tomko wrote:> > c<-scan(file=f,what=list(c("",(rep(integer(0),cols)))), skip=1) > m<-matrix(c, nrow = rows, ncol=cols,byrow=TRUE); > > for some reason I end up with a character matrix, which I don't want. Is this the proper way to skip the first column (this is not documented anywhere - how does one skip the first column in scan???). is my way of specifying "integer(0)" correct?No. Well, integer(0) is just superfluous where 0L would do, since scan only looks at the types not the contents, but more importantly, what= wants a list of as many elements as there are columns and you gave it> list(c("",(rep(integer(0),5))))[[1]] [1] "" I think what you actually meant was c(list(NULL),rep(list(0L),5))> > And finally - would any sparse matrix package be more appropriate, and can I use a sparse matrix for the image() function producing typical heat,aps? I have seen that some sparse matrix packages produce different looking outputs, which would not be appropriate. > > Thanks > Martin > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Possibly Parallel Threads
- r-help, how can i use my own distance matrix without usin g dist()
- Best VPN server for * and woad warriors usin g windows?
- Cannot authenticate usin ssh
- Usin ISO Linux & Memdisk to create a Viritual Floppydrive that Linux & Windows can load driver disk from.
- Usin ISO Linux & Memdisk to create a Viritual Floppy drive that Linux & Windows can load driver disk from.