Hi All, I am dealing with a large data set which translates in to a sparse matrix, I want to load this data that is spread in approximately 17000+ files each defining a row and each file has variable number of records that come with its column number and the value that they store. I wish to load this data in memory from these files one by one. Is there anyway I can do this in R, before I start processing? I am sure this is not the first time R or the community is confronted with this kind of a problem but I could not find the documentation for loading data in to sparse matrix I found quite a few packages for sparse matrix but they all were concentrating on how to do various operations with the matrix once the matrix is loaded. I need to first load the data in the system before I can think about analysing. Regards, Atul. Graduate Student, Department of Computer Science, University of Minnesota Duluth, Duluth, MN, 55812. -------- www.d.umn.edu/~kulka053 [[alternative HTML version deleted]]
Both SparseM and Matrix have facilities for rbind and cbind that allow you to concatenate pieces of sparse matrices together. On Nov 23, 2008, at 2:19 PM, Atul Kulkarni wrote:> Hi All, > > I am dealing with a large data set which translates in to a sparse > matrix, I > want to load this data that is spread in approximately 17000+ files > each > defining a row and each file has variable number of records that > come with > its column number and the value that they store. > > I wish to load this data in memory from these files one by one. Is > there > anyway I can do this in R, before I start processing? I am sure this > is not > the first time R or the community is confronted with this kind of a > problem > but I could not find the documentation for loading data in to sparse > matrix > I found quite a few packages for sparse matrix but they all were > concentrating on how to do various operations with the matrix once the > matrix is loaded. I need to first load the data in the system before > I can > think about analysing. > > Regards, > Atul. > > Graduate Student, > Department of Computer Science, > University of Minnesota Duluth, > Duluth, MN, 55812. > -------- > www.d.umn.edu/~kulka053 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
What matrix package are you using? I have not used sparse matrices, but a quick look at the help file of Matrix talks about a file format for reading in a sparse matrix. I would assume that all you need to do is to read in your files and write them out in that format. You can do it in using 'list.files' to read in the files and the 'cat' (or any other command that will write an ASCII file) to output the data in the correct format. On Sun, Nov 23, 2008 at 3:19 PM, Atul Kulkarni <atulskulkarni at gmail.com> wrote:> Hi All, > > I am dealing with a large data set which translates in to a sparse matrix, I > want to load this data that is spread in approximately 17000+ files each > defining a row and each file has variable number of records that come with > its column number and the value that they store. > > I wish to load this data in memory from these files one by one. Is there > anyway I can do this in R, before I start processing? I am sure this is not > the first time R or the community is confronted with this kind of a problem > but I could not find the documentation for loading data in to sparse matrix > I found quite a few packages for sparse matrix but they all were > concentrating on how to do various operations with the matrix once the > matrix is loaded. I need to first load the data in the system before I can > think about analysing. > > Regards, > Atul. > > Graduate Student, > Department of Computer Science, > University of Minnesota Duluth, > Duluth, MN, 55812. > -------- > www.d.umn.edu/~kulka053 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?