Hi, I have a very large data file(GB) from which I only want to extract one column to draw histogram. This would be done several times, so I would like to ask if there is anyway to plot this using R from the linux command line, something look like this cut -f1 xxx.txt |RplotHist .... Thanks and hope to hear from you. Best regards, Hang [[alternative HTML version deleted]]
Good afternoon Hang, This is an example of what I've done with a csv file with a header which is too big to read into memory. # this is a file with about 50 columns and 28 million records ap.fnam <- 'p2_all28m_records.csv' # lets just explore the columns in Addresspoint... # by reading in the header and the first row p1 <- read.csv(ap.fnam, nrows=1) # now which columns do we actually want? # ok... in this case we only want the NCAT column... cols.reqd <- grep('NCAT', names(p1)) # so we create a list containing this/these column(s) as a 'character' # type and all other columns as 'NULL'... col.classes <- ifelse(seq(ncol(p1)) %in% cols.reqd, 'character', 'NULL') # this will likely take a little over a minute! p9 <- read.csv(ap.fnam, colClasses=col.classes ) Hope this helps Kind regards, Sean On 14 February 2011 17:40, Hang PHAN <hangphan at gmail.com> wrote:> Hi, > I have a very large data file(GB) from which I only want to extract one > column to draw histogram. This would be done several times, so I would like > to ask if there is anyway to plot this using R from the linux command line, > something look like this > > cut -f1 xxx.txt |RplotHist .... > > Thanks and hope to hear from you. > Best regards, > Hang > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel >
On 14 February 2011 at 17:40, Hang PHAN wrote: | Hi, | I have a very large data file(GB) from which I only want to extract one | column to draw histogram. This would be done several times, so I would like | to ask if there is anyway to plot this using R from the linux command line, | something look like this | | cut -f1 xxx.txt |RplotHist .... Have a look at littler which was written with these uses in mind: http://dirk.eddelbuettel.com/code/littler.html It includes a few examples which should get you going. Also, in non-interactive mode, your plot device will have to a file. Hope this helps, Dirk -- Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
On Mon, Feb 14, 2011 at 05:40:29PM +0000, Hang PHAN wrote:> Hi, > I have a very large data file(GB) from which I only want to extract one > column to draw histogram. This would be done several times, so I would like > to ask if there is anyway to plot this using R from the linux command line, > something look like this > > cut -f1 xxx.txt |RplotHist ....Hi Hang: Can you use something like the following? x <- as.numeric(system("cut -f1 xxx.txt", intern=TRUE)) According to ?system, long lines will be split, however, no limit on the number of lines of the output is formulated there. Petr Savicky.