Hi, I have a big data file (over 30,000 records) looks like this: 100, 20, 46, 70 103, 0, 22, 45 117, -1, 34, 65 120, 15, 0, 25 113, 0, -1, 32 142, -1, -1, 55 ..... I want to read only those records having positive values in all of the four columns. That is, I don't want to read record # 3, 5, and 6 into R. However, when I type: read.csv("data.csv", sep=",") -> rawdata it reads the whole thing into R including those records I don't want. Could anyone tell me how I can read only those records I want? Thanks, Yu-Ling Wu __________________________________________________ Do You Yahoo!? Yahoo! Photos - Share your holiday photos online! http://photos.yahoo.com/ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Thu, 11 Jan 2001, Yu-Ling Wu wrote:> Hi, > > I have a big data file (over 30,000 records) looks > like this: > > 100, 20, 46, 70 > 103, 0, 22, 45 > 117, -1, 34, 65 > 120, 15, 0, 25 > 113, 0, -1, 32 > 142, -1, -1, 55 > ..... > > I want to read only those records having positive > values in all of the four > columns. That is, I don't want to read record # 3, 5, > and 6 into R. However, > when I type: > > read.csv("data.csv", sep=",") -> rawdataUm, read.csv uses sep =",", and you need header=FALSE.> it reads the whole thing into R including those > records I don't want. > Could anyone tell me how I can read only those records > I want?You can't! Until you have read the record, you cannot tell if all the entries are positive. Is this really a problem? You only have around 120k numbers, and I just did it very easily. rawdata <- read.csv("data.csv", header=F) Perhaps better is to use a matrix and scan(): rawdata <- matrix(scan("data.csv", sep=","), , 4, byrow=TRUE) keep <- (rawdata <= 0) %*% rep(1,4) == 0 rawdata[keep, ] Takes a few seconds and a few Mb. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
You can filter the data after reading as follows:> rawdata <- read.csv("data.csv", sep=",",header=FALSE) > rawdata <- rawdata[apply(rawdata,1,function(x)all(x>=0)),]Cheers, Pierre Yu-Ling Wu wrote:> > Hi, > > I have a big data file (over 30,000 records) looks > like this: > > 100, 20, 46, 70 > 103, 0, 22, 45 > 117, -1, 34, 65 > 120, 15, 0, 25 > 113, 0, -1, 32 > 142, -1, -1, 55 > ..... > > I want to read only those records having positive > values in all of the four > columns. That is, I don't want to read record # 3, 5, > and 6 into R. However, > when I type: > > read.csv("data.csv", sep=",") -> rawdata > > it reads the whole thing into R including those > records I don't want. > Could anyone tell me how I can read only those records > I want? > > Thanks, > Yu-Ling Wu > > __________________________________________________ > Do You Yahoo!? > Yahoo! Photos - Share your holiday photos online! > http://photos.yahoo.com/ > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- ----------------------------------------------------------------- Pierre Kleiber Email: pkleiber at honlab.nmfs.hawaii.edu Fishery Biologist Tel: 808 983-5399/737-7544 NOAA FISHERIES - Honolulu Laboratory Fax: 808 983-2902 2570 Dole St., Honolulu, HI 96822-2396 ----------------------------------------------------------------- "God could have told Moses about galaxies and mitochondria and all. But behold... It was good enough for government work." ----------------------------------------------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>I have a big data file (over 30,000 records) looks >like this: > >100, 20, 46, 70 >103, 0, 22, 45 >117, -1, 34, 65 >120, 15, 0, 25 >113, 0, -1, 32 >142, -1, -1, 55 >..... > >I want to read only those records having positive >values in all of the four >columns. That is, I don't want to read record # 3, 5, >and 6 into R.Perhaps someone knows how to do this with R, but, if I had to do it right now, I would pre-process with grep before reading into R: grep -v "-" myddata > myshorterdata The "grep" tool is available on Unix and Linux. If you have Windows, it would be useful to get the "unix tools for windows", which, unfornuately, I have just been unable to find after 10 minutes of searching www.gnu.org. Jonathan Baron, Professor of Psychology, University of Pennsylvania Home page: http://www.sas.upenn.edu/~jbaron -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Thu, 11 Jan 2001, Jonathan Baron wrote:> The "grep" tool is available on Unix and Linux. If you have > Windows, it would be useful to get the "unix tools for windows", > which, unfornuately, I have just been unable to find after 10 > minutes of searching www.gnu.org.You may want to try http://sources.redhat.com/cygwin/ Alexandre Fayolle -- http://www.logilab.com Narval is the first software agent available as free software (GPL). LOGILAB, Paris (France). -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._