Hello, I realize that using: x[x > 3 & x < 5] I can fetch all elements between 3 and 5. However I read in from a CSV file, and I would like to fetch all columns from within a range ( 842-2411). In teh past, I have done this to fetch just select few columns: data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) data_filter <- data[c(2,12,17)] write.table(data_filter, fileout, append = TRUE, sep= ",", row.names= FALSE, col.names = FALSE) nskip <- nskip+320 This time, however, instead of grabbing columns 2, 12, 17, I woudl like all columns in the range of 842-2411. I can't seem to do this correctly. Could somebody please provide some insight? Thanks in advance. -- Jason Thibodeau [[alternative HTML version deleted]]
On Sep 14, 2008, at 12:22 PM, Jason Thibodeau wrote:> Hello, > > I realize that using: x[x > 3 & x < 5] I can fetch all elements > between 3 > and 5. However I read in from a CSV file, and I would like to fetch > all > columns from within a range ( 842-2411). In teh past, I have done > this to > fetch just select few columns: > > data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) > data_filter <- data[c(2,12,17)] > write.table(data_filter, fileout, append = > TRUE, > sep= ",", row.names= FALSE, col.names = FALSE) > nskip <- nskip+320 > > This time, however, instead of grabbing columns 2, 12, 17, I woudl > like all > columns in the range of 842-2411. I can't seem to do this correctly. > Could > somebody please provide some insight? Thanks in advance.Have your tried: data_filter <- data[seq(842,2411)] write.table(data_filter, fileout, append = TRUE, sep= ",", row.names= FALSE, col.names = FALSE) When I use that format on a dataframe I have lying around, I get the expected results and I do not find in testing that dataframes are challenged by assigning 5000 columns. -- David Winsemius
Have you tried: data_filter <- data[842:2411] Also if you have a lot of data to read, I would suggest that you use a connection, and it all the data is numeric, possibly 'scan'. If you do use a connection, this would eliminate having to 'skip' each time which could be time consuming on a large file. Since it appears that you are not writing out the column names in the output file, you could bypass the header line on the file by readLine after the open. So something like this might work: input <- file('yourfile','r') invisible(readLines(input, n=1)) # skip the header while (TRUE){ # read file x <- try(read.csv(input, n=320, header=FALSE), silent=TRUE) # catch EOF if (inherits(x, 'try-error')) break write.csv(.......) } On Sun, Sep 14, 2008 at 12:22 PM, Jason Thibodeau <jbloudg20 at gmail.com> wrote:> Hello, > > I realize that using: x[x > 3 & x < 5] I can fetch all elements between 3 > and 5. However I read in from a CSV file, and I would like to fetch all > columns from within a range ( 842-2411). In teh past, I have done this to > fetch just select few columns: > > data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) > data_filter <- data[c(2,12,17)] > write.table(data_filter, fileout, append = TRUE, > sep= ",", row.names= FALSE, col.names = FALSE) > nskip <- nskip+320 > > This time, however, instead of grabbing columns 2, 12, 17, I woudl like all > columns in the range of 842-2411. I can't seem to do this correctly. Could > somebody please provide some insight? Thanks in advance. > > -- > Jason Thibodeau > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Hi Jason, data[] is a data frame, remember--you need to specify rows AND columns. So, data[,c(2,12,17)] is what you should be doing in the first place, and data[,842:2411] in the second place. Not sure if the help you needed was using the comma, or the : syntax, or if you're trying to read only certain columns during the read.csv process (which I don't think that's possible). --Adam On Sun, 14 Sep 2008, Jason Thibodeau wrote:> Hello, > > I realize that using: x[x > 3 & x < 5] I can fetch all elements between 3 > and 5. However I read in from a CSV file, and I would like to fetch all > columns from within a range ( 842-2411). In teh past, I have done this to > fetch just select few columns: > > data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) > data_filter <- data[c(2,12,17)] > write.table(data_filter, fileout, append = TRUE, > sep= ",", row.names= FALSE, col.names = FALSE) > nskip <- nskip+320 > > This time, however, instead of grabbing columns 2, 12, 17, I woudl like all > columns in the range of 842-2411. I can't seem to do this correctly. Could > somebody please provide some insight? Thanks in advance. > > -- > Jason Thibodeau > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Sep 14, 2008, at 4:40 PM, Jason Thibodeau wrote:> I cannot provide (all) the sample data (NDA) but here is the entire > function: > TEST_filter <- function(filein,fileout) > > { > file.remove(fileout) > nskip<-0 > while(1) > { > data_tmp <- read.csv(filein, header=TRUE, > nrows=10, skip=nskip) > > data_filter <- data_tmp[842,2411]Looks like you forgot a few syntactically essential items here: data_tmp[842,2411] would only be the 842nd row in the 2411st column. And, since you only have 10 rows, you got an informative error. I> > write.table(data_filter, fileout, append = > TRUE, sep= ",", row.names= FALSE, col.names = FALSE) > nskip <- nskip+10 > } > > }You also say file.remove( fileout) , then you try to append to fileout. Does that make sense? -- David Winsemius, MD Heritage Laboratories> > > Thanks for the help. > > On Sun, Sep 14, 2008 at 4:24 PM, David Winsemius <dwinsemius@comcast.net > > wrote: > > On Sep 14, 2008, at 4:01 PM, Jason Thibodeau wrote: > > TEST_filter("line50grab.csv","line50grab_filterout.csv") > Error in `[.data.frame`(data_tmp, seq(842, 2411)) : > undefined columns selected > > > I am guessing that you wrapped some code into a function but you did > not provide the function. You are not really following the posting > guidelines here. > > > > I know my file has about 3000 columns. > > This happened when I used: > data_tmp <- read.csv(filein, header=TRUE, nrows=10, skip=nskip) > data_filter <- data_tmp[seq(842,2411)] > write.table(data_filter, fileout, append = TRUE, > sep= ",", row.names= FALSE, col.names = FALSE) > > Also using data_tmp[842:2411] did not yield any output being written > to my > file. > > Not a big surprise. Appears the error preceded the write.table call. > > > I have another slightly unrelated problem, but I'll propose that > after this > one can be solved. > > If the problem is not with the syntax or semantics of TEST_filter as > I suspect, then perhaps you should examine the input file from R's > perspective with: > > ?count.fields > > Hard to tell without the actual code and sample data. > > -- > David Winsemius > > > > > Thanks a lot. > > On Sun, Sep 14, 2008 at 2:14 PM, Jason Thibodeau > <jbloudg20@gmail.com>wrote: > > Jim, this is a GREAT help. I was trying something similar before, > but I was > unable to detect EOF. Thanks for the help! > > Also, David, your suggestion worked perfectly. > > Thanks for all the help, everyone! > > > On Sun, Sep 14, 2008 at 2:08 PM, jim holtman <jholtman@gmail.com> > wrote: > > Have you tried: > > data_filter <- data[842:2411] > > Also if you have a lot of data to read, I would suggest that you use a > connection, and it all the data is numeric, possibly 'scan'. If you > do use a connection, this would eliminate having to 'skip' each time > which could be time consuming on a large file. Since it appears that > you are not writing out the column names in the output file, you could > bypass the header line on the file by readLine after the open. So > something like this might work: > > input <- file('yourfile','r') > invisible(readLines(input, n=1)) # skip the header > while (TRUE){ # read file > x <- try(read.csv(input, n=320, header=FALSE), silent=TRUE) # catch > EOF > if (inherits(x, 'try-error')) break > write.csv(.......) > } > > > > On Sun, Sep 14, 2008 at 12:22 PM, Jason Thibodeau > <jbloudg20@gmail.com> > wrote: > Hello, > > I realize that using: x[x > 3 & x < 5] I can fetch all elements > between > 3 > and 5. However I read in from a CSV file, and I would like to fetch > all > columns from within a range ( 842-2411). In teh past, I have done this > to > fetch just select few columns: > > data <- read.csv(filein, header=TRUE, nrows=320, skip=nskip) > data_filter <- data[c(2,12,17)] > write.table(data_filter, fileout, append = TRUE, > sep= ",", row.names= FALSE, col.names = FALSE) > nskip <- nskip+320 > > This time, however, instead of grabbing columns 2, 12, 17, I woudl > like > all > columns in the range of 842-2411. I can't seem to do this correctly. > Could > somebody please provide some insight? Thanks in advance. > > -- > Jason Thibodeau > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > > > > -- > Jason Thibodeau > > > > > -- > Jason Thibodeau > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Jason Thibodeau[[alternative HTML version deleted]]