joshgage
2008-Jun-03 00:58 UTC
[R] Partitioning a large data frame and writing output CSVs
Hello, I have a large dataset [536436,4] I'd like to partition the dataset into 999 groups of 564 rows and output each group as a CSV files... Obviously I could do this longhand but I know it is somehow possible to write a loop to do the same thing... I'd like to group such that the first group is the first 564 rows, the second group is the second 564 rows ..... the 999th group is the 999th 564 rows... In each newly created group, I'd like there to be a new column that identifies the group... i.e. the first group would have a new column in which all 564 observations have a character value of "group1" Finally I'd also like to output each one of these groups as a CSV file with a unique name.... Any help with this is very greatly appreciated.... thanks in advance, Josh -- View this message in context: http://www.nabble.com/Partitioning-a-large-data-frame-and-writing-output-CSVs-tp17614022p17614022.html Sent from the R help mailing list archive at Nabble.com.
jim holtman
2008-Jun-03 01:50 UTC
[R] Partitioning a large data frame and writing output CSVs
This might give you a hint of how to do it. BTW, were the dimensions 563436x4? test <- matrix(runif(100*4),ncol=4) # create groups of 10 rows group <- rep(1:10, each=10) new.test <- cbind(test, group=group) # now get indices to write out indices <- split(seq(nrow(test)), new.test[, 'group']) # now write out the files for (i in names(indices)){ write.csv(new.test[indices[[i]],], file=paste("data.", i, ".csv", sep=""), row.names=FALSE) } On Mon, Jun 2, 2008 at 8:58 PM, joshgage <joshgage@gmail.com> wrote:> > Hello, > > I have a large dataset [536436,4] > > I'd like to partition the dataset into 999 groups of 564 rows and output > each group as a CSV files... Obviously I could do this longhand but I know > it is somehow possible to write a loop to do the same thing... > > I'd like to group such that the first group is the first 564 rows, the > second group is the second 564 rows ..... the 999th group is the 999th 564 > rows... > > In each newly created group, I'd like there to be a new column that > identifies the group... i.e. the first group would have a new column in > which all 564 observations have a character value of "group1" > > Finally I'd also like to output each one of these groups as a CSV file with > a unique name.... > > Any help with this is very greatly appreciated.... > > thanks in advance, > > Josh > -- > View this message in context: > http://www.nabble.com/Partitioning-a-large-data-frame-and-writing-output-CSVs-tp17614022p17614022.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]]
Moshe Olshansky
2008-Jun-03 05:23 UTC
[R] Partitioning a large data frame and writing output CSVs
Assuming that your data is in a data.frame d, you could do the following: for (i in 1:999) { df <- d[564*(i-1)+(1:564),] g <- paste("group",i,sep="") group <- rep(g,564) newdf <- data.frame(group,df) filename <- paste("file",i,".csv",sep="") write.csv(newdf,filename) } --- On Tue, 3/6/08, joshgage <joshgage at gmail.com> wrote:> From: joshgage <joshgage at gmail.com> > Subject: [R] Partitioning a large data frame and writing output CSVs > To: r-help at r-project.org > Received: Tuesday, 3 June, 2008, 10:58 AM > Hello, > > I have a large dataset [536436,4] > > I'd like to partition the dataset into 999 groups of > 564 rows and output > each group as a CSV files... Obviously I could do this > longhand but I know > it is somehow possible to write a loop to do the same > thing... > > I'd like to group such that the first group is the > first 564 rows, the > second group is the second 564 rows ..... the 999th group > is the 999th 564 > rows... > > In each newly created group, I'd like there to be a new > column that > identifies the group... i.e. the first group would have a > new column in > which all 564 observations have a character value of > "group1" > > Finally I'd also like to output each one of these > groups as a CSV file with > a unique name.... > > Any help with this is very greatly appreciated.... > > thanks in advance, > > Josh > -- > View this message in context: > http://www.nabble.com/Partitioning-a-large-data-frame-and-writing-output-CSVs-tp17614022p17614022.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code.