thr3ads.net - R help - [R] Multiple subsetting of a dataframe based on many conditions [Apr 2013]

If this information is useful, please help other people find it:
Share via:

Mansfield, Desmond

2013-Apr-06 23:07 UTC

[R] Multiple subsetting of a dataframe based on many conditions

Hello Everybody,

I'm working with a dataframe that has 18 columns. I would like to subset the
data in one of these columns, "present", according to combinations of
data in six of the other columns within the data frame and then save this into a
text file. The columns I would like to use to subset "present" are:


* answer (1:4) [answer takes the values 1 to 4]
*p.num (1:18)
* session (1:2)
* count (1:8)
* type (1:3)


So there are a total of 3456 possible subsetting combinations.


At present, I have been using the following and manually changing the values in
each line and re-running the code.

input<-subset(input, answer==1)
input.s2g<-subset(input, p.num == 1)
input.s2g<-subset(input.s2g, session == "S2")
input.s2g<-subset(input.s2g, count==8)
input.s2g<-subset(input.s2g, type==1)


write.table(s2g, file = "1_1_S2_8_1", sep = "\t", col.names
= F, row.names = F)

But this takes me hours and is obviously prone to error. There must be an easier
way?


Thanks for any help!

	[[alternative HTML version deleted]]

Adams, Jean

2013-Apr-08 14:45 UTC

head link

[R] Multiple subsetting of a dataframe based on many conditions

# here's an example data frame
n <- 10
mydf <- data.frame(present=rnorm(n),
answer=sample(1:4, n, replace=TRUE),
 p.num=sample(1:18, n, replace=TRUE),
session=sample(1:2, n, replace=TRUE),
count=sample(1:8, n, replace=TRUE),
 type=sample(1:3, n, replace=TRUE))

# define a new variable, combo5, that represents the combination of the
five columns you specified
mydf$combo5 <- with(mydf, interaction(answer, p.num, session, count, type,
drop=TRUE))

# split the data frame according to combo5
# this gives you a list of data frames
mydf.split <- split(mydf, mydf$combo5)

# use lapply() and write.table() to write each date frame in the list to a
file
lapply(mydf.split, function(x) write.table(x,
file=as.character(x$combo5[1]), sep="\t", col.names=F, row.names=F))

Jean



On Sat, Apr 6, 2013 at 6:07 PM, Mansfield, Desmond
<dcm206@exeter.ac.uk>wrote:
> Hello Everybody,
>
> I'm working with a dataframe that has 18 columns. I would like to
subset
> the data in one of these columns, "present", according to
combinations of
> data in six of the other columns within the data frame and then save this
> into a text file. The columns I would like to use to subset
"present" are:
>
>
> * answer (1:4) [answer takes the values 1 to 4]
> *p.num (1:18)
> * session (1:2)
> * count (1:8)
> * type (1:3)
>
>
> So there are a total of 3456 possible subsetting combinations.
>
>
> At present, I have been using the following and manually changing the
> values in each line and re-running the code.
>
> input<-subset(input, answer==1)
> input.s2g<-subset(input, p.num == 1)
> input.s2g<-subset(input.s2g, session == "S2")
> input.s2g<-subset(input.s2g, count==8)
> input.s2g<-subset(input.s2g, type==1)
>
>
> write.table(s2g, file = "1_1_S2_8_1", sep = "\t",
col.names = F, row.names
> = F)
>
> But this takes me hours and is obviously prone to error. There must be an
> easier way?
>
>
> Thanks for any help!
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Possibly Parallel Threads

Search for more possibly parallel threads

R help - Apr 2013 - Multiple subsetting of a dataframe based on many conditions

[R] Multiple subsetting of a dataframe based on many conditions

[R] Multiple subsetting of a dataframe based on many conditions

Possibly Parallel Threads