thr3ads.net - R help - [R] Performing a function on columns specified in another dataframe [Jun 2010]

If this information is useful, please help other people find it:
Share via:

Josh B

2010-Jun-09 19:04 UTC

[R] Performing a function on columns specified in another dataframe

Hello Listserve,

Here is another question to keep you on your toes. Please consider the following
toy dataset:

a <- read.table(textConnection("fred sam joe alex
measure.1 10 4 10 1
measure.2 10 4 2 8
measure.3 3 1 8 3
measure.4 5 1 3 3
measure.5 8 6 8 3
measure.6 9 5 1 0
measure.7 4 6 10 1
measure.8 3 6 8 9
measure.9 8 6 7 7
measure.10 7 8 9 8"), header = TRUE)

And also please consider this toy dataset:
b <- read.table(textConnection("x y
test.1 fred sam
test.2 sam joe
test.3 joe alex"), header = TRUE)

What I want to do is perform some Student's t-tests. The comparisons I want
to make are specified in the dataset called "b" -- I'd like to
test fred versus sam, sam versus joe, and joe versus alex. How could I use the
dataset called "b" to specify the columns to use in the series of
t-tests? Keep in mind that my real dataset is enormous (1000 columns) and will
likely change, so solutions relying on numeric indexing would not work for me.

I'm thinking the code would look something like this:

#create a matrix for the output
results <- matrix(nrow = nrow(b), ncol = 1)
results <- cbind(b, results)

for (i in 1:length(b)){
    results[i,3] <- t.test(???, ???) #this is where I'm stuck. How do I
pull the information I want out of b -- i.e., the columns to use -- to do the
appropriate comparisons?
}

I'm hoping for a solution that doesn't create any new subsetted matrices
along the way, because this will slow down the run time.

Thanks in advance,
Josh


      
	[[alternative HTML version deleted]]

Jorge Ivan Velez

2010-Jun-09 20:05 UTC

head link

[R] Performing a function on columns specified in another dataframe

Hi Josh,

One way would be:

res <- apply(b, 1, function(Names) t.test(a[, Names[1]], a[, Names[2]]))
do.call(rbind, lapply(res, function(l) c(l$statistic, l$parameter, p
l$p.value)))
#                       t       df          p
# test.1  1.775490 17.35589 0.09335398
# test.2 -1.489210 15.82584 0.15608937
# test.3  1.533333 17.99873 0.14258339

HTH,
Jorge


On Wed, Jun 9, 2010 at 3:04 PM, Josh B <> wrote:
> Hello Listserve,
>
> Here is another question to keep you on your toes. Please consider the
> following toy dataset:
>
> a <- read.table(textConnection("fred sam joe alex
> measure.1 10 4 10 1
> measure.2 10 4 2 8
> measure.3 3 1 8 3
> measure.4 5 1 3 3
> measure.5 8 6 8 3
> measure.6 9 5 1 0
> measure.7 4 6 10 1
> measure.8 3 6 8 9
> measure.9 8 6 7 7
> measure.10 7 8 9 8"), header = TRUE)
>
> And also please consider this toy dataset:
> b <- read.table(textConnection("x y
> test.1 fred sam
> test.2 sam joe
> test.3 joe alex"), header = TRUE)
>
> What I want to do is perform some Student's t-tests. The comparisons I
want
> to make are specified in the dataset called "b" -- I'd like
to test fred
> versus sam, sam versus joe, and joe versus alex. How could I use the
dataset
> called "b" to specify the columns to use in the series of
t-tests? Keep in
> mind that my real dataset is enormous (1000 columns) and will likely
change,
> so solutions relying on numeric indexing would not work for me.
>
> I'm thinking the code would look something like this:
>
> #create a matrix for the output
> results <- matrix(nrow = nrow(b), ncol = 1)
> results <- cbind(b, results)
>
> for (i in 1:length(b)){
>    results[i,3] <- t.test(???, ???) #this is where I'm stuck. How do
I pull
> the information I want out of b -- i.e., the columns to use -- to do the
> appropriate comparisons?
> }
>
> I'm hoping for a solution that doesn't create any new subsetted
matrices
> along the way, because this will slow down the run time.
>
> Thanks in advance,
> Josh
>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Jun 2010 - Performing a function on columns specified in another dataframe

[R] Performing a function on columns specified in another dataframe

[R] Performing a function on columns specified in another dataframe

Possibly Parallel Threads