thr3ads.net - R help - [R] generating multiple dataset and applying function and output multiple output dataset...... [Sep 2011]

If this information is useful, please help other people find it:
Share via:

John Clark

2011-Sep-04 13:25 UTC

[R] generating multiple dataset and applying function and output multiple output dataset......

Dear R experts:

Here is my problem, just hard for me...

I want to generate multiple datasets, then apply a function to these
datasets and output corresponding output in single or multiple dataset
(whatever possible)...

# my example, although I need to generate a large number of variables and
datasets

seed <- round(runif(10)*1000000)

datagen <- function(x){
set.seed(x)
var <- rep(1:3, c(rep(3, 3)))
yvar <- rnorm(length(var), 50, 10)
matrix <- matrix(sample(1:10, c(10*length(var)), replace = TRUE), ncol = 10)
mydata <- data.frame(var, yvar, matrix)
}

gdt <- lapply (seed,  datagen)

# resulting list (I believe is correct term) has 10 dataframes: gdt[1]
.......to gdt[10]

# my function, this will perform anova in every component data frames and
output probability coefficients...
anovp <- function(x){
          ind <- 3:ncol(x)
          out <- lm(gdt[x]$yvar ~ gdt[x][, ind[ind]])
          pval <- out$coefficients[,4][2]
          pval <- do.call(rbind,pval)
         }

plist <- lapply (gdt,  anovp)

Error in gdt[x] : invalid subscript type 'list'

This is not working, I tried different options. But could not figure
out...finally decided to bother experts, sorry for that...

My questions are:

(1) Is this possible to handle such situation in this way or there are other
alternatives to handle such multiple datasets created?

(2)  If this is right way, how can I do it?


Thank you for attention and I will appreciate your help...


JC

	[[alternative HTML version deleted]]

Sarah Goslee

2011-Sep-05 13:40 UTC

head link

[R] generating multiple dataset and applying function and output multiple output dataset......

Hi,

On Sun, Sep 4, 2011 at 9:25 AM, John Clark <rosbreed.pba at gmail.com>
wrote:> Dear R experts:
>
> Here is my problem, just hard for me...
>
> I want to generate multiple datasets, then apply a function to these
> datasets and output corresponding output in single or multiple dataset
> (whatever possible)...
>
> # my example, although I need to generate a large number of variables and
> datasets
>
> seed <- round(runif(10)*1000000)
>
> datagen <- function(x){
> set.seed(x)
> var <- rep(1:3, c(rep(3, 3)))
> yvar <- rnorm(length(var), 50, 10)
> matrix <- matrix(sample(1:10, c(10*length(var)), replace = TRUE), ncol =
10)
> mydata <- data.frame(var, yvar, matrix)
> }
>
> gdt <- lapply (seed, ?datagen)
>
> # resulting list (I believe is correct term) has 10 dataframes: gdt[1]
> .......to gdt[10]
Yes, that's a list of dataframes, though the correct reference is gdt[[1]]
> # my function, this will perform anova in every component data frames and
> output probability coefficients...
> anovp <- function(x){
> ? ? ? ? ?ind <- 3:ncol(x)
> ? ? ? ? ?out <- lm(gdt[x]$yvar ~ gdt[x][, ind[ind]])
> ? ? ? ? ?pval <- out$coefficients[,4][2]
> ? ? ? ? ?pval <- do.call(rbind,pval)
> ? ? ? ? }
>
> plist <- lapply (gdt, ?anovp)
>
> Error in gdt[x] : invalid subscript type 'list'
It's not a matter of your use of lapply(), which is fine. It's that your
anovp() function just plain doesn't work.

You need to debug it with ONE dataframe before you try to lapply
it to a whole bunch.
> anovp(gdt[[1]])Error in gdt[x] : invalid subscript type 'list'

This suggests to me that x should be a matrix rather than a list (a dataframe
is a type of list), so I tried:
> anovp(as.matrix(gdt[[1]]))Error in gdt[x][, ind[ind]] : incorrect number of dimensions

But as you see there are still problems. You'll need to solve those first:
if
anovp() doesn't work for one dataframe, it won't work on a list of them.
> This is not working, I tried different options. But could not figure
> out...finally decided to bother experts, sorry for that...
>
> My questions are:
>
> (1) Is this possible to handle such situation in this way or there are
other
> alternatives to handle such multiple datasets created?
>
> (2) ?If this is right way, how can I do it?
>
>
> Thank you for attention and I will appreciate your help...
>
>
> JC
>

-- 
Sarah Goslee
http://www.functionaldiversity.org

Possibly Parallel Threads

Search for more maybe matching threads

R help - Sep 2011 - generating multiple dataset and applying function and output multiple output dataset......

[R] generating multiple dataset and applying function and output multiple output dataset......

[R] generating multiple dataset and applying function and output multiple output dataset......

Possibly Parallel Threads