thr3ads.net - R help - [R] re-sampling of large sacle data [Jul 2010]

If this information is useful, please help other people find it:
Share via:

jd6688

2010-Jul-27 21:43 UTC

[R] re-sampling of large sacle data

myDF:

d1		d2		d3                      d4                        d5
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.000925938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.000925938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938
-0.166910351	0.022304377	-0.00825924	0.008330689	-0.168225938


per the dataframe above,
step 1: do the following

        
doit=function(x)c(sum_positive=sum(x[-1][x[-1]>0]),sum_negative=sum(x[-1][x[-1]<0]))

          pos_neg_pool<-t(apply(myDF,1,doit))
          if not first run then append the data to the pos_neg_pool
step2:  reshuffle the data by columns then do step1, this step need to run
10000 times;

output will be 23*10000=230,000 rows.

Can anyone point out how to automate this 10000 runs in R?

Thanks,


          



-- 
View this message in context:
http://r.789695.n4.nabble.com/re-sampling-of-large-sacle-data-tp2304165p2304165.html
Sent from the R help mailing list archive at Nabble.com.

Gray Calhoun

2010-Jul-27 21:51 UTC

head link

[R] re-sampling of large sacle data

Write a function that incorporates "doit" and the column shuffle.
Let's call it "doitbetter"

replicate(10000, doitbetter())

You'll probably want to read the help for "replicate" to make sure
the
defaults are what you want.

--Gray

On Tue, Jul 27, 2010 at 4:43 PM, jd6688 <jdsignature at gmail.com>
wrote:>
> myDF:
>
> d1 ? ? ? ? ? ? ?d2 ? ? ? ? ? ? ?d3 ? ? ? ? ? ? ? ? ? ? ?d4 ? ? ? ? ? ? ? ?
? ? ? ?d5
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.000925938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.000925938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
> -0.166910351 ? ?0.022304377 ? ? -0.00825924 ? ? 0.008330689 ? ?
-0.168225938
>
>
> per the dataframe above,
> step 1: do the following
>
>
>
doit=function(x)c(sum_positive=sum(x[-1][x[-1]>0]),sum_negative=sum(x[-1][x[-1]<0]))
>
> ? ? ? ? ?pos_neg_pool<-t(apply(myDF,1,doit))
> ? ? ? ? ?if not first run then append the data to the pos_neg_pool
> step2: ?reshuffle the data by columns then do step1, this step need to run
> 10000 times;
>
> output will be 23*10000=230,000 rows.
>
> Can anyone point out how to automate this 10000 runs in R?
>
> Thanks,
>
>
>
>
>
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/re-sampling-of-large-sacle-data-tp2304165p2304165.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Gray Calhoun

Assistant Professor of Economics, Iowa State University
http://www.econ.iastate.edu/~gcalhoun/

jd6688

2010-Jul-27 22:44 UTC

head link

[R] re-sampling of large sacle data

I am trying to do the following to accomplish the tasks, can anybody to
simplify the solutions.

Thanks,

for (i in 1:10000){
 d<-apply(s,2,sample)
  pos_neg_tem<-t(apply(d,1,doit))
 if (i>1){
   pos_neg_pool<-rbind(pos_neg_pool,pos_neg_tem)
   
 }else{
  
  pos_neg_pool<- pos_neg_tem
}}
-- 
View this message in context:
http://r.789695.n4.nabble.com/re-sampling-of-large-sacle-data-tp2304165p2304221.html
Sent from the R help mailing list archive at Nabble.com.

David Winsemius

2010-Jul-28 11:22 UTC

head link

[R] re-sampling of large sacle data

On Jul 27, 2010, at 6:44 PM, jd6688 wrote:
>
> I am trying to do the following to accomplish the tasks, can anybody  
> to
> simplify the solutions.
>
> Thanks,
>
> for (i in 1:10000){
> d<-apply(s,2,sample)
>  pos_neg_tem<-t(apply(d,1,doit))
> if (i>1){
>   pos_neg_pool<-rbind(pos_neg_pool,pos_neg_tem)
>
> }else{
>
>  pos_neg_pool<- pos_neg_tem
> }}
A bit of efficiency advice: incremental creation of objects is  
generally a major source of slowness. Consider creating pos_neg_pool  
before the loop and then "filling it in" within the loop. It would  
also let you remove that "if{}else{}" statement.

-- 

David Winsemius, MD
West Hartford, CT

David Winsemius

2010-Jul-28 11:49 UTC

head link

[R] re-sampling of large sacle data

On Jul 28, 2010, at 12:09 AM, jd6688 wrote:
>
>
> d <- apply(s, 2, sample, size = 10000*nrow(s), replace = TRUE)
>
> why the code above return the following error
> Error: cannot allocate vector of size 218.8 Mb
Possibilities:
Your workspace is full of other junk?
Your workspace used to be full of other junk and its memory is too  
fragmented to find a contiguous chunk of memory?
Your computer is full of other junk?
You have not read the R-FAQ ( or the RW-FAQ ) items on the the topic  
of memory usage on whatever operating system you are working with.

-- 

David Winsemius, MD
West Hartford, CT

R help - Jul 2010 - re-sampling of large sacle data

[R] re-sampling of large sacle data

[R] re-sampling of large sacle data

[R] re-sampling of large sacle data

[R] re-sampling of large sacle data

[R] re-sampling of large sacle data

Seemingly Similar Threads