gildororonar at mail-on.us
2013-Sep-14 16:35 UTC
[R] how to do this trimming/selecting in canonical R?
This is better explained by example:> A <- data.frame(force = sort(runif(10, 0, 1)), condition = > sort(sample(0:100, 10))) > B <- data.frame(counterforce = sort(runif(15, 0, 1), decreasing=T), > condition = sort(sample(0:100, 15)))So we have:> Aforce condition 1 0.03515542 1 2 0.13267882 13 3 0.26155689 24 4 0.37453142 38 5 0.39360520 45 <--- trim everything after this 6 0.43924737 48 7 0.47669800 50 8 0.57044795 51 9 0.81177499 61 10 0.98860450 94> Bcounterforce condition 1 0.965769548 2 2 0.965266255 5 3 0.846941244 7 4 0.818013029 11 5 0.813139978 22 6 0.730599939 34 7 0.715985436 39 8 0.658073895 40 9 0.421264948 42 <--- trim everything after this 10 0.373774505 52 11 0.242191461 62 12 0.090584590 63 13 0.070020635 68 14 0.067366062 83 15 0.001585313 84 I need to trim away rows after No. 5, from A, trim away rows after No. 9, from B. Because A[5, condition] > max(B[1:9, condition] && A[5, force] > B[9+1, counterforce] In a general way, I am looking for x and y, where: A[x, condition] > max(B[1:y, condition] && A[x, force] > B[y+1, counterforce] and I will select A[1:x,] and B[1:y,], or trim away the rest, because they are irrelevent for the calculation onwards. This is easy to do it in C, and I actually have done it in C-like R script, by looping through all rows of A, and breaking from the loop when finding the matching trim-point in B. But I am learning R, so what is the native way to do it in R? ------------------------------------------------- VFEmail.net - http://www.vfemail.net $14.95 ONETIME Lifetime accounts with Privacy Features! 15GB disk! No bandwidth quotas! Commercial and Bulk Mail Options!
Gabor Grothendieck
2013-Sep-14 18:04 UTC
[R] how to do this trimming/selecting in canonical R?
On Sat, Sep 14, 2013 at 12:35 PM, <gildororonar at mail-on.us> wrote:> This is better explained by example: > >> A <- data.frame(force = sort(runif(10, 0, 1)), condition >> sort(sample(0:100, 10))) >> B <- data.frame(counterforce = sort(runif(15, 0, 1), decreasing=T), >> condition = sort(sample(0:100, 15))) > > > So we have: > >> A > > force condition > 1 0.03515542 1 > 2 0.13267882 13 > 3 0.26155689 24 > 4 0.37453142 38 > 5 0.39360520 45 <--- trim everything after this > 6 0.43924737 48 > 7 0.47669800 50 > 8 0.57044795 51 > 9 0.81177499 61 > 10 0.98860450 94 > >> B > > counterforce condition > 1 0.965769548 2 > 2 0.965266255 5 > 3 0.846941244 7 > 4 0.818013029 11 > 5 0.813139978 22 > 6 0.730599939 34 > 7 0.715985436 39 > 8 0.658073895 40 > 9 0.421264948 42 <--- trim everything after this > 10 0.373774505 52 > 11 0.242191461 62 > 12 0.090584590 63 > 13 0.070020635 68 > 14 0.067366062 83 > 15 0.001585313 84 > > I need to trim away rows after No. 5, from A, trim away rows after No. 9, > from B. > > Because > > A[5, condition] > max(B[1:9, condition] && A[5, force] > B[9+1, > counterforce] > > In a general way, I am looking for x and y, where: > > A[x, condition] > max(B[1:y, condition] && A[x, force] > B[y+1, > counterforce] > > and I will select A[1:x,] and B[1:y,], or trim away the rest, because they > are irrelevent for the calculation onwards.Try this: library(sqldf) r <- sqldf("select a.condition, b1.condition from A a, B b1, B b2 where a.condition > b1.condition and a.force > b2.counterforce and b2.rowid = b1.rowid + 1 order by a.condition, b1.condition limit 1") Atrim <- fn$sqldf("select * from A where condition <= `r[1]` ") Btrim <- fn$sqldf("select * from B where condition <= `r[2]` ") giving:> Atrimforce condition 1 0.03515542 1 2 0.13267882 13 3 0.26155689 24 4 0.37453142 38 5 0.39360520 45> Btrimcounterforce condition 1 0.9657695 2 2 0.9652663 5 3 0.8469412 7 4 0.8180130 11 5 0.8131400 22 6 0.7305999 34 7 0.7159854 39 8 0.6580739 40 9 0.4212649 42
Berend Hasselman
2013-Sep-14 20:06 UTC
[R] how to do this trimming/selecting in canonical R?
On 14-09-2013, at 18:35, gildororonar at mail-on.us wrote:> This is better explained by example: > >> A <- data.frame(force = sort(runif(10, 0, 1)), condition = sort(sample(0:100, 10))) >> B <- data.frame(counterforce = sort(runif(15, 0, 1), decreasing=T), condition = sort(sample(0:100, 15))) > > So we have: > >> A > force condition > 1 0.03515542 1 > 2 0.13267882 13 > 3 0.26155689 24 > 4 0.37453142 38 > 5 0.39360520 45 <--- trim everything after this > 6 0.43924737 48 > 7 0.47669800 50 > 8 0.57044795 51 > 9 0.81177499 61 > 10 0.98860450 94 > >> B > counterforce condition > 1 0.965769548 2 > 2 0.965266255 5 > 3 0.846941244 7 > 4 0.818013029 11 > 5 0.813139978 22 > 6 0.730599939 34 > 7 0.715985436 39 > 8 0.658073895 40 > 9 0.421264948 42 <--- trim everything after this > 10 0.373774505 52 > 11 0.242191461 62 > 12 0.090584590 63 > 13 0.070020635 68 > 14 0.067366062 83 > 15 0.001585313 84 > > I need to trim away rows after No. 5, from A, trim away rows after No. 9, from B. > > Because > > A[5, condition] > max(B[1:9, condition] && A[5, force] > B[9+1, counterforce] > > In a general way, I am looking for x and y, where: > > A[x, condition] > max(B[1:y, condition] && A[x, force] > B[y+1, counterforce] > > and I will select A[1:x,] and B[1:y,], or trim away the rest, because they are irrelevent for the calculation onwards. > > This is easy to do it in C, and I actually have done it in C-like R script, by looping through all rows of A, and breaking from the loop when finding the matching trim-point in B. But I am learning R, so what is the native way to do it in R?Your trim-point in B is not unique (at least for the data you provided). Use a loop in R like this for( x in seq_len(nrow(A)) ) { for( y in seq_len(nrow(B)-1) ) { res1 <- A[x,"condition"] > max(B[1:y,"condition"]) res2 <- A[x,"force"] > B[y+1,"counterforce"] res <- res1 && res2 if(res) cat("x=",x,"y=",y,"res=",res,"\n") } } Result is: # x= 5 y= 9 res= TRUE # x= 6 y= 8 res= TRUE # x= 6 y= 9 res= TRUE # x= 7 y= 8 res= TRUE # x= 7 y= 9 res= TRUE # x= 8 y= 8 res= TRUE # x= 8 y= 9 res= TRUE # x= 9 y= 5 res= TRUE # x= 9 y= 6 res= TRUE # x= 9 y= 7 res= TRUE # x= 9 y= 8 res= TRUE # x= 9 y= 9 res= TRUE # x= 9 y= 10 res= TRUE # x= 10 y= 1 res= TRUE # x= 10 y= 2 res= TRUE # x= 10 y= 3 res= TRUE # x= 10 y= 4 res= TRUE # x= 10 y= 5 res= TRUE # x= 10 y= 6 res= TRUE # x= 10 y= 7 res= TRUE # x= 10 y= 8 res= TRUE # x= 10 y= 9 res= TRUE # x= 10 y= 10 res= TRUE # x= 10 y= 11 res= TRUE # x= 10 y= 12 res= TRUE # x= 10 y= 13 res= TRUE # x= 10 y= 14 res= TRUE If you want a unique answer you'll need additional restrictions. Berend