I spent some time on this simple question, also searched the forum, eventually hacked my way to an ugly solution for my particular problem but I would like to improve my coding: I have data of the form: df <- expand.grid(group=c('copper', 'zinc', 'aluminum', 'nickel'), condition1=c(1:4)) I would like to add a new data column "condition2", with values equal to the value of condition1 plus a random number from 0-1 (uniform distribution) if the value of condition1 is < 1, or just condition1 if the value of condition1 is > 1. More generally, my interest is in manipulating the values of condition1 if they meet one or more criteria, or keeping the values the same otherwise. Thanks for any thoughts! -- View this message in context: http://r.789695.n4.nabble.com/Performing-operations-only-on-selected-data-tp4650646.html Sent from the R help mailing list archive at Nabble.com.
?ifelse --------------------------------------------------------------------------- Jeff Newmiller The ..... ..... Go Live... DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/Batteries O.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --------------------------------------------------------------------------- Sent from my phone. Please excuse my brevity. Marcel Curlin <cemarcel at u.washington.edu> wrote:>I spent some time on this simple question, also searched the forum, >eventually hacked my way to an ugly solution for my particular problem >but I >would like to improve my coding: > >I have data of the form: >df <- expand.grid(group=c('copper', 'zinc', 'aluminum', 'nickel'), >condition1=c(1:4)) > >I would like to add a new data column "condition2", with values equal >to the >value of condition1 plus a random number from 0-1 (uniform >distribution) if >the value of condition1 is < 1, or just condition1 if the value of >condition1 is > 1. More generally, my interest is in manipulating the >values >of condition1 if they meet one or more criteria, or keeping the values >the >same otherwise. Thanks for any thoughts! > > > > > >-- >View this message in context: >http://r.789695.n4.nabble.com/Performing-operations-only-on-selected-data-tp4650646.html >Sent from the R help mailing list archive at Nabble.com. > >______________________________________________ >R-help at r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code.
Thank you, this works very well. My only remaining question about this is about how ifelse is working; I understand the basic syntax (df$condition2 gets assigned the value *runif(nrow(df1[df1$condition1<=1,]),0,1)* or the value *df$condition1* depending on whether or not df$condition1 meets the criterion "<=1". As I understand it, "runif(nrow(df1[df1$condition1<=1,]),0,1)" is a vector of random values with vector length equal to the number of rows meeting "df$condition1<=1" and df$condition1 is just my column of condition1 values. So the command seems to be going down row by row and assigning condition2 values from one of two vectors in an "interleaved" way. So my question is, how does R keep track of which item in each of the vectors to assign to condition2? For example, if the first 4 entries of condition1 are 1, 3, 4, 1, how does R know to use the *first* entry of vector runif(nrow(df1[df1$condition1<=1,]),0,1) then the *second* and *third* values of vector df$condition1, then the *second* value of vector runif(nrow(df1[df1$condition1<=1,]),0,1)? -- View this message in context: http://r.789695.n4.nabble.com/Performing-operations-only-on-selected-data-tp4650646p4650803.html Sent from the R help mailing list archive at Nabble.com.
Reasonably Related Threads
- difftimes; histogram; memory problems
- Mixed Effects Model on Within-Subjects Design
- prop.test() and the simultaneous confidence interval for multiple proportions in R
- lattice : using both strip and strip.left
- Post-hoc tests for repeated measures in balanced experimental design