Hi all, I'm hoping someone might help with a query about conditionally applying formulas to a dataframe. In essence I have 3 lookup tables (Table A, B & C) and a dataframe with a variable Type.Code, which identifies the Lookup Table to which each record belongs. The lookup tables reference different sensor types for which I need apply a different formula to values in Column3 in each row to calculate a value for Column 4. I have no problem writing a for loop to handle this, but with over 1M rows in the df this is very slow. Is there a way to solve this using a function such as sapply or would I be better off just subsetting and applying each function separately on the appropriate subset before rejoining. Thanks in advance. Nick Table.A 1 3 5 13 Table.B 4 6 10 20 Table.C 5321 3233 4532 Dataframe Record Type.code Column3 Column4 1 1 0.4 2 3 0.25 3 4 100 4 20 150 5 5 0.4 6 4532 NA I have no problem writing a foor loop to do this for (i in 1:nrow(dataframe) ) { if (Type.code[i]%in%Table.A) Reading[i]<-function 1 else if (Type.code[i]%in%tTable.B) Reading[i]<-function 2 else if (Type.code[i]%in%Table.C) Reading[i]<-function 3 } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dr Nick Bond Research Fellow Monash University Victoria, Australia, 38000 Email: Nick.Bond@sci.monash.edu.au ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [[alternative HTML version deleted]]
jim holtman
2010-May-15 12:05 UTC
[R] conditional calculations per row (loop versus apply)
try this: Reading <- ifesle (Type.code%in%Table.A, function 1, ifelse (Type.code%in%tTable.B, function 2, ifelse (Type.code%in%Table.C, function 3,NA))) On Sat, May 15, 2010 at 5:21 AM, Nick Bond <Nick.Bond@sci.monash.edu.au>wrote:> Hi all, > I'm hoping someone might help with a query about conditionally applying > formulas to a dataframe. > > In essence I have 3 lookup tables (Table A, B & C) and a dataframe with a > variable Type.Code, which identifies the Lookup Table to which each record > belongs. The lookup tables reference different sensor types for which I need > apply a different formula to values in Column3 in each row to calculate a > value for Column 4. I have no problem writing a for loop to handle this, but > with over 1M rows in the df this is very slow. > > > Is there a way to solve this using a function such as sapply or would I be > better off just subsetting and applying each function separately on the > appropriate subset before rejoining. > > Thanks in advance. > > Nick > > > Table.A > 1 > 3 > 5 > 13 > > Table.B > 4 > 6 > 10 > 20 > > Table.C > > 5321 > 3233 > 4532 > > Dataframe > Record Type.code Column3 Column4 > 1 1 0.4 > 2 3 0.25 > 3 4 100 > 4 20 150 > 5 5 0.4 > 6 4532 NA > > > I have no problem writing a foor loop to do this > for (i in 1:nrow(dataframe) ) { > > if (Type.code[i]%in%Table.A) Reading[i]<-function 1 > else > if (Type.code[i]%in%tTable.B) Reading[i]<-function 2 > else > if (Type.code[i]%in%Table.C) Reading[i]<-function 3 > } > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Dr Nick Bond > Research Fellow > Monash University > Victoria, Australia, 38000 > Email: Nick.Bond@sci.monash.edu.au > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]]
Apparently Analagous Threads
- Function to read a string as the variables as opposed to taking the string name as the variable
- Displaying median value over the horizontal(median)line in the boxplot
- R Shiny Help - Trouble passing user input columns to emmeans after ANOVA analysis
- How to index a matrix with different row-number for each column?
- How to separate the string?