Hi all,
I'm hoping someone might help with a query about conditionally applying
formulas to a dataframe.
In essence I have 3 lookup tables (Table A, B & C) and a dataframe with a
variable Type.Code, which identifies the Lookup Table to which each record
belongs. The lookup tables reference different sensor types for which I need
apply a different formula to values in Column3 in each row to calculate a value
for Column 4. I have no problem writing a for loop to handle this, but with over
1M rows in the df this is very slow.
Is there a way to solve this using a function such as sapply or would I be
better off just subsetting and applying each function separately on the
appropriate subset before rejoining.
Thanks in advance.
Nick
Table.A
1
3
5
13
Table.B
4
6
10
20
Table.C
5321
3233
4532
Dataframe
Record Type.code Column3 Column4
1 1 0.4
2 3 0.25
3 4 100
4 20 150
5 5 0.4
6 4532 NA
I have no problem writing a foor loop to do this
for (i in 1:nrow(dataframe) ) {
if (Type.code[i]%in%Table.A) Reading[i]<-function 1
else
if (Type.code[i]%in%tTable.B) Reading[i]<-function 2
else
if (Type.code[i]%in%Table.C) Reading[i]<-function 3
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dr Nick Bond
Research Fellow
Monash University
Victoria, Australia, 38000
Email: Nick.Bond@sci.monash.edu.au
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[[alternative HTML version deleted]]
jim holtman
2010-May-15 12:05 UTC
[R] conditional calculations per row (loop versus apply)
try this:
Reading <- ifesle (Type.code%in%Table.A, function 1,
ifelse (Type.code%in%tTable.B, function 2,
ifelse (Type.code%in%Table.C, function 3,NA)))
On Sat, May 15, 2010 at 5:21 AM, Nick Bond
<Nick.Bond@sci.monash.edu.au>wrote:
> Hi all,
> I'm hoping someone might help with a query about conditionally applying
> formulas to a dataframe.
>
> In essence I have 3 lookup tables (Table A, B & C) and a dataframe with
a
> variable Type.Code, which identifies the Lookup Table to which each record
> belongs. The lookup tables reference different sensor types for which I
need
> apply a different formula to values in Column3 in each row to calculate a
> value for Column 4. I have no problem writing a for loop to handle this,
but
> with over 1M rows in the df this is very slow.
>
>
> Is there a way to solve this using a function such as sapply or would I be
> better off just subsetting and applying each function separately on the
> appropriate subset before rejoining.
>
> Thanks in advance.
>
> Nick
>
>
> Table.A
> 1
> 3
> 5
> 13
>
> Table.B
> 4
> 6
> 10
> 20
>
> Table.C
>
> 5321
> 3233
> 4532
>
> Dataframe
> Record Type.code Column3 Column4
> 1 1 0.4
> 2 3 0.25
> 3 4 100
> 4 20 150
> 5 5 0.4
> 6 4532 NA
>
>
> I have no problem writing a foor loop to do this
> for (i in 1:nrow(dataframe) ) {
>
> if (Type.code[i]%in%Table.A) Reading[i]<-function
1
> else
> if (Type.code[i]%in%tTable.B)
Reading[i]<-function 2
> else
> if (Type.code[i]%in%Table.C) Reading[i]<-function
3
> }
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Dr Nick Bond
> Research Fellow
> Monash University
> Victoria, Australia, 38000
> Email: Nick.Bond@sci.monash.edu.au
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
>
http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>
--
Jim Holtman
Cincinnati, OH
+1 513 646 9390
What is the problem that you are trying to solve?
[[alternative HTML version deleted]]
Maybe Matching Threads
- Function to read a string as the variables as opposed to taking the string name as the variable
- Displaying median value over the horizontal(median)line in the boxplot
- R Shiny Help - Trouble passing user input columns to emmeans after ANOVA analysis
- How to index a matrix with different row-number for each column?
- How to separate the string?