thr3ads.net - R help - [R] dividing a dataframe column by different constants [Sep 2009]

If this information is useful, please help other people find it:
Share via:

Ottorino-Luca Pantani

2009-Sep-03 16:17 UTC

[R] dividing a dataframe column by different constants

Dear R users, today I've got the following problem.
Here you are a dataframe as example.
There are some SAMPLES  for which a CONCentration was recorded through TIME.
The time during which the concentration was recorded is not always the same,
10 points for Sample A, 7 points for Sample B and 11 for sample C

Also the initial concentration was not the same for the three samples.

I would like express the concentrations as % of the concentration at 
time = 1, therefore I wrote the following code that do the job, but is 
impractical when the samples are, as in my real case, more than on hundred.
It is known that at the minimum time is present the maximum 
concentration, by which divide all the other concentrations in the sample.

I'm quite sure that there's a more elegant solution, but I really do not
even imagine how to write it.

Thanks in advance for your time


(df.mydata <- data.frame(
                         CONC                          c(seq( from = 1, to =
0.1, by = -0.1 ),
                           seq( from = 0.8, to = 0.2, by = -0.1 ),
                           seq( from = 0.6, to = 0.1, by = -0.05 )),
                         TIME                          c(1:10,
                           2:8,
                           4:14 ),
                         SAMPLE = c( rep( "A", 10 ),
                           rep( "B", 7 ),
                           rep( "C", 11 )
                           )
                         )
 )
MAX <- tapply( df.mydata$CONC, df.mydata$SAMPLE, max )
(df.mydata$PERCENTAGE <-
 ifelse(df.mydata$SAMPLE == "A",  df.mydata$CONC / MAX[1],
        ifelse(df.mydata$SAMPLE == "B",  df.mydata$CONC / MAX[2],
               df.mydata$CONC / MAX[3])))

-- 
Ottorino-Luca Pantani, Universit? di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta
P.zle Cascine 28 50144 Firenze Italia
Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 
OLPantani at unifi.it  http://www4.unifi.it/dssnp/

David Winsemius

2009-Sep-03 16:43 UTC

head link

[R] dividing a dataframe column by different constants

On Sep 3, 2009, at 12:17 PM, Ottorino-Luca Pantani wrote:
> Dear R users, today I've got the following problem.
> Here you are a dataframe as example.
> There are some SAMPLES  for which a CONCentration was recorded  
> through TIME.
> The time during which the concentration was recorded is not always  
> the same,
> 10 points for Sample A, 7 points for Sample B and 11 for sample C
>
> Also the initial concentration was not the same for the three samples.
>
> I would like express the concentrations as % of the concentration at  
> time = 1, therefore I wrote the following code that do the job, but  
> is impractical when the samples are, as in my real case, more than  
> on hundred.
> It is known that at the minimum time is present the maximum  
> concentration, by which divide all the other concentrations in the  
> sample.
>
> I'm quite sure that there's a more elegant solution, but I really
do
> not even imagine how to write it.
>
> Thanks in advance for your time
>
>
> (df.mydata <- data.frame(
>                        CONC >                        c(seq( from = 1, to
= 0.1, by = -0.1 ),
>                          seq( from = 0.8, to = 0.2, by = -0.1 ),
>                          seq( from = 0.6, to = 0.1, by = -0.05 )),
>                        TIME >                        c(1:10,
>                          2:8,
>                          4:14 ),
>                        SAMPLE = c( rep( "A", 10 ),
>                          rep( "B", 7 ),
>                          rep( "C", 11 )
>                          )
>                        )
> )
Perhaps this:

by(df.mydata, df.mydata$SAMPLE, function(x) x$CONC/x$CONC[1] )

...or if you wanted to used max(x$CONC) as the standardizing procedure  
hat ought to work as well. With your data is gives identical results.

The equivalent tapply construction would be:

tapply(df.mydata$CONC, df.mydata$SAMPLE, function(x) x/x[1] )

> MAX <- tapply( df.mydata$CONC, df.mydata$SAMPLE, max )
> (df.mydata$PERCENTAGE <-
> ifelse(df.mydata$SAMPLE == "A",  df.mydata$CONC / MAX[1],
>       ifelse(df.mydata$SAMPLE == "B",  df.mydata$CONC / MAX[2],
>              df.mydata$CONC / MAX[3])))
>
> -- 
> Ottorino-Luca Pantani, Universit? di Firenze
> Dip. Scienza del Suolo e Nutrizione della Pianta
> P.zle Cascine 28 50144 Firenze Italia
> Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 OLPantani at unifi.it 
http://www4.unifi.it/dssnp/
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Jorge Ivan Velez

2009-Sep-03 16:48 UTC

head link

[R] dividing a dataframe column by different constants

Dear Ottorino-Luca,
Here is a suggestion using ave():

df.mydata$PERCENTAGE <- with(df.mydata, ave(CONC, list(SAMPLE), FUN
function(x) x / max(x) ))
df.mydata[1:5,]
#   CONC TIME SAMPLE PERCENTAGE
# 1  1.0    1      A        1.0
# 2  0.9    2      A        0.9
# 3  0.8    3      A        0.8
# 4  0.7    4      A        0.7
# 5  0.6    5      A        0.6

See ?ave and ?tapply for more information.

HTH,
Jorge


On Thu, Sep 3, 2009 at 12:17 PM, Ottorino-Luca Pantani <
ottorino-luca.pantani@unifi.it> wrote:
> Dear R users, today I've got the following problem.
> Here you are a dataframe as example.
> There are some SAMPLES  for which a CONCentration was recorded through
> TIME.
> The time during which the concentration was recorded is not always the
> same,
> 10 points for Sample A, 7 points for Sample B and 11 for sample C
>
> Also the initial concentration was not the same for the three samples.
>
> I would like express the concentrations as % of the concentration at time
> 1, therefore I wrote the following code that do the job, but is impractical
> when the samples are, as in my real case, more than on hundred.
> It is known that at the minimum time is present the maximum concentration,
> by which divide all the other concentrations in the sample.
>
> I'm quite sure that there's a more elegant solution, but I really
do not
> even imagine how to write it.
>
> Thanks in advance for your time
>
>
> (df.mydata <- data.frame(
>                        CONC >                        c(seq( from = 1, to
= 0.1, by = -0.1 ),
>                          seq( from = 0.8, to = 0.2, by = -0.1 ),
>                          seq( from = 0.6, to = 0.1, by = -0.05 )),
>                        TIME >                        c(1:10,
>                          2:8,
>                          4:14 ),
>                        SAMPLE = c( rep( "A", 10 ),
>                          rep( "B", 7 ),
>                          rep( "C", 11 )
>                          )
>                        )
> )
> MAX <- tapply( df.mydata$CONC, df.mydata$SAMPLE, max )
> (df.mydata$PERCENTAGE <-
> ifelse(df.mydata$SAMPLE == "A",  df.mydata$CONC / MAX[1],
>       ifelse(df.mydata$SAMPLE == "B",  df.mydata$CONC / MAX[2],
>              df.mydata$CONC / MAX[3])))
>
> --
> Ottorino-Luca Pantani, Università di Firenze
> Dip. Scienza del Suolo e Nutrizione della Pianta
> P.zle Cascine 28 50144 Firenze Italia
> Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273 OLPantani@unifi.it
> http://www4.unifi.it/dssnp/
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Seemingly Similar Threads

Search for more possibly parallel threads

R help - Sep 2009 - dividing a dataframe column by different constants

[R] dividing a dataframe column by different constants

[R] dividing a dataframe column by different constants

[R] dividing a dataframe column by different constants

Seemingly Similar Threads