thr3ads.net - R help - [R] a more elegant way to get percentages? [Mar 2008]

If this information is useful, please help other people find it:
Share via:

Monica Pisica

2008-Mar-13 13:36 UTC

[R] a more elegant way to get percentages?

Hi,

I am trying to get percentages in a more elegant way. I have a data.frame with
locations and values (counts) of species at that location. Each location is
repeated for each species i have values for and i would like to get percentages
of each species at that location. I am not sure if i am clear in my explanations
so i will paste my code below:

#####################
> x   locat val
1      a   5
2      b   5
3      b  15
4      c   5
5      c  20
6      c   5
7      c  10
8      d   5
9      d  15
10     d  10> loc1 <- x$locat
> n <- length(loc1)
> locuniq1 <- unique(loc1)
> m <- length(locuniq1)
> counts <- seq(1:m)
> 
> for (i in 1:m) {+ count <- 0
+ for (j in 1:n) {
+ if (loc1[j]==locuniq1[i]) count <- count+1 
+ counts[i] <- count
+ }
+ }> 
> percent1 <- rep(0,n)
> j <- 0
> for (i in 1:m) {+ 
+ b <- x[(j+1):(j+counts[i]),]
+ total <- sum(b$val)
+ percent1[(j+1):(j+counts[i])] <- round(apply(as.matrix(b$val), 1,
function(x) {x*100/total}),2)
+ j = j+counts[i]
+ }> x1 <- cbind(x, percent1)    # this is the result i want 
> x1   locat val percent1
1      a   5   100.00
2      b   5    25.00
3      b  15    75.00
4      c   5    12.50
5      c  20    50.00
6      c   5    12.50
7      c  10    25.00
8      d   5    16.67
9      d  15    50.00
10     d  10    33.33> ################

I am wondering if there is any way to do it more efficiently, much more that the
first loop which gives how many times each location is present in the data.frame
is slow if you have a larger data.frame and not only 10 rows.

Thanks for any input and sorry if the email is on the long side,

Monica


_________________________________________________________________
[[elided Hotmail spam]]

Gabor Grothendieck

2008-Mar-13 13:45 UTC

head link

[R] a more elegant way to get percentages?

Assuming your x is as follows:

x <- data.frame(locat = c("a", "b", "b",
"c", "c", "c", "c", "d",
"d", "d"),
     val = c(5, 5, 15, 5, 20, 5, 10, 5, 15, 10))

Try this:

x$percent1 <- ave(x$val, x$locat, FUN = function(x) 100*x/sum(x))

On Thu, Mar 13, 2008 at 9:36 AM, Monica Pisica <pisicandru at hotmail.com>
wrote:>
> Hi,
>
> I am trying to get percentages in a more elegant way. I have a data.frame
with locations and values (counts) of species at that location. Each location is
repeated for each species i have values for and i would like to get percentages
of each species at that location. I am not sure if i am clear in my explanations
so i will paste my code below:
>
> #####################
>
> > x
>   locat val
> 1      a   5
> 2      b   5
> 3      b  15
> 4      c   5
> 5      c  20
> 6      c   5
> 7      c  10
> 8      d   5
> 9      d  15
> 10     d  10
> > loc1 <- x$locat
> > n <- length(loc1)
> > locuniq1 <- unique(loc1)
> > m <- length(locuniq1)
> > counts <- seq(1:m)
> >
> > for (i in 1:m) {
> + count <- 0
> + for (j in 1:n) {
> + if (loc1[j]==locuniq1[i]) count <- count+1
> + counts[i] <- count
> + }
> + }
> >
> > percent1 <- rep(0,n)
> > j <- 0
> > for (i in 1:m) {
> +
> + b <- x[(j+1):(j+counts[i]),]
> + total <- sum(b$val)
> + percent1[(j+1):(j+counts[i])] <- round(apply(as.matrix(b$val), 1,
function(x) {x*100/total}),2)
> + j = j+counts[i]
> + }
> > x1 <- cbind(x, percent1)    # this is the result i want
> > x1
>   locat val percent1
> 1      a   5   100.00
> 2      b   5    25.00
> 3      b  15    75.00
> 4      c   5    12.50
> 5      c  20    50.00
> 6      c   5    12.50
> 7      c  10    25.00
> 8      d   5    16.67
> 9      d  15    50.00
> 10     d  10    33.33
> >
> ################
>
> I am wondering if there is any way to do it more efficiently, much more
that the first loop which gives how many times each location is present in the
data.frame is slow if you have a larger data.frame and not only 10 rows.
>
> Thanks for any input and sorry if the email is on the long side,
>
> Monica
>
>
> _________________________________________________________________
> [[elided Hotmail spam]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Dimitris Rizopoulos

2008-Mar-13 13:45 UTC

head link

[R] a more elegant way to get percentages?

try the following:

x <- read.table(textConnection("locat val
1      a   5
2      b   5
3      b  15
4      c   5
5      c  20
6      c   5
7      c  10
8      d   5
9      d  15
10     d  10"), header = TRUE)

x$percent1 <- unlist(tapply(x$val, x$locat, function(x){
    round(100 * x / sum(x), 2)
}))
x


however, check whether the levels of the factor 'x$locat' are 
appropriately ordered.

I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm

----- Original Message ----- 
From: "Monica Pisica" <pisicandru at hotmail.com>
To: <r-help at r-project.org>
Sent: Thursday, March 13, 2008 2:36 PM
Subject: [R] a more elegant way to get percentages?

>
> Hi,
>
> I am trying to get percentages in a more elegant way. I have a 
> data.frame with locations and values (counts) of species at that 
> location. Each location is repeated for each species i have values 
> for and i would like to get percentages of each species at that 
> location. I am not sure if i am clear in my explanations so i will 
> paste my code below:
>
> #####################
>
>> x
>   locat val
> 1      a   5
> 2      b   5
> 3      b  15
> 4      c   5
> 5      c  20
> 6      c   5
> 7      c  10
> 8      d   5
> 9      d  15
> 10     d  10
>> loc1 <- x$locat
>> n <- length(loc1)
>> locuniq1 <- unique(loc1)
>> m <- length(locuniq1)
>> counts <- seq(1:m)
>>
>> for (i in 1:m) {
> + count <- 0
> + for (j in 1:n) {
> + if (loc1[j]==locuniq1[i]) count <- count+1
> + counts[i] <- count
> + }
> + }
>>
>> percent1 <- rep(0,n)
>> j <- 0
>> for (i in 1:m) {
> +
> + b <- x[(j+1):(j+counts[i]),]
> + total <- sum(b$val)
> + percent1[(j+1):(j+counts[i])] <- round(apply(as.matrix(b$val), 1, 
> function(x) {x*100/total}),2)
> + j = j+counts[i]
> + }
>> x1 <- cbind(x, percent1)    # this is the result i want
>> x1
>   locat val percent1
> 1      a   5   100.00
> 2      b   5    25.00
> 3      b  15    75.00
> 4      c   5    12.50
> 5      c  20    50.00
> 6      c   5    12.50
> 7      c  10    25.00
> 8      d   5    16.67
> 9      d  15    50.00
> 10     d  10    33.33
>>
> ################
>
> I am wondering if there is any way to do it more efficiently, much 
> more that the first loop which gives how many times each location is 
> present in the data.frame is slow if you have a larger data.frame 
> and not only 10 rows.
>
> Thanks for any input and sorry if the email is on the long side,
>
> Monica
>
>
> _________________________________________________________________
> [[elided Hotmail spam]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

Christos Hatzis

2008-Mar-13 13:54 UTC

head link

[R] a more elegant way to get percentages?

Monica,

You can try the following:
> x.tot <- aggregate(x$val, by=list(total=x$locat), 'sum')
> x.tot  total  x
1     a  5
2     b 20
3     c 40
4     d 30> cbind(x, perc=x$val/rep(x.tot$x, table(x$locat)) * 100)   locat val      perc
1      a   5 100.00000
2      b   5  25.00000
3      b  15  75.00000
4      c   5  12.50000
5      c  20  50.00000
6      c   5  12.50000
7      c  10  25.00000
8      d   5  16.66667
9      d  15  50.00000
10     d  10  33.33333

-Christos
> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Monica Pisica
> Sent: Thursday, March 13, 2008 9:36 AM
> To: r-help at r-project.org
> Subject: [R] a more elegant way to get percentages?
> 
> 
> Hi,
> 
> I am trying to get percentages in a more elegant way. I have 
> a data.frame with locations and values (counts) of species at 
> that location. Each location is repeated for each species i 
> have values for and i would like to get percentages of each 
> species at that location. I am not sure if i am clear in my 
> explanations so i will paste my code below:
> 
> #####################
> 
> > x
>    locat val
> 1      a   5
> 2      b   5
> 3      b  15
> 4      c   5
> 5      c  20
> 6      c   5
> 7      c  10
> 8      d   5
> 9      d  15
> 10     d  10
> > loc1 <- x$locat
> > n <- length(loc1)
> > locuniq1 <- unique(loc1)
> > m <- length(locuniq1)
> > counts <- seq(1:m)
> > 
> > for (i in 1:m) {
> + count <- 0
> + for (j in 1:n) {
> + if (loc1[j]==locuniq1[i]) count <- count+1 counts[i] <- count } }
> > 
> > percent1 <- rep(0,n)
> > j <- 0
> > for (i in 1:m) {
> + 
> + b <- x[(j+1):(j+counts[i]),]
> + total <- sum(b$val)
> + percent1[(j+1):(j+counts[i])] <- round(apply(as.matrix(b$val), 1, 
> + function(x) {x*100/total}),2) j = j+counts[i] }
> > x1 <- cbind(x, percent1)    # this is the result i want 
> > x1
>    locat val percent1
> 1      a   5   100.00
> 2      b   5    25.00
> 3      b  15    75.00
> 4      c   5    12.50
> 5      c  20    50.00
> 6      c   5    12.50
> 7      c  10    25.00
> 8      d   5    16.67
> 9      d  15    50.00
> 10     d  10    33.33
> > 
> ################
> 
> I am wondering if there is any way to do it more efficiently, 
> much more that the first loop which gives how many times each 
> location is present in the data.frame is slow if you have a 
> larger data.frame and not only 10 rows.
> 
> Thanks for any input and sorry if the email is on the long side,
> 
> Monica
> 
> 
> _________________________________________________________________
> [[elided Hotmail spam]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
>

(Ted Harding)

2008-Mar-13 14:15 UTC

head link

[R] a more elegant way to get percentages?

Now that people have answered Monica's query, can someone help me?!!
See below.

On 13-Mar-08 13:36:03, Monica Pisica wrote:> 
> Hi,
> 
> I am trying to get percentages in a more elegant way. I have a
> data.frame with locations and values (counts) of species at that
> location. Each location is repeated for each species i have values for
> and i would like to get percentages of each species at that location. I
> am not sure if i am clear in my explanations so i will paste my code
> below:
> 
>#####################
> 
>> x
>    locat val
> 1      a   5
> 2      b   5
> 3      b  15
> 4      c   5
> 5      c  20
> 6      c   5
> 7      c  10
> 8      d   5
> 9      d  15
> 10     d  10
With Monica's dataframe as above, the answer would be 100*x[,1]/z
where we want z to be c(5,20,20,40,40,40,40,30,30,30).

So, intending to give Monica a helpful answer, I tried
> apply(x,1,function(y) sum(x[x[,1]==y,2])) 1  2  3  4  5  6  7  8  9 10 
 5 15 15 30 30 30 30 15 15 15 

and similarly
> apply(x,1,function(y) sum(x$val[x$locat==y])) 1  2  3  4  5  6  7  8  9 10 
 5 15 15 30 30 30 30 15 15 15


So why didn't this work? Where's my blind spot? Indeed, why
did it gives the results it did?

With thanks,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 13-Mar-08                                       Time: 14:15:34
------------------------------ XFMail ------------------------------

Maybe Matching Threads

Search for more seemingly similar threads

R help - Mar 2008 - a more elegant way to get percentages?

[R] a more elegant way to get percentages?

[R] a more elegant way to get percentages?

[R] a more elegant way to get percentages?

[R] a more elegant way to get percentages?

[R] a more elegant way to get percentages?

Maybe Matching Threads