thr3ads.net - R help - [R] How to do the same thing for all levels of a column? [Jul 2012]

If this information is useful, please help other people find it:
Share via:

Zhao Jin

2012-Jul-23 22:01 UTC

[R] How to do the same thing for all levels of a column?

Dear all,



I am a R beginner, and I am looking for a way to do the same thing for all
levels of a column in a table.



Basically, I have a bunch of protein sequences composed of different amino
acid residues, and each residue is represented by an uppercase letter. I
want to calculate the ratio of different amino acid residues at each
position of the proteins. Here is an example table:

Proteins

Time_zero

1

2

3

4

5

6

7

8

p1

0.0050723

L

E

Y

I

I

P

D

A

p2

0.0002731

T

E

N

L

V

P

G

A

p3

9.757E-05

L

M

Y

Q

I

P

E

C

p4

0.0002077

R

E

Y

L

I

S

E

A



If I name this table as myfile.txt, I have the following scripts to
calculate the ratio of each amino acid residue at position 1:

# showing levels of the 3rd column, which means the types of residues
>myfile[,3]


# calculating the ratio of L
>list=c(which(myfile[,3]=="L"))
>time0total=sum(myfile[,2])
>AA_L=0
>for (i in 1:length(list)){AA_L=sum(myfile[list[[i]],2]+AA_L)}
>ratio_L=AA_L/time0total


So how can I write a script to do the same thing for the other two levels
(T and R) in column 3, and also do this for every column that contains
amino acid residues?



Many thanks for any help you could give me on this topic! :)



Regards,

Zhao
-- 
Zhao JIN
Ph.D. Candidate
Ruth Ley Lab
467 Biotech
Field of Microbiology, Cornell University
Lab: 607.255.4954
Cell: 412.889.3675

	[[alternative HTML version deleted]]

John Kane

2012-Jul-24 13:51 UTC

head link

[R] How to do the same thing for all levels of a column?

First thing is to supply the data in a useable format.  As is it is essenatially
unreadable.  All R-beginners do this. :)

Have a look at the dput function  (?dput) for a good way to supply sample data
in an email.

If you have a large dataset probably a few dozen lines of data would be fine.

Something like dput(head(mydata)) should be fine.  Just copy and paste the
output into your email.

Welcome to R.  I think you will like it.

John Kane
Kingston ON Canada

> -----Original Message-----
> From: zj29 at cornell.edu
> Sent: Mon, 23 Jul 2012 18:01:11 -0400
> To: r-help at r-project.org
> Subject: [R] How to do the same thing for all levels of a column?
> 
> Dear all,
> 
> 
> 
> I am a R beginner, and I am looking for a way to do the same thing for
> all
> levels of a column in a table.
> 
> 
> 
> Basically, I have a bunch of protein sequences composed of different
> amino
> acid residues, and each residue is represented by an uppercase letter. I
> want to calculate the ratio of different amino acid residues at each
> position of the proteins. Here is an example table:
> 
> Proteins
> 
> Time_zero
> 
> 1
> 
> 2
> 
> 3
> 
> 4
> 
> 5
> 
> 6
> 
> 7
> 
> 8
> 
> p1
> 
> 0.0050723
> 
> L
> 
> E
> 
> Y
> 
> I
> 
> I
> 
> P
> 
> D
> 
> A
> 
> p2
> 
> 0.0002731
> 
> T
> 
> E
> 
> N
> 
> L
> 
> V
> 
> P
> 
> G
> 
> A
> 
> p3
> 
> 9.757E-05
> 
> L
> 
> M
> 
> Y
> 
> Q
> 
> I
> 
> P
> 
> E
> 
> C
> 
> p4
> 
> 0.0002077
> 
> R
> 
> E
> 
> Y
> 
> L
> 
> I
> 
> S
> 
> E
> 
> A
> 
> 
> 
> If I name this table as myfile.txt, I have the following scripts to
> calculate the ratio of each amino acid residue at position 1:
> 
> # showing levels of the 3rd column, which means the types of residues
> 
> >myfile[,3]
> 
> 
> 
> # calculating the ratio of L
> 
> >list=c(which(myfile[,3]=="L"))
> 
> >time0total=sum(myfile[,2])
> 
> >AA_L=0
> 
> >for (i in 1:length(list)){AA_L=sum(myfile[list[[i]],2]+AA_L)}
> 
> >ratio_L=AA_L/time0total
> 
> 
> 
> So how can I write a script to do the same thing for the other two levels
> (T and R) in column 3, and also do this for every column that contains
> amino acid residues?
> 
> 
> 
> Many thanks for any help you could give me on this topic! :)
> 
> 
> 
> Regards,
> 
> Zhao
> --
> Zhao JIN
> Ph.D. Candidate
> Ruth Ley Lab
> 467 Biotech
> Field of Microbiology, Cornell University
> Lab: 607.255.4954
> Cell: 412.889.3675
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
____________________________________________________________
FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on your
desktop!

John Kane

2012-Jul-24 15:18 UTC

head link

[R] How to do the same thing for all levels of a column?

I think this does what you want using two packages, plyr and reshape2 that
   you may have to install.  If so install.packages("plyr",
"reshape2") should
   do the trick.
   library(plyr)
   library(reshape2)
   # using supplied file 'myfile" from below
   time0total = sum(myfile[,2])
   mydata  <-  myfile[, 2:10]
   md1  <-  melt(mydata, id = "Time_zero")
   ddply(md1, .(variable, value), summarise, sum = sum(Time_zero)/time0total)


   John Kane
   Kingston ON Canada

   -----Original Message-----
   From: zj29 at cornell.edu
   Sent: Tue, 24 Jul 2012 10:25:21 -0400
   To: jrkrideau at inbox.com
   Subject: Re: [R] How to do the same thing for all levels of a column?

   Hi John,
   Thank you for the tips. My apologies about the unreadable sample data...
   So here is the output of the sample data, and hopefully it works this time
   :)
   myfile  <-  structure(list(Proteins = structure(1:4, .Label =
c("p1", "p2",
   "p3", "p4"), class = "factor"), Time_zero =
c(0.0050723, 0.0002731,
   9.76e-05, 0.0002077), X1 = structure(c(1L, 3L, 1L, 2L), .Label =
c("L",
   "R", "T"), class = "factor"), X2 =
structure(c(1L, 1L, 2L, 1L
   ), .Label = c("E", "M"), class = "factor"), X3
= structure(c(2L,
   1L, 2L, 2L), .Label = c("N", "Y"), class =
"factor"), X4 = structure(c(1L,
   2L,  3L,  2L),  .Label  =  c("I",  "L",  "Q"),
class = "factor"), X5    structure(c(1L,
   2L, 1L, 1L), .Label = c("I", "V"), class =
"factor"), X6 = structure(c(1L,
   1L, 1L, 2L), .Label = c("P", "S"), class =
"factor"), X7 = structure(c(1L,
   3L,  2L,  2L),  .Label  =  c("D",  "E",  "G"),
class = "factor"), X8    structure(c(1L,
   1L,  2L,  1L),  .Label  =  c("A",  "C"),  class =
"factor")), .Names    c("Proteins",
   "Time_zero", "X1", "X2", "X3",
"X4", "X5", "X6", "X7", "X8"),
row.names    c(NA,
   4L), class = "data.frame")
   And here is my original question:
   Basically, I have a bunch of protein sequences composed of different amino
   acid residues, and each residue is represented by an uppercase letter. I
   want  to  calculate the ratio of different amino acid residues at each
   position of the proteins.

   If  I  name  this table as myfile.txt, I have the following scripts to
   calculate the ratio of each amino acid residue at position 1:

   # showing levels of the 3rd column, which means the types of residues

   >myfile[,3]


   # calculating the ratio of L

   >list=c(which(myfile[,3]=="L"))

   >time0total=sum(myfile[,2])

   >AA_L=0

   >for (i in 1:length(list)){AA_L=sum(myfile[list[[i]],2]+AA_L)}

   >ratio_L=AA_L/time0total


   So how can I write a script to do the same thing for the other two levels (T
   and R) in column 3, and also do this for every column that contains amino
   acid residues?

   Thanks a lot!

   Regards,

   Zhao
   2012/7/24 John Kane <[1]jrkrideau at inbox.com>

     First thing is to supply the data in a useable format.  As is it is
     essenatially unreadable.  All R-beginners do this. :)
     Have a look at the dput function  (?dput) for a good way to supply sample
     data in an email.
     If you have a large dataset probably a few dozen lines of data would be
     fine.
     Something like dput(head(mydata)) should be fine.  Just copy and paste the
     output into your email.
     Welcome to R.  I think you will like it.
     John Kane
     Kingston ON Canada

   > -----Original Message-----
   > From: [2]zj29 at cornell.edu
   > Sent: Mon, 23 Jul 2012 18:01:11 -0400
   > To: [3]r-help at r-project.org
   > Subject: [R] How to do the same thing for all levels of a column?
   >
   > Dear all,
   >
   >
   >
   > I am a R beginner, and I am looking for a way to do the same thing for
   > all
   > levels of a column in a table.
   >
   >
   >
   > Basically, I have a bunch of protein sequences composed of different
   > amino
   > acid residues, and each residue is represented by an uppercase letter. I
   > want to calculate the ratio of different amino acid residues at each
   > position of the proteins. Here is an example table:
   >
   > Proteins
   >
   > Time_zero
   >
   > 1
   >
   > 2
   >
   > 3
   >
   > 4
   >
   > 5
   >
   > 6
   >
   > 7
   >
   > 8
   >
   > p1
   >
   > 0.0050723
   >
   > L
   >
   > E
   >
   > Y
   >
   > I
   >
   > I
   >
   > P
   >
   > D
   >
   > A
   >
   > p2
   >
   > 0.0002731
   >
   > T
   >
   > E
   >
   > N
   >
   > L
   >
   > V
   >
   > P
   >
   > G
   >
   > A
   >
   > p3
   >
   > 9.757E-05
   >
   > L
   >
   > M
   >
   > Y
   >
   > Q
   >
   > I
   >
   > P
   >
   > E
   >
   > C
   >
   > p4
   >
   > 0.0002077
   >
   > R
   >
   > E
   >
   > Y
   >
   > L
   >
   > I
   >
   > S
   >
   > E
   >
   > A
   >
   >
   >
   > If I name this table as myfile.txt, I have the following scripts to
   > calculate the ratio of each amino acid residue at position 1:
   >
   > # showing levels of the 3rd column, which means the types of residues
   >
   > >myfile[,3]
   >
   >
   >
   > # calculating the ratio of L
   >
   > >list=c(which(myfile[,3]=="L"))
   >
   > >time0total=sum(myfile[,2])
   >
   > >AA_L=0
   >
   > >for (i in 1:length(list)){AA_L=sum(myfile[list[[i]],2]+AA_L)}
   >
   > >ratio_L=AA_L/time0total
   >
   >
   >
   > So how can I write a script to do the same thing for the other two
levels
   > (T and R) in column 3, and also do this for every column that contains
   > amino acid residues?
   >
   >
   >
   > Many thanks for any help you could give me on this topic! :)
   >
   >
   >
   > Regards,
   >
   > Zhao
   > --
   > Zhao JIN
   > Ph.D. Candidate
   > Ruth Ley Lab
   > 467 Biotech
   > Field of Microbiology, Cornell University
   > Lab: 607.255.4954
   > Cell: 412.889.3675
   >

     >       [[alternative HTML version deleted]]
     >
     > ______________________________________________
     > [4]R-help at r-project.org mailing list
     > [5]https://stat.ethz.ch/mailman/listinfo/r-help
     > PLEASE do read the posting guide
     > [6]http://www.R-project.org/posting-guide.html
     > and provide commented, minimal, self-contained, reproducible code.
     ____________________________________________________________
     FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks & orcas on
     your desktop!
     Check it out at [7]http://www.inbox.com/marineaquarium

   --
   Zhao JIN
   Ph.D. Candidate
   Ruth Ley Lab
   467 Biotech
   Field of Microbiology, Cornell University
   Lab: 607.255.4954
   Cell: 412.889.3675
     _________________________________________________________________

   [8]3D Earth Screensaver Preview 
   Free 3D Earth Screensaver
   Watch   the   Earth   right   on   your   desktop!  Check  it  out  at
   [9]www.inbox.com/earth

References

   1. mailto:jrkrideau at inbox.com
   2. mailto:zj29 at cornell.edu
   3. mailto:r-help at r-project.org
   4. mailto:R-help at r-project.org
   5. https://stat.ethz.ch/mailman/listinfo/r-help
   6. http://www.R-project.org/posting-guide.html
   7. http://www.inbox.com/marineaquarium
   8. http://www.inbox.com/earth
   9. http://www.inbox.com/earth

John Kane

2012-Jul-25 13:44 UTC

head link

[R] How to do the same thing for all levels of a column?

No it's actually telling it to split by the two variables (variable, value)
   if I understand your question correctly.
   The confusion is my fault. I tend to be lazy when running examples and did
   not rename the melt() output to something meaningful. I sometimes forget
   that it's not just me reading the code.
   If you run:
   md1  <-  melt(mydata, id = "Time_zero",
            variable.name="xvars",
           value.name="aminos")
   ddply(md1, .(xvars, aminos), summarise, sum = sum(Time_zero)/time0total)
   I think it will show what is happening.



   John Kane
   Kingston ON Canada

   -----Original Message-----
   From: zj29 at cornell.edu
   Sent: Tue, 24 Jul 2012 15:26:52 -0400
   To: gunter.berton at gene.com
   Subject: Re: [R] How to do the same thing for all levels of a column?

   Hi John and Bert,
   Thank you so much for your replies. Both of your scripts worked well, so now
   I've learnt two ways to do it. :)
   Bert: I was not very clear on what I wanted to do. I just would like to
   calculate the residues shown in the table, not all residues. The apply
   functions are amazing!
   John: as I am still digesting the codes, I am not sure if I fully understood
   the argument .(variables, value) in the ddply line. The description of ddply
   says that .variables show the variables to split data frame by, as quoted
   variables, a formula or character vector. So does .(variables, value) tell R
   to  split  the data frame by values, which are the types of amino acid
   residues?
   Thank you all again.
   Cheers,
   Zhao
   2012/7/24 Bert Gunter <[1]gunter.berton at gene.com>

     ... and I neglected to mention that f = myfiles[,2]
     Sigh....  More coffee needed.
     -- Bert

   On Tue, Jul 24, 2012 at 9:43 AM, Bert Gunter <[2]bgunter at gene.com>
wrote:
   > Sorry. Typo in my previous. Should be:
   >
   >> sapply(myfile[,-c(1,2)],function(x)prop.table(tapply(f,x,sum)))
   > $X1
   >          L          R          T
   > 0.91491320 0.03675651 0.04833030
   >
   > $X2
   >         E         M
   > 0.9827278 0.0172722
   >
   > $X3
   >         N         Y
   > 0.0483303 0.9516697
   >
   > $X4
   >         I         L         Q
   > 0.8976410 0.0850868 0.0172722
   >
   > $X5
   >         I         V
   > 0.9516697 0.0483303
   >
   > $X6
   >          P          S
   > 0.96324349 0.03675651
   >
   > $X7
   >         D         E         G
   > 0.8976410 0.0540287 0.0483303
   >
   > $X8
   >         A         C
   > 0.9827278 0.0172722
   >
   >
   >
   > On Tue, Jul 24, 2012 at 9:37 AM, Bert Gunter <[3]bgunter at
gene.com> wrote:
   >> OK, I admit it: I re-read what you wrote and now I'm confused.
Is:
   >>
   >>> sapply(myfile[,-c(1,2)],function(x)prop.table(tapply(f,x)))
   >>
   >>             X1       X2        X3       X4     X5  X6    X7  X8
   >> [1,] 0.1428571 0.2 0.2857143 0.125 0.2 0.2 0.125 0.2
   >> [2,] 0.4285714 0.2 0.1428571 0.250 0.4 0.2 0.375 0.2
   >> [3,] 0.1428571 0.4 0.2857143 0.375 0.2 0.2 0.250 0.4
   >> [4,] 0.2857143 0.2 0.2857143 0.250 0.2 0.4 0.250 0.2
   >>
   >> what you want?
   >>
   >> -- Bert
   >> On Tue, Jul 24, 2012 at 9:17 AM, Bert Gunter <[4]bgunter at
gene.com> wrote:
   >>> The OP's request is a bit ambiguous to me: at a given
residue, do you
   >>> wish to calculate the proportions for only those amino acids
that
   >>> appear at that residue, or do you wish to include the
proportions for
   >>> all amino acids, some of which might then be 0.
   >>>
   >>> Assuming the former, then I don't think one needs to go to
the lengths
   >>> described by John below.
   >>>
   >>> Using your example (thanks!), the following seems to suffice:
   >>>
   >>>> sapply(myfile[,-c(1,2)],function(x)prop.table(table(x)))
   >>>
   >>> $X1
   >>> x
   >>>    L    R    T
   >>> 0.50 0.25 0.25
   >>>
   >>> $X2
   >>> x
   >>>    E    M
   >>> 0.75 0.25
   >>>
   >>> $X3
   >>> x
   >>>    N    Y
   >>> 0.25 0.75
   >>>
   >>> $X4
   >>> x
   >>>    I    L    Q
   >>> 0.25 0.50 0.25
   >>>
   >>> $X5
   >>> x
   >>>    I    V
   >>> 0.75 0.25
   >>>
   >>> $X6
   >>> x
   >>>    P    S
   >>> 0.75 0.25
   >>>
   >>> $X7
   >>> x
   >>>    D    E    G
   >>> 0.25 0.50 0.25
   >>>
   >>> $X8
   >>> x
   >>>    A    C
   >>> 0.75 0.25
   >>>
   >>>
   >>> This could, of course, then be modified to add zero proportions
for
   >>> all non-appearing amino acids.
   >>>
   >>> -- Cheers,
   >>> Bert
   >>>
   >>> On Tue, Jul 24, 2012 at 8:18 AM, John Kane <[5]jrkrideau at
inbox.com>
   wrote:
   >>>>
   >>>>      I think this does what you want using two packages,
plyr and
   reshape2 that
   >>>>    you may have to install.  If so
install.packages("plyr", "reshape2")
   should
   >>>>    do the trick.
   >>>>    library(plyr)
   >>>>    library(reshape2)
   >>>>    # using supplied file 'myfile" from below
   >>>>    time0total = sum(myfile[,2])
   >>>>    mydata  <-  myfile[, 2:10]
   >>>>    md1  <-  melt(mydata, id = "Time_zero")
   >>>>         ddply(md1,   .(variable,   value),   summarise,  
sum      sum(Time_zero)/time0total)
   >>>>
   >>>>
   >>>>    John Kane
   >>>>    Kingston ON Canada
   >>>>
   >>>>    -----Original Message-----
   >>>>    From: [6]zj29 at cornell.edu
   >>>>    Sent: Tue, 24 Jul 2012 10:25:21 -0400
   >>>>    To: [7]jrkrideau at inbox.com
   >>>>     Subject: Re: [R] How to do the same thing for all levels
of a
   column?
   >>>>
   >>>>    Hi John,
   >>>>    Thank you for the tips. My apologies about the unreadable
sample
   data...
   >>>>    So here is the output of the sample data, and hopefully
it works
   this time
   >>>>    :)
   >>>>     myfile  <-  structure(list(Proteins = structure(1:4,
.Label    c("p1", "p2",
   >>>>    "p3", "p4"), class =
"factor"), Time_zero = c(0.0050723, 0.0002731,
   >>>>    9.76e-05, 0.0002077), X1 = structure(c(1L, 3L, 1L, 2L),
.Label    c("L",
   >>>>    "R", "T"), class =
"factor"), X2 = structure(c(1L, 1L, 2L, 1L
   >>>>    ), .Label = c("E", "M"), class =
"factor"), X3 = structure(c(2L,
   >>>>      1L,  2L,  2L), .Label = c("N",
"Y"), class = "factor"), X4    structure(c(1L,
   >>>>    2L,  3L,  2L),  .Label  =  c("I", 
"L",  "Q"), class = "factor"), X5
      >>>>    structure(c(1L,
   >>>>      2L,  1L,  1L), .Label = c("I",
"V"), class = "factor"), X6    structure(c(1L,
   >>>>      1L,  1L,  2L), .Label = c("P",
"S"), class = "factor"), X7    structure(c(1L,
   >>>>    3L,  2L,  2L),  .Label  =  c("D", 
"E",  "G"), class = "factor"), X8
      >>>>    structure(c(1L,
   >>>>    1L,  2L,  1L),  .Label  =  c("A", 
"C"),  class = "factor")), .Names
      >>>>    c("Proteins",
   >>>>     "Time_zero", "X1", "X2",
"X3", "X4", "X5", "X6", "X7",
"X8"),
   row.names    >>>>    c(NA,
   >>>>    4L), class = "data.frame")
   >>>>    And here is my original question:
   >>>>    Basically, I have a bunch of protein sequences composed
of different
   amino
   >>>>    acid residues, and each residue is represented by an
uppercase
   letter. I
   >>>>    want  to  calculate the ratio of different amino acid
residues at
   each
   >>>>    position of the proteins.
   >>>>
   >>>>    If  I  name  this table as myfile.txt, I have the
following scripts
   to
   >>>>    calculate the ratio of each amino acid residue at
position 1:
   >>>>
   >>>>      # showing levels of the 3rd column, which means the
types of
   residues
   >>>>
   >>>>    >myfile[,3]
   >>>>
   >>>>
   >>>>    # calculating the ratio of L
   >>>>
   >>>>    >list=c(which(myfile[,3]=="L"))
   >>>>
   >>>>    >time0total=sum(myfile[,2])
   >>>>
   >>>>    >AA_L=0
   >>>>
   >>>>    >for (i in
1:length(list)){AA_L=sum(myfile[list[[i]],2]+AA_L)}
   >>>>
   >>>>    >ratio_L=AA_L/time0total
   >>>>
   >>>>
   >>>>    So how can I write a script to do the same thing for the
other two
   levels (T
   >>>>    and R) in column 3, and also do this for every column
that contains
   amino
   >>>>    acid residues?
   >>>>
   >>>>    Thanks a lot!
   >>>>
   >>>>    Regards,
   >>>>
   >>>>    Zhao
   >>>>    2012/7/24 John Kane <[1][8]jrkrideau at inbox.com>
   >>>>
   >>>>      First thing is to supply the data in a useable format. 
As is it
   is
   >>>>      essenatially unreadable.  All R-beginners do this. :)
   >>>>      Have a look at the dput function  (?dput) for a good
way to supply
   sample
   >>>>      data in an email.
   >>>>      If you have a large dataset probably a few dozen lines
of data
   would be
   >>>>      fine.
   >>>>      Something like dput(head(mydata)) should be fine.  Just
copy and
   paste the
   >>>>      output into your email.
   >>>>      Welcome to R.  I think you will like it.
   >>>>      John Kane
   >>>>      Kingston ON Canada
   >>>>
   >>>>    > -----Original Message-----
   >>>>    > From: [2][9]zj29 at cornell.edu
   >>>>    > Sent: Mon, 23 Jul 2012 18:01:11 -0400
   >>>>    > To: [3][10]r-help at r-project.org
   >>>>    > Subject: [R] How to do the same thing for all levels
of a column?
   >>>>    >
   >>>>    > Dear all,
   >>>>    >
   >>>>    >
   >>>>    >
   >>>>    > I am a R beginner, and I am looking for a way to do
the same thing
   for
   >>>>    > all
   >>>>    > levels of a column in a table.
   >>>>    >
   >>>>    >
   >>>>    >
   >>>>      > Basically, I have a bunch of protein sequences
composed of
   different
   >>>>    > amino
   >>>>    > acid residues, and each residue is represented by an
uppercase
   letter. I
   >>>>    > want to calculate the ratio of different amino acid
residues at
   each
   >>>>    > position of the proteins. Here is an example table:
   >>>>    >
   >>>>    > Proteins
   >>>>    >
   >>>>    > Time_zero
   >>>>    >
   >>>>    > 1
   >>>>    >
   >>>>    > 2
   >>>>    >
   >>>>    > 3
   >>>>    >
   >>>>    > 4
   >>>>    >
   >>>>    > 5
   >>>>    >
   >>>>    > 6
   >>>>    >
   >>>>    > 7
   >>>>    >
   >>>>    > 8
   >>>>    >
   >>>>    > p1
   >>>>    >
   >>>>    > 0.0050723
   >>>>    >
   >>>>    > L
   >>>>    >
   >>>>    > E
   >>>>    >
   >>>>    > Y
   >>>>    >
   >>>>    > I
   >>>>    >
   >>>>    > I
   >>>>    >
   >>>>    > P
   >>>>    >
   >>>>    > D
   >>>>    >
   >>>>    > A
   >>>>    >
   >>>>    > p2
   >>>>    >
   >>>>    > 0.0002731
   >>>>    >
   >>>>    > T
   >>>>    >
   >>>>    > E
   >>>>    >
   >>>>    > N
   >>>>    >
   >>>>    > L
   >>>>    >
   >>>>    > V
   >>>>    >
   >>>>    > P
   >>>>    >
   >>>>    > G
   >>>>    >
   >>>>    > A
   >>>>    >
   >>>>    > p3
   >>>>    >
   >>>>    > 9.757E-05
   >>>>    >
   >>>>    > L
   >>>>    >
   >>>>    > M
   >>>>    >
   >>>>    > Y
   >>>>    >
   >>>>    > Q
   >>>>    >
   >>>>    > I
   >>>>    >
   >>>>    > P
   >>>>    >
   >>>>    > E
   >>>>    >
   >>>>    > C
   >>>>    >
   >>>>    > p4
   >>>>    >
   >>>>    > 0.0002077
   >>>>    >
   >>>>    > R
   >>>>    >
   >>>>    > E
   >>>>    >
   >>>>    > Y
   >>>>    >
   >>>>    > L
   >>>>    >
   >>>>    > I
   >>>>    >
   >>>>    > S
   >>>>    >
   >>>>    > E
   >>>>    >
   >>>>    > A
   >>>>    >
   >>>>    >
   >>>>    >
   >>>>    > If I name this table as myfile.txt, I have the
following scripts
   to
   >>>>    > calculate the ratio of each amino acid residue at
position 1:
   >>>>    >
   >>>>    > # showing levels of the 3rd column, which means the
types of
   residues
   >>>>    >
   >>>>    > >myfile[,3]
   >>>>    >
   >>>>    >
   >>>>    >
   >>>>    > # calculating the ratio of L
   >>>>    >
   >>>>    > >list=c(which(myfile[,3]=="L"))
   >>>>    >
   >>>>    > >time0total=sum(myfile[,2])
   >>>>    >
   >>>>    > >AA_L=0
   >>>>    >
   >>>>    > >for (i in
1:length(list)){AA_L=sum(myfile[list[[i]],2]+AA_L)}
   >>>>    >
   >>>>    > >ratio_L=AA_L/time0total
   >>>>    >
   >>>>    >
   >>>>    >
   >>>>    > So how can I write a script to do the same thing for
the other two
   levels
   >>>>    > (T and R) in column 3, and also do this for every
column that
   contains
   >>>>    > amino acid residues?
   >>>>    >
   >>>>    >
   >>>>    >
   >>>>    > Many thanks for any help you could give me on this
topic! :)
   >>>>    >
   >>>>    >
   >>>>    >
   >>>>    > Regards,
   >>>>    >
   >>>>    > Zhao
   >>>>    > --
   >>>>    > Zhao JIN
   >>>>    > Ph.D. Candidate
   >>>>    > Ruth Ley Lab
   >>>>    > 467 Biotech
   >>>>    > Field of Microbiology, Cornell University
   >>>>    > Lab: 607.255.4954
   >>>>    > Cell: 412.889.3675
   >>>>    >
   >>>>
   >>>>      >       [[alternative HTML version deleted]]
   >>>>      >
   >>>>      > ______________________________________________
   >>>>      > [4][11]R-help at r-project.org mailing list
   >>>>      >
[5][12]https://stat.ethz.ch/mailman/listinfo/r-help
   >>>>      > PLEASE do read the posting guide
   >>>>      > [6][13]http://www.R-project.org/posting-guide.html
   >>>>      > and provide commented, minimal, self-contained,
reproducible
   code.
   >>>>     
____________________________________________________________
   >>>>      FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins,
sharks &
   orcas on
   >>>>      your desktop!
   >>>>      Check it out at
[7][14]http://www.inbox.com/marineaquarium
   >>>>
   >>>>    --
   >>>>    Zhao JIN
   >>>>    Ph.D. Candidate
   >>>>    Ruth Ley Lab
   >>>>    467 Biotech
   >>>>    Field of Microbiology, Cornell University
   >>>>    Lab: 607.255.4954
   >>>>    Cell: 412.889.3675
   >>>>     
_________________________________________________________________
   >>>>
   >>>>    [8]3D Earth Screensaver Preview
   >>>>    Free 3D Earth Screensaver
   >>>>    Watch   the   Earth   right   on   your   desktop!  Check
it  out
   at
   >>>>    [9][15]www.inbox.com/earth
   >>>>
   >>>> References
   >>>>
   >>>>    1. mailto:[16]jrkrideau at inbox.com
   >>>>    2. mailto:[17]zj29 at cornell.edu
   >>>>    3. mailto:[18]r-help at r-project.org
   >>>>    4. mailto:[19]R-help at r-project.org
   >>>>    5. [20]https://stat.ethz.ch/mailman/listinfo/r-help
   >>>>    6. [21]http://www.R-project.org/posting-guide.html
   >>>>    7. [22]http://www.inbox.com/marineaquarium
   >>>>    8. [23]http://www.inbox.com/earth
   >>>>    9. [24]http://www.inbox.com/earth
   >>>> ______________________________________________
   >>>> [25]R-help at r-project.org mailing list
   >>>> [26]https://stat.ethz.ch/mailman/listinfo/r-help
   >>>> PLEASE do read the posting guide
   [27]http://www.R-project.org/posting-guide.html
   >>>> and provide commented, minimal, self-contained, reproducible
code.
   >>>
   >>>
   >>>
   >>> --
   >>>
   >>> Bert Gunter
   >>> Genentech Nonclinical Biostatistics
   >>>
   >>> Internal Contact Info:
   >>> Phone: 467-7374
   >>> Website:
   >>>
   [28]http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-b
   iostatistics/pdb-ncb-home.htm
   >>
   >>
   >>
   >> --
   >>
   >> Bert Gunter
   >> Genentech Nonclinical Biostatistics
   >>
   >> Internal Contact Info:
   >> Phone: 467-7374
   >> Website:
   >>
   [29]http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-b
   iostatistics/pdb-ncb-home.htm
   >
   >
   >
   > --
   >
   > Bert Gunter
   > Genentech Nonclinical Biostatistics
   >
   > Internal Contact Info:
   > Phone: 467-7374
   > Website:
   >
   [30]http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-b
   iostatistics/pdb-ncb-home.htm
   --
   Bert Gunter
   Genentech Nonclinical Biostatistics
   Internal Contact Info:
   Phone: 467-7374
   Website:
   [31]http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-b
   iostatistics/pdb-ncb-home.htm

   --
   Zhao JIN
   Ph.D. Candidate
   Ruth Ley Lab
   467 Biotech
   Field of Microbiology, Cornell University
   Lab: 607.255.4954
   Cell: 412.889.3675
     _________________________________________________________________

   [32]3D Marine Aquarium Screensaver Preview 
   Free 3D Marine Aquarium Screensaver
   Watch  dolphins,  sharks  &  orcas  on  your  desktop! Check it out at
   [33]www.inbox.com/marineaquarium

References

   1. mailto:gunter.berton at gene.com
   2. mailto:bgunter at gene.com
   3. mailto:bgunter at gene.com
   4. mailto:bgunter at gene.com
   5. mailto:jrkrideau at inbox.com
   6. mailto:zj29 at cornell.edu
   7. mailto:jrkrideau at inbox.com
   8. mailto:jrkrideau at inbox.com
   9. mailto:zj29 at cornell.edu
  10. mailto:r-help at r-project.org
  11. mailto:R-help at r-project.org
  12. https://stat.ethz.ch/mailman/listinfo/r-help
  13. http://www.R-project.org/posting-guide.html
  14. http://www.inbox.com/marineaquarium
  15. http://www.inbox.com/earth
  16. mailto:jrkrideau at inbox.com
  17. mailto:zj29 at cornell.edu
  18. mailto:r-help at r-project.org
  19. mailto:R-help at r-project.org
  20. https://stat.ethz.ch/mailman/listinfo/r-help
  21. http://www.R-project.org/posting-guide.html
  22. http://www.inbox.com/marineaquarium
  23. http://www.inbox.com/earth
  24. http://www.inbox.com/earth
  25. mailto:R-help at r-project.org
  26. https://stat.ethz.ch/mailman/listinfo/r-help
  27. http://www.R-project.org/posting-guide.html
  28.
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
  29.
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
  30.
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
  31.
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
  32. http://www.inbox.com/marineaquarium
  33. http://www.inbox.com/marineaquarium

Maybe Matching Threads

Search for more apparently analagous threads

R help - Jul 2012 - How to do the same thing for all levels of a column?

[R] How to do the same thing for all levels of a column?

[R] How to do the same thing for all levels of a column?

[R] How to do the same thing for all levels of a column?

[R] How to do the same thing for all levels of a column?

Maybe Matching Threads