lalitha viswanath
2006-May-01 02:58 UTC
[R] table of means/medians across bins used for a histogram
Hi I am trying to get a table of means of parameter 1 across BINS of parameter 2. I am working in proteomics and a sample of my data is as follows cluster-age clock-rate(evolutionary rate) scopclass 0.002 10 A 0.045 0.1 B 0.13 15 A 0.15 34 D .... .... .... .... Scop class has only 9 distinct categories (A-I) Whereas cluster-age and clock-rate are discrete variables greater than 0. I am trying to do two things with this kind of data, out of which I managed to accomplish one thanks to the documentation and pre-existing queries on the mailing lists. 1. Plot a histogram of the age distribution with scop class category superimposed on each bin. I managed to do this with barplot2. 2. Now I am trying to plot a scatter plot of the age v/s the clock-rate. However to eliminate possible sampling errors, we are trying to get an average of the clock-rate for each of the bins used above. i.e. before plotting a x-y plot, i wish to compute average clock-rate in each of the bins for the age and then plot a x-y plot of the age v/s clock rate. Can anyone point me to appropriate functions for the same? I am trying to work with prop.table, cut, break, etc. But I am not heading anywhere. Thanks Lalitha
Gabor Grothendieck
2006-May-01 03:15 UTC
[R] table of means/medians across bins used for a histogram
My understanding is that you want to replace each rate with its average over the associated bin and then plot age against that. In that case try this:> DF # test dataage rate bin 1 0.002 10.0 A 2 0.045 0.1 B 3 0.130 15.0 A 4 0.150 34.0 D> with(DF, plot(ave(rate, bin), age))Assuming they are stored in vectors the columns are age, rate, bin we would have plot(ave(clock, bin), age) On 4/30/06, lalitha viswanath <lalithaviswanath at yahoo.com> wrote:> Hi > I am trying to get a table of means of parameter 1 > across BINS of parameter 2. > > I am working in proteomics and a sample of my data is > as follows > > cluster-age clock-rate(evolutionary rate) scopclass > 0.002 10 A > 0.045 0.1 B > 0.13 15 A > 0.15 34 D > .... > .... > .... > .... > > Scop class has only 9 distinct categories (A-I) > Whereas cluster-age and clock-rate are discrete > variables greater than 0. > > I am trying to do two things with this kind of data, > out of which I managed to accomplish one thanks to the > documentation and pre-existing queries on the mailing > lists. > 1. Plot a histogram of the age distribution with scop > class category superimposed on each bin. I managed to > do this with barplot2. > 2. Now I am trying to plot a scatter plot of the age > v/s the clock-rate. However to eliminate possible > sampling errors, we are trying to get an average of > the clock-rate for each of the bins used above. > i.e. before plotting a x-y plot, i wish to compute > average clock-rate in each of the bins for the age and > then plot a x-y plot of the age v/s clock rate. > > Can anyone point me to appropriate functions for the > same? > I am trying to work with prop.table, cut, break, etc. > But I am not heading anywhere. > > Thanks > Lalitha > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
Gabor Grothendieck
2006-May-01 03:35 UTC
[R] table of means/medians across bins used for a histogram
Or perhaps a bit simpler: plot(age ~ ave(clock, bin), DF) On 4/30/06, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> My understanding is that you want to replace each rate with its average > over the associated bin and then plot age against that. In that > case try this: > > > DF # test data > age rate bin > 1 0.002 10.0 A > 2 0.045 0.1 B > 3 0.130 15.0 A > 4 0.150 34.0 D > > with(DF, plot(ave(rate, bin), age)) > > Assuming they > are stored in vectors > the columns are age, rate, bin we would have > > plot(ave(clock, bin), age) > > On 4/30/06, lalitha viswanath <lalithaviswanath at yahoo.com> wrote: > > Hi > > I am trying to get a table of means of parameter 1 > > across BINS of parameter 2. > > > > I am working in proteomics and a sample of my data is > > as follows > > > > cluster-age clock-rate(evolutionary rate) scopclass > > 0.002 10 A > > 0.045 0.1 B > > 0.13 15 A > > 0.15 34 D > > .... > > .... > > .... > > .... > > > > Scop class has only 9 distinct categories (A-I) > > Whereas cluster-age and clock-rate are discrete > > variables greater than 0. > > > > I am trying to do two things with this kind of data, > > out of which I managed to accomplish one thanks to the > > documentation and pre-existing queries on the mailing > > lists. > > 1. Plot a histogram of the age distribution with scop > > class category superimposed on each bin. I managed to > > do this with barplot2. > > 2. Now I am trying to plot a scatter plot of the age > > v/s the clock-rate. However to eliminate possible > > sampling errors, we are trying to get an average of > > the clock-rate for each of the bins used above. > > i.e. before plotting a x-y plot, i wish to compute > > average clock-rate in each of the bins for the age and > > then plot a x-y plot of the age v/s clock rate. > > > > Can anyone point me to appropriate functions for the > > same? > > I am trying to work with prop.table, cut, break, etc. > > But I am not heading anywhere. > > > > Thanks > > Lalitha > > > > ______________________________________________ > > R-help at stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > > >
lalitha viswanath
2006-May-01 15:56 UTC
[R] table of means/medians across bins used for a histogram
Hi I think I seem to have phrased my doubt incorrectly. I want a x-y plot of age v/s rate (the bin is irrelevant for this plot); only that instead of a simple x-y plot, i want a plot of average(rate) for each age-intervals. My ages vary from 0 to 0.7 and I want to divide them in groups of 0.02. So I want a plot of the following Age-intervals Average rate in that interval 0-0.02 5 0.02-0.04 7 0.04-0.06 1 0.06-0.08 0 0.08-0.1 0.15 Age-intervals mentioned along the x-axis (like for a histogram) and rates plotted for each age-interval --- Gabor Grothendieck <ggrothendieck at gmail.com> wrote:> Or perhaps a bit simpler: > > plot(age ~ ave(clock, bin), DF) > > > On 4/30/06, Gabor Grothendieck > <ggrothendieck at gmail.com> wrote: > > My understanding is that you want to replace each > rate with its average > > over the associated bin and then plot age against > that. In that > > case try this: > > > > > DF # test data > > age rate bin > > 1 0.002 10.0 A > > 2 0.045 0.1 B > > 3 0.130 15.0 A > > 4 0.150 34.0 D > > > with(DF, plot(ave(rate, bin), age)) > > > > Assuming they > > are stored in vectors > > the columns are age, rate, bin we would have > > > > plot(ave(clock, bin), age) > > > > On 4/30/06, lalitha viswanath > <lalithaviswanath at yahoo.com> wrote: > > > Hi > > > I am trying to get a table of means of parameter > 1 > > > across BINS of parameter 2. > > > > > > I am working in proteomics and a sample of my > data is > > > as follows > > > > > > cluster-age clock-rate(evolutionary rate) > scopclass > > > 0.002 10 A > > > 0.045 0.1 B > > > 0.13 15 A > > > 0.15 34 D > > > .... > > > .... > > > .... > > > .... > > > > > > Scop class has only 9 distinct categories (A-I) > > > Whereas cluster-age and clock-rate are discrete > > > variables greater than 0. > > > > > > I am trying to do two things with this kind of > data, > > > out of which I managed to accomplish one thanks > to the > > > documentation and pre-existing queries on the > mailing > > > lists. > > > 1. Plot a histogram of the age distribution with > scop > > > class category superimposed on each bin. I > managed to > > > do this with barplot2. > > > 2. Now I am trying to plot a scatter plot of the > age > > > v/s the clock-rate. However to eliminate > possible > > > sampling errors, we are trying to get an average > of > > > the clock-rate for each of the bins used above. > > > i.e. before plotting a x-y plot, i wish to > compute > > > average clock-rate in each of the bins for the > age and > > > then plot a x-y plot of the age v/s clock rate. > > > > > > Can anyone point me to appropriate functions for > the > > > same? > > > I am trying to work with prop.table, cut, break, > etc. > > > But I am not heading anywhere. > > > > > > Thanks > > > Lalitha > > > > > > ______________________________________________ > > > R-help at stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > > > > >
Gabor Grothendieck
2006-May-01 16:50 UTC
[R] table of means/medians across bins used for a histogram
I assume you want to discretize one column and then for each level produced, calculate the mean of another column and plot those means against the levels. Using the builtin iris data frame discretize Sepal.Width producing the SWfac factor and calculate, SLmean, the mean Sepal.Length for each level of that factor. Then plot using custom x axis: SWfac <- cut(iris$Sepal.Width, seq(2, 4.4, .5)) SLmean <- tapply(iris$Sepal.Length, SWfac, mean) plot(SLmean, xaxt = "n") axis(1, seq(SLmean), levels(SWfac)) On 5/1/06, lalitha viswanath <lalithaviswanath at yahoo.com> wrote:> Hi > I think I seem to have phrased my doubt incorrectly. > I want a x-y plot of age v/s rate (the bin is > irrelevant for this plot); only that instead of a > simple x-y plot, i want a plot of average(rate) for > each age-intervals. > > My ages vary from 0 to 0.7 and I want to divide them > in groups of 0.02. > > So I want a plot of the following > Age-intervals Average rate in that interval > 0-0.02 5 > 0.02-0.04 7 > 0.04-0.06 1 > 0.06-0.08 0 > 0.08-0.1 0.15 > > Age-intervals mentioned along the x-axis (like for a > histogram) and rates plotted for each age-interval > > --- Gabor Grothendieck <ggrothendieck at gmail.com> > wrote: > > > Or perhaps a bit simpler: > > > > plot(age ~ ave(clock, bin), DF) > > > > > > On 4/30/06, Gabor Grothendieck > > <ggrothendieck at gmail.com> wrote: > > > My understanding is that you want to replace each > > rate with its average > > > over the associated bin and then plot age against > > that. In that > > > case try this: > > > > > > > DF # test data > > > age rate bin > > > 1 0.002 10.0 A > > > 2 0.045 0.1 B > > > 3 0.130 15.0 A > > > 4 0.150 34.0 D > > > > with(DF, plot(ave(rate, bin), age)) > > > > > > Assuming they > > > are stored in vectors > > > the columns are age, rate, bin we would have > > > > > > plot(ave(clock, bin), age) > > > > > > On 4/30/06, lalitha viswanath > > <lalithaviswanath at yahoo.com> wrote: > > > > Hi > > > > I am trying to get a table of means of parameter > > 1 > > > > across BINS of parameter 2. > > > > > > > > I am working in proteomics and a sample of my > > data is > > > > as follows > > > > > > > > cluster-age clock-rate(evolutionary rate) > > scopclass > > > > 0.002 10 A > > > > 0.045 0.1 B > > > > 0.13 15 A > > > > 0.15 34 D > > > > .... > > > > .... > > > > .... > > > > .... > > > > > > > > Scop class has only 9 distinct categories (A-I) > > > > Whereas cluster-age and clock-rate are discrete > > > > variables greater than 0. > > > > > > > > I am trying to do two things with this kind of > > data, > > > > out of which I managed to accomplish one thanks > > to the > > > > documentation and pre-existing queries on the > > mailing > > > > lists. > > > > 1. Plot a histogram of the age distribution with > > scop > > > > class category superimposed on each bin. I > > managed to > > > > do this with barplot2. > > > > 2. Now I am trying to plot a scatter plot of the > > age > > > > v/s the clock-rate. However to eliminate > > possible > > > > sampling errors, we are trying to get an average > > of > > > > the clock-rate for each of the bins used above. > > > > i.e. before plotting a x-y plot, i wish to > > compute > > > > average clock-rate in each of the bins for the > > age and > > > > then plot a x-y plot of the age v/s clock rate. > > > > > > > > Can anyone point me to appropriate functions for > > the > > > > same? > > > > I am trying to work with prop.table, cut, break, > > etc. > > > > But I am not heading anywhere. > > > > > > > > Thanks > > > > Lalitha > > > > > > > > ______________________________________________ > > > > R-help at stat.math.ethz.ch mailing list > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > > > > > > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com >