Dear All, Apologies if this is too simple for this list. Let us assume that you have an instrument measuring particle distributions. The output is a set of counts {n_i} corresponding to a set of average sizes {d_i}. The set of {d_i} ranges from d_i_min to d_i_max either linearly of logarithmically. There is no access to further detailed information about the distribution of the measured sizes, but at least you know enough to plot n(d_i) (number of counts as a function of particle size). If you can fit the {n_i} to a known distribution (e.g. normal or lognormal), then you can choose a new set of average sizes, {D_i} and plot the corresponding n_i(D_i). But what if the initial {n_i}'s observations do not belong to a known distribution and you still want to calculate n(D_i)? On the top of my head, I think that whatever I do must conserve the original total number of observations N=\sum_i{n_i}, but this does not terribly constrain the problem. Any suggestion is welcome. Many thanks Lorenzo
Hi Lorenzo, I think it would be better if you provided a few example datasets/tables. Right now, I can't exactly circumscribe your problem. When binning data, the cut() function tends to be very useful. To fit common univariate distributions to a given dataset, you should take a look at the fitdistr() function in the MASS package. If this doesn't answer your question, please try to explain in details how your problem relates to R. Best of luck, Luc Lorenzo Isella wrote:> Dear All, > Apologies if this is too simple for this list. > Let us assume that you have an instrument measuring particle distributions. > The output is a set of counts {n_i} corresponding to a set of average > sizes {d_i}. > The set of {d_i} ranges from d_i_min to d_i_max either linearly of > logarithmically. > There is no access to further detailed information about the > distribution of the measured sizes, but at least you know enough to > plot n(d_i) (number of counts as a function of particle size). > If you can fit the {n_i} to a known distribution (e.g. normal or > lognormal), then you can choose a new set of average sizes, {D_i} and > plot the corresponding n_i(D_i). > But what if the initial {n_i}'s observations do not belong to a known > distribution and you still want to calculate n(D_i)? > On the top of my head, I think that whatever I do must conserve the > original total number of observations N=\sum_i{n_i}, but this does not > terribly constrain the problem. > Any suggestion is welcome. > Many thanks > > Lorenzo > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Lorenzo Isella wrote:> Dear All, > Apologies if this is too simple for this list. > Let us assume that you have an instrument measuring particle distributions. > The output is a set of counts {n_i} corresponding to a set of average > sizes {d_i}. > The set of {d_i} ranges from d_i_min to d_i_max either linearly of > logarithmically. > There is no access to further detailed information about the > distribution of the measured sizes, but at least you know enough to > plot n(d_i) (number of counts as a function of particle size). > If you can fit the {n_i} to a known distribution (e.g. normal or > lognormal), then you can choose a new set of average sizes, {D_i} and > plot the corresponding n_i(D_i). > But what if the initial {n_i}'s observations do not belong to a known > distribution and you still want to calculate n(D_i)? > On the top of my head, I think that whatever I do must conserve the > original total number of observations N=\sum_i{n_i}, but this does not > terribly constrain the problem. > Any suggestion is welcome. >Hi Lorenzo, You should probably be aware that both the position and spacing of category boundaries can have a large effect on parameter location tests carried out on the categorized data. See: Wainer, H., Geseroli, M. & Verdi, M. (2006) Finding what is not there through the unfortunate binning of results: The Mendel effect. Chance,19(1): 49-52. Lemon, J. On the perils of categorizing responses. Tutorials in Quantitative Methods for Psychology, 5(1): 35-39. Jim