thr3ads.net - R help - [R] Fitting a distribution to peaks in histogram [Jul 2006]

If this information is useful, please help other people find it:
Share via:

Ulrik Stervbo

2006-Jul-19 09:58 UTC

[R] Fitting a distribution to peaks in histogram

Hello list!

I would like to fit a distribution to each of the peaks in a histogram, such
as this: http://photos1.blogger.com/blogger/7029/2724/1600/DU145-Bax3-Bcl-xL.png
.

The peaks are identified using Petr Pikal peaks function (
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/33097.html), but after that
I am quite stuck.

Any idea as to how I can:
Fit a distribution to each peak
Integrate the area between each two peaks, using the means and widths of the
distributions fitted to the two peaks. I will be using the integrate
function

The histogram is based on approximately 15000 events, which makes Mclust and
pam (which both delivers the information I need) less useful.

The whole point of this exercise is to find the percentage of cells in peak
1, 2, 3, and so on, and between peak 1-2, peak 2-3, peak 3-4 and so on.
Having more that 6 peaks does not appears likely.

I am quite new to R and apologise if the solution is fairly basic.

Thank you in advance for any help and suggestions

Sincerely,
Ulrik

	[[alternative HTML version deleted]]

Petr Pikal

2006-Jul-19 12:54 UTC

head link

[R] Fitting a distribution to peaks in histogram

Hi

There are some packages for mass spectra processing (spectrino, 
caMassClass). I did not use them so I do not know how they suit your 
needs.

However you can compute area (integrate) by these functions

# uses information interactively from plot(x,y)
# first it replots data between corners *replot(x,y)*
# then it computes sum between x axis and y values - osum -
# and between "baseline" and y values - cista - based
# on locator positions

integ<-function (x,y)
{
replot(x,y)
meze<-locator(2)
dm<-meze$x[1]
hm<-meze$x[2]
abline(v=c(dm,hm),col=2)
vyber<-x<=hm&x>=dm
f3 <- splinefun(x, y)
osum<-integrate(f3, dm, hm)$value
o1<-(y[x==min(x[vyber])]+y[x==max(x[vyber])])*(max(x[vyber])-
min(x[vyber]))/2
cista<-osum-o1
return(c(osum,cista))
}

# similar as integ but you has to supply upper and lower limits (dm, 
# hm) manually if you do not want to perform "integration" of whole # 
area under the curve.


integ1<-function (x,y,dm=-Inf,hm=+Inf)
{
ifelse(dm==-Inf, dm<-min(x), dm<-dm)
ifelse(hm==+Inf, hm<-max(x), hm<-hm)
vyber<-x<=hm&x>=dm
f3 <- splinefun(x, y)
osum<-integrate(f3, dm, hm)$value
o1<-(y[x==min(x[vyber])]+y[x==max(x[vyber])])*(max(x[vyber])-
min(x[vyber]))/2
cista<-osum-o1
return(c(osum,cista))
}



On 19 Jul 2006 at 11:58, Ulrik Stervbo wrote:

Date sent:      	Wed, 19 Jul 2006 11:58:38 +0200
From:           	"Ulrik Stervbo" <ulriks at ruc.dk>
To:             	r-help at stat.math.ethz.ch
Subject:        	[R] Fitting a distribution to peaks in histogram
> Hello list!
> 
> I would like to fit a distribution to each of the peaks in a
> histogram, such as this:
> http://photos1.blogger.com/blogger/7029/2724/1600/DU145-Bax3-Bcl-xL.pn
> g .
> 
> The peaks are identified using Petr Pikal peaks function (
> http://finzi.psych.upenn.edu/R/Rhelp02a/archive/33097.html), but after
> that I am quite stuck.
> 
> Any idea as to how I can:
> Fit a distribution to each peak
> Integrate the area between each two peaks, using the means and widths
> of the distributions fitted to the two peaks. I will be using the
> integrate function
> 
> The histogram is based on approximately 15000 events, which makes
> Mclust and pam (which both delivers the information I need) less
> useful.
> 
> The whole point of this exercise is to find the percentage of cells in
> peak 1, 2, 3, and so on, and between peak 1-2, peak 2-3, peak 3-4 and
> so on. Having more that 6 peaks does not appears likely.
> 
> I am quite new to R and apologise if the solution is fairly basic.
> 
> Thank you in advance for any help and suggestions
> 
> Sincerely,
> Ulrik
> 
>  [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html and provide commented,
> minimal, self-contained, reproducible code.
Petr Pikal
petr.pikal at precheza.cz

hadley wickham

2006-Jul-19 16:20 UTC

head link

[R] Fitting a distribution to peaks in histogram

> I would like to fit a distribution to each of the peaks in a histogram,
such
> as this:
http://photos1.blogger.com/blogger/7029/2724/1600/DU145-Bax3-Bcl-xL.png
As a first shot, I'd try fitting a mixture of gamma distributions (say
3), plus a constant term for the highest bin.  You could do this using
ML.  If the number of peaks is truly unknown, this will be a little
trickier but still possible and you could use the LRT to chose between
them.
> Integrate the area between each two peaks, using the means and widths of
the
> distributions fitted to the two peaks. I will be using the integrate
> function
Why do you want to do this?
>
> The histogram is based on approximately 15000 events, which makes Mclust
and
> pam (which both delivers the information I need) less useful.
If you have unbinned data, it would be better (more precise/powerful)
to use that.

Regards,

Hadley

Charles C. Berry

2006-Jul-20 17:03 UTC

head link

[R] Fitting a distribution to peaks in histogram

On Wed, 19 Jul 2006, Ulrik Stervbo wrote:

[much deleted]
>
> I am measureing the amount of DNA in cells, and I need to know the
> percentage of cells in a part of the cell cycle; that the percentage
> of cells in the first peak, in the second peak and so on. I want to
> integrate the area between to two cells, because that apparently is
> how its none (as far as I can tell from the literature)
>
I do not doubt that that is how some do it, but a better approach is takes 
account of errors in the values used to form the histogram.

This is reviewed in

T. Lynn Eudey
Statistical Considerations in DNA Flow Cytometry
Statistical Science 1996, Vol. 11, No. 4, 320{334

A classic approach for this purpose is

John Mendelsohn; John Rice
Deconvolution of Microfluorometric Histograms with B Splines
Journal of the American Statistical Association. Vol. 77, No. 380, 1982.

A FORTRAN program is mentioned there and the authors say that they will 
provide listings on request (but that WAS a long time ago).

I don't know if this was ever implemented in S/R, but an email to 
Professor John Rice (UC Berkeley Dept of Stats) might help.

HTH,

Chuck

Charles C. Berry                        (858) 534-2098
                                          Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	         UC San Diego
http://biostat.ucsd.edu/~cberry/         La Jolla, San Diego 92093-0717

Maybe Matching Threads

Search for more reasonably related threads

R help - Jul 2006 - Fitting a distribution to peaks in histogram

[R] Fitting a distribution to peaks in histogram

[R] Fitting a distribution to peaks in histogram

[R] Fitting a distribution to peaks in histogram

[R] Fitting a distribution to peaks in histogram

Maybe Matching Threads