I've been asked in private, (and am replying BCC to the asker),>> I saw your post on the R-help archives page about the possibility of >> porting a function from S-Plus called peaks() to R. I am looking for >> some way to locate peaks in a simple x,y data set, and thought that R >> might be the way to go."of course" it is the way to go, don't get lost be going somewhere else :-) and try install.packages("fortune") fortune("go with R")>> Any ideas would be a great help,Using RSiteSearch("peaks") gives too many hits, among which those you can get by the more advanced (regular expression) call RSiteSearch("/peaks\\b.*\\bfunction/") where in the 2nd hit, http://finzi.psych.upenn.edu/R/Rhelp02a/archive/33097.html Petr Pikal gives a simple peaks() function, originally by Brian Ripley which is using embed() and max.col() smartly. I wonder if we shouldn't polish that a bit and add to R's standard 'utils' package. Martin Maechler, ETH Zurich
> > I wonder if we shouldn't polish that a bit and add to R's > standard 'utils' package. >Hm, I figured out there are (at least) two versions out there, one being the "original" idea and a modification. === Petr Pikal in 2001 (based on Brian Ripley's idea)=peaks <- function(series, span=3) { z <- embed(series, span) result <- max.col(z) == 1 + span %/% 2 result } versus === Petr Pikal in 2004 =peaks2<-function(series,span=3) { z <- embed(series, span) s <- span%/%2 v<- max.col(z) == 1 + s result <- c(rep(FALSE,s),v) result <- result[1:(length(result)-s)] result } Comparison shows> peaks(c(1,4,1,1,6,1,5,1,1),3)[1] TRUE FALSE FALSE TRUE FALSE TRUE FALSE which is a logical vector for elements 2:N-1 and> peaks2(c(1,4,1,1,6,1,5,1,1),3)[1] FALSE TRUE FALSE FALSE TRUE FALSE TRUE which is a logical vector for elements 1:N-2. As I would expect to "lose" (span-1)/2 elements on each side of the vector, to me the 2001 version feels more natural. Also, both "suffer" from being non-deterministic in the multiple-maxima-case (the two 4s here)> peaks(c(1,4,4,1,6,1,5,1,1),3)[1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE> peaks(c(1,4,4,1,6,1,5,1,1),3)[1] TRUE TRUE FALSE TRUE FALSE TRUE FALSE> peaks(c(1,4,4,1,6,1,5,1,1),3)[1] FALSE FALSE FALSE TRUE FALSE TRUE FALSE> peaks(c(1,4,4,1,6,1,5,1,1),3)[1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE which also persits for span > 3 (without the 6 then, of course):> peaks(c(1,4,4,1,1,1,5,1,1),5)[1] TRUE FALSE FALSE FALSE TRUE> peaks(c(1,4,4,1,1,1,5,1,1),5)[1] FALSE FALSE FALSE FALSE TRUE> peaks(c(1,4,4,1,1,1,5,1,1),5)[1] TRUE FALSE FALSE FALSE TRUE This could (should?) be fixed by modifying the call to max.col() result <- max.col(z, "first") == 1 + span %/% 2; Just my two cents, Marc -- =======================================================Dipl. Inform. Med. Marc Kirchner Interdisciplinary Centre for Scientific Computing (IWR) Multidimensional Image Processing INF 368 University of Heidelberg D-69120 Heidelberg Tel: ++49-6221-54 87 97 Fax: ++49-6221-54 88 50 marc.kirchner at iwr.uni-heidelberg.de -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : https://stat.ethz.ch/pipermail/r-help/attachments/20051123/bba13290/attachment.bin
>>>>> "MM" == Martin Maechler <maechler at stat.math.ethz.ch> >>>>> on Wed, 23 Nov 2005 14:35:14 +0100 writes:MM> I've been asked in private, MM> (and am replying BCC to the asker), >>> I saw your post on the R-help archives page about the possibility of >>> porting a function from S-Plus called peaks() to R. I am looking for >>> some way to locate peaks in a simple x,y data set, and thought that R >>> might be the way to go. MM> "of course" it is the way to go, don't get lost be going MM> somewhere else :-) MM> and try MM> install.packages("fortune") MM> fortune("go with R") auch! Two mistakes in such short section.. (thanks, Andy!) Instead, it should have been >>> ..... and thought that R might be the way to go. "of course" it is the way to go, don't get lost by going somewhere else :-) and try install.packages("fortune") library(fortune) fortune("go with R") Martin
>> I am looking for some way to locate peaks in a simple x,y data set.See my 'msc.peaks.find' function in 'caMassClass', it has a simple peak finding algorithm. Jarek Tuszynski
Hi Marc I use this function for finding maxima in some spectral data (eg. from Xray diffraction) and it satisfied my needs. The function itself was modified probably due to some reasons for ploting my data so it dropped values from the end rather than from both sides. Peaks in those cases are different than just occasional spikes from noise so therefore I did not notice this bug. Thanks for your suggestion. Best regards. Petr On 23 Nov 2005 at 14:33, Marc Kirchner wrote: Date sent: Wed, 23 Nov 2005 14:33:28 +0000 From: Marc Kirchner <marc.kirchner at iwr.uni-heidelberg.de> To: Martin Maechler <maechler at stat.math.ethz.ch> Copies to: R-help at r-project.org Subject: Re: [R] finding peaks in a simple dataset with R> > > > I wonder if we shouldn't polish that a bit and add to R's > > standard 'utils' package. > > > > Hm, I figured out there are (at least) two versions out there, one > being the "original" idea and a modification. > > === Petr Pikal in 2001 (based on Brian Ripley's idea)=> peaks <- function(series, span=3) { > z <- embed(series, span) > result <- max.col(z) == 1 + span %/% 2 > result > } > > versus > > === Petr Pikal in 2004 => peaks2<-function(series,span=3) { > z <- embed(series, span) > s <- span%/%2 > v<- max.col(z) == 1 + s > result <- c(rep(FALSE,s),v) > result <- result[1:(length(result)-s)] > result > } > > Comparison shows > > peaks(c(1,4,1,1,6,1,5,1,1),3) > [1] TRUE FALSE FALSE TRUE FALSE TRUE FALSE > which is a logical vector for elements 2:N-1 and > > > peaks2(c(1,4,1,1,6,1,5,1,1),3) > [1] FALSE TRUE FALSE FALSE TRUE FALSE TRUE > which is a logical vector for elements 1:N-2. > > As I would expect to "lose" (span-1)/2 elements on each side > of the vector, to me the 2001 version feels more natural. > > Also, both "suffer" from being non-deterministic in the > multiple-maxima-case (the two 4s here) > > > peaks(c(1,4,4,1,6,1,5,1,1),3) > [1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE > > peaks(c(1,4,4,1,6,1,5,1,1),3) > [1] TRUE TRUE FALSE TRUE FALSE TRUE FALSE > > peaks(c(1,4,4,1,6,1,5,1,1),3) > [1] FALSE FALSE FALSE TRUE FALSE TRUE FALSE > > peaks(c(1,4,4,1,6,1,5,1,1),3) > [1] FALSE TRUE FALSE TRUE FALSE TRUE FALSE > > which also persits for span > 3 (without the 6 then, of course): > > > peaks(c(1,4,4,1,1,1,5,1,1),5) > [1] TRUE FALSE FALSE FALSE TRUE > > peaks(c(1,4,4,1,1,1,5,1,1),5) > [1] FALSE FALSE FALSE FALSE TRUE > > peaks(c(1,4,4,1,1,1,5,1,1),5) > [1] TRUE FALSE FALSE FALSE TRUE > > This could (should?) be fixed by modifying the call to max.col() > result <- max.col(z, "first") == 1 + span %/% 2; > > Just my two cents, > Marc > > -- > =======================================================> Dipl. Inform. Med. Marc Kirchner > Interdisciplinary Centre for Scientific Computing (IWR) > Multidimensional Image Processing > INF 368 > University of Heidelberg > D-69120 Heidelberg > Tel: ++49-6221-54 87 97 > Fax: ++49-6221-54 88 50 > marc.kirchner at iwr.uni-heidelberg.de > >Petr Pikal petr.pikal at precheza.cz
Try, # work directly with data from the input files directory = system.file("Test", package = "caMassClass") X = msc.rawMS.read.csv(directory, "IMAC_normal_.*csv") Peaks = msc.peaks.find(X) # Find Peaks cat(nrow(Peaks), "peaks were found in", Peaks[nrow(Peaks),2], "files.\n") stopifnot( nrow(Peaks)==424 ) On my data to see that every thing works OK. Than I would convert your "input.dat" to CSV format: 2.00, 233 2.04, 220 ... 11.60, 540 12.00, 600 <-- a peak! 12.04, 450 ... On Windows machine, you can do it by opening your file in excel, and saving it as CSV. Or possibly using test editor to replace ' ' with ', '. Than the script X = msc.rawMS.read.csv('.', "Input.csv") Peaks = msc.peaks.find(X) cat(nrow(Peaks), "peaks were found in", Peaks [nrow(Peaks),2], "files.\n") should work. Other way, is to try: X = read.table("input.dat", header=TRUE) Y = X[,2] rownames(Y) = signif(X[,1], 6) Peaks = msc.peaks.find(Y) Which casts your data in correct format, described in documentation as: "Spectrum data either in matrix format [nFeatures x nSamples] or in 3D array format [nFeatures x nSamples x nCopies]. Row names (rownames(X)) store M/Z mass of each row." I hope one of those solutions works for you. Good Luck. Jarek Tuszynski -----Original Message----- From: dylan.beaudette at gmail.com [mailto:dylan.beaudette at gmail.com] Sent: Wednesday, November 23, 2005 5:47 PM To: r-help at stat.math.ethz.ch Cc: Tuszynski, Jaroslaw W. Subject: Re: [R] finding peaks in a simple dataset with R On Wednesday 23 November 2005 10:15 am, Tuszynski, Jaroslaw W. wrote:> >> I am looking for some way to locate peaks in a simple x,y data set. > > See my 'msc.peaks.find' function in 'caMassClass', it has a simple > peak finding algorithm. > > Jarek Tuszynski > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.htmlJarek, Thanks for the tip. I was able to install the caMassClass package and all of its dependancies. In addition, I was able to run the examples on the manual pages. However, The format of the input data to the 'msc.peaks.find' function is not apparent to me. In its simplest form, my data looks something like this: 2.00 233 2.04 220 ... 11.60 540 12.00 600 <-- a peak! 12.04 450 ... Here is an example R session, trying out the function you suggested: #importing my data like this: X <- read.table("input.dat", header=TRUE) #from the example: Peaks = msc.peaks.find(X) #errors with: Error in sort(x, partial = unique(c(lo, hi))) : 'x' must be atomic Also: I have tried one of the functions ( 'getPeaks' ) listed on the 'msc.peaks.find' manual page, however I am still having a problem with the format of my data vs. what the function is expecting. #importing my data like this: X <- read.table("input.dat", header=TRUE) #setup an output file for peak information peakfile <- paste("peakinfo.csv", sep="/") #run the analysis: getPeaks(X,peakfile) #errors with: Error in area/max(area) : non-numeric argument to binary operator In addition: Warning message: no finite arguments to max; returning -Inf any ideas would be greatly appreciated! -- Dylan Beaudette Soils and Biogeochemistry Graduate Group University of California at Davis 530.754.7341