---------- Forwarded message ---------- From: niharika singhal <niharikasinghal1990 at gmail.com> Date: Sun, Aug 27, 2017 at 11:57 AM Subject: Re: Find maxima of a function To: "David Winsemius [via R]" <ml+s789695n4745009h56 at n4.nabble.com>, "Ismail SEZEN [via R]" <ml+s789695n4744993h60 at n4.nabble.com>, Ulrik Stervbo <ulrik.stervbo at gmail.com> Dear David and Ismail, The actual problem is I am getting the parameters from the Kmeans cluster on the data set obtained from the mclust package. In mclust method I am changing the value of G according to user input, so the number of means, sigma and the coefficien mixture I will get is not fixed, It can be 3 or 4or5 or 10. It will all depend on the user. Then on the result of kmeans, I wanted to find the maxima so that I can use Normalized probability formula further for my logic and in order to do that I used Newton's method and that was implemented in the optimr package. I was getting the wrong result so I used distr package in order to check the problem and I figured out I was getting the wrong maxima. So I can to the conclusion which I have posted as my query. PS: I am able to use distr package without any problem. And I want to use dnorm because I have a Gaussian mixture model, so I did not want to use rnorm. I hope you get what my problem is now. I will try both of your solution, which uses the same function and sees if it helps me. I am struck at this point from almost 4 weeks and it is delaying my work a lot. I really appreciate all your help Thanks & Regards Niharika SInghal On Sat, Aug 26, 2017 at 10:37 PM, David Winsemius [via R] < ml+s789695n4745009h56 at n4.nabble.com> wrote:> > > On Aug 26, 2017, at 12:13 PM, Ismail SEZEN <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=4745009&i=0>> wrote: > > > > > >> On 26 Aug 2017, at 16:39, niharika singhal <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=4745009&i=1>> wrote: > >> > >> Hi, > >> > >> Thanks for your mail, and time > >> > >> It is not working for some arguments, when mean value is like >6. > >> > >> > >> case > >> > >> mc0 <- c(0.08844446,0.1744455,0.1379778,0.1209769,0.1573065,0. > >> 1134463,0.2074027) > >> > >> rv <-UnivarMixingDistribution(Norm(486.4255, 53.24133), > >> > >> Norm(664.0713, 3.674773), > >> > >> Norm(669.0484, 4.101381), > >> > >> Norm(677.1753, 4.869985), > >> > >> Norm(683.2635, 7.288175), > >> > >> Norm(727.6229, 37.64198), > >> > >> Norm(819.2011, 57.06655), > >> > >> mixCoeff=mc0/sum(mc0)) > >> > >> plot(rv, to.draw.arg="d") > >> > >> > >> I am getting 731.1345 from the code you have provide > >> > >> > >> It is part of a code, so it was difficult to write a reproducible code > >> > >> I have tried to use optimr but it gives me the local maxima, now I am > >> struck with the problem of how to get the global maxima > >> > > > > This is basically an optimization problem so it?s nothing to do with > distr package and UnivarMixingDistribution function. Also I?m not able to > install distr package because of an error. > > > str(rv) > Formal class 'AbscontDistribution' [package "distr"] with 12 slots > ..@ gaps : num [1:2, 1:2] 259 1037 295 1063 > ..@ img :Formal class 'Reals' [package "distr"] with 2 slots > .. .. ..@ dimension: num 1 > .. .. ..@ name : chr "Real Space" > ..@ param : NULL > ..@ r :function (n) > ..@ d :function (x, log = FALSE) > ..@ p :function (q, lower.tail = TRUE, log.p = FALSE) > ..@ q :function (p, lower.tail = TRUE, log.p = FALSE) > ..@ .withSim : logi FALSE > ..@ .withArith : logi TRUE > ..@ .logExact : logi FALSE > ..@ .lowerExact: logi FALSE > ..@ Symmetry :Formal class 'NoSymmetry' [package "distr"] with 2 slots > .. .. ..@ type : chr "non-symmetric distribution" > .. .. ..@ SymmCenter: NULL > > str(density(rv)) > Error in density.default(rv) : argument 'x' must be numeric > > str(density(rv at r(1000)) > + ) > List of 7 > $ x : num [1:512] 297 298 300 301 302 ... > $ y : num [1:512] 4.39e-07 6.40e-07 9.15e-07 1.29e-06 1.78e-06 ... > $ bw : num 10.5 > $ n : int 1000 > $ call : language density.default(x = rv at r(1000)) > $ data.name: chr "rv at r(1000)" > $ has.na : logi FALSE > - attr(*, "class")= chr "density" > > > > max( density(rv at r(10000))$y ) > [1] 0.02133529 > > which.max( density(rv at r(10000))$y ) > [1] 259 > > > > my.instance <- rv at r(10000) # need to creat an immutable value since > this is a pseuado-random number series. > > which.max( density(my.instance)$y ) # location of x and y values for > global max > [1] 263 > > density(my.instance)$x[ which.max( density(my.instance)$y )] # x value > that yields global max > [1] 669.8586 > > density(my.instance)$y[ which.max( density(my.instance)$y )] value at > max. > [1] 0.02035133 > > > > > > > an you create an example by rnorm function? or use dput to share x and y > outputs of UnivarMixingDistribution function and I can look for the issue. > > > > Also You might have multiple maximas has same maximum y-values (for > instance, y = sin(x)) and this definition is relevant to defined interval. > For instance again, we can talk about a global extremum for a function like > y = ax^2+bx + c. We know we will have only a single extremum point (maxima > or minima). For higher order functions, we can talk about only local > maximas. So, I assume you want to obtain maximum one of this local maximas, > right? > > > > If yes, why don?t we find the maximum y-value and corresponding x-value > as follows? > > > > y <- c(1,2,3,4,3,2,3,4,5,6,7,8,9,8,7,6,5,6,7,6,5) > > x <- 1:length(y) > > > > fun <- splinefun(x = x, y = y, method = "n") > > x2 <- seq(1, max(x), 0.1) > > y2 <- fun(x2) > > plot(x, y, type = "l") > > lines(x2, y2, col = "red") > > > > max.x <- optimize(fun, interval = range(x), maximum = TRUE) > > print(max.x) # x coordinate of global maximum of y by spline and > optimize > > x[which(y == max(y))] # global maximum of dicrete x-y vectors > > > > > > spline function uses cubic spline method to obtain the undefined values > in a discrete series and optimize function calculates EXACT LOCATION of the > extremum. I suspect we have a communication failure :) > > > > ______________________________________________ > > [hidden email] <http:///user/SendEmail.jtp?type=node&node=4745009&i=2> > mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posti > ng-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > 'Any technology distinguishable from magic is insufficiently advanced.' > -Gehm's Corollary to Clarke's Third Law > > ______________________________________________ > [hidden email] <http:///user/SendEmail.jtp?type=node&node=4745009&i=3> > mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posti > ng-guide.html > and provide commented, minimal, self-contained, reproducible code. > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://r.789695.n4.nabble.com/Find-maxima-of-a-function-OR-o > ptimization-of-a-Function-tp4744989p4745009.html > To unsubscribe from Find maxima of a function OR optimization of a > Function, click here > <http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4744989&code=bmloYXJpa2FzaW5naGFsMTk5MEBnbWFpbC5jb218NDc0NDk4OXwtNDI2MzAyOTM1> > . > NAML > <http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >[[alternative HTML version deleted]]
Dear Niharika, As I said before, the problem is basically an optimization issue. You should isolate the problematic part from the rest of your study. Sometimes, more information does not help to solution. All the answers from us (Ulrik, David, me) are more or less are correct to find a maximum point. Newton?s method is also correct. But after answers, you only say, it didn?t find the right maxima. At this point I?m cluesless, because problem might be originated at some other point and we don?t know it. For instance, In my previous answer, my suggested solution was right for both your 2 situations. But suddenly you just said, it didin?t work for a mean greater than 6 and I?m not able to reproduce your new situation. I?m really upset that you lost 4 weeks on a very easy issue (it is very long time). But if you don?t want to loose anymore, please, isolate your question from the rest of the work, create minimal reproduciple example, and let?s focus on the problematic part. I assure you that your problem will be solved faster. I could install the distr package at the end, but I?m getting another error while running your code. But I don?t believe the question is related to this package. Let me explain about extremum points although most of you know about it. Let?s assume we have a y(x) function. An extremum (max/min) point is defined as where dy/dx = 0. If second order derivative of y is less than 0, it?s a maximum, if greater than 0, it?s a minimum point. If zero, it?s undefined. So, at the end, we need x and y vectors to find maxima. y <- c(1,2,3,4,3,2,3,4,5,6,7,8,9,8,7,6,5,6,7,6,5) # sample y values x <- 1:length(y) # correspoinding x values # Now we need to convert this discrete x-y vectors to a function. Because we don?t know the values between points. # That?s why, we fit a cubic spline to have the values between discrete points. fun <- splinefun(x = x, y = y, method = "n?) # now, we are able to find what value of y at x = 1.5 by the help of fun function. x2 <- seq(1, max(x), 0.1) y2 <- fun(x2) plot(x, y, type = "l") lines(x2, y2, col = "red?) # see red and black lines are slightly different because we fit a cubic spline to the discrete points. # Now, we will use optimize function to find the maxima in a defined interval. Interval definition is crucial. # optimize function calculates all required steps explained above to find an extremum and gives the result. max.x <- optimize(fun, interval = range(x), maximum = TRUE) max.x2 <- x[which(y == max(y))] # global maximum of discrete x-y vectors print(max.x) # x value of global maximum of y print(max.x2) # x value of global maximum of y ( discrete points) see max.x and max.x2 are very close. max.x is calculated under a cubic spline assumption and max.x2 is calculated by discrete points. As a result, problem is reduced to find maxima of x-y vector pairs and this SHOULD work for all sort of situations (UNIVERSAL FACT). If it does not work one of your situations, MOST PROBABLY, you are doing something wrong and we need to investigate your failed case. At this point, you can use ?dput? function to extract, copy and paste your x-y vectors. So, we can copy the statement, run in our environment without requiring distr or any other package. Best wishes, isezen> On 27 Aug 2017, at 12:59, niharika singhal <niharikasinghal1990 at gmail.com> wrote: > > Dear David and Ismail, > > The actual problem is I am getting the parameters from the Kmeans cluster > on the data set obtained from the mclust package. > > In mclust method I am changing the value of G according to user input, so > the number of means, sigma and the coefficien mixture I will get is not > fixed, It can be 3 or 4or5 or 10. It will all depend on the user. > > Then on the result of kmeans, I wanted to find the maxima so that I can use > Normalized probability formula further for my logic and in order to do that > I used Newton's method and that was implemented in the optimr package. > > I was getting the wrong result so I used distr package in order to check > the problem and I figured out I was getting the wrong maxima. So I can to > the conclusion which I have posted as my query. > > PS: I am able to use distr package without any problem. > > And I want to use dnorm because I have a Gaussian mixture model, so I did > not want to use rnorm. > > I hope you get what my problem is now. > > I will try both of your solution, which uses the same function and sees if > it helps me. > > I am struck at this point from almost 4 weeks and it is delaying my work a > lot. > > I really appreciate all your help > > Thanks & Regards > Niharika SInghal
I have not followed the history of this thread, but I am quite flummoxed as to why the OP is rewriting code to estimate parameters from an univariate Gaussian mixture model when alternatives such as EMCluster (which generally appears to handle initialization better than MClust) exist. Or perhaps there is more to it in which case I apologize. But I thought that I would make the OP aware of the package (of which, in full disclosure, I am co-author). Best wishes, Ranjan On Sun, 27 Aug 2017 16:01:59 +0300 Ismail SEZEN <sezenismail at gmail.com> wrote:> Dear Niharika, > > As I said before, the problem is basically an optimization issue. You should isolate the problematic part from the rest of your study. Sometimes, more information does not help to solution. All the answers from us (Ulrik, David, me) are more or less are correct to find a maximum point. Newton?s method is also correct. But after answers, you only say, it didn?t find the right maxima. At this point I?m cluesless, because problem might be originated at some other point and we don?t know it. For instance, In my previous answer, my suggested solution was right for both your 2 situations. But suddenly you just said, it didin?t work for a mean greater than 6 and I?m not able to reproduce your new situation. > > I?m really upset that you lost 4 weeks on a very easy issue (it is very long time). But if you don?t want to loose anymore, please, isolate your question from the rest of the work, create minimal reproduciple example, and let?s focus on the problematic part. I assure you that your problem will be solved faster. > > I could install the distr package at the end, but I?m getting another error while running your code. But I don?t believe the question is related to this package. > > Let me explain about extremum points although most of you know about it. Let?s assume we have a y(x) function. An extremum (max/min) point is defined as where dy/dx = 0. If second order derivative of y is less than 0, it?s a maximum, if greater than 0, it?s a minimum point. If zero, it?s undefined. So, at the end, we need x and y vectors to find maxima. > > y <- c(1,2,3,4,3,2,3,4,5,6,7,8,9,8,7,6,5,6,7,6,5) # sample y values > x <- 1:length(y) # correspoinding x values > > # Now we need to convert this discrete x-y vectors to a function. Because we don?t know the values between points. > # That?s why, we fit a cubic spline to have the values between discrete points. > > fun <- splinefun(x = x, y = y, method = "n?) > > # now, we are able to find what value of y at x = 1.5 by the help of fun function. > > x2 <- seq(1, max(x), 0.1) > y2 <- fun(x2) > > plot(x, y, type = "l") > lines(x2, y2, col = "red?) > > # see red and black lines are slightly different because we fit a cubic spline to the discrete points. > # Now, we will use optimize function to find the maxima in a defined interval. Interval definition is crucial. > # optimize function calculates all required steps explained above to find an extremum and gives the result. > > max.x <- optimize(fun, interval = range(x), maximum = TRUE) > max.x2 <- x[which(y == max(y))] # global maximum of discrete x-y vectors > > print(max.x) # x value of global maximum of y > print(max.x2) # x value of global maximum of y ( discrete points) > > see max.x and max.x2 are very close. max.x is calculated under a cubic spline assumption and max.x2 is calculated by discrete points. > > As a result, problem is reduced to find maxima of x-y vector pairs and this SHOULD work for all sort of situations (UNIVERSAL FACT). If it does not work one of your situations, MOST PROBABLY, you are doing something wrong and we need to investigate your failed case. > > At this point, you can use ?dput? function to extract, copy and paste your x-y vectors. So, we can copy the statement, run in our environment without requiring distr or any other package. > > Best wishes, > > isezen > > > > On 27 Aug 2017, at 12:59, niharika singhal <niharikasinghal1990 at gmail.com> wrote: > > > > Dear David and Ismail, > > > > The actual problem is I am getting the parameters from the Kmeans cluster > > on the data set obtained from the mclust package. > > > > In mclust method I am changing the value of G according to user input, so > > the number of means, sigma and the coefficien mixture I will get is not > > fixed, It can be 3 or 4or5 or 10. It will all depend on the user. > > > > Then on the result of kmeans, I wanted to find the maxima so that I can use > > Normalized probability formula further for my logic and in order to do that > > I used Newton's method and that was implemented in the optimr package. > > > > I was getting the wrong result so I used distr package in order to check > > the problem and I figured out I was getting the wrong maxima. So I can to > > the conclusion which I have posted as my query. > > > > PS: I am able to use distr package without any problem. > > > > And I want to use dnorm because I have a Gaussian mixture model, so I did > > not want to use rnorm. > > > > I hope you get what my problem is now. > > > > I will try both of your solution, which uses the same function and sees if > > it helps me. > > > > I am struck at this point from almost 4 weeks and it is delaying my work a > > lot. > > > > I really appreciate all your help > > > > Thanks & Regards > > Niharika SInghal > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Important Notice: This mailbox is ignored: e-mails are set to be deleted on receipt. Please respond to the mailing list if appropriate. For those needing to send personal or professional e-mail, please use appropriate addresses.