PIKAL Petr
2020-Jun-10 10:09 UTC
[R] segmented do not correctly fit data (variable names problem)
Dear all To make my problem more on topic I would like to ask about weird results from segmented fit, despite of Bert's warning. Here is my data temp <- structure(list(V1 = c(0L, 15L, 30L, 45L, 60L, 75L, 90L, 105L, 120L, 135L, 150L, 165L, 180L, 195L, 210L, 225L, 240L, 255L, 270L, 285L, 300L, 315L, 330L, 345L, 360L), V2 = c(98.68666667, 100.8, 103.28, 107.44, 110.06, 114.26, 117.6, 121.04, 123.8533333, 126.66, 129.98, 134.1866667, 139.04, 144.6, 152.08, 161.3, 169.8733333, 176.6133333, 181.92, 186.0266667, 188.7533333, 190.7066667, 192.0533333, 192.9933333, 193.3533333)), class "data.frame", row.names = c(NA, + -25L)) Here is the fit. library(segmented) plot(temp$V1, temp$V2) fit <- lm(V2~V1, temp) fit.s <- segmented(fit, seg.Z = ~ V1, npsi=2) plot(fit.s, add=TRUE, col=2) which is wrong. If I take example from web, the result is OK. set.seed(12) xx <- 1:100 zz <- runif(100) yy <- 2 + 1.5*pmax(xx - 35, 0) - 1.5*pmax(xx - 70, 0) + 15*pmax(zz - .5, 0) + rnorm(100,0,2) dati <- data.frame(x = xx, y = yy, z = zz) out.lm <- lm(y ~ x, data = dati) o <- segmented(out.lm, seg.Z = ~x, psi = list(x = c(30,60)), control seg.control(display = FALSE) ) plot(dati$x, dati$y) plot(o, add=TRUE, col=2) What am I doing wrong? Is there a bug in segmented? BTW, if I change column names in temp to x and y, segmented found correct fit. names(temp) <- c("x", "y") plot(temp$x, temp$y) fit <- lm(y~x, temp) fit.s <- segmented(fit, seg.Z = ~x, npsi=2) plot(fit.s, add=TRUE, col=2)> sessionInfo()R Under development (unstable) (2020-03-08 r77917) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363) Matrix products: default locale: [1] LC_COLLATE=Czech_Czechia.1250 LC_CTYPE=Czech_Czechia.1250 [3] LC_MONETARY=Czech_Czechia.1250 LC_NUMERIC=C [5] LC_TIME=Czech_Czechia.1250 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] segmented_1.1-0 loaded via a namespace (and not attached): [1] compiler_4.0.0 tools_4.0.0 splines_4.0.0 Cheers Petr
Bert Gunter
2020-Jun-10 14:29 UTC
[R] segmented do not correctly fit data (variable names problem)
Note: My warning was for "stepwise" regression, which is what *you wrote*, not "segmented". Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Jun 10, 2020 at 3:09 AM PIKAL Petr <petr.pikal at precheza.cz> wrote:> Dear all > > To make my problem more on topic I would like to ask about weird results > from segmented fit, despite of Bert's warning. > > Here is my data > > temp <- structure(list(V1 = c(0L, 15L, 30L, 45L, 60L, 75L, 90L, 105L, 120L, > 135L, 150L, 165L, 180L, 195L, 210L, 225L, 240L, 255L, 270L, 285L, 300L, > 315L, 330L, 345L, 360L), V2 = c(98.68666667, 100.8, 103.28, 107.44, 110.06, > 114.26, 117.6, 121.04, 123.8533333, 126.66, 129.98, 134.1866667, 139.04, > 144.6, 152.08, 161.3, 169.8733333, 176.6133333, 181.92, 186.0266667, > 188.7533333, 190.7066667, 192.0533333, 192.9933333, 193.3533333)), class > "data.frame", row.names = c(NA, > + -25L)) > > Here is the fit. > > library(segmented) > plot(temp$V1, temp$V2) > fit <- lm(V2~V1, temp) > fit.s <- segmented(fit, seg.Z = ~ V1, npsi=2) > plot(fit.s, add=TRUE, col=2) > > which is wrong. > > If I take example from web, the result is OK. > > set.seed(12) > xx <- 1:100 > zz <- runif(100) > yy <- 2 + 1.5*pmax(xx - 35, 0) - 1.5*pmax(xx - 70, 0) + 15*pmax(zz - .5, 0) > + rnorm(100,0,2) > dati <- data.frame(x = xx, y = yy, z = zz) > out.lm <- lm(y ~ x, data = dati) > o <- segmented(out.lm, seg.Z = ~x, psi = list(x = c(30,60)), control > seg.control(display = FALSE) > ) > plot(dati$x, dati$y) > plot(o, add=TRUE, col=2) > > What am I doing wrong? Is there a bug in segmented? BTW, if I change column > names in temp to x and y, segmented found correct fit. > > names(temp) <- c("x", "y") > plot(temp$x, temp$y) > fit <- lm(y~x, temp) > fit.s <- segmented(fit, seg.Z = ~x, npsi=2) > plot(fit.s, add=TRUE, col=2) > > > sessionInfo() > R Under development (unstable) (2020-03-08 r77917) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows 10 x64 (build 18363) > > Matrix products: default > > locale: > [1] LC_COLLATE=Czech_Czechia.1250 LC_CTYPE=Czech_Czechia.1250 > [3] LC_MONETARY=Czech_Czechia.1250 LC_NUMERIC=C > [5] LC_TIME=Czech_Czechia.1250 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] segmented_1.1-0 > > loaded via a namespace (and not attached): > [1] compiler_4.0.0 tools_4.0.0 splines_4.0.0 > > > Cheers > Petr > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]