PIKAL Petr
2020-Jun-10 10:09 UTC
[R] segmented do not correctly fit data (variable names problem)
Dear all
To make my problem more on topic I would like to ask about weird results
from segmented fit, despite of Bert's warning.
Here is my data
temp <- structure(list(V1 = c(0L, 15L, 30L, 45L, 60L, 75L, 90L, 105L, 120L,
135L, 150L, 165L, 180L, 195L, 210L, 225L, 240L, 255L, 270L, 285L, 300L,
315L, 330L, 345L, 360L), V2 = c(98.68666667, 100.8, 103.28, 107.44, 110.06,
114.26, 117.6, 121.04, 123.8533333, 126.66, 129.98, 134.1866667, 139.04,
144.6, 152.08, 161.3, 169.8733333, 176.6133333, 181.92, 186.0266667,
188.7533333, 190.7066667, 192.0533333, 192.9933333, 193.3533333)), class
"data.frame", row.names = c(NA,
+ -25L))
Here is the fit.
library(segmented)
plot(temp$V1, temp$V2)
fit <- lm(V2~V1, temp)
fit.s <- segmented(fit, seg.Z = ~ V1, npsi=2)
plot(fit.s, add=TRUE, col=2)
which is wrong.
If I take example from web, the result is OK.
set.seed(12)
xx <- 1:100
zz <- runif(100)
yy <- 2 + 1.5*pmax(xx - 35, 0) - 1.5*pmax(xx - 70, 0) + 15*pmax(zz - .5, 0)
+ rnorm(100,0,2)
dati <- data.frame(x = xx, y = yy, z = zz)
out.lm <- lm(y ~ x, data = dati)
o <- segmented(out.lm, seg.Z = ~x, psi = list(x = c(30,60)), control
seg.control(display = FALSE)
)
plot(dati$x, dati$y)
plot(o, add=TRUE, col=2)
What am I doing wrong? Is there a bug in segmented? BTW, if I change column
names in temp to x and y, segmented found correct fit.
names(temp) <- c("x", "y")
plot(temp$x, temp$y)
fit <- lm(y~x, temp)
fit.s <- segmented(fit, seg.Z = ~x, npsi=2)
plot(fit.s, add=TRUE, col=2)
> sessionInfo()
R Under development (unstable) (2020-03-08 r77917)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale:
[1] LC_COLLATE=Czech_Czechia.1250 LC_CTYPE=Czech_Czechia.1250
[3] LC_MONETARY=Czech_Czechia.1250 LC_NUMERIC=C
[5] LC_TIME=Czech_Czechia.1250
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] segmented_1.1-0
loaded via a namespace (and not attached):
[1] compiler_4.0.0 tools_4.0.0 splines_4.0.0
Cheers
Petr
Bert Gunter
2020-Jun-10 14:29 UTC
[R] segmented do not correctly fit data (variable names problem)
Note: My warning was for "stepwise" regression, which is what *you wrote*, not "segmented". Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Jun 10, 2020 at 3:09 AM PIKAL Petr <petr.pikal at precheza.cz> wrote:> Dear all > > To make my problem more on topic I would like to ask about weird results > from segmented fit, despite of Bert's warning. > > Here is my data > > temp <- structure(list(V1 = c(0L, 15L, 30L, 45L, 60L, 75L, 90L, 105L, 120L, > 135L, 150L, 165L, 180L, 195L, 210L, 225L, 240L, 255L, 270L, 285L, 300L, > 315L, 330L, 345L, 360L), V2 = c(98.68666667, 100.8, 103.28, 107.44, 110.06, > 114.26, 117.6, 121.04, 123.8533333, 126.66, 129.98, 134.1866667, 139.04, > 144.6, 152.08, 161.3, 169.8733333, 176.6133333, 181.92, 186.0266667, > 188.7533333, 190.7066667, 192.0533333, 192.9933333, 193.3533333)), class > "data.frame", row.names = c(NA, > + -25L)) > > Here is the fit. > > library(segmented) > plot(temp$V1, temp$V2) > fit <- lm(V2~V1, temp) > fit.s <- segmented(fit, seg.Z = ~ V1, npsi=2) > plot(fit.s, add=TRUE, col=2) > > which is wrong. > > If I take example from web, the result is OK. > > set.seed(12) > xx <- 1:100 > zz <- runif(100) > yy <- 2 + 1.5*pmax(xx - 35, 0) - 1.5*pmax(xx - 70, 0) + 15*pmax(zz - .5, 0) > + rnorm(100,0,2) > dati <- data.frame(x = xx, y = yy, z = zz) > out.lm <- lm(y ~ x, data = dati) > o <- segmented(out.lm, seg.Z = ~x, psi = list(x = c(30,60)), control > seg.control(display = FALSE) > ) > plot(dati$x, dati$y) > plot(o, add=TRUE, col=2) > > What am I doing wrong? Is there a bug in segmented? BTW, if I change column > names in temp to x and y, segmented found correct fit. > > names(temp) <- c("x", "y") > plot(temp$x, temp$y) > fit <- lm(y~x, temp) > fit.s <- segmented(fit, seg.Z = ~x, npsi=2) > plot(fit.s, add=TRUE, col=2) > > > sessionInfo() > R Under development (unstable) (2020-03-08 r77917) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows 10 x64 (build 18363) > > Matrix products: default > > locale: > [1] LC_COLLATE=Czech_Czechia.1250 LC_CTYPE=Czech_Czechia.1250 > [3] LC_MONETARY=Czech_Czechia.1250 LC_NUMERIC=C > [5] LC_TIME=Czech_Czechia.1250 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] segmented_1.1-0 > > loaded via a namespace (and not attached): > [1] compiler_4.0.0 tools_4.0.0 splines_4.0.0 > > > Cheers > Petr > > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]