Hayes, Rachel M wrote:> Hi all,
>
>
>
> I have a vector of proportions (post_op_prw) such that
>
>
>
> >summary(amb$post_op_prw)
>
>
>
> Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
>
> 0.0000 0.0000 0.0000 0.3985 0.9134 0.9962 1.0000
>
>
>
>> summary(cut2(amb$post_op_prw,0.0001))
>
>
>
> [0.0000,0.0001) [0.0001,0.9962] NA's
>
> 1904 1672 1
>
>
>
> I want to use post_op_prw as a predictor variable in an OLS model. I
> decided to fit it using a restricted cubic spline. But, I'm seeing
> behavior I don't understand. See below:
>
>
>
>> rcspline.eval(amb$post_op_prw,nk = 3, knots.only = T)
>
> [1] 0.0000000 0.6147927 0.9092937 0.9667178
>
> Warning message:
>
> In rcspline.eval(amb$post_op_prw, nk = 3, knots.only = T) :
>
> could not obtain 3 knots with default algorithm.
>
> Used alternate algorithm to obtain 4 knots
>
>> rcspline.eval(amb$post_op_prw,nk = 4, knots.only = T)
>
> [1] 0.0000000 0.8476793 0.9783558
>
>> rcspline.eval(amb$post_op_prw,nk = 5, knots.only = T)
>
> [1] 0.0000000 0.9012711 0.9783558
>
>
>
> Why are the 4 and 5 knot spline requests returning a spline with 3
> knots? I get the best model results using rcs(amb$post_op_prw,3).
I'm
> kind of new to using splines. Does the fact that observations are
> clustered at the ends make the spline fit questionable?
Yes, or at least it makes the choice of knots questionable. For that
type of variable with many ties I tend to use a quadratic effect
(pol(x,2) in Design or rms packages).
Frank
>
>
>
> Thanks,
>
>
>
> Rachel Hayes
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University