similar to: Rpart -- using predict() when missing data is present?

Displaying 20 results from an estimated 1000 matches similar to: "Rpart -- using predict() when missing data is present?"

2007 Apr 13
0
How consistent is predict() syntax?
I have a situation where lagged values of a time-series are used to predict future values. I have packed together the time-series and the lagged values into a data frame: > str(D) 'data.frame': 191 obs. of 13 variables: $ y : num -0.21 -2.28 -2.71 2.26 -1.11 1.71 2.63 -0.45 -0.11 4.79 ... $ y.l1 : num NA -0.21 -2.28 -2.71 2.26 -1.11 1.71 2.63 -0.45 -0.11 ... $ y.l2 : num
2004 Apr 13
0
In-sample / Out-of-sample using R
I'm trying to learn how to use R to: * Make a random partition of a data frame between in-sample and out-of-sample * Estimate a model (e.g. lm()) for the in-sample * Make predictions for all observations * Compare the in-sample error sigma against the out-of-sample error sigma. I came up with the following code. I think it's okay, but I can't help feeling this is
2005 Aug 04
1
Puzzled at rpart prediction
I'm in a situation where I say: > predict(m.rpart, newdata=D[N1+t,]) 0 1 173 0.8 0.2 which I interpret as meaning: an 80% chance of "0" and a 20% chance of "1". Okay. This is consistent with: > predict(m.rpart, newdata=D[N1+t,], type="class") [1] 0 Levels: 0 1 But I'm puzzled at the following. If I say: > predict(m.rpart,
2006 Mar 06
3
Interleaving elements of two vectors?
Suppose one has x <- c(1, 2, 7, 9, 14) y <- c(71, 72, 77) How would one write an R function which alternates between elements of one vector and the next? In other words, one wants z <- c(x[1], y[1], x[2], y[2], x[3], y[3], x[4], y[4], x[5], y[5]) I couldn't think of a clever and general way to write this. I am aware of gdata::interleave() but it deals
2006 Jan 26
2
Prediction when using orthogonal polynomials in regression
Folks, I'm doing fine with using orthogonal polynomials in a regression context: # We will deal with noisy data from the d.g.p. y = sin(x) + e x <- seq(0, 3.141592654, length.out=20) y <- sin(x) + 0.1*rnorm(10) d <- lm(y ~ poly(x, 4)) plot(x, y, type="l"); lines(x, d$fitted.values, col="blue") # Fits great! all.equal(as.numeric(d$coefficients[1] + m
2005 Jun 07
1
R and MLE
I learned R & MLE in the last few days. It is great! I wrote up my explorations as http://www.mayin.org/ajayshah/KB/R/mle/mle.html I will be most happy if R gurus will look at this and comment on how it can be improved. I have a few specific questions: * Should one use optim() or should one use stats4::mle()? I felt that mle() wasn't adding much value compared with optim, and
2005 Jul 12
2
Puzzled at ifelse()
I have a situation where this is fine: > if (length(x)>15) { clever <- rr.ATM(x, maxtrim=7) } else { clever <- rr.ATM(x) } > clever $ATM [1] 1848.929 $sigma [1] 1.613415 $trim [1] 0 $lo [1] 1845.714 $hi [1] 1852.143 But this variant, using ifelse(), breaks: > clever <- ifelse(length(x)>15, rr.ATM(x, maxtrim=7), rr.ATM(x))
2005 May 27
1
R commandline editor question
I am using R 2.1 on Apple OS X. When I get the ">" prompt, I find it works well with emacs commandline editing. Keys like M-f C-k etc. work fine. The one thing that I really yearn for, which is missing, is bracket matching When I am doing something which ends in )))) it is really useful to have emacs or vi-style bracket matching, so as to be able to visually keep track of whether I
2005 May 08
2
Need a factor level even though there are no observations
I'm in this situation: factorlabels <- c("School", "College", "Beyond") with data for 8 families: education.man <- c(1,2,1,2,1,2,1,2) # Note : no "3" values education.wife <- c(1,2,3,1,2,3,1,2) # 1,2,3 are all present. My goal is to create this table: School College Beyond
2005 Aug 19
1
Problem with get.hist.quote() in tseries
When using get.hist.quote(), I find the dates are broken. This is with R 2.1.1 on Mac OS X `panther'. > library(tseries) Loading required package: quadprog 'tseries' version: 0.9-27 'tseries' is a package for time series analysis and computational finance. See 'library(help="tseries")' for details. > x <-
2005 May 24
1
Catching an error with lm()
Folks, I'm in a situation where I do a few thousand regressions, and some of them are bad data. How do I get back an error value (return code such as NULL) from lm(), instead of an error _message_? Here's an example: > x <- c(NA, 3, 4) > y <- c(2, NA, NA) > d <- lm(y ~ x) Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 0 (non-NA) cases
2005 Jun 14
1
Puzzled in utilising summary.lm() to obtain Var(x)
I have a program which is doing a few thousand runs of lm(). Suppose it is a simple model y = a + bx1 + cx2 + e I have the R object "d" where d <- summary(lm(y ~ x1 + x2)) I would like to obtain Var(x2) out of "d". How might I do it? I can, of course, always do sd(x2). But it would be much more convenient if I could snoop around the contents of summary.lm and
2005 Aug 16
1
Extracting some rows from a data frame - lapses into a vector
I have a data frame with one column "x": > str(data) `data.frame': 20 obs. of 1 variable: $ x: num 0.0495 0.0986 0.9662 0.7501 0.8621 ... Normally, I know that the notation dataframe[indexes,] gives you a new data frame which is the specified set of rows. But I find: > str(data[1:10,]) num [1:10] 0.0495 0.0986 0.9662 0.7501 0.8621 ... Here, it looks like the operation
2005 Sep 25
1
Question on lm(): When does R-squared come out as NA?
I have a situation with a large dataset (3000+ observations), where I'm doing lags as regressors, where I get: Call: lm(formula = rj ~ rM + rM.1 + rM.2 + rM.3 + rM.4) Residuals: 1990-06-04 1994-11-14 1998-08-21 2002-03-13 2005-09-15 -5.64672 -0.59596 -0.04143 0.55412 8.18229 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.003297 0.017603
2006 Aug 14
1
Presentation of multiple models in one table using xtable
Consider this situation: > x1 <- runif(100); x2 <- runif(100); y <- 2 + 3*x1 - 4*x2 + rnorm(100) > m1 <- summary(lm(y ~ x1)) > m2 <- summary(lm(y ~ x2)) > m3 <- summary(lm(y ~ x1 + x2)) Now you have estimated 3 different "competing" models, and suppose you want to present the set of models in one table. xtable(m1) is cool, but doing that thrice would give
2007 Jan 18
2
The math underlying the `betareg' package?
Folks, The betareg package appears to be polished and works well. But I would like to look at the exact formulas for the underlying model being estimated, the likelihood function, etc. E.g. if one has to compute \frac{\partial E(y)}{\partial x_i}, this requires careful calculations through these formulas. I read "Regression analysis of variates observed on (0,1): percentages, proportions and
2005 Aug 26
1
update.packages() is broken?
Folks, I am using R 2.1.1 on Apple OS X 10.3. Earlier, I used to say $ sudo R > update.packages() and all the packages used to get installed. For several weeks, I noticed that nothing has been coming through. I used the R-for-Mac graphics console and I find that there are many packages where new versions have come out which I don't have. Is something wrong with update.packages()? I
2005 Jun 06
1
A performance anomaly
I wrote a simple log likelihood (for the ordinary least squares (OLS) model), in two ways. The first works out the likelihood. The second merely calls the first, but after transforming the variance parameter, so as to allow an unconstrained maximisation. So the second suffers a slight cost for one exp() and then it pays the cost of calling the first. I did performance measurement. One would
2005 Oct 01
1
Placing axes label strings closer to the graph?
Folks, I have placed an example of a self-contained R program later in this mail. It generates a file inflation.pdf. When I stare at the picture, I see the "X label string" and "Y label string" sitting lonely and far away from the axes. How can these distances be adjusted? I read ?par and didn't find this directly. I want to hang on to 2.8 x 2.8 inches as the overall size
2006 Jan 19
2
Tobit estimation?
Folks, Based on http://www.biostat.wustl.edu/archives/html/s-news/1999-06/msg00125.html I thought I should experiment with using survreg() to estimate tobit models. I start by simulating a data frame with 100 observations from a tobit model > x1 <- runif(100) > x2 <- runif(100)*3 > ystar <- 2 + 3*x1 - 4*x2 + rnorm(100)*2 > y <- ystar > censored <- ystar <= 0