Displaying 20 results from an estimated 300 matches similar to: "Cook's distance"
2009 Nov 08
influence.measures(stats): hatvalues(model, ...)
I am trying to understand the method 'hatvalues(...)', which returns something similar to the diagonals of the plain vanilla hat matrix [X(X'X)^(-1)X'], but not quite.
A Fortran programmer I am not, but tracing through the code it looks like perhaps some sort of correction based on the notion of 'leave-one-out' variance is being applied.
Whatever the
2011 Oct 18
cygwing warming when creating a package in windows
Dear All,
I am a beginner creating R packages. I followed the Leisch (2009) tutorial
and the document ?Writing R Extensions? to write an example.
I installed R 2.12.2 (I also tried R2.13.2), the last version of Rtools and
the recommended packages in a PC with Windows 7 Home Premium.
I can run R CMD INSTALL linmod in the command prompt and the R CMD check
linmod. The following outputs are
2009 Oct 26
What is the most efficient practice to develop an R package?
I am reading Section 5 and 6 of
It seems that I have to do the following two steps in order to make an
R package. But when I am testing these package, these two steps will
run many times, which may take a lot of time. So when I still develop
the package, shall I always source('linmod.R') to test it. Once the
code in
2012 Feb 09
passing an extra argument to an S3 generic
I'm trying to write some functions extending influence measures to
multivariate linear models and also
allow subsets of size m>=1 to be considered for deletion diagnostics.
I'd like these to work roughly parallel
to those functions for the univariate lm where only single case deletion
(m=1) diagnostics are considered.
Corresponding to stats::hatvalues.lm, the S3 method for class
2010 Apr 27
Problem calculating multiple regressions on a data frame.
Hi there,
I am stuck trying to solve what should be a fairly easy problem.
I have a data frame that essentially consists of (ID, time as seqMonth,
variable, value) and i want to find the regression coefficient of value vs
time for each combination of ID and Variable.
I have tried several approaches and none of them seems to work as i
For example, i have tried:
2006 Jan 11
updating formula inside function
Dear R-Helpers
Given a function like
foo <- function(data,var1,var2,var3) {
f <- formula(paste(var1,'~',paste(var2,var3,sep='+'),sep=''))
linmod <- lm(f)
By typing
I get the result of the linear model a~b+c.
How can I rewrite the function so that the formula can be updated inside
the function,
2009 Mar 05
I am struiggling a bit with this function 'hatvalues'. I would like a little more undrestanding than taking the black-box and using the values. I looked at the Fortran source and it is quite opaque to me. So I am asking for some help in understanding the theory. First, I take the simplest case of a single variant. For this I turn o John Fox's book, "Applied Regression Analysis
2010 Jun 18
How to calculate the robust standard error of the dependent variable
Hi, folks
The summary of linmod shows the standard error of the coefficients. How can
we get the sd of y and the robust standard errors in R?
[[alternative HTML version deleted]]
2010 Jun 21
How to predict the mean and variance of the dependent variable after regression
Hi, folks,
As seen in the following codes:
y=rlnorm(10,mean=10)### Fake dataset
After the regression, I would like to know the mean of y. Since log(y) is
normal and y is lognormal, I need to know the mean and variance of log(y)
first. I tried mean (y) and mean(linmod), but either one is what I want.
Any tips?
Thanks in
2009 Sep 14
Eliminate cases in a subset of a dataframe
Hi folks,
I created a subset of a dataframe (i.e., selected only men):
subdata <- subset(data,data$gender==1)
After a residual diagnostic of a regression analysis, I detected three
linmod <- lm(y ~ x, data=subdata)
Say, the cases 11,22, and 33 were outliers.
Here comes the problem: When I want to exclude these three cases in a
further regression analysis,
- for
2008 Nov 20
Identify command in R
Hi all,
In using the identify command, I get the following message
> plot(hatvalues(scireg3))
> abline(h=.0154,lty=2) # plots a reference line at (k + 1)/n
> identify(1:1165, hatvalues(scireg3),row.names(sciach))
Error in xy.coords(x, y) : 'x' and 'y' lengths differ
which doesn't allow me to see the observation number when I scroll over
with the mouse. What
2007 Oct 19
In a SLR, Why Does the Hat Matrix Depend on the Weights?
I understand that the hat matrix is a function of the predictor variable
alone. So, in the following example why do the values on the diagonal of the
hat matrix change when I go from an unweighted fit to a weighted fit? Is the
function hatvalues giving me something other than what I think it is?
fit <- lm(short.velocity ~ blood.glucose)
2012 May 29
setting parameters equal in lm
Forgive me if this is a trivial question, but I couldn't find it an answer
in former forums. I'm trying to reproduce some SAS results where they set
two parameters equal. For example:
y = b1X1 + b2X2 + b1X3
Notice that the variables X1 and X3 both have the same slope and the
intercept has been removed. How do I get an estimate of this regression
model? I know how to remove the intercept
2008 Mar 10
Weighting data when running regressions
Dear R-Help,
I'm new to R and struggling with weighting data when I run regression. I've
tried to use search to solve my problem but haven't found anything helpful
so far.
I (successfully) import data from SPSS (15) and try to run a linear
regression on a subset of my data file where WEIGHT is the name of my
weighting variable (numeric), e.g.:
2003 Jul 12
Problem with library "car"
I am using the Unix version of R (version 1.7.0), installed via fink on a G4
Macintosh. I recently upgraded from version 1.6.0 and found that the "car"
library now has a problem:
---Begin transcript---
Attaching package 'car':
The following object(s) are masked from package:base :
dfbeta dfbeta.lm dfbetas dfbetas.lm hatvalues hatvalues.lm
2011 Mar 27
Sweave: include a multi-page-pdf plot
I'm just starting out with Sweave, and I can't get a plot(linmod) to
display all four plots:
<< bild >>=
x1 <- runif(100)
x2 <- rexp(100)
y <- 3 + 4*x1 + 5*x2 + rnorm(100)
mod <- lm(y~x1+x2)
Some Text
This plots only the first image of the four-page plot.lm() result.
I don't want to use
2009 Nov 05
Using a by() function to process several regression (lm()) functions
Thank you very much for looking at this. I have a "seasonal" user for R. I
teach my undergrads and graduates students statistics using R and often find
myself trying to solve problems to process student collected data in an
efficient way.
In this case, I have a data.frame with multiple observations. These are gas
concentrations in a chamber and are used to measure into rates,
2006 Jan 12
Firths bias correction for log-linear models
Dear R-Help List,
I'm trying to implement Firth's (1993) bias correction for log-linear models.
Firth (1993) states that such a correction can be implemented by supplementing
the data with a function of h_i, the diagonals from the hat matrix, but doesn't
provide further details. I can see that for a saturated log-linear model, h_i=1
for all i, hence one just adds 1/2 to each count,
2009 Feb 17
plot.lm: "Cook's distance" label can overplot point labels
The following code demonstrates an annoyance with plot.lm():
x11(width=3.75, height=4)
nihills.lm <- lm(log(time) ~ log(dist) + log(climb), data = nihills)
plot(nihills.lm, which=5)
OR try the following
xy <- data.frame(x=c(3,1:5), y=c(-2, 1:5))
plot(lm(y ~ x, data=xy), which=5)
The "Cook's distance" text overplots the label for the point with the
2006 Oct 24
Cook's Distance in GLM (PR#9316)
Hi Community,
I'm trying to reconcile Cook's Distances computed in glm. The
following snippet of code shows that the Cook's Distances contours on
the plot of Residuals v Leverage do not seem to be the same as the
values produced by cooks.distance() or in the Cook's Distance against
observation number plot.
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)