Displaying 20 results from an estimated 300 matches similar to: "Cook's distance"
2009 Nov 08
2
influence.measures(stats): hatvalues(model, ...)
Hello:
I am trying to understand the method 'hatvalues(...)', which returns something similar to the diagonals of the plain vanilla hat matrix [X(X'X)^(-1)X'], but not quite.
A Fortran programmer I am not, but tracing through the code it looks like perhaps some sort of correction based on the notion of 'leave-one-out' variance is being applied.
Whatever the
2011 Oct 18
1
cygwing warming when creating a package in windows
Dear All,
I am a beginner creating R packages. I followed the Leisch (2009) tutorial
and the document ?Writing R Extensions? to write an example.
I installed R 2.12.2 (I also tried R2.13.2), the last version of Rtools and
the recommended packages in a PC with Windows 7 Home Premium.
I can run R CMD INSTALL linmod in the command prompt and the R CMD check
linmod. The following outputs are
2009 Oct 26
2
What is the most efficient practice to develop an R package?
I am reading Section 5 and 6 of
http://cran.r-project.org/doc/contrib/Leisch-CreatingPackages.pdf
It seems that I have to do the following two steps in order to make an
R package. But when I am testing these package, these two steps will
run many times, which may take a lot of time. So when I still develop
the package, shall I always source('linmod.R') to test it. Once the
code in
2012 Feb 09
1
passing an extra argument to an S3 generic
I'm trying to write some functions extending influence measures to
multivariate linear models and also
allow subsets of size m>=1 to be considered for deletion diagnostics.
I'd like these to work roughly parallel
to those functions for the univariate lm where only single case deletion
(m=1) diagnostics are considered.
Corresponding to stats::hatvalues.lm, the S3 method for class
2010 Apr 27
3
Problem calculating multiple regressions on a data frame.
Hi there,
I am stuck trying to solve what should be a fairly easy problem.
I have a data frame that essentially consists of (ID, time as seqMonth,
variable, value) and i want to find the regression coefficient of value vs
time for each combination of ID and Variable.
I have tried several approaches and none of them seems to work as i
expected.
For example, i have tried:
2006 Jan 11
1
updating formula inside function
Dear R-Helpers
Given a function like
foo <- function(data,var1,var2,var3) {
f <- formula(paste(var1,'~',paste(var2,var3,sep='+'),sep=''))
linmod <- lm(f)
return(linmod)
}
By typing
foo(mydata,'a','b','c')
I get the result of the linear model a~b+c.
How can I rewrite the function so that the formula can be updated inside
the function,
2009 Mar 05
1
hatvalues?
I am struiggling a bit with this function 'hatvalues'. I would like a little more undrestanding than taking the black-box and using the values. I looked at the Fortran source and it is quite opaque to me. So I am asking for some help in understanding the theory. First, I take the simplest case of a single variant. For this I turn o John Fox's book, "Applied Regression Analysis
2010 Jun 18
1
How to calculate the robust standard error of the dependent variable
Hi, folks
linmod=y~x+z
summary(linmod)
The summary of linmod shows the standard error of the coefficients. How can
we get the sd of y and the robust standard errors in R?
Thanks!
[[alternative HTML version deleted]]
2010 Jun 21
2
How to predict the mean and variance of the dependent variable after regression
Hi, folks,
As seen in the following codes:
x1=rlnorm(10)
x2=rlnorm(10,mean=2)
y=rlnorm(10,mean=10)### Fake dataset
linmod=lm(log(y)~log(x1)+log(x2))
After the regression, I would like to know the mean of y. Since log(y) is
normal and y is lognormal, I need to know the mean and variance of log(y)
first. I tried mean (y) and mean(linmod), but either one is what I want.
Any tips?
Thanks in
2009 Sep 14
3
Eliminate cases in a subset of a dataframe
Hi folks,
I created a subset of a dataframe (i.e., selected only men):
subdata <- subset(data,data$gender==1)
After a residual diagnostic of a regression analysis, I detected three
outliers:
linmod <- lm(y ~ x, data=subdata)
plot(linmod)
Say, the cases 11,22, and 33 were outliers.
Here comes the problem: When I want to exclude these three cases in a
further regression analysis,
- for
2008 Nov 20
2
Identify command in R
Hi all,
In using the identify command, I get the following message
> plot(hatvalues(scireg3))
> abline(h=.0154,lty=2) # plots a reference line at (k + 1)/n
> identify(1:1165, hatvalues(scireg3),row.names(sciach))
Error in xy.coords(x, y) : 'x' and 'y' lengths differ
which doesn't allow me to see the observation number when I scroll over
with the mouse. What
2007 Oct 19
2
In a SLR, Why Does the Hat Matrix Depend on the Weights?
I understand that the hat matrix is a function of the predictor variable
alone. So, in the following example why do the values on the diagonal of the
hat matrix change when I go from an unweighted fit to a weighted fit? Is the
function hatvalues giving me something other than what I think it is?
library(ISwR)
data(thuesen)
attach(thuesen)
fit <- lm(short.velocity ~ blood.glucose)
2012 May 29
2
setting parameters equal in lm
Forgive me if this is a trivial question, but I couldn't find it an answer
in former forums. I'm trying to reproduce some SAS results where they set
two parameters equal. For example:
y = b1X1 + b2X2 + b1X3
Notice that the variables X1 and X3 both have the same slope and the
intercept has been removed. How do I get an estimate of this regression
model? I know how to remove the intercept
2008 Mar 10
3
Weighting data when running regressions
Dear R-Help,
I'm new to R and struggling with weighting data when I run regression. I've
tried to use search to solve my problem but haven't found anything helpful
so far.
I (successfully) import data from SPSS (15) and try to run a linear
regression on a subset of my data file where WEIGHT is the name of my
weighting variable (numeric), e.g.:
library(foreign)
2003 Jul 12
1
Problem with library "car"
I am using the Unix version of R (version 1.7.0), installed via fink on a G4
Macintosh. I recently upgraded from version 1.6.0 and found that the "car"
library now has a problem:
---Begin transcript---
>library(car)
Attaching package 'car':
The following object(s) are masked from package:base :
dfbeta dfbeta.lm dfbetas dfbetas.lm hatvalues hatvalues.lm
2011 Mar 27
1
Sweave: include a multi-page-pdf plot
Hi,
I'm just starting out with Sweave, and I can't get a plot(linmod) to
display all four plots:
<< bild >>=
x1 <- runif(100)
x2 <- rexp(100)
y <- 3 + 4*x1 + 5*x2 + rnorm(100)
mod <- lm(y~x1+x2)
plot(mod)
@
Some Text
<<fig=TRUE>>=
<<bild>>
@
This plots only the first image of the four-page plot.lm() result.
I don't want to use
2009 Nov 05
2
Using a by() function to process several regression (lm()) functions
Hello,
Thank you very much for looking at this. I have a "seasonal" user for R. I
teach my undergrads and graduates students statistics using R and often find
myself trying to solve problems to process student collected data in an
efficient way.
In this case, I have a data.frame with multiple observations. These are gas
concentrations in a chamber and are used to measure into rates,
2006 Jan 12
1
Firths bias correction for log-linear models
Dear R-Help List,
I'm trying to implement Firth's (1993) bias correction for log-linear models.
Firth (1993) states that such a correction can be implemented by supplementing
the data with a function of h_i, the diagonals from the hat matrix, but doesn't
provide further details. I can see that for a saturated log-linear model, h_i=1
for all i, hence one just adds 1/2 to each count,
2009 Feb 17
1
plot.lm: "Cook's distance" label can overplot point labels
The following code demonstrates an annoyance with plot.lm():
library(DAAGxtras)
x11(width=3.75, height=4)
nihills.lm <- lm(log(time) ~ log(dist) + log(climb), data = nihills)
plot(nihills.lm, which=5)
OR try the following
xy <- data.frame(x=c(3,1:5), y=c(-2, 1:5))
plot(lm(y ~ x, data=xy), which=5)
The "Cook's distance" text overplots the label for the point with the
2006 Oct 24
1
Cook's Distance in GLM (PR#9316)
Hi Community,
I'm trying to reconcile Cook's Distances computed in glm. The
following snippet of code shows that the Cook's Distances contours on
the plot of Residuals v Leverage do not seem to be the same as the
values produced by cooks.distance() or in the Cook's Distance against
observation number plot.
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)