Displaying 20 results from an estimated 10000 matches similar to: "Formula for whether hat value is influential?"
2010 Feb 21
1
tests for measures of influence in regression
influence.measures gives several measures of influence for each
observation (Cook's Distance, etc) and actually flags observations
that it determines are influential by any of the measures. Looks
good! But how does it discriminate between the influential and non-
influential observations by each of the measures? Like does it do a
Bonferroni-corrected t on the residuals identified by
2005 Sep 13
4
plot(<lm>): new behavior in R-2.2.0 alpha
As some of you R-devel readers may know, the plot() method for
"lm" objects is based in large parts on contributions by John
Maindonald, subsequently "massaged" by me and other R-core
members.
In the statistics litterature on applied regression, people have
had diverse oppinions on what (and how many!) plots should be
used for goodness-of-fit / residual diagnostics, and to my
1999 Jun 23
1
Influence.measures
I am using rw0641 with Windows 98. To list just the influential
repetitiones that result from "influence.measures", I am using the input
result <- lm(y~x)
and the code from the example in the help for "influence.measures"
INFLM <- function(result){
inflm <- influence.measures(result)
which(apply(inflm$is.inf,1,any))
}
It works fine up to now with the
2004 Feb 10
1
make check in 1.8.1.
I just (finally!!!) got R version 1.8.1 to configure and build under
Solaris 9 (after much travail; there were funnies in my environment
variables that mucked things up, but that's another story).
Anyhow, when I ran ``make check'' I got an error right toward the
end. Looking in the directory ``tests'' I found that the error was
associated with the file reg-tests-3.R, and the
2004 Mar 23
1
influence.measures, cooks.distance, and glm
Dear list,
I've noticed that influence.measures and cooks.distance gives different
results for non-gaussian GLMs. For example, using R-1.9.0 alpha
(2003-03-17) under Windows:
> ## Dobson (1990) Page 93: Randomized Controlled Trial :
> counts <- c(18,17,15,20,10,20,25,13,12)
> outcome <- gl(3,1,9)
> treatment <- gl(3,3)
> glm.D93 <- glm(counts ~ outcome +
2011 Nov 07
2
ordination in vegan: what does downweight() do?
Can anyone point me in the right direction of figuring out what downweight()
is doing?
I am using vegan to perform CCA on diatom assemblage data. I have a lot of
rare species, so I want to reduce the influence of rare species in my CCA. I
have read that some authors reduce rare species by only including species
with an abundance of at least 1% in at least one sample (other authors use
5% as a
2005 Feb 11
1
cook's distance in weighted regression
I have a puzzle as to how R is computing Cook's distance in weighted linear
regression.
In
this case cook's distance should be given not as in OLS case by
h_ii*r_i^2/(1-hii)^2 divided by k*s^2 (1)
(where r is plain unadjusted residual, k is number of parameters in model,
etc. )
but rather by
w_ii*h_ii*r_i^2/(1-hii)^2 divided by k*s^2,
2010 Aug 03
4
REmove level with zero observations
If I have a column with 2 levels, but one level has no remaining
observations. Can I remove the level?
Had intended to do it as listed below, but soon realized that even though
there are no observations, the level is still there.
For instance
summary(dbs3.train.sans.influential.obs$HAC)
yields
0 ,1
4685,0
nlevels(dbs3.train.sans.influential.obs$HAC)
yields
[1] 2
drop.list <- NULL
2004 Sep 12
2
Variable Importance in pls: R or B? (and in glpls?)
Dear R-users, dear Ron
I use pls from the pls.pcr package for classification. Since I need to
know which variables are most influential onto the classification
performance, what criteria shall I look at:
a) B, the array of regression coefficients for a certain model (means a
certain number of latent variables) (and: squared or absolute values?)
OR
b) the weight matrix RR (or R in the De
2006 Jan 10
2
standardized residuals (rstandard & plot.lm) (PR#8468)
This bug is not quite fixed - the example from my original report now =
works using R-2.2.1, but
plot(Uniform, 6)
does not. The bug is due to
if (show[6]) {
ymx <- max(cook, na.rm =3D TRUE) * 1.025
g <- hatval/(1 - hatval) # Potential division by zero here #
plot(g, cook, xlim =3D c(0, max(g)), ylim =3D c(0, ymx),=20
main =3D main, xlab =3D
2010 Feb 28
1
Gradient Boosting Trees with correlated predictors in gbm
Dear R users,
I’m trying to understand how correlated predictors impact the Relative
Importance measure in Stochastic Boosting Trees (J. Friedman). As Friedman
described “ …with single decision trees (referring to Brieman’s CART
algorithm), the relative importance measure is augmented by a strategy
involving surrogate splits intended to uncover the masking of influential
variables by others
2016 Jul 27
3
[RFC] One or many git repositories?
On 7/27/2016 12:17 PM, Chris Bieneman wrote:
>
> This is a really bad argument for large influential changes like this.
Quite the contrary---anybody can participate and anybody can express
their concerns, explain their goals, their workflow, etc. For a large
influential changes like this, "zoning out" is a poor choice of action.
> I suspect this is why the idea of having a
2010 May 05
2
OLS Regression diagnostic measures check list - what to consider?
Hello dear R help list,
I wish to compile a check-list for diagnostic measures for OLS regression.
My question:
Can you offer more (or newer) tests/measures for the validity of a linear
model then what is given here:
http://www.statmethods.net/stats/rdiagnostics.html
This resource gives a list of measures to test for:
OUTLIERS, INFLUENTIAL OBSERVATIONS, NON-NORMALITY, NON-CONSTANT ERROR
2011 Jan 27
1
Minor typo in influence.measures.Rd ?
Dear list,
There is, I believe, a minor typo in the example section of
influence.measures.Rd. In the final example the word `does` appears
where I suspect `dose` is required:
I couldn't remember exactly what format patches should be in, so here is
one as diff would produce:
Index: devel/src/library/stats/man/influence.measures.Rd
2011 Jan 17
1
Problem about for loop
Hi everyones, my function like;
e <- rnorm(n=50, mean=0, sd=sqrt(0.5625))
x0 <- c(rep(1,50))
x1 <- rnorm(n=50,mean=2,sd=1)
x2 <- rnorm(n=50,mean=2,sd=1)
x3 <- rnorm(n=50,mean=2,sd=1)
x4 <- rnorm(n=50,mean=2,sd=1)
y <- 1+ 2*x1+4*x2+3*x3+2*x4+e
x2[1] = 10 #influential observarion
y[1] = 10 #influential observarion
data.x <- matrix(c(x0,x1,x2,x3,x4),ncol=5)
data.y
2011 Feb 08
2
Ken Olsen od DEC, 1927-2011
A lot of us wouldn't be here without him. DEC made good, really reliable
hardware.
mark
<http://www.networkworld.com/news/2011/020711-kenneth-olsen-dec-obit.html>
2007 Jul 21
1
Gamma MLE
Hello,
I was asked to try the following code on R,
gamma.mles
function (xx,shape0,rate0)
{
n<- length(xx)
xbar<- mean(xx)
logxbar<- mean(log(xx))
theta<-c(shape0,rate0)
repeat {
theta0<- theta
shape<- theta0[1]
rate<- theta0[2]
S<- n*matrix(c(log(rate)-digamma(shape)+logxbar,shape/rate-xbar),ncol=1)
I<- n*matrix(c(trigamma(shape),-1/rate,-1/rate,shape/rate^2),ncol=2)
2009 Mar 31
1
CV and GCV for finding smoothness parameter
I received an assignment that I have to do in R, but I'm absolutely not very
good at it.
The task is the following:
http://www.nabble.com/file/p22804957/question8.jpg
To do this, we also get the following pieces of code (not in correct order):
http://www.nabble.com/file/p22804957/hints.jpg
I'm terrible at this and I'm completely stuck. The model I chose can be
found in here:
2011 Mar 20
2
Why unique(sample) decreases the performance ?
Hi,
I' am interested in differences between sample's result when samples consist
of full elements and consist of only distinct elements. When sample consist
of full elements it take about 120 sec., but when consist of only distinct
elements it take about 4.5 or 5 times more sec. I expected that opposite of
this result, because unique(sample) has less elements than full sample. Code
as
2017 Nov 19
2
Changeing logarithms
Hi!
I'm using a large panel data, and now I have faced some difficulties with
my analysis. The predictors are not normally distributed and there are
quite many outliers (some of them are influential though).
I have tried to change the logarythm, but i'm not sure, how to do that. I
want also draw a plot picture in which logarythms of predictors x and y are
changed. How could I do that?