similar to: Formula for whether hat value is influential?

Displaying 20 results from an estimated 10000 matches similar to: "Formula for whether hat value is influential?"

2010 Feb 21
1
tests for measures of influence in regression
influence.measures gives several measures of influence for each observation (Cook's Distance, etc) and actually flags observations that it determines are influential by any of the measures. Looks good! But how does it discriminate between the influential and non- influential observations by each of the measures? Like does it do a Bonferroni-corrected t on the residuals identified by
2005 Sep 13
4
plot(<lm>): new behavior in R-2.2.0 alpha
As some of you R-devel readers may know, the plot() method for "lm" objects is based in large parts on contributions by John Maindonald, subsequently "massaged" by me and other R-core members. In the statistics litterature on applied regression, people have had diverse oppinions on what (and how many!) plots should be used for goodness-of-fit / residual diagnostics, and to my
1999 Jun 23
1
Influence.measures
I am using rw0641 with Windows 98. To list just the influential repetitiones that result from "influence.measures", I am using the input result <- lm(y~x) and the code from the example in the help for "influence.measures" INFLM <- function(result){ inflm <- influence.measures(result) which(apply(inflm$is.inf,1,any)) } It works fine up to now with the
2004 Feb 10
1
make check in 1.8.1.
I just (finally!!!) got R version 1.8.1 to configure and build under Solaris 9 (after much travail; there were funnies in my environment variables that mucked things up, but that's another story). Anyhow, when I ran ``make check'' I got an error right toward the end. Looking in the directory ``tests'' I found that the error was associated with the file reg-tests-3.R, and the
2004 Mar 23
1
influence.measures, cooks.distance, and glm
Dear list, I've noticed that influence.measures and cooks.distance gives different results for non-gaussian GLMs. For example, using R-1.9.0 alpha (2003-03-17) under Windows: > ## Dobson (1990) Page 93: Randomized Controlled Trial : > counts <- c(18,17,15,20,10,20,25,13,12) > outcome <- gl(3,1,9) > treatment <- gl(3,3) > glm.D93 <- glm(counts ~ outcome +
2011 Nov 07
2
ordination in vegan: what does downweight() do?
Can anyone point me in the right direction of figuring out what downweight() is doing? I am using vegan to perform CCA on diatom assemblage data. I have a lot of rare species, so I want to reduce the influence of rare species in my CCA. I have read that some authors reduce rare species by only including species with an abundance of at least 1% in at least one sample (other authors use 5% as a
2005 Feb 11
1
cook's distance in weighted regression
I have a puzzle as to how R is computing Cook's distance in weighted linear regression. In this case cook's distance should be given not as in OLS case by h_ii*r_i^2/(1-hii)^2 divided by k*s^2 (1) (where r is plain unadjusted residual, k is number of parameters in model, etc. ) but rather by w_ii*h_ii*r_i^2/(1-hii)^2 divided by k*s^2,
2010 Aug 03
4
REmove level with zero observations
If I have a column with 2 levels, but one level has no remaining observations. Can I remove the level? Had intended to do it as listed below, but soon realized that even though there are no observations, the level is still there. For instance summary(dbs3.train.sans.influential.obs$HAC) yields 0 ,1 4685,0 nlevels(dbs3.train.sans.influential.obs$HAC) yields [1] 2 drop.list <- NULL
2004 Sep 12
2
Variable Importance in pls: R or B? (and in glpls?)
Dear R-users, dear Ron I use pls from the pls.pcr package for classification. Since I need to know which variables are most influential onto the classification performance, what criteria shall I look at: a) B, the array of regression coefficients for a certain model (means a certain number of latent variables) (and: squared or absolute values?) OR b) the weight matrix RR (or R in the De
2006 Jan 10
2
standardized residuals (rstandard & plot.lm) (PR#8468)
This bug is not quite fixed - the example from my original report now = works using R-2.2.1, but plot(Uniform, 6) does not. The bug is due to if (show[6]) { ymx <- max(cook, na.rm =3D TRUE) * 1.025 g <- hatval/(1 - hatval) # Potential division by zero here # plot(g, cook, xlim =3D c(0, max(g)), ylim =3D c(0, ymx),=20 main =3D main, xlab =3D
2010 Feb 28
1
Gradient Boosting Trees with correlated predictors in gbm
Dear R users, I’m trying to understand how correlated predictors impact the Relative Importance measure in Stochastic Boosting Trees (J. Friedman). As Friedman described “ …with single decision trees (referring to Brieman’s CART algorithm), the relative importance measure is augmented by a strategy involving surrogate splits intended to uncover the masking of influential variables by others
2016 Jul 27
3
[RFC] One or many git repositories?
On 7/27/2016 12:17 PM, Chris Bieneman wrote: > > This is a really bad argument for large influential changes like this. Quite the contrary---anybody can participate and anybody can express their concerns, explain their goals, their workflow, etc. For a large influential changes like this, "zoning out" is a poor choice of action. > I suspect this is why the idea of having a
2010 May 05
2
OLS Regression diagnostic measures check list - what to consider?
Hello dear R help list, I wish to compile a check-list for diagnostic measures for OLS regression. My question: Can you offer more (or newer) tests/measures for the validity of a linear model then what is given here: http://www.statmethods.net/stats/rdiagnostics.html This resource gives a list of measures to test for: OUTLIERS, INFLUENTIAL OBSERVATIONS, NON-NORMALITY, NON-CONSTANT ERROR
2011 Jan 27
1
Minor typo in influence.measures.Rd ?
Dear list, There is, I believe, a minor typo in the example section of influence.measures.Rd. In the final example the word `does` appears where I suspect `dose` is required: I couldn't remember exactly what format patches should be in, so here is one as diff would produce: Index: devel/src/library/stats/man/influence.measures.Rd
2011 Jan 17
1
Problem about for loop
Hi everyones, my function like; e <- rnorm(n=50, mean=0, sd=sqrt(0.5625)) x0 <- c(rep(1,50)) x1 <- rnorm(n=50,mean=2,sd=1) x2 <- rnorm(n=50,mean=2,sd=1) x3 <- rnorm(n=50,mean=2,sd=1) x4 <- rnorm(n=50,mean=2,sd=1) y <- 1+ 2*x1+4*x2+3*x3+2*x4+e x2[1] = 10 #influential observarion y[1] = 10 #influential observarion data.x <- matrix(c(x0,x1,x2,x3,x4),ncol=5) data.y
2011 Feb 08
2
Ken Olsen od DEC, 1927-2011
A lot of us wouldn't be here without him. DEC made good, really reliable hardware. mark <http://www.networkworld.com/news/2011/020711-kenneth-olsen-dec-obit.html>
2007 Jul 21
1
Gamma MLE
Hello, I was asked to try the following code on R, gamma.mles function (xx,shape0,rate0) { n<- length(xx) xbar<- mean(xx) logxbar<- mean(log(xx)) theta<-c(shape0,rate0) repeat { theta0<- theta shape<- theta0[1] rate<- theta0[2] S<- n*matrix(c(log(rate)-digamma(shape)+logxbar,shape/rate-xbar),ncol=1) I<- n*matrix(c(trigamma(shape),-1/rate,-1/rate,shape/rate^2),ncol=2)
2009 Mar 31
1
CV and GCV for finding smoothness parameter
I received an assignment that I have to do in R, but I'm absolutely not very good at it. The task is the following: http://www.nabble.com/file/p22804957/question8.jpg To do this, we also get the following pieces of code (not in correct order): http://www.nabble.com/file/p22804957/hints.jpg I'm terrible at this and I'm completely stuck. The model I chose can be found in here:
2011 Mar 20
2
Why unique(sample) decreases the performance ?
Hi, I' am interested in differences between sample's result when samples consist of full elements and consist of only distinct elements. When sample consist of full elements it take about 120 sec., but when consist of only distinct elements it take about 4.5 or 5 times more sec. I expected that opposite of this result, because unique(sample) has less elements than full sample. Code as
2017 Nov 19
2
Changeing logarithms
Hi! I'm using a large panel data, and now I have faced some difficulties with my analysis. The predictors are not normally distributed and there are quite many outliers (some of them are influential though). I have tried to change the logarythm, but i'm not sure, how to do that. I want also draw a plot picture in which logarythms of predictors x and y are changed. How could I do that?