thr3ads.net - similar to: "Formula for whether hat value is influential?"

Displaying 20 results from an estimated 10000 matches similar to: "Formula for whether hat value is influential?"

tests for measures of influence in regression

2010 Feb 21

tests for measures of influence in regression

influence.measures gives several measures of influence for each observation (Cook's Distance, etc) and actually flags observations that it determines are influential by any of the measures. Looks good! But how does it discriminate between the influential and non- influential observations by each of the measures? Like does it do a Bonferroni-corrected t on the residuals identified by

plot(<lm>): new behavior in R-2.2.0 alpha

2005 Sep 13

plot(<lm>): new behavior in R-2.2.0 alpha

As some of you R-devel readers may know, the plot() method for "lm" objects is based in large parts on contributions by John Maindonald, subsequently "massaged" by me and other R-core members. In the statistics litterature on applied regression, people have had diverse oppinions on what (and how many!) plots should be used for goodness-of-fit / residual diagnostics, and to my

Influence.measures

1999 Jun 23

Influence.measures

I am using rw0641 with Windows 98. To list just the influential repetitiones that result from "influence.measures", I am using the input result <- lm(y~x) and the code from the example in the help for "influence.measures" INFLM <- function(result){ inflm <- influence.measures(result) which(apply(inflm$is.inf,1,any)) } It works fine up to now with the

make check in 1.8.1.

2004 Feb 10

make check in 1.8.1.

I just (finally!!!) got R version 1.8.1 to configure and build under Solaris 9 (after much travail; there were funnies in my environment variables that mucked things up, but that's another story). Anyhow, when I ran ``make check'' I got an error right toward the end. Looking in the directory ``tests'' I found that the error was associated with the file reg-tests-3.R, and the

influence.measures, cooks.distance, and glm

2004 Mar 23

influence.measures, cooks.distance, and glm

Dear list, I've noticed that influence.measures and cooks.distance gives different results for non-gaussian GLMs. For example, using R-1.9.0 alpha (2003-03-17) under Windows: > ## Dobson (1990) Page 93: Randomized Controlled Trial : > counts <- c(18,17,15,20,10,20,25,13,12) > outcome <- gl(3,1,9) > treatment <- gl(3,3) > glm.D93 <- glm(counts ~ outcome +

ordination in vegan: what does downweight() do?

2011 Nov 07

ordination in vegan: what does downweight() do?

Can anyone point me in the right direction of figuring out what downweight() is doing? I am using vegan to perform CCA on diatom assemblage data. I have a lot of rare species, so I want to reduce the influence of rare species in my CCA. I have read that some authors reduce rare species by only including species with an abundance of at least 1% in at least one sample (other authors use 5% as a

cook's distance in weighted regression

2005 Feb 11

cook's distance in weighted regression

I have a puzzle as to how R is computing Cook's distance in weighted linear regression. In this case cook's distance should be given not as in OLS case by h_ii*r_i^2/(1-hii)^2 divided by k*s^2 (1) (where r is plain unadjusted residual, k is number of parameters in model, etc. ) but rather by w_ii*h_ii*r_i^2/(1-hii)^2 divided by k*s^2,

REmove level with zero observations

2010 Aug 03

REmove level with zero observations

If I have a column with 2 levels, but one level has no remaining observations. Can I remove the level? Had intended to do it as listed below, but soon realized that even though there are no observations, the level is still there. For instance summary(dbs3.train.sans.influential.obs$HAC) yields 0 ,1 4685,0 nlevels(dbs3.train.sans.influential.obs$HAC) yields [1] 2 drop.list <- NULL

Variable Importance in pls: R or B? (and in glpls?)

2004 Sep 12

Variable Importance in pls: R or B? (and in glpls?)

Dear R-users, dear Ron I use pls from the pls.pcr package for classification. Since I need to know which variables are most influential onto the classification performance, what criteria shall I look at: a) B, the array of regression coefficients for a certain model (means a certain number of latent variables) (and: squared or absolute values?) OR b) the weight matrix RR (or R in the De

standardized residuals (rstandard & plot.lm) (PR#8468)

2006 Jan 10

standardized residuals (rstandard & plot.lm) (PR#8468)

This bug is not quite fixed - the example from my original report now = works using R-2.2.1, but plot(Uniform, 6) does not. The bug is due to if (show[6]) { ymx <- max(cook, na.rm =3D TRUE) * 1.025 g <- hatval/(1 - hatval) # Potential division by zero here # plot(g, cook, xlim =3D c(0, max(g)), ylim =3D c(0, ymx),=20 main =3D main, xlab =3D

Gradient Boosting Trees with correlated predictors in gbm

2010 Feb 28

Gradient Boosting Trees with correlated predictors in gbm

Dear R users, I’m trying to understand how correlated predictors impact the Relative Importance measure in Stochastic Boosting Trees (J. Friedman). As Friedman described “ …with single decision trees (referring to Brieman’s CART algorithm), the relative importance measure is augmented by a strategy involving surrogate splits intended to uncover the masking of influential variables by others

[RFC] One or many git repositories?

2016 Jul 27

[RFC] One or many git repositories?

On 7/27/2016 12:17 PM, Chris Bieneman wrote: > > This is a really bad argument for large influential changes like this. Quite the contrary---anybody can participate and anybody can express their concerns, explain their goals, their workflow, etc. For a large influential changes like this, "zoning out" is a poor choice of action. > I suspect this is why the idea of having a

OLS Regression diagnostic measures check list - what to consider?

2010 May 05

OLS Regression diagnostic measures check list - what to consider?

Hello dear R help list, I wish to compile a check-list for diagnostic measures for OLS regression. My question: Can you offer more (or newer) tests/measures for the validity of a linear model then what is given here: http://www.statmethods.net/stats/rdiagnostics.html This resource gives a list of measures to test for: OUTLIERS, INFLUENTIAL OBSERVATIONS, NON-NORMALITY, NON-CONSTANT ERROR

Minor typo in influence.measures.Rd ?

2011 Jan 27

Minor typo in influence.measures.Rd ?

Dear list, There is, I believe, a minor typo in the example section of influence.measures.Rd. In the final example the word `does` appears where I suspect `dose` is required: I couldn't remember exactly what format patches should be in, so here is one as diff would produce: Index: devel/src/library/stats/man/influence.measures.Rd

Problem about for loop

2011 Jan 17

Problem about for loop

Hi everyones, my function like; e <- rnorm(n=50, mean=0, sd=sqrt(0.5625)) x0 <- c(rep(1,50)) x1 <- rnorm(n=50,mean=2,sd=1) x2 <- rnorm(n=50,mean=2,sd=1) x3 <- rnorm(n=50,mean=2,sd=1) x4 <- rnorm(n=50,mean=2,sd=1) y <- 1+ 2*x1+4*x2+3*x3+2*x4+e x2[1] = 10 #influential observarion y[1] = 10 #influential observarion data.x <- matrix(c(x0,x1,x2,x3,x4),ncol=5) data.y

Ken Olsen od DEC, 1927-2011

2011 Feb 08

Ken Olsen od DEC, 1927-2011

A lot of us wouldn't be here without him. DEC made good, really reliable hardware. mark <http://www.networkworld.com/news/2011/020711-kenneth-olsen-dec-obit.html>

Gamma MLE

2007 Jul 21

Gamma MLE

Hello, I was asked to try the following code on R, gamma.mles function (xx,shape0,rate0) { n<- length(xx) xbar<- mean(xx) logxbar<- mean(log(xx)) theta<-c(shape0,rate0) repeat { theta0<- theta shape<- theta0[1] rate<- theta0[2] S<- n*matrix(c(log(rate)-digamma(shape)+logxbar,shape/rate-xbar),ncol=1) I<- n*matrix(c(trigamma(shape),-1/rate,-1/rate,shape/rate^2),ncol=2)

CV and GCV for finding smoothness parameter

2009 Mar 31

CV and GCV for finding smoothness parameter

I received an assignment that I have to do in R, but I'm absolutely not very good at it. The task is the following: http://www.nabble.com/file/p22804957/question8.jpg To do this, we also get the following pieces of code (not in correct order): http://www.nabble.com/file/p22804957/hints.jpg I'm terrible at this and I'm completely stuck. The model I chose can be found in here:

Why unique(sample) decreases the performance ?

2011 Mar 20

Why unique(sample) decreases the performance ?

Hi, I' am interested in differences between sample's result when samples consist of full elements and consist of only distinct elements. When sample consist of full elements it take about 120 sec., but when consist of only distinct elements it take about 4.5 or 5 times more sec. I expected that opposite of this result, because unique(sample) has less elements than full sample. Code as

Changeing logarithms

2017 Nov 19

Changeing logarithms

Hi! I'm using a large panel data, and now I have faced some difficulties with my analysis. The predictors are not normally distributed and there are quite many outliers (some of them are influential though). I have tried to change the logarythm, but i'm not sure, how to do that. I want also draw a plot picture in which logarythms of predictors x and y are changed. How could I do that?

similar to: Formula for whether hat value is influential?