R-users E-mail: r-help@r-project.org Hi! R-users. I am just wondering what the definition of "dffits" in R language is. Let me show you an simple example. function() { library(MASS) xx <- c(1,2,3,4,5) yy <- c(1,3,4,2,4) data1 <- data.frame(x=xx, y=yy) lm.out <- lm(y~., data=data1, x=T) lev1 <- lm.influence(lm.out)$hat sig1 <- lm.influence(lm.out)$sigma res1 <- residuals(lm.out) ey <- fitted(lm.out) py <- ey + res1/(1-lev1) df1 <- dffits(lm.out, infl = lm.influence(lm.out)) df1 <- dffits(lm.out) print("df1: dffits") print(df1) my_df1 <- (ey-py)/(sig1*sqrt(lev1)) print("my_df1") print(my_df1) my_df2 <- -lev1*(ey-py)/(sig1*sqrt(lev1)) print("my_df2") print(my_df2) } [1] "df1: dffits" 1 2 3 4 5 -1.3333333 0.4082483 0.6000000 -1.0475699 0.2672612 [1] "my_df1" 1 2 3 4 5 2.2222222 -1.3608276 -3.0000000 3.4918995 -0.4454354 [1] "my_df2" 1 2 3 4 5 -1.3333333 0.4082483 0.6000000 -1.0475699 0.2672612 I think that "my_df1" is "dffits"( http://en.wikipedia.org/wiki/DFFITS ), but in R language, "my_df2" gives the difinition of "dffits". Please let me know why. -- ***** r.otasuke@gmail.com ***** http://cse.naro.affrc.go.jp/takezawa/intro.html [[alternative HTML version deleted]]
Check out: http://en.wikipedia.org/wiki/DFFITS On Sun, Oct 19, 2008 at 1:26 AM, Kunio takezawa <r.otasuke at gmail.com> wrote:> R-users > E-mail: r-help at r-project.org > > Hi! R-users. > > I am just wondering what the definition of "dffits" in R language is. > Let me show you an simple example. > > function() { > library(MASS) > > xx <- c(1,2,3,4,5) > yy <- c(1,3,4,2,4) > > data1 <- data.frame(x=xx, y=yy) > lm.out <- lm(y~., data=data1, x=T) > lev1 <- lm.influence(lm.out)$hat > sig1 <- lm.influence(lm.out)$sigma > res1 <- residuals(lm.out) > > ey <- fitted(lm.out) > py <- ey + res1/(1-lev1) > > df1 <- dffits(lm.out, infl = lm.influence(lm.out)) > df1 <- dffits(lm.out) > print("df1: dffits") > print(df1) > > my_df1 <- (ey-py)/(sig1*sqrt(lev1)) > print("my_df1") > print(my_df1) > > my_df2 <- -lev1*(ey-py)/(sig1*sqrt(lev1)) > > print("my_df2") > print(my_df2) > } > > > [1] "df1: dffits" > 1 2 3 4 5 > -1.3333333 0.4082483 0.6000000 -1.0475699 0.2672612 > [1] "my_df1" > 1 2 3 4 5 > 2.2222222 -1.3608276 -3.0000000 3.4918995 -0.4454354 > [1] "my_df2" > 1 2 3 4 5 > -1.3333333 0.4082483 0.6000000 -1.0475699 0.2672612 > > I think that "my_df1" is "dffits"( http://en.wikipedia.org/wiki/DFFITS ), > but in R language, "my_df2" gives the difinition of "dffits". > Please let me know why. > > -- > ***** r.otasuke at gmail.com ***** > http://cse.naro.affrc.go.jp/takezawa/intro.html > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Dear Kunio, The approach in dffits() in R is equivalent to the definition of DFFITS_i in Belsley, Kuh, and Welch, Regression Diagnostics (which is, I believe the original source, or close to it), generalized to WLS. Possibly a more transparent definition would be dfs <- function(mod){ rs <- rstudent(mod) h <- hatvalues(mod) sqrt(h/(1 - h))*rs } I hope this helps, John ------------------------------ John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]On> Behalf Of Kunio takezawa > Sent: October-19-08 1:27 AM > To: r-help at r-project.org > Subject: [R] definition of "dffits" > > R-users > E-mail: r-help at r-project.org > > Hi! R-users. > > I am just wondering what the definition of "dffits" in R language is. > Let me show you an simple example. > > function() { > library(MASS) > > xx <- c(1,2,3,4,5) > yy <- c(1,3,4,2,4) > > data1 <- data.frame(x=xx, y=yy) > lm.out <- lm(y~., data=data1, x=T) > lev1 <- lm.influence(lm.out)$hat > sig1 <- lm.influence(lm.out)$sigma > res1 <- residuals(lm.out) > > ey <- fitted(lm.out) > py <- ey + res1/(1-lev1) > > df1 <- dffits(lm.out, infl = lm.influence(lm.out)) > df1 <- dffits(lm.out) > print("df1: dffits") > print(df1) > > my_df1 <- (ey-py)/(sig1*sqrt(lev1)) > print("my_df1") > print(my_df1) > > my_df2 <- -lev1*(ey-py)/(sig1*sqrt(lev1)) > > print("my_df2") > print(my_df2) > } > > > [1] "df1: dffits" > 1 2 3 4 5 > -1.3333333 0.4082483 0.6000000 -1.0475699 0.2672612 > [1] "my_df1" > 1 2 3 4 5 > 2.2222222 -1.3608276 -3.0000000 3.4918995 -0.4454354 > [1] "my_df2" > 1 2 3 4 5 > -1.3333333 0.4082483 0.6000000 -1.0475699 0.2672612 > > I think that "my_df1" is "dffits"( http://en.wikipedia.org/wiki/DFFITS ), > but in R language, "my_df2" gives the difinition of "dffits". > Please let me know why. > > -- > ***** r.otasuke at gmail.com ***** > http://cse.naro.affrc.go.jp/takezawa/intro.html > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.