R-users
E-mail: r-help@r-project.org
   Hi! R-users.
   I am just wondering what the definition of "dffits" in R language
is.
Let me show you an simple example.
function() {
  library(MASS)
  xx <- c(1,2,3,4,5)
  yy <- c(1,3,4,2,4)
  data1 <- data.frame(x=xx, y=yy)
  lm.out <- lm(y~., data=data1, x=T)
  lev1 <- lm.influence(lm.out)$hat
  sig1 <- lm.influence(lm.out)$sigma
  res1 <- residuals(lm.out)
  ey <- fitted(lm.out)
  py <- ey + res1/(1-lev1)
  df1 <- dffits(lm.out, infl = lm.influence(lm.out))
  df1 <- dffits(lm.out)
  print("df1: dffits")
  print(df1)
  my_df1 <- (ey-py)/(sig1*sqrt(lev1))
  print("my_df1")
  print(my_df1)
  my_df2 <- -lev1*(ey-py)/(sig1*sqrt(lev1))
  print("my_df2")
  print(my_df2)
}
[1] "df1: dffits"
         1          2          3          4          5
-1.3333333  0.4082483  0.6000000 -1.0475699  0.2672612
[1] "my_df1"
         1          2          3          4          5
 2.2222222 -1.3608276 -3.0000000  3.4918995 -0.4454354
[1] "my_df2"
         1          2          3          4          5
-1.3333333  0.4082483  0.6000000 -1.0475699  0.2672612
I think that "my_df1" is "dffits"(
http://en.wikipedia.org/wiki/DFFITS ),
but in R language, "my_df2" gives the difinition of
"dffits".
   Please let me know why.
-- 
*****    r.otasuke@gmail.com    *****
http://cse.naro.affrc.go.jp/takezawa/intro.html
	[[alternative HTML version deleted]]
Check out: http://en.wikipedia.org/wiki/DFFITS On Sun, Oct 19, 2008 at 1:26 AM, Kunio takezawa <r.otasuke at gmail.com> wrote:> R-users > E-mail: r-help at r-project.org > > Hi! R-users. > > I am just wondering what the definition of "dffits" in R language is. > Let me show you an simple example. > > function() { > library(MASS) > > xx <- c(1,2,3,4,5) > yy <- c(1,3,4,2,4) > > data1 <- data.frame(x=xx, y=yy) > lm.out <- lm(y~., data=data1, x=T) > lev1 <- lm.influence(lm.out)$hat > sig1 <- lm.influence(lm.out)$sigma > res1 <- residuals(lm.out) > > ey <- fitted(lm.out) > py <- ey + res1/(1-lev1) > > df1 <- dffits(lm.out, infl = lm.influence(lm.out)) > df1 <- dffits(lm.out) > print("df1: dffits") > print(df1) > > my_df1 <- (ey-py)/(sig1*sqrt(lev1)) > print("my_df1") > print(my_df1) > > my_df2 <- -lev1*(ey-py)/(sig1*sqrt(lev1)) > > print("my_df2") > print(my_df2) > } > > > [1] "df1: dffits" > 1 2 3 4 5 > -1.3333333 0.4082483 0.6000000 -1.0475699 0.2672612 > [1] "my_df1" > 1 2 3 4 5 > 2.2222222 -1.3608276 -3.0000000 3.4918995 -0.4454354 > [1] "my_df2" > 1 2 3 4 5 > -1.3333333 0.4082483 0.6000000 -1.0475699 0.2672612 > > I think that "my_df1" is "dffits"( http://en.wikipedia.org/wiki/DFFITS ), > but in R language, "my_df2" gives the difinition of "dffits". > Please let me know why. > > -- > ***** r.otasuke at gmail.com ***** > http://cse.naro.affrc.go.jp/takezawa/intro.html > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Dear Kunio,
The approach in dffits() in R is equivalent to the definition of DFFITS_i in
Belsley, Kuh, and Welch, Regression Diagnostics (which is, I believe the
original source, or close to it), generalized to WLS. Possibly a more
transparent definition would be
dfs <- function(mod){
	rs <- rstudent(mod)
	h <- hatvalues(mod)
	sqrt(h/(1 - h))*rs
	}
I hope this helps,
 John
------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at
r-project.org]
On> Behalf Of Kunio takezawa
> Sent: October-19-08 1:27 AM
> To: r-help at r-project.org
> Subject: [R] definition of "dffits"
> 
> R-users
> E-mail: r-help at r-project.org
> 
>    Hi! R-users.
> 
>    I am just wondering what the definition of "dffits" in R
language is.
> Let me show you an simple example.
> 
> function() {
>   library(MASS)
> 
>   xx <- c(1,2,3,4,5)
>   yy <- c(1,3,4,2,4)
> 
>   data1 <- data.frame(x=xx, y=yy)
>   lm.out <- lm(y~., data=data1, x=T)
>   lev1 <- lm.influence(lm.out)$hat
>   sig1 <- lm.influence(lm.out)$sigma
>   res1 <- residuals(lm.out)
> 
>   ey <- fitted(lm.out)
>   py <- ey + res1/(1-lev1)
> 
>   df1 <- dffits(lm.out, infl = lm.influence(lm.out))
>   df1 <- dffits(lm.out)
>   print("df1: dffits")
>   print(df1)
> 
>   my_df1 <- (ey-py)/(sig1*sqrt(lev1))
>   print("my_df1")
>   print(my_df1)
> 
>   my_df2 <- -lev1*(ey-py)/(sig1*sqrt(lev1))
> 
>   print("my_df2")
>   print(my_df2)
> }
> 
> 
> [1] "df1: dffits"
>          1          2          3          4          5
> -1.3333333  0.4082483  0.6000000 -1.0475699  0.2672612
> [1] "my_df1"
>          1          2          3          4          5
>  2.2222222 -1.3608276 -3.0000000  3.4918995 -0.4454354
> [1] "my_df2"
>          1          2          3          4          5
> -1.3333333  0.4082483  0.6000000 -1.0475699  0.2672612
> 
> I think that "my_df1" is "dffits"(
http://en.wikipedia.org/wiki/DFFITS ),
> but in R language, "my_df2" gives the difinition of
"dffits".
>    Please let me know why.
> 
> --
> *****    r.otasuke at gmail.com    *****
> http://cse.naro.affrc.go.jp/takezawa/intro.html
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.