Hi everyone,
I have a data frame (d), wich has the results of mosquitoes trapping in
three different places.
I suspect that one of these places (Local=='Palm') is biased by low
numbers and will yield slower slopes in the variance-mean regression over
the areas. I wonder if these slopes are diferents.
I've looked trought the support list for methods for comparing slopes and
found the three explained above:
#d - whole datafrmae
d$Local <- as.factor(d$Local)
dHT <- subset(d, Local!='Palm')
dP <- subset(d, Local!='Palm')
1 - Single dataframe procedure:
fit1 <- lm(log(varCDI) ~ Local + I(log(CDI)) + Local:I(log(CDI)), data=d)
fit2 <- lm(log(varCDI) ~ Local + I(log(CDI)) + Local:I(log(CDI)),
data=dHT)
a1 <- anova(fit1)
a2 <- anova(fit2)
a1
Local:I(log(CDI)) 2 1.183 0.592 7.8741 0.0005574 ***
Residuals 152 11.421 0.075
a2
Local:I(log(CDI)) 1 0.061 0.061 0.8156 0.36859
Residuals 102 7.643 0.075
>>>> Concluding that the places are different
2- proposed by Spencer Graves:
Spencer <- function(df1, df2){
fit1 <- lm( log(df1$varCDI)~log(df1$CDI) )
fit2 <- lm( log(df2$varCDI)~log(df2$CDI) )
s1 <- summary(fit1)$coefficients
s2 <- summary(fit2)$coefficients
db <- (s2[2,1]-s1[2,1])
#### TEST DEGREES OF FREEDOM
df <- (fit1$df.residual+fit2$df.residual)
### Pooled Variance ####
sd <- sqrt(s2[2,2]^2+s1[2,2]^2)
td <- db/sd
2*pt(-abs(td), df)
}
Spencer(dP, dHT)
[1] 0.0001717730
3- I wrote one based on Zar (1999) and Kleinbaum (1987) procedure:
Zar <- function(df1, df2){
fit1 <- lm( log(df1$varCDI)~log(df1$CDI) )
fit2 <- lm( log(df2$varCDI)~log(df2$CDI) )
s1 <- summary(fit1)$coefficients
s2 <- summary(fit2)$coefficients
db <- (s1[2,1]-s2[2,1])
#### TEST DEGREES OF FREEDOM
n1 <- fit1$df.residual
n2 <- fit2$df.residual
n <- n1 + n2
### Pooled Variance ####
ssx1 <- var(df1$CDI) * (n1+1)
ssx2 <- var(df2$CDI) * (n2+1)
RSS1 <- anova(fit1)[2,2]
RSS2 <- anova(fit2)[2,2]
sspvm <- ( RSS1 + RSS2 ) / n
sd1_2 <- sqrt( sspvm/ssx1 + sspvm/ssx2 )
td <- db/sd1_2
2*pt(-abs(td), n)
}
Zar(dP, dHT)
[1] 0.4040566
Is there anything wrong with the function I wrote?
Assuming that there is nothing wrong with the function I wrote, I'd like
to know why R dos not follow this procedure.
Why does the variation over the X-axis is not considered for the pooled
variance?
Perhaps a simpler question: how is calculated the coeficient standart
error wich is printed by 'summary'?
Thanks,
Arthur