babs
2012-Jul-27  13:35 UTC
[R] producing a graph with glm poisson distributed respons count data and categorical independant variables
Hello,
I am working on my thesis and can't really figure out how to produce a
reasonable graph from the output from my glm., 
I could just give the R-output in my results and then discuss them, but it
would be more interesting if I could visualise what is going on.
My research is how bees react to different fieldmargins, for this I have 4
different types of field margin (A,B,C & D) and two different experiments,
one where the field margins are adjecent and one where they are seperated.
I wanted to know if the bees react differently on the different types of
field margin and whether there were differences between the two
experiments... I also used an offset function to correct for the different
number of field margins of the same type were the counts have been going on.
I counted the counts of the same fied margins together and then put in the
offset function.
So i used the model that is underneath: mengsel A, B, C & D=type of field
margin and proefopzet 1 and 2= experiment p1 and p2
I already checked if this saturated model is better then that without an
interaction effect:
so I think i have a good model for my data
Call:
glm(formula = count ~ mengsel * proefopzet, family = poisson, 
    data = a.data, offset = log(opp))
Deviance Residuals: 
[1]  0  0  0  0  0  0  0  0
Coefficients:
                      Estimate Std. Error z value Pr(>|z|)    
(Intercept)            5.01507    0.04704 106.622  < 2e-16 ***
mengselB               0.82654    0.05640  14.656  < 2e-16 ***
mengselC              -1.36441    0.12329 -11.067  < 2e-16 ***
mengselD              -2.30702    0.18854 -12.237  < 2e-16 ***
proefopzetp2          -0.92909    0.10303  -9.017  < 2e-16 ***
mengselB:proefopzetp2  0.14373    0.13399   1.073 0.283411    
mengselC:proefopzetp2 -2.02842    0.52307  -3.878 0.000105 ***
mengselD:proefopzetp2  3.03323    0.22821  13.291  < 2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
(Dispersion parameter for poisson family taken to be 1)
    Null deviance:  1.8216e+03  on 7  degrees of freedom
Residual deviance: -1.2212e-14  on 0  degrees of freedom
AIC: 67.589
Number of Fisher Scoring iterations: 3
> 
Now...how can i visualise this? I don't seem to find how i could do this...
The data on which this is computed is the following:> a.data
  count proefopzet mengsel opp
1  1033        p1               B            3
2    77           p1               C             2
3   452         p1               A             3
4    30          p1               D             2
5   157         p2               B             1
6     4            p2              C             2
7   119         p2               A             2
8   123         p2               D             1
Thanks,
Babs
 
--
View this message in context:
http://r.789695.n4.nabble.com/producing-a-graph-with-glm-poisson-distributed-respons-count-data-and-categorical-independant-variabs-tp4638110.html
Sent from the R help mailing list archive at Nabble.com.
ONKELINX, Thierry
2012-Jul-27  14:44 UTC
[R] producing a graph with glm poisson distributed respons count data and categorical independant variables
Dear Babs,
This is how I would present the model, if I had enough data to support the
model. The model is too complicated for your data and leads to a perfect fit. Is
this the aggregated dataset, or does your design has no replicates?
Best regards,
Thierry
dataset$combinatie <- dataset$proefopzet:dataset$mengsel
model <- glm(count ~ offset(log(opp)) + proefopzet * mengsel, data = dataset,
family = poisson)
modelPlot <- glm(count ~ offset(log(opp)) + 0 + combinatie, data = dataset,
family = poisson)
summary(model)
summary(modelPlot)
all.equal(fitted(model), fitted(modelPlot))
library(multcomp)
extra <- as.data.frame(confint(glht(modelPlot))$confint)
extra <- exp(extra)
extra$proefopzet <- factor(substr(rownames(extra), 11, 12))
extra$mengsel <- factor(substr(rownames(extra), 14, 14))
dataset <- merge(dataset, extra)
library(ggplot2)
ggplot(dataset, aes(x = mengsel, y = count / opp, ymin = lwr, ymax = upr, colour
= proefopzet)) + geom_errorbar() + geom_point(shape = "O") +
geom_point(aes(y = Estimate))
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
Thierry.Onkelinx at inbo.be
www.inbo.be
To call in the statistician after the experiment is done may be no more than
asking him to perform a post-mortem examination: he may be able to say what the
experiment died of.
~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data.
~ Roger Brinner
The combination of some data and an aching desire for an answer does not ensure
that a reasonable answer can be extracted from a given body of data.
~ John Tukey
-----Oorspronkelijk bericht-----
Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
Namens babs
Verzonden: vrijdag 27 juli 2012 15:35
Aan: r-help at r-project.org
Onderwerp: [R] producing a graph with glm poisson distributed respons count data
and categorical independant variables
Hello,
I am working on my thesis and can't really figure out how to produce a
reasonable graph from the output from my glm.,
I could just give the R-output in my results and then discuss them, but it
would be more interesting if I could visualise what is going on.
My research is how bees react to different fieldmargins, for this I have 4
different types of field margin (A,B,C & D) and two different experiments,
one where the field margins are adjecent and one where they are seperated.
I wanted to know if the bees react differently on the different types of
field margin and whether there were differences between the two
experiments... I also used an offset function to correct for the different
number of field margins of the same type were the counts have been going on.
I counted the counts of the same fied margins together and then put in the
offset function.
So i used the model that is underneath: mengsel A, B, C & D=type of field
margin and proefopzet 1 and 2= experiment p1 and p2
I already checked if this saturated model is better then that without an
interaction effect:
so I think i have a good model for my data
Call:
glm(formula = count ~ mengsel * proefopzet, family = poisson,
    data = a.data, offset = log(opp))
Deviance Residuals:
[1]  0  0  0  0  0  0  0  0
Coefficients:
                      Estimate Std. Error z value Pr(>|z|)
(Intercept)            5.01507    0.04704 106.622  < 2e-16 ***
mengselB               0.82654    0.05640  14.656  < 2e-16 ***
mengselC              -1.36441    0.12329 -11.067  < 2e-16 ***
mengselD              -2.30702    0.18854 -12.237  < 2e-16 ***
proefopzetp2          -0.92909    0.10303  -9.017  < 2e-16 ***
mengselB:proefopzetp2  0.14373    0.13399   1.073 0.283411
mengselC:proefopzetp2 -2.02842    0.52307  -3.878 0.000105 ***
mengselD:proefopzetp2  3.03323    0.22821  13.291  < 2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
(Dispersion parameter for poisson family taken to be 1)
    Null deviance:  1.8216e+03  on 7  degrees of freedom
Residual deviance: -1.2212e-14  on 0  degrees of freedom
AIC: 67.589
Number of Fisher Scoring iterations: 3
>
Now...how can i visualise this? I don't seem to find how i could do this...
The data on which this is computed is the following:> a.data
  count proefopzet mengsel opp
1  1033        p1               B            3
2    77           p1               C             2
3   452         p1               A             3
4    30          p1               D             2
5   157         p2               B             1
6     4            p2              C             2
7   119         p2               A             2
8   123         p2               D             1
Thanks,
Babs
--
View this message in context:
http://r.789695.n4.nabble.com/producing-a-graph-with-glm-poisson-distributed-respons-count-data-and-categorical-independant-variabs-tp4638110.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the writer
and may not be regarded as stating an official position of INBO, as long as the
message is not confirmed by a duly signed document.
babs
2012-Jul-28  10:10 UTC
[R] producing a graph with glm poisson distributed respons count data and categorical independant variables
Hi,
I do have replicates, but not many. The offsets are my the number of
replicates actually, i misled myself by thinking i could add the counts of
the replicates all up and then go further with the offset function. I 
decided to split up my experiments in order to interpret them, they were
actually from the beginning split up experiments. Because i thought i could
see if the bees reacted in a same manner to the different experiments i
thought i could analyse it with an interaction factor to reveal whether this
was true. But i want now want to analyse them apart so i can draw
conclusions from the both experiments apart.
I did the following  with the counts from p1, including the replicates so  i
have only the type of field-margin as a variable, as i was only interested
in this from the beginning:
 
mengsel	count
C	39
C	38
A	79
A	96
A	278
D	15
D	15
B	322
B	449
B	262
a.data.p1<-read.table("a.data.p1.txt",header=TRUE,sep="")
a.data.p1
fit.sat.a.p1<-glm(count~mengsel,data=a.data.p1,family=poisson)
anova(fit.sat.a.p1,test="Chisq")
fit.main.a.p1<-glm(count~1,data=a.data.p1,family=poisson)
anova(fit.sat.a.p1,fit.main.a.p1,test="Chisq")
extractAIC(fit.sat.a.p1)
extractAIC(fit.main.a.p1)
summary(fit.sat.a.p1)
This tells me that the saturated model is better explaning then the other
so:
Call:
glm(formula = count ~ mengsel, family = poisson, data = a.data.p1)
Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-6.4531  -3.7799  -0.0404   0.0603   9.2385  
Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  5.01728    0.04698  106.79   <2e-16 ***
mengselB     0.82433    0.05635   14.63   <2e-16 ***
mengselC    -1.36662    0.12327  -11.09   <2e-16 ***
mengselD    -2.30923    0.18852  -12.25   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
(Dispersion parameter for poisson family taken to be 1)
    Null deviance: 1385.58  on 9  degrees of freedom
Residual deviance:  202.01  on 6  degrees of freedom
AIC: 273.15
Number of Fisher Scoring iterations: 5
How can i produce a graph for this?
I am worried that i still do not have enough replicates to actually draw
descent conclusions or conclusions at all...in my second experiment i do not
have replicates  for 2 out of four types of field margins, for the other two
i only have 2 replicates. 
this being the counts from the second experiment:
C	3	p2
C	1	p2
A	90	p2
A	29	p2
D	0	p2
B	157	p2
thanks,
babs
--
View this message in context:
http://r.789695.n4.nabble.com/producing-a-graph-with-glm-poisson-distributed-respons-count-data-and-categorical-independant-variabs-tp4638110p4638192.html
Sent from the R help mailing list archive at Nabble.com.
Possibly Parallel Threads
- new CRAN package : modelplotr
- new CRAN package : modelplotr
- Automatic printer-driver download - slow respons from printers
- chan_sip.c:10173 handle_response: Dont know how to handle a 202 Accepted respons
- glmulti runs indefinitely when using genetic algorithm with lme4