Hi A.K Regarding my question on comparing normal/ obese/overweight with blood pressure change, I did finally as per the first suggestion of stacking the data and creating a normal category . This only gives me a obese not obese 14, but when I did with the wide format hoping to get a obese14,normal14,overweight 14 Vs hibp 21, i could not complete any of the models. This time I classified obese=1 & overweight=1 as obese itself. Can you tell me if I can use the geese or geeglm function with this data eg: : HIBP~ time* Age Here age is a factor with 3 levels, time: 2 levels, HIBP = yes/no. It says geese/geeglm: contrast can be applied only with factor with 2 or more levels. What is the way to overcome this. Can I manipulate the data to make it work. I need to know if the demogrphic variables affect change in blood pressure status over time? How to get the p values with gee model? Thanks On Thu, Jan 3, 2013 at 5:06 AM, arun kirshna [via R] < ml-node+s789695n4654438h5@n4.nabble.com> wrote:> HI Rex, > If I take a small subset from your whole dataset, and go through your > codes: > BP_2b<-read.csv("BP_2b.csv",sep="\t") > BP.sub<-BP_2b[410:418,c(1,8:11,13)] #deleted the columns that are not > needed > BP.stacknormal<- subset(BP.subnew,Obese14==0 & Overweight14==0) > BP.stackObese <- subset(BP.subnew,Obese14==1) > BP.stackOverweight <- subset(BP.subnew,Overweight14==1) > BP.stacknormal$Categ<-"Normal14" > BP.stackObese$Categ<-"Obese14" > BP.stackOverweight$Categ <- "Overweight14" > BP.newObeseOverweightNormal<-rbind(BP.stacknormal,BP.stackObese,BP.stackOverweight) > > BP.newObeseOverweightNormal > # CODEA Obese14 Overweight14 Overweight21 Obese21 hibp21 Categ > #411 541 0 0 0 0 0 Normal14 > #415 545 0 0 1 1 1 Normal14 > #418 549 0 0 1 0 0 Normal14 > #413 543 1 0 1 1 0 Obese14 > #417 548 0 1 1 0 0 Overweight14 > BP.newObeseOverweightNormal$Categ<- > factor(BP.newObeseOverweightNormal$Categ) > BP.stack3 <- > reshape(BP.newObeseOverweightNormal,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),direction="long") > > library(car) > BP.stack3$time<-recode(BP.stack3$time,"1=14;2=21") > BP.stack3 #Here Normal14 gets repeated even at time==21. Given that you > are using the "Categ" and "time" #columns in the analysis, it will give > incorrect results. > # CODEA hibp21 Categ time Obese Overweight > #541.1 541 0 Normal14 14 0 0 > #545.1 545 1 Normal14 14 0 0 > #549.1 549 0 Normal14 14 0 0 > #543.1 543 0 Obese14 14 1 0 > #548.1 548 0 Overweight14 14 0 1 > #541.2 541 0 Normal14 21 0 0 > #545.2 545 1 Normal14 21 1 1 > #549.2 549 0 Normal14 21 0 1 > #543.2 543 0 Obese14 21 1 1 > #548.2 548 0 Overweight14 21 0 1 > #Even if I correct the above codes, this will give incorrect > results/(error as you shown) because the response variable (hibp21) gets > #repeated when you reshape it from wide to long. > > The correct classification might be: > BP_2b<-read.csv("BP_2b.csv",sep="\t") > BP.sub<-BP_2b[410:418,c(1,8:11,13)] > BP.subnew<-reshape(BP.sub,idvar="CODEA",timevar="time",sep="",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),direction="long") > > BP.subnew$time<-recode(BP.subnew$time,"1=14;2=21") > BP.subnew<-na.omit(BP.subnew) > > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==14 & > BP.subnew$Obese==0]<-"Overweight14" > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==21 & > BP.subnew$Obese==0]<-"Overweight21" > BP.subnew$Categ[BP.subnew$Obese==1 & BP.subnew$time==14 & > BP.subnew$Overweight==0]<-"Obese14" > BP.subnew$Categ[BP.subnew$Obese==1 & BP.subnew$time==21 & > BP.subnew$Overweight==0]<-"Obese21" > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==21& > BP.subnew$Obese==1]<-"ObeseOverweight21" > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==14& > BP.subnew$Obese==1]<-"ObeseOverweight14" > BP.subnew$Categ[BP.subnew$Overweight==0 & BP.subnew$Obese==0 > &BP.subnew$time==14]<-"Normal14" > BP.subnew$Categ[BP.subnew$Overweight==0 & BP.subnew$Obese==0 > &BP.subnew$time==21]<-"Normal21" > > BP.subnew$Categ<-factor(BP.subnew$Categ) > BP.subnew$time<-factor(BP.subnew$time) > BP.subnew > # CODEA hibp21 time Obese Overweight Categ > #541.1 541 0 14 0 0 Normal14 > #543.1 543 0 14 1 0 Obese14 > #545.1 545 1 14 0 0 Normal14 > #548.1 548 0 14 0 1 Overweight14 > #549.1 549 0 14 0 0 Normal14 > #541.2 541 0 21 0 0 Normal21 > #543.2 543 0 21 1 1 ObeseOverweight21 > #545.2 545 1 21 1 1 ObeseOverweight21 > #548.2 548 0 21 0 1 Overweight21 > #549.2 549 0 21 0 1 Overweight21 > > #NOw with the whole dataset: > BP.sub<-BP_2b[,c(1,8:11,13)] #change here and paste the above lines: > head(BP.subnew) > # CODEA hibp21 time Obese Overweight Categ > #3.1 3 0 14 0 0 Normal14 > #7.1 7 0 14 0 0 Normal14 > #8.1 8 0 14 0 0 Normal14 > #9.1 9 0 14 0 0 Normal14 > #14.1 14 1 14 0 0 Normal14 > #21.1 21 0 14 0 0 Normal14 > > tail(BP.subnew) > # CODEA hibp21 time Obese Overweight Categ > #8485.2 8485 0 21 1 1 ObeseOverweight21 > #8506.2 8506 0 21 0 1 Overweight21 > #8520.2 8520 0 21 0 0 Normal21 > #8529.2 8529 1 21 1 1 ObeseOverweight21 > #8550.2 8550 0 21 1 1 ObeseOverweight21 > #8554.2 8554 0 21 0 0 Normal21 > > summary(lme.1 <- lme(hibp21~time+Categ+ time*Categ, > data=BP.subnew,random=~1|CODEA, na.action=na.omit)) > #Error in MEEM(object, conLin, control$niterEM) : > #Singularity in backsolve at level 0, block 1 > #May be because of the reasons I mentioned above. > > #YOu didn't mention the library(gee) > BP.gee8 <- gee(hibp21~time+Categ+time*Categ, > data=BP.subnew,id=CODEA,family=binomial, > corstr="exchangeable",na.action=na.omit) > #Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27 > #Error in gee(hibp21 ~ time + Categ + time * Categ, data = BP.subnew, id > CODEA, : > #rank-deficient model matrix > With your codes, it might have worked, but the results may be inaccurate > # After running your whole codes: > BP.gee8 <- gee(hibp21~time+Categ+time*Categ, > data=BP.stack3,id=CODEA,family=binomial, > corstr="exchangeable",na.action=na.omit) > #Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27 > #running glm to get initial regression estimate > # (Intercept) time CategObese14 > # -2.456607e+01 9.940875e-15 2.087584e-13 > # CategOverweight14 time:CategObese14 time:CategOverweight14 > # 2.087584e-13 -9.940875e-15 -9.940875e-15 > #Error in gee(hibp21 ~ time + Categ + time * Categ, data = BP.stack3, id > CODEA, : > # Cgee: error: logistic model for probability has fitted value very close > to 1. > #estimates diverging; iteration terminated. > > In short, I think it would be better to go with the suggestion in my > previous email with adequate changes in "Categ" variable (adding > ObeseOverweight14, ObeseOverweight21 etc) as I showed here. > > A.K. > > > > > > > > > ------------------------------ > If you reply to this email, your message will be added to the discussion > below: > http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654438.html > To unsubscribe from random effects model, click here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4654346&code=dXNoYS5uYXRoYW5AZ21haWwuY29tfDQ2NTQzNDZ8MjAyMjE1NDI0> > . > NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >-- View this message in context: http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654776.html Sent from the R help mailing list archive at Nabble.com. [[alternative HTML version deleted]]
Hi, I am? not very familiar with the geese/geeglm().? Is it from library(geepack)? Regarding your question: " Can you tell me if I can use the geese or geeglm function with this data eg: : HIBP~ time* Age Here age is a factor with 3 levels, time: 2 levels, HIBP = yes/no. From your original data: BP_2b<-read.csv("BP_2b.csv",sep="\t") head(BP_2b,2) #? CODEA Sex MaternalAge Education Birthplace AggScore IntScore Obese14 #1???? 1? NA?????????? 3???????? 4????????? 1?????? NA?????? NA????? NA #2???? 3?? 2?????????? 3???????? 3????????? 1??????? 0??????? 0?????? 0 ?# Overweight14 Overweight21 Obese21 hibp14 hibp21 #1?????????? NA?????????? NA????? NA???? NA???? NA #2??????????? 0??????????? 1?????? 0????? 0????? 0 If I understand your new classification: BP.stacknormal<- subset(BP_2b,Obese14==0 & Overweight14==0 & Obese21==0 & Overweight21==0) BP.stackObese <- subset(BP_2b,(Obese14==1& Overweight14==0 & Obese14==1&Overweight14==1)|(Obese14==1&Overweight14==1 & Obese21==1 & Overweight21==0)|(Obese14==1&Overweight14==0 & Obese21==0 & Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==1 & Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==1 & Overweight21==1)|(Obese14==0 & Overweight14==1 & Obese21==1 &Overweight21==1)|(Obese14==1& Overweight14==1 & Obese21==1& Overweight21==1)) #check whether there are more classification that fits to #Obese ?BP.stackOverweight <- subset(BP_2b,(Obese14==0 & Overweight14==1 & Obese21==0 & Overweight21==1)|(Obese14==0 &Overweight14==1 & Obese21==0 & Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==0 & Overweight21==1)) BP.stacknormal$Categ<-"Normal" BP.stackObese$Categ<-"Obese" BP.stackOverweight$Categ <- "Overweight" ?BP.newObeseOverweightNormal<-na.omit(rbind(BP.stacknormal,BP.stackObese,BP.stackOverweight)) ?nrow(BP.newObeseOverweightNormal) #[1] 1581 BP.stack3 <- reshape(BP.newObeseOverweightNormal,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21"),c("hibp14","hibp21")),v.names=c("Obese","Overweight","hibp"),direction="long") library(car) BP.stack3$time<-recode(BP.stack3$time,"1=14;2=21") head(BP.stack3,2) ? #? CODEA Sex MaternalAge Education Birthplace AggScore IntScore? Categ time #8.1???? 8?? 2?????????? 4???????? 4????????? 1??????? 0??????? 0 Normal?? 14 #9.1???? 9?? 1?????????? 3???????? 6????????? 2??????? 0??????? 0 Normal?? 14 ? #? Obese Overweight hibp #8.1???? 0????????? 0??? 0 Now, your formula: (HIBP~time*Age), is it MaternalAge? If it is, it has three values unique(BP.stack3$MaternalAge) #[1] 4 3 5 and for time (14,21) # If it says that geese/geeglm, contrasts could be applied with factors>=2 levels, what is the problem? If you take "Categ" variable, it also has 3 levels (Normal, Obese, Overweight). ?BP.stack3$MaternalAge<-factor(BP.stack3$MaternalAge) ?BP.stack3$time<-factor(BP.stack3$time) library(geepack) For your last question about how to get the p-values: # Using one of the example datasets: data(seizure) ???? seiz.l <- reshape(seizure, ?????????????????????? varying=list(c("base","y1", "y2", "y3", "y4")), ?????????????????????? v.names="y", times=0:4, direction="long") ???? seiz.l <- seiz.l[order(seiz.l$id, seiz.l$time),] ???? seiz.l$t <- ifelse(seiz.l$time == 0, 8, 2) ???? seiz.l$x <- ifelse(seiz.l$time == 0, 0, 1) ???? m1 <- geese(y ~ offset(log(t)) + x + trt + x:trt, id = id, ???????????????? data=seiz.l, corstr="exch", family=poisson) ???? summary(m1) ???? ?summary(m1)$mean["p"] #??????????????????? p #(Intercept) 0.0000000 #x?????????? 0.3347040 #trt???????? 0.9011982 #x:trt?????? 0.6236769 #If you need the p-values of the scale ?? summary(m1)$scale["p"] ?# ????????????????? p #(Intercept) 0.0254634 Hope it helps. A.K. ----- Original Message ----- From: rex2013 <usha.nathan at gmail.com> To: r-help at r-project.org Cc: Sent: Sunday, January 6, 2013 4:55 AM Subject: Re: [R] random effects model Hi A.K Regarding my question on comparing normal/ obese/overweight with blood pressure change, I did finally as per the first suggestion of stacking the data and creating a normal category . This only gives me a obese not obese 14, but when I did with the wide format hoping to? get? a obese14,normal14,overweight 14 Vs hibp 21, i could not complete any of the models. This time I classified obese=1 & overweight=1 as obese itself. Can you tell me if I can use the geese or geeglm function with this data eg: : HIBP~ time* Age Here age is a factor with 3 levels, time: 2 levels, HIBP = yes/no. It says geese/geeglm: contrast can be applied only with factor with 2 or more levels. What is the way to overcome this. Can I manipulate the data to make it work. I need to know if the demogrphic variables affect change in blood pressure status over time? How to get the p values with gee model? Thanks On Thu, Jan 3, 2013 at 5:06 AM, arun kirshna [via R] < ml-node+s789695n4654438h5 at n4.nabble.com> wrote:> HI Rex, > If I take a small subset from your whole dataset, and go through your > codes: > BP_2b<-read.csv("BP_2b.csv",sep="\t") >? BP.sub<-BP_2b[410:418,c(1,8:11,13)] #deleted the columns that are not > needed >? BP.stacknormal<- subset(BP.subnew,Obese14==0 & Overweight14==0) > BP.stackObese <- subset(BP.subnew,Obese14==1) >? BP.stackOverweight <- subset(BP.subnew,Overweight14==1) > BP.stacknormal$Categ<-"Normal14" > BP.stackObese$Categ<-"Obese14" > BP.stackOverweight$Categ <- "Overweight14" >? BP.newObeseOverweightNormal<-rbind(BP.stacknormal,BP.stackObese,BP.stackOverweight) > >? BP.newObeseOverweightNormal > #? ? CODEA Obese14 Overweight14 Overweight21 Obese21 hibp21? ? ? ? Categ > #411? 541? ? ? 0? ? ? ? ? ? 0? ? ? ? ? ? 0? ? ? 0? ? ? 0? ? Normal14 > #415? 545? ? ? 0? ? ? ? ? ? 0? ? ? ? ? ? 1? ? ? 1? ? ? 1? ? Normal14 > #418? 549? ? ? 0? ? ? ? ? ? 0? ? ? ? ? ? 1? ? ? 0? ? ? 0? ? Normal14 > #413? 543? ? ? 1? ? ? ? ? ? 0? ? ? ? ? ? 1? ? ? 1? ? ? 0? ? ? Obese14 > #417? 548? ? ? 0? ? ? ? ? ? 1? ? ? ? ? ? 1? ? ? 0? ? ? 0 Overweight14 > BP.newObeseOverweightNormal$Categ<- > factor(BP.newObeseOverweightNormal$Categ) > BP.stack3 <- > reshape(BP.newObeseOverweightNormal,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),direction="long") > > library(car) > BP.stack3$time<-recode(BP.stack3$time,"1=14;2=21") > BP.stack3 #Here Normal14 gets repeated even at time==21.? Given that you > are using the "Categ" and "time" #columns in the analysis, it will give > incorrect results. > #? ? ? CODEA hibp21? ? ? ? Categ time Obese Overweight > #541.1? 541? ? ? 0? ? Normal14? 14? ? 0? ? ? ? ? 0 > #545.1? 545? ? ? 1? ? Normal14? 14? ? 0? ? ? ? ? 0 > #549.1? 549? ? ? 0? ? Normal14? 14? ? 0? ? ? ? ? 0 > #543.1? 543? ? ? 0? ? ? Obese14? 14? ? 1? ? ? ? ? 0 > #548.1? 548? ? ? 0 Overweight14? 14? ? 0? ? ? ? ? 1 > #541.2? 541? ? ? 0? ? Normal14? 21? ? 0? ? ? ? ? 0 > #545.2? 545? ? ? 1? ? Normal14? 21? ? 1? ? ? ? ? 1 > #549.2? 549? ? ? 0? ? Normal14? 21? ? 0? ? ? ? ? 1 > #543.2? 543? ? ? 0? ? ? Obese14? 21? ? 1? ? ? ? ? 1 > #548.2? 548? ? ? 0 Overweight14? 21? ? 0? ? ? ? ? 1 > #Even if I correct the above codes, this will give incorrect > results/(error as you shown) because the response variable (hibp21) gets > #repeated when you reshape it from wide to long. > > The correct classification might be: > BP_2b<-read.csv("BP_2b.csv",sep="\t") >? BP.sub<-BP_2b[410:418,c(1,8:11,13)] > BP.subnew<-reshape(BP.sub,idvar="CODEA",timevar="time",sep="",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),direction="long") > > BP.subnew$time<-recode(BP.subnew$time,"1=14;2=21") >? BP.subnew<-na.omit(BP.subnew) > > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==14 & > BP.subnew$Obese==0]<-"Overweight14" > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==21 & > BP.subnew$Obese==0]<-"Overweight21" >? BP.subnew$Categ[BP.subnew$Obese==1 & BP.subnew$time==14 & > BP.subnew$Overweight==0]<-"Obese14" >? BP.subnew$Categ[BP.subnew$Obese==1 & BP.subnew$time==21 & > BP.subnew$Overweight==0]<-"Obese21" >? BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==21& > BP.subnew$Obese==1]<-"ObeseOverweight21" >? BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==14& > BP.subnew$Obese==1]<-"ObeseOverweight14" > BP.subnew$Categ[BP.subnew$Overweight==0 & BP.subnew$Obese==0 > &BP.subnew$time==14]<-"Normal14" >? BP.subnew$Categ[BP.subnew$Overweight==0 & BP.subnew$Obese==0 > &BP.subnew$time==21]<-"Normal21" > > BP.subnew$Categ<-factor(BP.subnew$Categ) > BP.subnew$time<-factor(BP.subnew$time) > BP.subnew > #? ? ? CODEA hibp21 time Obese Overweight? ? ? ? ? ? Categ > #541.1? 541? ? ? 0? 14? ? 0? ? ? ? ? 0? ? ? ? ? Normal14 > #543.1? 543? ? ? 0? 14? ? 1? ? ? ? ? 0? ? ? ? ? Obese14 > #545.1? 545? ? ? 1? 14? ? 0? ? ? ? ? 0? ? ? ? ? Normal14 > #548.1? 548? ? ? 0? 14? ? 0? ? ? ? ? 1? ? ? Overweight14 > #549.1? 549? ? ? 0? 14? ? 0? ? ? ? ? 0? ? ? ? ? Normal14 > #541.2? 541? ? ? 0? 21? ? 0? ? ? ? ? 0? ? ? ? ? Normal21 > #543.2? 543? ? ? 0? 21? ? 1? ? ? ? ? 1 ObeseOverweight21 > #545.2? 545? ? ? 1? 21? ? 1? ? ? ? ? 1 ObeseOverweight21 > #548.2? 548? ? ? 0? 21? ? 0? ? ? ? ? 1? ? ? Overweight21 > #549.2? 549? ? ? 0? 21? ? 0? ? ? ? ? 1? ? ? Overweight21 > > #NOw with the whole dataset: > BP.sub<-BP_2b[,c(1,8:11,13)] #change here and paste the above lines: >? head(BP.subnew) >? ? # CODEA hibp21 time Obese Overweight? ? Categ > #3.1? ? ? 3? ? ? 0? 14? ? 0? ? ? ? ? 0 Normal14 > #7.1? ? ? 7? ? ? 0? 14? ? 0? ? ? ? ? 0 Normal14 > #8.1? ? ? 8? ? ? 0? 14? ? 0? ? ? ? ? 0 Normal14 > #9.1? ? ? 9? ? ? 0? 14? ? 0? ? ? ? ? 0 Normal14 > #14.1? ? 14? ? ? 1? 14? ? 0? ? ? ? ? 0 Normal14 > #21.1? ? 21? ? ? 0? 14? ? 0? ? ? ? ? 0 Normal14 > > tail(BP.subnew) >? #? ? CODEA hibp21 time Obese Overweight? ? ? ? ? ? Categ > #8485.2? 8485? ? ? 0? 21? ? 1? ? ? ? ? 1 ObeseOverweight21 > #8506.2? 8506? ? ? 0? 21? ? 0? ? ? ? ? 1? ? ? Overweight21 > #8520.2? 8520? ? ? 0? 21? ? 0? ? ? ? ? 0? ? ? ? ? Normal21 > #8529.2? 8529? ? ? 1? 21? ? 1? ? ? ? ? 1 ObeseOverweight21 > #8550.2? 8550? ? ? 0? 21? ? 1? ? ? ? ? 1 ObeseOverweight21 > #8554.2? 8554? ? ? 0? 21? ? 0? ? ? ? ? 0? ? ? ? ? Normal21 > > summary(lme.1 <- lme(hibp21~time+Categ+ time*Categ, > data=BP.subnew,random=~1|CODEA, na.action=na.omit)) > #Error in MEEM(object, conLin, control$niterEM) : >? #Singularity in backsolve at level 0, block 1 > #May be because of the reasons I mentioned above. > > #YOu didn't mention the library(gee) > BP.gee8 <- gee(hibp21~time+Categ+time*Categ, > data=BP.subnew,id=CODEA,family=binomial, > corstr="exchangeable",na.action=na.omit) > #Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27 > #Error in gee(hibp21 ~ time + Categ + time * Categ, data = BP.subnew, id > CODEA,? : >? #rank-deficient model matrix > With your codes, it might have worked, but the results may be inaccurate > # After running your whole codes: >? BP.gee8 <- gee(hibp21~time+Categ+time*Categ, > data=BP.stack3,id=CODEA,family=binomial, > corstr="exchangeable",na.action=na.omit) > #Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27 > #running glm to get initial regression estimate >? ? #? ? ? ? (Intercept)? ? ? ? ? ? ? ? ? time? ? ? ? ? CategObese14 >? ? ? #? ? -2.456607e+01? ? ? ? ? 9.940875e-15? ? ? ? ? 2.087584e-13 >? ? # CategOverweight14? ? ? time:CategObese14 time:CategOverweight14 >? ? ? #? ? 2.087584e-13? ? ? ? ? -9.940875e-15? ? ? ? ? -9.940875e-15 > #Error in gee(hibp21 ~ time + Categ + time * Categ, data = BP.stack3, id > CODEA,? : >? # Cgee: error: logistic model for probability has fitted value very close > to 1. > #estimates diverging; iteration terminated. > > In short, I think it would be better to go with the suggestion in my > previous email with adequate changes in "Categ" variable (adding > ObeseOverweight14, ObeseOverweight21 etc) as I showed here. > > A.K. > > > > > > > > > ------------------------------ >? If you reply to this email, your message will be added to the discussion > below: > http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654438.html >? To unsubscribe from random effects model, click here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4654346&code=dXNoYS5uYXRoYW5AZ21haWwuY29tfDQ2NTQzNDZ8MjAyMjE1NDI0> > . > NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >-- View this message in context: http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654776.html Sent from the R help mailing list archive at Nabble.com. ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
HI, BP.stack5 is the one without missing values. na.omit(....).? Otherwise, I have to use the option na.action=.. in the ?geese() statement You need to read about the correlation structures.? IN unstructured option, more number of parameters needs to be estimated,? In repeated measures design, when the underlying structure is not known, it would be better to compare using different options (exchangeable is similar to compound symmetry) and select the one which provide the least value for AIC or BIC.? Have a look at http://stats.stackexchange.com/questions/21771/how-to-perform-model-selection-in-gee-in-r It's not clear to me? "reference to write about missing values". ?? A.K. ----- Original Message ----- From: Usha Gurunathan <usha.nathan at gmail.com> To: arun <smartpink111 at yahoo.com> Cc: Sent: Monday, January 7, 2013 6:12 PM Subject: Re: [R] random effects model Hi AK 2)I shall try putting exch. and check when I get home. Btw, what is BP.stack5? is it with missing values or only complete cases? I guess I am still not clear about the unstructured and exchangeable options, as in which one is better. 1)Rgding the summary(p): NA thing, I tried putting one of my gee equation. Can you suggest me a reference to write about" missing values and the implications for my results" Thanks. On 1/8/13, arun <smartpink111 at yahoo.com> wrote:> HI, > > Just to add: > fit3<-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack5,family=binomial,corstr="exch",scale.fix=TRUE) > #works >? summary(fit3)$mean["p"] > #? ? ? ? ? ? ? ? ? ? ? ? ? ? p > #(Intercept)? ? ? ? 0.00000000 > #MaternalAge4? ? ? ? 0.49099242 > #MaternalAge5? ? ? ? 0.04686295 > #time21? ? ? ? ? ? ? 0.86164351 > #MaternalAge4:time21 0.59258221 > #MaternalAge5:time21 0.79909832 > > fit4<-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack5,family=binomial,corstr="unstructured",scale.fix=TRUE) > #when the correlation structure is changed to "unstructured" > #Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : >? # contrasts can be applied only to factors with 2 or more levels > #In addition: Warning message: > #In is.na(rows) : is.na() applied to non-(list or vector) of type 'NULL' > > > Though, it works with data(Ohio) > > fit1<-geese(resp~age+smoke+age:smoke,id=id,data=ohio1,family=binomial,corstr="unstructured",scale.fix=TRUE) >? summary(fit1)$mean["p"] > #? ? ? ? ? ? ? ? ? ? ? p > #(Intercept)? 0.00000000 > #age-1? ? ? ? 0.60555454 > #age0? ? ? ? 0.45322698 > #age1? ? ? ? 0.01187725 > #smoke1? ? ? 0.86262269 > #age-1:smoke1 0.17239050 > #age0:smoke1? 0.32223942 > #age1:smoke1? 0.36686706 > > > > By checking: >? with(BP.stack5,table(MaternalAge,time)) > #? ? ? ? ? time > #MaternalAge? 14? 21 >? #? ? ? ? 3 1104? 864 >? ? #? ? ? 4? 875? 667 >? ? #? ? 5? 67? 53 #less number of observations > > >? BP.stack6 <- BP.stack5[order(BP.stack5$CODEA, BP.stack5$time),] >? head(BP.stack6)? # very few IDs with? MaternalAge==5 > #? ? ? X CODEA Sex MaternalAge Education Birthplace AggScore IntScore > #1493 3.1? ? 3? 2? ? ? ? ? 3? ? ? ? 3? ? ? ? ? 1? ? ? ? 0? ? ? ? 0 > #3202 3.2? ? 3? 2? ? ? ? ? 3? ? ? ? 3? ? ? ? ? 1? ? ? ? 0? ? ? ? 0 > #1306 7.1? ? 7? 2? ? ? ? ? 4? ? ? ? 6? ? ? ? ? 1? ? ? ? 0? ? ? ? 0 > #3064 7.2? ? 7? 2? ? ? ? ? 4? ? ? ? 6? ? ? ? ? 1? ? ? ? 0? ? ? ? 0 > #1? ? 8.1? ? 8? 2? ? ? ? ? 4? ? ? ? 4? ? ? ? ? 1? ? ? ? 0? ? ? ? 0 > #2047 8.2? ? 8? 2? ? ? ? ? 4? ? ? ? 4? ? ? ? ? 1? ? ? ? 0? ? ? ? 0 >? #? ? ? ? Categ time Obese Overweight hibp > #1493 Overweight? 14? ? 0? ? ? ? ? 0? ? 0 > #3202 Overweight? 21? ? 0? ? ? ? ? 1? ? 0 > #1306? ? ? Obese? 14? ? 0? ? ? ? ? 0? ? 0 > #3064? ? ? Obese? 21? ? 1? ? ? ? ? 1? ? 0 > #1? ? ? ? Normal? 14? ? 0? ? ? ? ? 0? ? 0 > #2047? ? Normal? 21? ? 0? ? ? ? ? 0? ? 0 > BP.stack7<-BP.stack6[BP.stack6$MaternalAge!=5,] >? BP.stack7$MaternalAge<-factor(as.numeric(as.character(BP.stack7$MaternalAge) >? fit5<-geese(hibp~MaternalAge*time,id=CODEA,data=BP.stack7,family=binomial,corstr="unstructured",scale.fix=TRUE) > #Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : >? # contrasts can be applied only to factors with 2 or more levels > >? with(BP.stack7,table(MaternalAge,time))? #It looks like the combinations > are still there > #? ? ? ? ? time > #MaternalAge? 14? 21 >? #? ? ? ? 3 1104? 864 >? ? #? ? ? 4? 875? 667 > > It works also with corstr="ar1".? Why do you gave the option > "unstructured"? > A.K. > > > > > > > ----- Original Message ----- > From: rex2013 <usha.nathan at gmail.com> > To: r-help at r-project.org > Cc: > Sent: Monday, January 7, 2013 6:15 AM > Subject: Re: [R] random effects model > > Hi A.K > > Below is the comment I get, not sure why. > > BP.sub3 is the stacked data without the missing values. > > BP.geese3 <- geese(HiBP~time*MaternalAge,data=BP.sub3,id=CODEA, > family=binomial, corstr="unstructured", na.action=na.omit)Error in > `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : >? contrasts can be applied only to factors with 2 or more levels > > Even though age has 3 levels; time has 14 years & 21 years; HIBP is a > binary response outcome. > > 2) When you mentioned summary(m1)$mean["p"] what did the p mean? i > used this in one of the gee command, it produced NA as answer? > > Many thanks > > > > On Mon, Jan 7, 2013 at 5:26 AM, arun kirshna [via R] < > ml-node+s789695n4654795h72 at n4.nabble.com> wrote: > >> Hi, >> >> I am? not very familiar with the geese/geeglm().? Is it from >> library(geepack)? >> Regarding your question: >> " >> Can you tell me if I can use the geese or geeglm function with this data >> eg: : HIBP~ time* Age >> Here age is a factor with 3 levels, time: 2 levels, HIBP = yes/no. >> >> From your original data: >> BP_2b<-read.csv("BP_2b.csv",sep="\t") >> head(BP_2b,2) >> #? CODEA Sex MaternalAge Education Birthplace AggScore IntScore Obese14 >> #1? ? 1? NA? ? ? ? ? 3? ? ? ? 4? ? ? ? ? 1? ? ? NA? ? ? NA? ? ? NA >> #2? ? 3? 2? ? ? ? ? 3? ? ? ? 3? ? ? ? ? 1? ? ? ? 0? ? ? ? 0? ? ? 0 >>? # Overweight14 Overweight21 Obese21 hibp14 hibp21 >> #1? ? ? ? ? NA? ? ? ? ? NA? ? ? NA? ? NA? ? NA >> #2? ? ? ? ? ? 0? ? ? ? ? ? 1? ? ? 0? ? ? 0? ? ? 0 >> >> If I understand your new classification: >> BP.stacknormal<- subset(BP_2b,Obese14==0 & Overweight14==0 & Obese21==0 & >> Overweight21==0) >> BP.stackObese <- subset(BP_2b,(Obese14==1& Overweight14==0 & >> Obese14==1&Overweight14==1)|(Obese14==1&Overweight14==1 & Obese21==1 & >> Overweight21==0)|(Obese14==1&Overweight14==0 & Obese21==0 & >> Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==1 & >> Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==1 & >> Overweight21==1)|(Obese14==0 & Overweight14==1 & Obese21==1 >> &Overweight21==1)|(Obese14==1& Overweight14==1 & Obese21==1& >> Overweight21==1)) #check whether there are more classification that fits >> to >> #Obese >>? BP.stackOverweight <- subset(BP_2b,(Obese14==0 & Overweight14==1 & >> Obese21==0 & Overweight21==1)|(Obese14==0 &Overweight14==1 & Obese21==0 & >> Overweight21==0)|(Obese14==0 & Overweight14==0 & Obese21==0 & >> Overweight21==1)) >> BP.stacknormal$Categ<-"Normal" >> BP.stackObese$Categ<-"Obese" >> BP.stackOverweight$Categ <- "Overweight" >> >> BP.newObeseOverweightNormal<-na.omit(rbind(BP.stacknormal,BP.stackObese,BP.stackOverweight)) >> >>? nrow(BP.newObeseOverweightNormal) >> #[1] 1581 >> BP.stack3 <- >> reshape(BP.newObeseOverweightNormal,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21"),c("hibp14","hibp21")),v.names=c("Obese","Overweight","hibp"),direction="long") >> >> library(car) >> BP.stack3$time<-recode(BP.stack3$time,"1=14;2=21") >> head(BP.stack3,2) >>? #? CODEA Sex MaternalAge Education Birthplace AggScore IntScore? Categ >> time >> #8.1? ? 8? 2? ? ? ? ? 4? ? ? ? 4? ? ? ? ? 1? ? ? ? 0? ? ? ? 0 Normal >> 14 >> #9.1? ? 9? 1? ? ? ? ? 3? ? ? ? 6? ? ? ? ? 2? ? ? ? 0? ? ? ? 0 Normal >> 14 >>? #? Obese Overweight hibp >> #8.1? ? 0? ? ? ? ? 0? ? 0 >> >> Now, your formula: (HIBP~time*Age), is it MaternalAge? >> If it is, it has three values >> unique(BP.stack3$MaternalAge) >> #[1] 4 3 5 >> and for time (14,21) # If it says that geese/geeglm, contrasts could be >> applied with factors>=2 levels, what is the problem? >> If you take "Categ" variable, it also has 3 levels (Normal, Obese, >> Overweight). >> >>? BP.stack3$MaternalAge<-factor(BP.stack3$MaternalAge) >>? BP.stack3$time<-factor(BP.stack3$time) >> >> library(geepack) >> For your last question about how to get the p-values: >> # Using one of the example datasets: >> data(seizure) >>? ? ? seiz.l <- reshape(seizure, >>? ? ? ? ? ? ? ? ? ? ? ? varying=list(c("base","y1", "y2", "y3", "y4")), >>? ? ? ? ? ? ? ? ? ? ? ? v.names="y", times=0:4, direction="long") >>? ? ? seiz.l <- seiz.l[order(seiz.l$id, seiz.l$time),] >>? ? ? seiz.l$t <- ifelse(seiz.l$time == 0, 8, 2) >>? ? ? seiz.l$x <- ifelse(seiz.l$time == 0, 0, 1) >>? ? ? m1 <- geese(y ~ offset(log(t)) + x + trt + x:trt, id = id, >>? ? ? ? ? ? ? ? ? data=seiz.l, corstr="exch", family=poisson) >>? ? ? summary(m1) >> >>? summary(m1)$mean["p"] >> #? ? ? ? ? ? ? ? ? ? p >> #(Intercept) 0.0000000 >> #x? ? ? ? ? 0.3347040 >> #trt? ? ? ? 0.9011982 >> #x:trt? ? ? 0.6236769 >> >> >> #If you need the p-values of the scale >>? ? summary(m1)$scale["p"] >>? #? ? ? ? ? ? ? ? ? p >> #(Intercept) 0.0254634 >> >> Hope it helps. >> >> A.K. >> >> >> >> >> >> >> ----- Original Message ----- >> From: rex2013 <[hidden >> email]<http://user/SendEmail.jtp?type=node&node=4654795&i=0>> >> >> To: [hidden email] <http://user/SendEmail.jtp?type=node&node=4654795&i=1> >> Cc: >> Sent: Sunday, January 6, 2013 4:55 AM >> Subject: Re: [R] random effects model >> >> Hi A.K >> >> Regarding my question on comparing normal/ obese/overweight with blood >> pressure change, I did finally as per the first suggestion of stacking the >> data and creating a normal category . This only gives me a obese not obese >> 14, but when I did with the wide format hoping to? get? a >> obese14,normal14,overweight 14 Vs hibp 21, i could not complete any of the >> models. >> This time I classified obese=1 & overweight=1 as obese itself. >> >> Can you tell me if I can use the geese or geeglm function with this data >> eg: : HIBP~ time* Age >> Here age is a factor with 3 levels, time: 2 levels, HIBP = yes/no. >> >> It says geese/geeglm: contrast can be applied only with factor with 2 or >> more levels. What is the way to overcome this. Can I manipulate the data >> to >> make it work. >> >> I need to know if the demogrphic variables affect change in blood pressure >> status over time? >> >> How to get the p values with gee model? >> >> Thanks >> On Thu, Jan 3, 2013 at 5:06 AM, arun kirshna [via R] < >> [hidden email] <http://user/SendEmail.jtp?type=node&node=4654795&i=2>> >> wrote: >> >> > HI Rex, >> > If I take a small subset from your whole dataset, and go through your >> > codes: >> > BP_2b<-read.csv("BP_2b.csv",sep="\t") >> >? BP.sub<-BP_2b[410:418,c(1,8:11,13)] #deleted the columns that are not >> > needed >> >? BP.stacknormal<- subset(BP.subnew,Obese14==0 & Overweight14==0) >> > BP.stackObese <- subset(BP.subnew,Obese14==1) >> >? BP.stackOverweight <- subset(BP.subnew,Overweight14==1) >> > BP.stacknormal$Categ<-"Normal14" >> > BP.stackObese$Categ<-"Obese14" >> > BP.stackOverweight$Categ <- "Overweight14" >> > >> BP.newObeseOverweightNormal<-rbind(BP.stacknormal,BP.stackObese,BP.stackOverweight) >> >> > >> >? BP.newObeseOverweightNormal >> > #? ? CODEA Obese14 Overweight14 Overweight21 Obese21 hibp21? ? ? ? Categ >> > #411? 541? ? ? 0? ? ? ? ? ? 0? ? ? ? ? ? 0? ? ? 0? ? ? 0? ? Normal14 >> > #415? 545? ? ? 0? ? ? ? ? ? 0? ? ? ? ? ? 1? ? ? 1? ? ? 1? ? Normal14 >> > #418? 549? ? ? 0? ? ? ? ? ? 0? ? ? ? ? ? 1? ? ? 0? ? ? 0? ? Normal14 >> > #413? 543? ? ? 1? ? ? ? ? ? 0? ? ? ? ? ? 1? ? ? 1? ? ? 0? ? ? Obese14 >> > #417? 548? ? ? 0? ? ? ? ? ? 1? ? ? ? ? ? 1? ? ? 0? ? ? 0 Overweight14 >> > BP.newObeseOverweightNormal$Categ<- >> > factor(BP.newObeseOverweightNormal$Categ) >> > BP.stack3 <- >> > >> reshape(BP.newObeseOverweightNormal,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),direction="long") >> >> > >> > library(car) >> > BP.stack3$time<-recode(BP.stack3$time,"1=14;2=21") >> > BP.stack3 #Here Normal14 gets repeated even at time==21.? Given that you >> > are using the "Categ" and "time" #columns in the analysis, it will give >> > incorrect results. >> > #? ? ? CODEA hibp21? ? ? ? Categ time Obese Overweight >> > #541.1? 541? ? ? 0? ? Normal14? 14? ? 0? ? ? ? ? 0 >> > #545.1? 545? ? ? 1? ? Normal14? 14? ? 0? ? ? ? ? 0 >> > #549.1? 549? ? ? 0? ? Normal14? 14? ? 0? ? ? ? ? 0 >> > #543.1? 543? ? ? 0? ? ? Obese14? 14? ? 1? ? ? ? ? 0 >> > #548.1? 548? ? ? 0 Overweight14? 14? ? 0? ? ? ? ? 1 >> > #541.2? 541? ? ? 0? ? Normal14? 21? ? 0? ? ? ? ? 0 >> > #545.2? 545? ? ? 1? ? Normal14? 21? ? 1? ? ? ? ? 1 >> > #549.2? 549? ? ? 0? ? Normal14? 21? ? 0? ? ? ? ? 1 >> > #543.2? 543? ? ? 0? ? ? Obese14? 21? ? 1? ? ? ? ? 1 >> > #548.2? 548? ? ? 0 Overweight14? 21? ? 0? ? ? ? ? 1 >> > #Even if I correct the above codes, this will give incorrect >> > results/(error as you shown) because the response variable (hibp21) gets >> > #repeated when you reshape it from wide to long. >> > >> > The correct classification might be: >> > BP_2b<-read.csv("BP_2b.csv",sep="\t") >> >? BP.sub<-BP_2b[410:418,c(1,8:11,13)] >> > >> BP.subnew<-reshape(BP.sub,idvar="CODEA",timevar="time",sep="",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),direction="long") >> >> > >> > BP.subnew$time<-recode(BP.subnew$time,"1=14;2=21") >> >? BP.subnew<-na.omit(BP.subnew) >> > >> > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==14 & >> > BP.subnew$Obese==0]<-"Overweight14" >> > BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==21 & >> > BP.subnew$Obese==0]<-"Overweight21" >> >? BP.subnew$Categ[BP.subnew$Obese==1 & BP.subnew$time==14 & >> > BP.subnew$Overweight==0]<-"Obese14" >> >? BP.subnew$Categ[BP.subnew$Obese==1 & BP.subnew$time==21 & >> > BP.subnew$Overweight==0]<-"Obese21" >> >? BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==21& >> > BP.subnew$Obese==1]<-"ObeseOverweight21" >> >? BP.subnew$Categ[BP.subnew$Overweight==1 & BP.subnew$time==14& >> > BP.subnew$Obese==1]<-"ObeseOverweight14" >> > BP.subnew$Categ[BP.subnew$Overweight==0 & BP.subnew$Obese==0 >> > &BP.subnew$time==14]<-"Normal14" >> >? BP.subnew$Categ[BP.subnew$Overweight==0 & BP.subnew$Obese==0 >> > &BP.subnew$time==21]<-"Normal21" >> > >> > BP.subnew$Categ<-factor(BP.subnew$Categ) >> > BP.subnew$time<-factor(BP.subnew$time) >> > BP.subnew >> > #? ? ? CODEA hibp21 time Obese Overweight? ? ? ? ? ? Categ >> > #541.1? 541? ? ? 0? 14? ? 0? ? ? ? ? 0? ? ? ? ? Normal14 >> > #543.1? 543? ? ? 0? 14? ? 1? ? ? ? ? 0? ? ? ? ? Obese14 >> > #545.1? 545? ? ? 1? 14? ? 0? ? ? ? ? 0? ? ? ? ? Normal14 >> > #548.1? 548? ? ? 0? 14? ? 0? ? ? ? ? 1? ? ? Overweight14 >> > #549.1? 549? ? ? 0? 14? ? 0? ? ? ? ? 0? ? ? ? ? Normal14 >> > #541.2? 541? ? ? 0? 21? ? 0? ? ? ? ? 0? ? ? ? ? Normal21 >> > #543.2? 543? ? ? 0? 21? ? 1? ? ? ? ? 1 ObeseOverweight21 >> > #545.2? 545? ? ? 1? 21? ? 1? ? ? ? ? 1 ObeseOverweight21 >> > #548.2? 548? ? ? 0? 21? ? 0? ? ? ? ? 1? ? ? Overweight21 >> > #549.2? 549? ? ? 0? 21? ? 0? ? ? ? ? 1? ? ? Overweight21 >> > >> > #NOw with the whole dataset: >> > BP.sub<-BP_2b[,c(1,8:11,13)] #change here and paste the above lines: >> >? head(BP.subnew) >> >? ? # CODEA hibp21 time Obese Overweight? ? Categ >> > #3.1? ? ? 3? ? ? 0? 14? ? 0? ? ? ? ? 0 Normal14 >> > #7.1? ? ? 7? ? ? 0? 14? ? 0? ? ? ? ? 0 Normal14 >> > #8.1? ? ? 8? ? ? 0? 14? ? 0? ? ? ? ? 0 Normal14 >> > #9.1? ? ? 9? ? ? 0? 14? ? 0? ? ? ? ? 0 Normal14 >> > #14.1? ? 14? ? ? 1? 14? ? 0? ? ? ? ? 0 Normal14 >> > #21.1? ? 21? ? ? 0? 14? ? 0? ? ? ? ? 0 Normal14 >> > >> > tail(BP.subnew) >> >? #? ? CODEA hibp21 time Obese Overweight? ? ? ? ? ? Categ >> > #8485.2? 8485? ? ? 0? 21? ? 1? ? ? ? ? 1 ObeseOverweight21 >> > #8506.2? 8506? ? ? 0? 21? ? 0? ? ? ? ? 1? ? ? Overweight21 >> > #8520.2? 8520? ? ? 0? 21? ? 0? ? ? ? ? 0? ? ? ? ? Normal21 >> > #8529.2? 8529? ? ? 1? 21? ? 1? ? ? ? ? 1 ObeseOverweight21 >> > #8550.2? 8550? ? ? 0? 21? ? 1? ? ? ? ? 1 ObeseOverweight21 >> > #8554.2? 8554? ? ? 0? 21? ? 0? ? ? ? ? 0? ? ? ? ? Normal21 >> > >> > summary(lme.1 <- lme(hibp21~time+Categ+ time*Categ, >> > data=BP.subnew,random=~1|CODEA, na.action=na.omit)) >> > #Error in MEEM(object, conLin, control$niterEM) : >> >? #Singularity in backsolve at level 0, block 1 >> > #May be because of the reasons I mentioned above. >> > >> > #YOu didn't mention the library(gee) >> > BP.gee8 <- gee(hibp21~time+Categ+time*Categ, >> > data=BP.subnew,id=CODEA,family=binomial, >> > corstr="exchangeable",na.action=na.omit) >> > #Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27 >> > #Error in gee(hibp21 ~ time + Categ + time * Categ, data = BP.subnew, id >> >> > CODEA,? : >> >? #rank-deficient model matrix >> > With your codes, it might have worked, but the results may be inaccurate >> > # After running your whole codes: >> >? BP.gee8 <- gee(hibp21~time+Categ+time*Categ, >> > data=BP.stack3,id=CODEA,family=binomial, >> > corstr="exchangeable",na.action=na.omit) >> > #Beginning Cgee S-function, @(#) geeformula.q 4.13 98/01/27 >> > #running glm to get initial regression estimate >> >? ? #? ? ? ? (Intercept)? ? ? ? ? ? ? ? ? time? ? ? ? ? CategObese14 >> >? ? ? #? ? -2.456607e+01? ? ? ? ? 9.940875e-15? ? ? ? ? 2.087584e-13 >> >? ? # CategOverweight14? ? ? time:CategObese14 time:CategOverweight14 >> >? ? ? #? ? 2.087584e-13? ? ? ? ? -9.940875e-15? ? ? ? ? -9.940875e-15 >> > #Error in gee(hibp21 ~ time + Categ + time * Categ, data = BP.stack3, id >> >> > CODEA,? : >> >? # Cgee: error: logistic model for probability has fitted value very >> close >> > to 1. >> > #estimates diverging; iteration terminated. >> > >> > In short, I think it would be better to go with the suggestion in my >> > previous email with adequate changes in "Categ" variable (adding >> > ObeseOverweight14, ObeseOverweight21 etc) as I showed here. >> > >> > A.K. >> > >> > >> > >> > >> > >> > >> > >> > >> > ------------------------------ >> >? If you reply to this email, your message will be added to the >> discussion >> > below: >> > >> >> > . >> > NAML< >> http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >> >> > >> >> >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654776.html >> Sent from the R help mailing list archive at Nabble.com. >>? ? [[alternative HTML version deleted]] >> >> ______________________________________________ >> [hidden email] >> <http://user/SendEmail.jtp?type=node&node=4654795&i=3>mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >> ______________________________________________ >> [hidden email] >> <http://user/SendEmail.jtp?type=node&node=4654795&i=4>mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> >> ------------------------------ >>? If you reply to this email, your message will be added to the discussion >> below: >> http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654795.html >>? To unsubscribe from random effects model, click >> here<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4654346&code=dXNoYS5uYXRoYW5AZ21haWwuY29tfDQ2NTQzNDZ8MjAyMjE1NDI0> >> . >> NAML<http://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >> > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/random-effects-model-tp4654346p4654826.html > Sent from the R help mailing list archive at Nabble.com. >? ? [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >
Hi, For your first question, M1<-as.table(rbind(c(825,2407),c(828,2200))) ?dimnames(M1)<- list(gender=c("Male","Female"), MV=c("Study","NonStudy/missing")) M1 #??????? MV #gender?? Study NonStudy/missing ?# Male???? 825???????????? 2407 ?# Female?? 828???????????? 2200 Xsq<-chisq.test(M1) Xsq #??? Pearson's Chi-squared test with Yates' continuity correction #data:? M1 #X-squared = 2.5684, df = 1, p-value = 0.109 I will take a look at your second question later. A.K. ________________________________ From: Usha Gurunathan <usha.nathan at gmail.com> To: arun <smartpink111 at yahoo.com> Sent: Sunday, January 13, 2013 1:51 AM Subject: Re: [R] random effects model HI AK Thanks a lot? for explaining that. 1. With the chi sq. ( in order to find out if the diffce is significant between groups) do I have create a separate excel file and make a dataframe.How do I go about it? On Sun, Jan 13, 2013 at 1:22 PM, arun <smartpink111 at yahoo.com> wrote: HI,>???? >table(BP_2b$Sex) #original dataset >#?? 1??? 2 >#3232 3028 >?nrow(BP_2b) >#[1] 6898 >?nrow(BP_2bSexNoMV) >#[1] 6260 >?6898-6260 >#[1] 638 #these rows were removed from the BP_2b to create BP_2bSexNoMV >BP_2bSexMale<-BP_2bSexNoMV[BP_2bSexNoMV$Sex=="Male",] >?nrow(BP_2bSexMale) >#[1] 3232 >?nrow(BP_2bSexMale[!complete.cases(BP_2bSexMale),]) #Missing rows with Male >#[1] 2407 >?nrow(BP_2bSexMale[complete.cases(BP_2bSexMale),]) #Non missing rows with Male >#[1] 825 > > >You did the chisquare test on the new dataset with 6260 rows, right. >I removed those 638 rows because these doesn't belong to either male or female, but you want the % of missing value per male or female.? So, I thought this will bias the results.? If you want to include the missing values, you could do it, but I don't know where you would put that missing values as it cannot be classified as belonging specifically to males or females.? I hope you understand it. > >Sometimes, the maintainer's respond a bit slow.? You have to sent an email reminding him again. > >Regarding the vmv package, you could email Waqas Ahmed Malik (malik at math.uni-augsburg.de) regarding options for changing the title and the the font etc. >You could also use this link (http://www.r-bloggers.com/visualizing-missing-data-2/ ) to plot missing value (?plot.missing()).? I never used that package, but you could try.? Looks like it gives more information. > >A.K. > > > > > > > > >________________________________ >From: Usha Gurunathan <usha.nathan at gmail.com> >To: arun <smartpink111 at yahoo.com> >Sent: Saturday, January 12, 2013 9:05 PM > >Subject: Re: [R] random effects model > > >Hi A.K > >So it is number of females missing/total female participants enrolled: 72.65% >Number of females missing/total (of males+ females) ?participants enrolled : 35.14% > >The total no. with the master data: Males: 3232, females: 3028 ( I got this before removing any missing values) > >with table(Copy.of.BP_2$ Sex) ?## BP > > >If I were to write a table ( ?and do a chi sq. later),? > >as Gender ? ? ? ? ? ?Study ? ? ? ? ? ? ? ? ? ?Non study/missing ? ? Total >? ? ??Male ? ? ? ? ? ? ?825 (25.53%)? ? ? ? ? ? ?2407 (74.47%) ? ? ? 3232 (100%) >? ??Female ? ? ? ? ? 828 (27.35%) ? ? ? ? ? ? 2200 (72.65%) ? ? ? 3028 ( 100%) >? ? ?Total ? ? ? ? ? ? ?1653 ? ? ? ? ? ? ? ? ? ? ? ? ?4607 ? ? ? ? ? ? ? ? ? ? ?6260 ? ? > > >The problem is when I did? >>colSums(is.na(Copy.of.BP_2), the sex category showed N=638. > >I cannot understand the discrepancy.Also, when you have mentioned to remove NA, is that not a missing value that needs to be included in the total number missing. I am a bit confused. Can you help? > >## I tried sending email to gee pack maintainer at the ID with R site, mail didn't go through?? > >Many thanks > > > > > > >On Sun, Jan 13, 2013 at 9:17 AM, arun <smartpink111 at yahoo.com> wrote: > >Hi, >>Yes, you are right. ?72.655222% was those missing among females. ?35.14377% of values in females are missing from among the whole dataset (combined total of Males+Females data after removing the NAs from the variable "Sex").? >> >>A.K. >> >> >> >>________________________________ >>From: Usha Gurunathan <usha.nathan at gmail.com> >>To: arun <smartpink111 at yahoo.com> >>Cc: R help <r-help at r-project.org> >>Sent: Saturday, January 12, 2013 5:59 PM >> >>Subject: Re: [R] random effects model >> >> >> >>Hi AK >>That works. I was trying to get? similar results from any other package. Being a beginner, I was not sure how to modify the syntax to get my output. >> >>lapply(split(BP_2bSexNoMV,BP_ >>2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(x))*100) #gives the percentage of rows of missing #values from the overall rows for Males and Females >>#$Female >>#[1] 72.65522 >># >>#$Male >>#[1] 74.47401 >> >>#iF you want the percentage from the total number rows in Males and Females (without NA's in the the Sex column) >>?lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(BP_2bSexNoMV))*100) >>#$Female >>#[1] 35.14377 >># >>#$Male >>#[1] 38.45048 >> >>How do I interpret the above 2 difft results? 72.66% of values were missing among female participants?? Can you pl. clarify. >> >>Many thanks. >> >> >>On Sun, Jan 13, 2013 at 3:28 AM, arun <smartpink111 at yahoo.com> wrote: >> >>lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(x))*100) #gives the percentage of rows of missing #values from the overall rows for Males and Females >>>#$Female >>>#[1] 72.65522 >>># >>>#$Male >>>#[1] 74.47401 >>> >>>#iF you want the percentage from the total number rows in Males and Females (without NA's in the the Sex column) >>>?lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(BP_2bSexNoMV))*100) >>>#$Female >>>#[1] 35.14377 >>># >>>#$Male >>>#[1] 38.45048 >> >
HI, I think I mentioned to you before that when you reshape the columns excluding the response variable, response variable gets repeated (in this case hibp14 or hibp21) and creates the error" I run your code, there are obvious problems in the code so I didn't reach up to BP.gee BP_2b<-read.csv("BP_2b.csv",sep="\t") BP.stack3 <- reshape(BP_2b,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),times=factor(c(1,2)),direction="long") BP.stack3 <- transform(BP.stack3,CODEA=factor(CODEA),Sex=factor(Sex,labels=c("Male","Female")),MaternalAge=factor(MaternalAge,labels=c("39years or less","40-49 years","50 years or older")),Education=factor(Education,labels=c("Primary/special","Started secondary","Completed grade10", "Completed grade12", "College","University")),Birthplace=factor(Birthplace,labels=c("Australia","Other English-speaking","Other"))) ?BP.stack3$Sex <- factor(BP.stack3$Sex,levels=levels(BP.stack3$Sex)[c(2,1)]) ?BP.sub3a <-? subset(BP.stack3,subset=!(is.na(Sex)| is.na(Education)|is.na(Birthplace)|is.na(Education)|is.na(hibp14)| is.na(hibp21)))?? ?nrow(BP.sub3a) #[1] 3364 ?BP.sub5a <- BP.sub3a[order(BP.sub3a$CODEA),] # your code was BP.sub5a <- BP.sub3a[order(BP.sub5a$CODEA),]? ???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? ^^^^^ was not defined before #Next line BPsub3$Categ[BPsub6$Overweight==1&BPsub3$time==1&BPsub3$Obese==0]<- "Overweight14"? #It should be BP.sub3 and what is BPsub6, it was not defined previously. #Error in BPsub3$Categ[BPsub6$Overweight == 1 & BPsub3$time == 1 & BPsub3$Obese ==? : ? #object 'BPsub3' not found A.K. ________________________________ From: Usha Gurunathan <usha.nathan at gmail.com> To: arun <smartpink111 at yahoo.com> Sent: Sunday, January 13, 2013 1:51 AM Subject: Re: [R] random effects model HI AK Thanks a lot? for explaining that. 1. With the chi sq. ( in order to find out if the diffce is significant between groups) do I have create a separate excel file and make a dataframe.How do I go about it? I have resent a mail to Jun Yan at a difft email ad( first add.didn't work, mail not delivered). 2. With my previous query ( reg. Obese/Overweight/ Normal at age 14 Vs change of blood pressure status at 21), even though I had compromised without the age-specific regression, but I am still keen to explore why the age-specific regression didn't work despite it looking okay. I have given below the syntax. If you get time, could you kindly look at it and see if it could work by any chance? Apologies for persisting with this query. BP.stack3 <- reshape(Copy.of.BP_2,idvar="CODEA",timevar="time",sep="_",varying=list(c("Obese14","Obese21"),c("Overweight14","Overweight21")),v.names=c("Obese","Overweight"),times=factor(c(1,2)),direction="long BP.stack3 head(BP.stack3) tail(BP.stack3) ?names(BP.stack3)[c(2,3,4,5,6,7)] <- c("Sex","MaternalAge","Education","Birthplace","AggScore","IntScore") BP.stack3 <- transform(BP.stack3,CODEA=factor(CODEA),Sex=factor(Sex,labels=c("Male","Female")),MaternalAge=factor(MaternalAge,labels=c("39years or less","40-49 years","50 years or older")),Education=factor(Education,labels=c("Primary/special","Started secondary","Completed grade10", "Completed grade12", "College","University")),Birthplace=factor(Birthplace,labels=c("Australia","Other English-speaking","Other"))) table(BP.stack3$Sex)?? BP.stack3$Sex <- factor(BP.stack3$Sex,levels=levels(BP.stack3$Sex)[c(2,1)]) levels(BP.stack3$Sex) BP.sub3a <-? subset(BP.stack3,subset=!(is.na(Sex)| is.na(Education)|is.na(Birthplace)|is.na(Education)|is.na(hibp14)| is.na(hibp21)))??? summary(BP.sub3a) BP.sub5a <- BP.sub3a[order(BP.sub5a$CODEA),]? ?BPsub3$Categ[BPsub6$Overweight==1&BPsub3$time==1&BPsub3$Obese==0] <- "Overweight14" BPsub3$Categ[BPsub6$Overweight==1&BPsub3$time==2&BPsub3$Obese==0] <- "Overweight21" BPsub3$Categ[BPsub3$Obese==1&BPsub3$time==1&BPsub3$Overweight==0|BPsub3$Obese==1&BPsub3$time==1&BPsub3$Overweight==1 ] <- "Obese14" BPsub3$Categ[BPsub3$Obese==0&BPsub3$time==1&BPsub3 BPsub3$Categ[BPsub6$Overweight==1&BPsub3$time==1&BPsub3$Obese==0] <- "Overweight14"$Overweight==0] <- "Normal14" BPsub3$Categ[BPsub3$Obese==0&BPsub3$time==2&BPsub3$Overweight==0] <- "Normal21" BPsub3$Categ[BPsub3$Obese==1&BPsub3$time==2&BPsub3$Overweight==0|BPsub3$Obese==1&BPsub3$time==2&BPsub3$Overweight==1] <- "Obese21" BPsub3$Categ <- factor(BPsub3$Categ) BPsub3$time <- factor(BPsub3$time) summary(BPsub3$Categ) BPsub7 <- subset(BPsub6,subset=!(is.na(Categ))) BPsub7$time <- recode(BPsub7$time,"1=14;2=21") BPsub7$hibp14 <- factor(BPsub7$hibp14) levels(BPsub7$hibp14) levels(BPsub7$Categ) names(BPsub7) head(BPsub7)??? ### this was looking quite okay. tail(BPsub7) str(BPsub7) library(gee) BP.gee <- geese(hibp14~ time*Categ, data=BPsub7,id=CODEA,family=binomial, corstr="exchangeable",na.action=na.omit) Thanks again. On Sun, Jan 13, 2013 at 1:22 PM, arun <smartpink111 at yahoo.com> wrote: HI,>???? >table(BP_2b$Sex) #original dataset >#?? 1??? 2 >#3232 3028 >?nrow(BP_2b) >#[1] 6898 >?nrow(BP_2bSexNoMV) >#[1] 6260 >?6898-6260 >#[1] 638 #these rows were removed from the BP_2b to create BP_2bSexNoMV >BP_2bSexMale<-BP_2bSexNoMV[BP_2bSexNoMV$Sex=="Male",] >?nrow(BP_2bSexMale) >#[1] 3232 >?nrow(BP_2bSexMale[!complete.cases(BP_2bSexMale),]) #Missing rows with Male >#[1] 2407 >?nrow(BP_2bSexMale[complete.cases(BP_2bSexMale),]) #Non missing rows with Male >#[1] 825 > > >You did the chisquare test on the new dataset with 6260 rows, right. >I removed those 638 rows because these doesn't belong to either male or female, but you want the % of missing value per male or female.? So, I thought this will bias the results.? If you want to include the missing values, you could do it, but I don't know where you would put that missing values as it cannot be classified as belonging specifically to males or females.? I hope you understand it. > >Sometimes, the maintainer's respond a bit slow.? You have to sent an email reminding him again. > >Regarding the vmv package, you could email Waqas Ahmed Malik (malik at math.uni-augsburg.de) regarding options for changing the title and the the font etc. >You could also use this link (http://www.r-bloggers.com/visualizing-missing-data-2/ ) to plot missing value (?plot.missing()).? I never used that package, but you could try.? Looks like it gives more information. > >A.K. > > > > > > > > >________________________________ >From: Usha Gurunathan <usha.nathan at gmail.com> >To: arun <smartpink111 at yahoo.com> >Sent: Saturday, January 12, 2013 9:05 PM > >Subject: Re: [R] random effects model > > >Hi A.K > >So it is number of females missing/total female participants enrolled: 72.65% >Number of females missing/total (of males+ females) ?participants enrolled : 35.14% > >The total no. with the master data: Males: 3232, females: 3028 ( I got this before removing any missing values) > >with table(Copy.of.BP_2$ Sex) ?## BP > > >If I were to write a table ( ?and do a chi sq. later),? > >as Gender ? ? ? ? ? ?Study ? ? ? ? ? ? ? ? ? ?Non study/missing ? ? Total >? ? ??Male ? ? ? ? ? ? ?825 (25.53%)? ? ? ? ? ? ?2407 (74.47%) ? ? ? 3232 (100%) >? ??Female ? ? ? ? ? 828 (27.35%) ? ? ? ? ? ? 2200 (72.65%) ? ? ? 3028 ( 100%) >? ? ?Total ? ? ? ? ? ? ?1653 ? ? ? ? ? ? ? ? ? ? ? ? ?4607 ? ? ? ? ? ? ? ? ? ? ?6260 ? ? > > >The problem is when I did? >>colSums(is.na(Copy.of.BP_2), the sex category showed N=638. > >I cannot understand the discrepancy.Also, when you have mentioned to remove NA, is that not a missing value that needs to be included in the total number missing. I am a bit confused. Can you help? > >## I tried sending email to gee pack maintainer at the ID with R site, mail didn't go through?? > >Many thanks > > > > > > >On Sun, Jan 13, 2013 at 9:17 AM, arun <smartpink111 at yahoo.com> wrote: > >Hi, >>Yes, you are right. ?72.655222% was those missing among females. ?35.14377% of values in females are missing from among the whole dataset (combined total of Males+Females data after removing the NAs from the variable "Sex").? >> >>A.K. >> >> >> >>________________________________ >>From: Usha Gurunathan <usha.nathan at gmail.com> >>To: arun <smartpink111 at yahoo.com> >>Cc: R help <r-help at r-project.org> >>Sent: Saturday, January 12, 2013 5:59 PM >> >>Subject: Re: [R] random effects model >> >> >> >>Hi AK >>That works. I was trying to get? similar results from any other package. Being a beginner, I was not sure how to modify the syntax to get my output. >> >>lapply(split(BP_2bSexNoMV,BP_ >>2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(x))*100) #gives the percentage of rows of missing #values from the overall rows for Males and Females >>#$Female >>#[1] 72.65522 >># >>#$Male >>#[1] 74.47401 >> >>#iF you want the percentage from the total number rows in Males and Females (without NA's in the the Sex column) >>?lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(BP_2bSexNoMV))*100) >>#$Female >>#[1] 35.14377 >># >>#$Male >>#[1] 38.45048 >> >>How do I interpret the above 2 difft results? 72.66% of values were missing among female participants?? Can you pl. clarify. >> >>Many thanks. >> >> >>On Sun, Jan 13, 2013 at 3:28 AM, arun <smartpink111 at yahoo.com> wrote: >> >>lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(x))*100) #gives the percentage of rows of missing #values from the overall rows for Males and Females >>>#$Female >>>#[1] 72.65522 >>># >>>#$Male >>>#[1] 74.47401 >>> >>>#iF you want the percentage from the total number rows in Males and Females (without NA's in the the Sex column) >>>?lapply(split(BP_2bSexNoMV,BP_2bSexNoMV$Sex),function(x) (nrow(x[!complete.cases(x[,-2]),])/nrow(BP_2bSexNoMV))*100) >>>#$Female >>>#[1] 35.14377 >>># >>>#$Male >>>#[1] 38.45048 >> >??????