It means you have selected a response variable from one data frame
(unmarried.male) and a predictor from another data frame (fieder.male) and they
have different lengths.
You might be better off if you used the names in the data frame rather than
selecting columns in a form such as 'some.data.frame[, 3]', This just
confuses the issue and makes it very easy to make mistakes - as indeed you have
done.
Also, to fit models on subsets of the data, you do not have to create separate
data frames. See the 'subset' argument of glm, which is standard for
most fitting functions. This is also a way to avoid problems and would have
helped you here as well.
Bill Venables.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of gked
Sent: Monday, 14 March 2011 4:33 AM
To: r-help at r-project.org
Subject: [R] troubles with logistic regression
hello everyone,
I working on the dataset for my project in class and got stuck on trying to
run logistic regression. here is my code:
data <- read.csv(file="C:/Users/fieder.data.2000.csv")
# creating subset of men
fieder.male<-subset(data,data[,8]==1)
unmarried.male<-subset(data,data[,8]==1&data[,6]==1)
# glm fit
agesq.male<-(unmarried.male[,5])^2
male.sqrtincome<-sqrt(unmarried.male[,9])
fieder.male.mar.glm<-glm(as.factor(unmarried.male[,6])~
factor(fieder.male[,7])+fieder.male[,5]+agesq.male+
male.sqrtincome,binomial(link="logit") )
par(mfrow=c(1,1))
plot(c(0,300),c(0,1),pch=" ",
xlab="sqrt income, truncated at 90000",
ylab="modeled probability of being never-married")
junk<- lowess(male.sqrtincome,
log(fieder.male.mar.glm$fitted.values/
(1-fieder.male.mar.glm$fitted.values)))
lines(junk$x,exp(junk$y)/(1+exp(junk$y)))
title(main="probability of never marrying\n males, by sqrt(income)")
points(male.sqrtincome[unmarried.male==0],
fieder.male.mar.glm$fitted.values[unmarried.male==0],pch=16)
points(male.sqrtincome[unmarried.male==1],
fieder.male.mar.glm$fitted.values[unmarried.male==1],pch=1)
The error says:
Error in model.frame.default(formula = as.factor(unmarried.male[, 6]) ~ :
variable lengths differ (found for 'factor(fieder.male[, 7])')
What does it mean? Where am i making a mistake?
Thank you
P.S. i am also attaching data file in .csv format
http://r.789695.n4.nabble.com/file/n3352356/fieder.data.2000.csv
fieder.data.2000.csv
--
View this message in context:
http://r.789695.n4.nabble.com/troubles-with-logistic-regression-tp3352356p3352356.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.