Displaying 20 results from an estimated 20000 matches similar to: "How to do cross validation with glm?"
2011 Aug 23
3
GLM question
Hi All,
I am trying to fit my data with glm model, my data is a matrix of size n*100. So, I have n rows and 100 columns and my vector y is of size n which contains the labels (0 or 1)
My question is:
instead of manually typing the model as
glm.fit = glm(y~ x[,1]+x[,2]+...+x[,100], family=binomial())
I have a for loop as follows that concatenates the x variables as follows:
final_str=NULL
for
2011 Aug 26
2
How to find the accuracy of the predicted glm model with family = binomial (link = logit)
Hi All,
When modeling with glm and family = binomial (link = logit) and response values of 0 and 1, I get the predicted probabilities of assigning to my class one, then I would like to compare it with my vector y which does have the original labels. How should I change the probabilities into values of zero and 1 and then compare it with my vector y to find out about the accuracy of my
2011 Aug 20
2
a Question regarding glm for linear regression
Hello All,
I have a question about glm in R. I would like to fit a model with glm function, I have a vector y (size n) which is my response variable and I have matrix X which is by size (n*f) where f is the number of features or columns. I have about 80 features, and when I fit a model using the following formula,?
glmfit = glm(y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12 + x13
2011 Sep 07
1
Question about model selection for glm -- how to select features based on BIC?
Hi All,
After fitting a model with glm function, I would like to do the model selection and select some of the features and I am using the "step function" as follows:
glm.fit <- glm (Y ~ . , data = dat, family = binomial(link=logit)) AIC_fitted = step(glm.fit, direction = "both")
I was wondering is there any way to select the features based on BIC rather than AIC? is there
2011 Aug 10
2
glmnet
Hi All,
I have been trying to use glmnet package to do LASSO linear regression. my x data is a matrix n_row by n_col and y is a vector of size n_row corresponding to the vector data. The number of n_col is much more larger than the number of n_row. I do the following:
fits = glmnet(x, y, family="multinomial")I have been following this
2011 Aug 11
1
Cv.glment question -- why giving me an error
Hi All,
I am trying to run cv.glmnet(x,y,family="multinomial", nfolds =4) and I only have 8 observations and the number of features I have is 1000, so my x matrix is 8 by 1000 and when I run the following, I get this error, I am not sure what is causing this problem.
Error in predmat[which, , seq(nlami)] = preds : number of items to replace is not a multiple of replacement length
Can
2011 Aug 27
1
Grouping variables in a data frame
Hi All,
I have a data frame as follow:
user_id time age location gender
.....
and I learn a logistic regression to learn the weights (glm with family= (link = logit))), my response value is either zero or one. I would like to group the users based on user_id and time and see the y values and predicted y values at the same time. Or plot them some how. Is there any way to somehow group them
2012 Jan 06
3
How to fit my data with a distribution?
Dear All,
I have a bunch of data points as follows:
x 100
y 200
z 300
...
where 100, 200, 300 are the values. I would like to know the distribution of my data? how can I fit my data into a distribution?
Thanks a lot,
Andra
[[alternative HTML version deleted]]
2008 Jun 09
1
Cross-validation in R
Folks; I am having a problem with the cv.glm and would appreciate someone
shedding some light here. It seems obvious but I cannot get it. I did read
the manual, but I could not get more insight. This is a database containing
3363 records and I am trying a cross-validation to understand the process.
When using the cv.glm, code below, I get mean of perr1 of 0.2336 and SD of
0.000139. When using a
2007 May 11
1
model seleciton by leave-one-out cross-validation
Hi, all
When I am using mle.cv(wle), I find a interesting problem: I can't do
leave-one-out cross-validation with mle.cv(wle). I will illustrate the
problem as following:
> xx=matrix(rnorm(20*3),ncol=3)
> bb=c(1,2,0)
> yy=xx%*%bb+rnorm(20,0,0.001)+0
> summary(mle.cv(yy~xx,split=nrow(xx)-1,monte.carlo=2*nrow(xx),verbose=T),
num.max=1)[[1]]
mle.cv: dimension of the split subsample
2011 Aug 25
1
How to combine two learned regression models?
Hi All,
I have a set of features of size p and I would like to separate my feature space into two sets so that p = p1 + p2, p1 is a set of features and p2 is another set of features and I want to fit a glm model for each sets of features separately. Then I want to combine the results of two glm models with a parameter beta. For example, beta * F(p1) + (1-beta) * F(p2) where F(p1) is a learned
2010 Apr 02
2
Cross-validation for parameter selection (glm/logit)
If my aim is to select a good subset of parameters for my final logit
model built using glm(). What is the best way to cross-validate the
results so that they are reliable?
Let's say that I have a large dataset of 1000's of observations. I
split this data into two groups, one that I use for training and
another for validation. First I use the training set to build a model,
and the the
2007 May 14
1
cross-validation / sensitivity anaylsis for logistic regression model
Hi,
I have developed a logistic regression model in the form of (factor_1~ numeric
+ factor_2) and would like to perform a cross-validation or some similar
form of sensitivity analysis on this model.
using cv.glm() from the boot package:
# dataframe from which model was built in 'z'
# model is called 'm_geo.lrm'
# as suggested in the man page for a binomial model:
cost <-
2004 Sep 15
1
Cross-validation for Linear Discrimitant Analysis
Hello:
I am new to R and statistics and I have two questions.
First I need help to interpret the cross-validation result from the R
linear discriminant analysis function "lda". I did the following:
lda (group ~ Var1 + Var2, CV=T)
where "CV=T" tells the lda to do cross-validation. The output of lda are
the posterior probabilities among other things, but I can't find an
2011 Sep 03
2
ROCR package question for evaluating two regression models
Hello All,
I have used logistic regression glm in R and I am evaluating two models both learned with glm but with different predictors. model1 <- glm (Y ~ x4+ x5+ x6+ x7, data = dat, family = binomial(link=logit))model2 <- glm (Y~ x1 + x2 +x3 , data = dat, family = binomial(link=logit))
and I would like to compare these two models based on the prediction that I get from each model:
pred1 =
2005 Mar 17
1
Cross validation, one more time (hopefully the last)
I apologize for posting on this question again, but unfortunately, I don't have and can't get access to MASS for at least three weeks. I have found some code on the web however which implements the prediction error algorithm in cv.glm.
http://www.bioconductor.org/workshops/NGFN03/modelsel-exercise.pdf
Now I've tried to adapt it to my purposes, but since I'm not deeply familiar
2011 Aug 29
1
How to order based on the second two columns?
Hello All,
I have a data frame consisting of 4 columns (id1, id2, y, pred)
where pred is the predicted value based on the glm function and my data frame is called "all". "data" is another data frame that has all data but I want to put together some important columns from my original data frame (data) into another data frame (all) as follows and I would like them to be sorted
2012 Mar 01
1
GLM with regularization
Hello,
Thank you for probably not so new question, but i am new to R.
Does any of packages have something like glm+regularization? So far i
see probably something close to that as a ridge regression in MASS but
I think i need something like GLM, in particular binomial regularized
versions of polynomial regression.
Also I am not sure how some of the K-fold crossvalidation helpers out
there
2011 Jul 22
1
cv.glm and "longer object length is not a multiple of shorter object length" error
Hi,
I've done some searching where others have had trouble with this error (or
"warning" actually), but I'm unable to solve my problem. I have a data
sheet with 13 columns and 36 rows. Each column has exactly the same number
of rows. I've created glms and now want to do cross-validation on 2 of
them. Please be gentle-- I'm new to R (and statistics, too, for that
2006 Nov 15
1
cross-validation for count data
Hi everybody,
I'm trying to use cross-validation (cv.glm) for count data. Does someone know which is the appropriate cost function for Poisson distribution?
Thank you in advance.
Valerio.
Conservation Biology Unit
Department of Environmental and Territory Sciences
University of Milano-Bicocca
Piazza della Scienza,1
20126 Milano, Italy.