similar to: Regression with very high number of categorical variables

Displaying 20 results from an estimated 4000 matches similar to: "Regression with very high number of categorical variables"

2011 May 13
6
Powerful PC to run R
Dear all, I'm currently running R on my laptop -- a Lenovo Thinkpad X201 (Intel Core i7 CPU, M620, 2.67 Ghz, 8 GB RAM). The problem is that some of my calculations run for several days sometimes even weeks (mainly simulations over a large parameter space). Depending on the external conditions, my laptop sometimes shuts down due to overheating. I'm now thinking about buying a more
2011 Apr 12
2
Testing equality of coefficients in coxph model
Dear all, I'm running a coxph model of the form: coxph(Surv(Start, End, Death.ID) ~ x1 + x2 + a1 + a2 + a3) Within this model, I would like to compare the influence of x1 and x2 on the hazard rate. Specifically I am interested in testing whether the estimated coefficient for x1 is equal (or not) to the estimated coefficient for x2. I was thinking of using a Chow-test for this but the Chow
2016 Apr 16
1
Social Network Simulation
Dear all, I am trying to simulate a series of networks that have characteristics similar to real life social networks. Specifically I am interested in networks that have (a) a reasonable degree of clustering (as measured by the transitivity function in igraph) and (b) a reasonable degree of degree polarization (as measured by the average degree of the top 10% nodes with highest degree divided by
2011 Mar 26
1
Effect size in multiple regression
Dear all, is there a convenient way to determine the effect size for a regression coefficient in a multiple regression model? I have a model of the form lm(y ~ A*B*C*D) and would like to determine Cohen's f2 (http://en.wikipedia.org/wiki/Effect_size) for each predictor without having to do it manually. Thanks, Michael Michael Haenlein Associate Professor of Marketing ESCP Europe Paris,
2013 Jan 22
2
Approximating discrete distribution by continuous distribution
Dear all, I have a discrete distribution showing how age is distributed across a population using a certain set of bands: Age <- matrix(c(74045062, 71978405, 122718362, 40489415), ncol=1, dimnames=list(c("<18", "18-34", "35-64", "65+"),c())) Age_dist <- Age/sum(Age) For example I know that 23.94% of all people are between 0-18 years, 23.28%
2012 May 29
2
Wilcoxon-Mann-Whitney U value: outcomes from different stat packages
Given this example #start code a<-c(0,70,50,100,70,650,1300,6900,1780,4930,1120,700,190,940, 760,100,300,36270,5610,249680,1760,4040,164890,17230,75140,1870,22380,5890,2430) b<-c(0,0,10,30,50,440,1000,140,70,90,60,60,20,90,180,30,90, 3220,490,20790,290,740,5350,940,3910,0,640,850,260) wilcox.test(a, b, paired=FALSE) #sum of rank for first sample sum.rank.a <-
2010 Nov 11
2
predict.coxph and predict.survreg
Dear all, I'm struggling with predicting "expected time until death" for a coxph and survreg model. I have two datasets. Dataset 1 includes a certain number of people for which I know a vector of covariates (age, gender, etc.) and their event times (i.e., I know whether they have died and when if death occurred prior to the end of the observation period). Dataset 2 includes another
2010 Jul 14
1
Printing status updates in while-loop
Dear all, I'm using a while loop in the context of an iterative optimization procedure. Within my while loop I have a counter variable that helps me to determine how long the loop has been running. Before the loop I initialize it as counter <- 0 and the last condition within my loop is counter <- counter + 1. I'd like to print out the current status of "counter" while the
2011 Sep 21
2
Cannot allocate vector of size x
Dear all, I am running a simulation in which I randomly generate a series of vectors to test whether they fulfill a certain condition. In most cases, there is no problem. But from time to time, the (randomly) generated vectors are too large for my system and I get the error message: "Cannot allocate vector of size x". The problem is that in those cases my simulation stops and I have to
2010 Sep 08
1
Aggregating data from two data frames
Dear all, I'm working with two data frames. The first frame (agg_data) consists of two columns. agg_data[,1] is a unique ID for each row and agg_data[,2] contains a continuous variable. The second data frame (geo_data) consists of several columns. One of these columns (geo_data$ZCTA) corresponds to the unique ID in the first data frame. The problem is that only a subset of the unique ID
2011 Jul 15
2
Convert continuous variable into discrete variable
Dear all, I have a continuous variable that can take on values between 0 and 100, for example: x<-runif(100,0,100) I also have a second variable that defines a series of thresholds, for example: y<-c(3, 4.5, 6, 8) I would like to convert my continuous variable into a discrete one using the threshold variables: If x is between 0 and 3 the discrete variable should be 1 If x is between 3
2011 Sep 19
1
Binary optimization problem in R
Dear all, I would like to solve a problem similar to a multiple knapsack problem and am looking for a function in R that can help me. Specifically, my situation is as follows: I have a list of n items which I would like to allocate to m groups with fixed size. Each item has a certain profit value and this profit depends on the type of group the item is in. My problem is to allocate the items
2010 Jul 25
1
Equivalent to go-to statement
Dear all, I'm working with a code that consists of two parts: In Part 1 I'm generating a random graph using the igraph library (which represents the relationships between different nodes) and a vector (which represents a certain characteristic for each node): library(igraph) g <- watts.strogatz.game(1,100,5,0.05) z <- rlnorm(100,0,1) In Part 2 I'm iteratively changing the
2012 Apr 12
2
Curve fitting, probably splines
Dear all, This is probably more related to statistics than to [R] but I hope someone can give me an idea how to solve it nevertheless: Assume I have a variable y that is a function of x: y=f(x). I know the average value of y for different intervals of x. For example, I know that in the interval[0;x1] the average y is y1, in the interval [x1;x2] the average y is y2 and so forth. I would like to
2017 Dec 14
1
Aggregation across two variables in data.table
Dear all, I have a data.frame that includes a series of demographic variables for a set of respondents plus a dependent variable (Theta). For example: Age Education Marital Familysize Income Housing Theta 1: 50 Associate degree Divorced 4 70K+ Owned with mortgage 9.147777 2: 65
2010 Nov 19
2
question about constraint minimization
Hi, I am a beginner of R. There is a question about constraint minimization. A function, y=f(x1,x2,x3....x12), needs to be minimized. There are 3 requirements for the minimization: (1) x2+x3+...+x12=1.5 (x1 is excluded); (2) x1=x3=x4; (3) x1, x3 and x5 are in the range of -1~0, respectively. The rest variables (x2, x4, x6, x7, ...., x12) are in the range of 0~1, respectively. The
2012 Nov 15
1
Stepwise regression scope: all interacting terms (.^2)
Dear Gurus, Thank you in advance for your assistance. I'm trying to understand scope better when performing stepwise regression using "step." I have a model with a binary response variable and 10 predictor variables. When I perform stepwise regression I define scope=.^2 to allow interactions between all terms. But I am missing something. When I perform stepwise regression (both
2012 Nov 15
1
Step-wise method for large dimension
Hi , I want to apply the following code fo my data with 400 predictors. I was wondering if there ia an alternative way instead of typing 400 predictors for the following code. I really appreciate your help. fit0<-lm(Y~1, data= mydata) fit.final<- lm(Y~X1+X2+X3+.....+X400, data=mydata) ??? step(fit0, scope=list(lower=fit0, upper=fit.final), data=mydata, direction="forward")
2011 Jun 18
1
Applying function to all elements of a formula
Hi, I would like to do a regression like: reg <- lm(y~log(.), data) where the log function is applied to "." in the form: log(x1)+ log(x2)+ log(x3)... instead of in the form log(x1+x2+x3+...) Is this possible? Thank you, Scott [[alternative HTML version deleted]]
2009 Aug 31
3
Two way joining vs heatmap
Hi STATISTICA has a function called "Two-way joining" (see http://www.statsoft.com/TEXTBOOK/stcluan.html#twotwo) and the reference material states that this is based on the method as published by Hartigan (found this paper: http://www.jstor.org/pss/2284710 through wikipedia). What is the relationship (if any) between the "heatmap" function in R and this technique? Is there an