similar to: Convert character string to top levels + NAN

Displaying 20 results from an estimated 10000 matches similar to: "Convert character string to top levels + NAN"

2009 Mar 16
2
FW: Select a random subset of rows out of matrix
Dear all, I have a large dataset (N=100,000 with 89 variables per subject). This dataset is stored in a 100.000 x 89 matrix where each row describes one individual and each column one variable. What is the easiest way of selecting a subset of let's say 1.000 individuals out of that whole matrix? Thanks, Michael Michael Haenlein Associate Professor of Marketing ESCP-EAP European School of
2011 Apr 12
2
Testing equality of coefficients in coxph model
Dear all, I'm running a coxph model of the form: coxph(Surv(Start, End, Death.ID) ~ x1 + x2 + a1 + a2 + a3) Within this model, I would like to compare the influence of x1 and x2 on the hazard rate. Specifically I am interested in testing whether the estimated coefficient for x1 is equal (or not) to the estimated coefficient for x2. I was thinking of using a Chow-test for this but the Chow
2010 Nov 11
2
predict.coxph and predict.survreg
Dear all, I'm struggling with predicting "expected time until death" for a coxph and survreg model. I have two datasets. Dataset 1 includes a certain number of people for which I know a vector of covariates (age, gender, etc.) and their event times (i.e., I know whether they have died and when if death occurred prior to the end of the observation period). Dataset 2 includes another
2008 Apr 18
1
spdep question - Moran's I
Dear all, I would like to calculate a Moran's I statistic using the moran function in the spdep package. The problem I'm having deals with how to create the listw object. My data stems from the area of social network analysis. I have list of poeple and for each pair of them I have a measure of their relationship strength. So my dataset looks like: Jim; Bob; 0.5 This measure of
2011 Mar 26
1
Effect size in multiple regression
Dear all, is there a convenient way to determine the effect size for a regression coefficient in a multiple regression model? I have a model of the form lm(y ~ A*B*C*D) and would like to determine Cohen's f2 (http://en.wikipedia.org/wiki/Effect_size) for each predictor without having to do it manually. Thanks, Michael Michael Haenlein Associate Professor of Marketing ESCP Europe Paris,
2016 Apr 16
1
Social Network Simulation
Dear all, I am trying to simulate a series of networks that have characteristics similar to real life social networks. Specifically I am interested in networks that have (a) a reasonable degree of clustering (as measured by the transitivity function in igraph) and (b) a reasonable degree of degree polarization (as measured by the average degree of the top 10% nodes with highest degree divided by
2010 Aug 03
2
Collinearity in Moderated Multiple Regression
Dear all, I have one dependent variable y and two independent variables x1 and x2 which I would like to use to explain y. x1 and x2 are design factors in an experiment and are not correlated with each other. For example assume that: x1 <- rbind(1,1,1,2,2,2,3,3,3) x2 <- rbind(1,2,3,1,2,3,1,2,3) cor(x1,x2) The problem is that I do not only want to analyze the effect of x1 and x2 on y but
2013 Jan 22
2
Approximating discrete distribution by continuous distribution
Dear all, I have a discrete distribution showing how age is distributed across a population using a certain set of bands: Age <- matrix(c(74045062, 71978405, 122718362, 40489415), ncol=1, dimnames=list(c("<18", "18-34", "35-64", "65+"),c())) Age_dist <- Age/sum(Age) For example I know that 23.94% of all people are between 0-18 years, 23.28%
2011 Feb 22
1
System of related regression equations
Dear all, I would like to estimate a system of regression equations of the following form: y1 = a1 + b1 x1 + b2x2 + e1 y2 = a2 + c1 y1 + c2 x2 + c3 x3 + e2 Specifically the dependent variable in Equation 1 appears as an independent variable in Equation 2. Additionally some independent variables that appear in Equation 1 are also included in Equation 2. I assume that I cannot estimate these two
2011 May 27
1
Help to improve existing R-Code
Dear all, I have written a relatively brief R-Code to run a series of simulations. Currently the code runs for a very long time (up to several days, depending on the conditions) and I expect this to be the case because it might not be very efficiently written. I am, for example, relying on several for(...) loops which could probably be done much faster using a different way of programming. I am
2010 Jul 28
1
Time-dependent covariates in survreg function
Dear all, I'm asking this question again as I didn't get a reply last time: I'm doing a survival analysis with time-dependent covariates. Until now, I have used a simple Cox model for this, specifically the coxph function from the survival library. Now, I would like to try out an accelerated failure time model with a parametric specification as implemented for example in the survreg
2011 May 13
6
Powerful PC to run R
Dear all, I'm currently running R on my laptop -- a Lenovo Thinkpad X201 (Intel Core i7 CPU, M620, 2.67 Ghz, 8 GB RAM). The problem is that some of my calculations run for several days sometimes even weeks (mainly simulations over a large parameter space). Depending on the external conditions, my laptop sometimes shuts down due to overheating. I'm now thinking about buying a more
2011 May 11
1
Total effect of X on Y under presence of interaction effects
Dear all, this is probably more a statistics question than an R question but probably there is somebody who can help me nevertheless. I'm running a regression with four predictors (a, b, c, d) and all their interaction effects using lm. Based on theory I assume that a influences y positively. In my output (see below) I see, however, a negative regression coefficient for a. But several of the
2011 Sep 19
1
Binary optimization problem in R
Dear all, I would like to solve a problem similar to a multiple knapsack problem and am looking for a function in R that can help me. Specifically, my situation is as follows: I have a list of n items which I would like to allocate to m groups with fixed size. Each item has a certain profit value and this profit depends on the type of group the item is in. My problem is to allocate the items
2011 Mar 12
2
Identifying unique pairs
Dear R helpers Suppose I have a data frame as given below mydat = data.frame(x = c(1,1,1, 2, 2, 2, 2, 2, 5, 5, 6), y = c(10, 10, 10, 8, 8, 8, 7, 7, 2, 2, 4)) mydat         x     y 1      1     10 2      1     10 3      1     10 4      2       8 5      2       8 6      2       8 7      2       7 8      2       7 9      5       2 10    5       2 11    6       4 unique(mydat$x) will give me 1,
2016 Jul 09
2
Red Neuronal complicada categorías
Hola, Esta es una forma de hacerlo... Mira que lo primero que he modificado es el fichero "x.csv" para sustituir los espacios en los nombres por "_". Y también he quitado los acentos y las eñes... He utilizado el paquete RNNS y la función "mlp()" para ajustar la red. #------------------------------------------- > x <- read.csv("x.csv",
2012 Dec 24
1
How to do it through 1 step?
A data set(dat),has 2 variables: x and a, and 100 rows. I wanna add 2 variables,and call the new data set dat1: var1:f = a/median(a) var2:x_new = x*f My solution: dat1<-transform(dat,f = a/median(a),x_new = x*f) But gets error reply which says that "f" is not exits since dat has no variables called "f". So I have to do through 2 steps:
2011 Sep 15
0
Allocation of data points to groups based on membership probabilities
Dear all, I have a matrix that provides, for a series of data points, the probability that each of these points belongs to a certain group. Take the following example, which represents 20 data points and their group membership probability to five groups (A-E): set.seed(1) probs <- matrix(runif(100),nrow=20,
2010 Aug 07
3
plot the dependent variable against one of the predictors with other predictors as constant
Hi, folks, Happy work in weekends >_< My question is how to plot the dependent variable against one of the predictors with other predictors as constant. Not for the original data, but after prediction. It means y is the predicted value of the dependent variables. The constane value of the other predictors may be the average or some fixed value. ####### y=1:10 x=10:1 z=2:11
2009 Aug 12
3
Obtaining the value of x at a given value of y in a smooth.spline object
I have some data fit to a smooth.spline object as follows: (x=vector of data for the predictor variable, y=vector of data for the response variable) fit <- smooth.spline(x,y) Now, given a spline fit point y_new, I want to be able to find out what value of x_new yielded this fit value. How to do so? (This problem is the inverse of the predict.smooth.spline function, which takes x_new as input