thr3ads.net - similar to: "manupulating a data frame column"

Displaying 20 results from an estimated 70000 matches similar to: "manupulating a data frame column"

2008 Mar 29

Tabulating Sparse Contingency Table

I have a sparse contingency table (most cells are 0): > xtabs(~.,data[,idx:(idx+4)]) , , x3 = 1, x4 = 1, x5 = 1 x2 x1 1 2 3 1 0 0 31 2 0 0 112 3 0 0 94 , , x3 = 2, x4 = 1, x5 = 1 x2 x1 1 2 3 1 0 0 0 2 0 0 0 3 0 0 0 , , x3 = 3, x4 = 1, x5 = 1 x2 x1 1 2 3 1 0 0 0 2 0 0 0 3 0 0 0 , , x3 = 1, x4

code speed help? -- example and results provided

2003 Aug 05

code speed help? -- example and results provided

I have the following piece of code that combines lists comprised of components of varying length into a list with components of constant length. I have found 2 ways to do it, and the faster of the two is posted below along with sample results. Do you have any suggestions on how to decrease the calculation time by modifying the code? > ####Function########### >

sub setting a data frame with binomial responses

2012 Aug 01

sub setting a data frame with binomial responses

Hi everyone, Let me have a dataframe named ?mydata? and created as below, *> n=c(5,5,5,5) #number of trils > x1=c(2,3,1,3) ) #number of successes > x2=c(5,5,5,5) #number of successes > x3=c(0,0,0,0) #number of successes > x4=c(5,0,5,0) #number of successes > mydata=data.frame(n,x1,x2,x3,x4) > mydata* n x1 x2 x3 x4 1 5 2 5 0 5 2 5 3 5 0 0 3 5 1 5 0 5 4 5 3 5 0

lm design matrix bug?

2007 Oct 29

lm design matrix bug?

Hi All Maybe I dont understand it, but I would have expected that the design matrix has as many rows as there were observations available to fit the model. Below a small artificial dataset created, then one model fitted and the design matrix outputted, having 27 rows. Then I delete 6 obs, and fit the model on these 21 obs, but the design matrix that comes out has 26 rows? Thanks for your

Standard Error for difference in predicted probabilities

2010 Sep 24

Standard Error for difference in predicted probabilities

Is there a way to estimate the standard error for the difference in predicted probabilities obtained from a logistic regression model? For example, this code gives the difference for the predicted probability of when x2==1 vs. when x2==0, holding x1 constant at its mean: y=rbinom(100,1,.4) x1=rnorm(100, 3, 2) x2=rbinom(100, 1, .7) mod=glm(y ~ x1 + x2, family=binomial) pred=predict(mod,

excluding a column from a data frame

2009 Apr 15

excluding a column from a data frame

Dear R People: Suppose I have the following data frame: x1 x2 x3 1 -0.1582116 0.06635783 1.765448 2 -1.1407422 0.47235664 0.615931 3 0.8702362 2.32301341 2.653805 > str(xx) 'data.frame': 3 obs. of 3 variables: $ x1: num -0.158 -1.141 0.87 $ x2: num 0.0664 0.4724 2.323 $ x3: num 1.765 0.616 2.654 I can exclude the second column nicely via: >

vector output loop or function

2011 Sep 01

vector output loop or function

Dear all Sorry for simple question: I want to put the following option into look as number of X is large 1000 variables X1 <- sample(c(1,2, 3, 4),10, replace = T, prob = c(0.4, 0.2, 0.2, 0.2)) cv1 <- round(runif(2, 1, 10)) # X2 is copy of X1 X2 <- X1 # now X2 is different in cv1 random positions X2[cv1] <- 5 cv2 <- round(runif(2, 1, 10)) # X3 is copy of X2 X3 <- X2

error in model specification for cfa with lavaan-package

2011 Jun 01

error in model specification for cfa with lavaan-package

Dear R-List, (I am not sure whether this list is the right place for my question...) I have a dataframe df.cfa

how to apply the function cut( ) to many columns in a data.frame?

2007 Mar 01

how to apply the function cut( ) to many columns in a data.frame?

Dear useRs, In a data.frame (df) I have several columns (x1, x2, x3....xn) containing data as a continuous numerical response: df var x1 x2 x3 1 143 147 137 2 93 93 117 3 164 39 101 4 123 118 97 5 63 125 97 6 129 83 124 7 123 93 136 8 123 80 79 9 89 107 150 10 78 95 121 I want to

Summing Select Columns of a Data Frame?

2009 Jan 20

Summing Select Columns of a Data Frame?

Hi, I would like to operate on certain columns in a dataframe, but not others. My data looks like this: x1 x2 x3 1 2 3 4 5 6 7 8 9 I want to create a new column named x4 that is the sum of x1 and x2, but NOT x3. I looked at colSums and apply, but those functions seem to use all the columns in a dataframe. How do I only use select columns? If it helps, in Stata this would be gen x4

how to create a substraction matrix (subtract a row of every column from the same row in other columns)

2012 Sep 12

how to create a substraction matrix (subtract a row of every column from the same row in other columns)

Hello I have data like this x1 x2 x3 x4 x5 I want to create a matrix similar to a correlation matrix, but with the difference between the two values, like this x1 x2 x3 x4 x5 x1 x2-x1 x3-x1 x4-x1 x5-x1 x2 x3-x2 x4-x2 x5-x2 x3 x4-x3 x5-x3 x4 x5-x4 x5 Then I

sort by column and row names

2011 Feb 17

sort by column and row names

Hello, All, How can one sort on column and row names. For example: How can this X1 X3 X2 X1 1 0 0 X3 0 1 0 X2 0 0 1 become this? X1 X2 X3 X1 1 0 0 X2 0 1 0 X3 0 0 1 Thank you for your time! Jim [[alternative HTML version deleted]]

Data frame manipulation by eliminating rows containing extreme values

2011 Oct 22

Data frame manipulation by eliminating rows containing extreme values

Dear All, I have got the limits for removing extreme values for each variables using following function . f=function(x){quantile(x, c(0.25, 0.75),na.rm = TRUE) - matrix(IQR(x,na.rm = TRUE) * c(1.5), nrow = 1) %*% c(-1, 1)} #Example: n <- 100 x1 <- runif(n) x2 <- runif(n) x3 <- x1 + x2 + runif(n)/10 x4 <- x1 + x2 + x3 + runif(n)/10 x5 <-

Column order in stacking/unstacking

2011 Mar 12

Column order in stacking/unstacking

Dear R users, I'm having some problems with the stack() and unstack() functions, and wondered if you could help. I have a large data frame (400 rows x 2000 columns), which I need to reduce to a single column of values (and therefore 800000 rows), so that I can use it in other operations (e.g., generating predictions from a GLM object). However, the problem I'm having can be reproduced

Still avoiding loops

2005 Jan 26

Still avoiding loops

Dear all, I have a matrix X with 47 lines and say 500 columns - values are in {0,1}. I'd like to compare lines. For that, I first did: for (i in 1:(dim(X)[1]-1)) for (j in (i+1):dim(X)[1]) { Y <- X[i,]+Y[j,] etc. but, since it takes a long time, I would prefer avoding loops; for that, my first idea was to add this matrix: X1=X[,rep(1:46,46:1)] to this one: res=NULL for (i in

problem when extacting columns of a data frame in a new data frame

2008 Jan 08

problem when extacting columns of a data frame in a new data frame

Dear R-users, I would like to create a new data frame composed of 2 columns of another data frame. But it does not give me what I want... > casesCNST[1:10,] case X1 X2 X3 X4 expected 1 A1 0 0 0 0 E 2 A2 0 0 0 1 C 3 A3 0 0 0 2 C 4 A4 0 0 0 3 C 5 A5 0 0 0 4 C 6 A6 0 0 1 0 C 7 A7 0 0 1 1 C 8

How to provide list as an argument for the data.frame()

2009 Jul 14

How to provide list as an argument for the data.frame()

Hi R -users, i've a table as describe below. I'm reading the numeric value presented in this table to populate a list. #table #============ #X A B C #x1 2 3 4 #x2 5 7 10 #x4 2 3 5 #============ rawData <- read.table("raw_data.txt",header=T, sep="\t") myList=list() counter=0 for (i in c(1:length(rawData$X))) { print (i)

as.data.frame: Error in "names<-.default" (PR#7808)

2005 Apr 22

as.data.frame: Error in "names<-.default" (PR#7808)

Hello, I found a potential problem in R 2.1.0 (and R 2.0.1) I expect that > tmp <- FUN(x1, x2, x3, x4) > as.data.frame(tmp) is the same as > as.data.frame(FUN(x1, x2, x3, x4)) since the tmp variable in this case is unnecessary. However, below I will demonstrate that under an odd set of conditions, I can correctly perform as.data.frame(tmp), but not as.data.frame(FUN(x1, x2, x3,

t.tests on a data.frame using an apply-type function

2010 Aug 21

t.tests on a data.frame using an apply-type function

I have a data.frame with ~250 observations (rows) in each of ~50 categories (columns). I would like to perform t.tests on subsets of observations within each column, with the subsets according to index vectors contained in other columns of the data.frame. My data.frame looks something like this: x<-data.frame(matrix(rnorm(200,mean=5,sd=.5),nrow=20)) colnames(x)<-c("site",

re stricting points in a data frame

2008 Jan 30

re stricting points in a data frame

useR's, Consider some variables and a data frame of points: x1 <- c(1,2,3) x2 <- c(3,4,5) xk1 <- seq(min(x1)-.5, max(x1)+.5,.5) xk2 <- seq(min(x2)-.5, max(x2)+.5,.5) expand.grid(xk1=xk1,xk2=xk2) xk1 xk2 1 0.5 2.5 2 1.0 2.5 3 1.5 2.5 4 2.0 2.5 5 2.5 2.5 6 3.0 2.5 7 3.5 2.5 ... 46 2.0 5.5 47 2.5 5.5 48 3.0 5.5 49 3.5 5.5 I want to restrict the data frame to only contain

similar to: manupulating a data frame column