similar to: splitting and saving a large dataframe

Displaying 20 results from an estimated 700 matches similar to: "splitting and saving a large dataframe"

2005 Oct 20
5
spliting an integer
Hi there, From the vector X of integers, X = c(11999, 122000, 81997) I would like to make these two vectors: Z= c(1999, 2000, 1997) Y =c(1 , 12 , 8) That is, each entry of vector Z receives the four last digits of each entry of X, and Y receives "the rest". Any suggestions? Thanks in advance, Dimitri [[alternative HTML version deleted]]
2011 Apr 07
1
Assigning a larger number of levels to a factor that has fewer levels
Hello! I have larger and a smaller data frame with 1 factor in each - it's the same factor: large.frame<-data.frame(myfactor=LETTERS[1:10]) small.frame<-data.frame(myfactor=LETTERS[c(9,7,5,3,1)]) levels(large.frame$myfactor) levels(small.frame$myfactor) table(large.frame$myfactor) table(small.frame$myfactor) myfactor has 10 levels in large.frame and 5 levels in small.frame. All 5
2005 Oct 25
2
Inf in regressions
Hi, Suppose I I wish to run lm( y ~ x + z + log(w) ) where w assumes non-negative values. A problem arises when w=0, as log(0) = -Inf, and R doesn't accept that (as it "accepts" NA). Is there a way to tell R to do with -Inf the same it does with NA, i.e, to ignore it? ( Otherwise I have to do something like w[w==0] <- NA which doesn't hurt, but might be a bit
2006 Jan 26
1
efficiency with "%*%"
Hi, x and y are (numeric) vectors. I wonder if one of the following is more efficient than the other: x%*%y or sum(x*y) ? Thanks, Dimitri Szerman
2012 Mar 28
1
discrepancy between paired t test and glht on lme models
Hi folks, I am working with repeated measures data and I ran into issues where the paired t-test results did not match those obtained by employing glht() contrasts on a lme model. While the lme model itself appears to be fine, there seems to be some discrepancy with using glht() on the lme model (unless I am missing something here). I was wondering if someone could help identify the issue. On
2009 May 06
4
tapply changing order of factor levels?
Hi, Does tapply change the order when applied on a factor? Below is the code I tried. > mylevels<-c("IN0020020155","IN0019800021","IN0020020064") >
2012 Jan 18
1
drop rare factors
I have a data frame with some factor columns. I want to drop the rows with rare factor values (and remove the factor values from the factors). E.g., frame$MyFactor takes values A 1,000 times, B 2,000 times, C 30 times and D 4 times. I want to remove all rows which assume rare values (<1%), i.e., C and D. i.e., frame <- frame[[! (frame$MyFactor %in% c("A","B"))]] except
2010 Jul 05
2
repeated measures with missing data
Dear R help group, I am teaching myself linear mixed models with missing data since I would like to analyze a stats design with these kind of models. The textbook example is for the procedure "proc MIXED" in SAS, but I would like to know if there is an equivalent in R. This example only includes two time-measurements across subjects (a t-test "with missing values"), but I
2006 Jul 05
1
creating a data frame from a list
Dear all, I have a list with three (named) numeric vectors: > lst = list(a=c(A=1,B=8) , b=c(A=2,B=3,C=0), c=c(B=2,D=0) ) > lst $a A B 1 8 $b A B C 2 3 0 $c B D 2 0 Now, I'd love to use this list to create the following data frame: > dtf = data.frame(a=c(A=1,B=8,C=NA,D=NA), + b=c(A=2,B=3,C=0,D=NA), + c=c(A=NA,B=2,C=NA,D=0) ) > dtf a b
2006 Jul 12
1
help in vectorization
Hi, I have two data frames. One is like > dtf = data.frame(y=c(rep(2002,4), rep(2003,5)), + m=c(9:12, 1:5), + def=c(.74,.75,.76,.78,.80,.82,.85,.85,.87)) and the other dtf2 = data.frame(y=rep( c(2002,2003),20), m=c(trunc(runif(20,1,5)),trunc(runif(20,9,12))), inc=rnorm(40,mean=300,sd=150) ) What I want is to divide
2012 Nov 24
1
Adding a new variable to each element of a list
Hello, I have a list of data with multiple elements, and each element in the list has multiple variables in it. Here's an example: ### Make the fake data dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6) subject <- factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3",
2011 Mar 30
2
summing values by week - based on daily dates - but with some dates missing
Dear everybody, I have the following challenge. I have a data set with 2 subgroups, dates (days), and corresponding values (see example code below). Within each subgroup: I need to aggregate (sum) the values by week - for weeks that start on a Monday (for example, 2008-12-29 was a Monday). I find it difficult because I have missing dates in my data - so that sometimes I don't even have the
2006 Apr 26
1
help using tapply
Dear R-mates, # Here's what I am trying to do. I have a dataset like this: id = c(rep(1,8), rep(2,8)) dur1 <- c( 17,18,19,18,24,19,24,24 ) est1 <- c( rep(1,5), rep(2,3) ) dur2 <- c(1,1,3,4,8,12,13,14) est2 <- rep(1,8) mydata = data.frame(id, estat=c(est1, est2), durat=c(dur1, dur2)) # I want to one have this: id = c(rep(1,8), rep(2,8))
2009 Mar 03
1
repeated measures anova, sphericity, epsilon, etc
I have 3 questions (below). Background: I am teaching an introductory statistics course in which we are covering (among other things) repeated measures anova. This time around teaching it, we are using R for all of our computations. We are starting by covering the univariate approach to repeated measures anova. Doing a basic repeated measures anova (univariate approach) using aov() seems
2011 Nov 04
2
Efficiency of factor objects
R factors are the natural way to represent factors -- and should be efficient since they use small integers. But in fact, for many (but not all) operations, R factors are considerably slower than integers, or even character strings. This appears to be because whenever a factor vector is subsetted, the entire levels vector is copied. For example: > i1 <- sample(1e4,1e6,replace=T) > c1
2003 Jun 04
2
convert factor to numeric
Hi R-experts! Every once in a while I need to convert a factor to a vector of numeric values. as.numeric(myfactor) of course returns a nice numeric vector of the indexes of the levels which is usually not what I had in mind: > v <- c(25, 3.78, 16.5, 37, 109) > f <- factor(v) > f [1] 25 3.78 16.5 37 109 Levels: 3.78 16.5 25 37 109 > as.numeric(f) [1] 3 1 2 4 5 > What I
2003 Nov 25
2
O2 optimization produces wrong code (PR#5315)
Full_Name: jean coursol Version: 1.7.1, 1.8.0 OS: linux & Windows-XP Submission from: (NULL) (129.175.52.7) Binary MS-Windows akima module from CRAN (1.8.0 version) produces wrong results with some data. Installing akima source in linux, with same data: -with gcc-2.95.3 -O2 : give correct results (under R 1.7.1); -with gcc-3.2.3 -O2 : give wrong results (under R-1.7.1 and R-1.8.0); -with
2004 Nov 30
1
RecordPlot
I want to do a zoom with recordPlot(). I have problems with lists. (R-2.0.1 patched 2004-11-30 , various linux). I have problems with RecordPlot class structure. > plot(1:10) > saveP <- recordPlot() > dev.off() > sx <- saveP[[1]][[2]][[2]] > saveP[[1]][[2]][[2]] <- sx Error in "[[<-"(`*tmp*`, 1, value = list(list( .Primitive("plot.new")),
2007 Dec 09
2
adjusting "levels" after subset a table
Um texto embutido e sem conjunto de caracteres especificado associado... Nome: n?o dispon?vel Url: https://stat.ethz.ch/pipermail/r-help/attachments/20071208/5409f1a7/attachment.pl
2006 May 24
3
How to make attributes persist after indexing?
Dear All! For descriptive purposes I would like to add attributes to objects. These attributes should be kept, even if by indexing only part of the object is used. I noted that some attributes like levels and class of a factor exist also after indexing, while others, like comment or label vanish. Is there a way to make an arbitrary attribute to be kept after indexing? This would be especially