Displaying 20 results from an estimated 700 matches similar to: "splitting and saving a large dataframe"
2005 Oct 20
5
spliting an integer
Hi there,
From the vector X of integers,
X = c(11999, 122000, 81997)
I would like to make these two vectors:
Z= c(1999, 2000, 1997)
Y =c(1 , 12 , 8)
That is, each entry of vector Z receives the four last digits of each entry of X, and Y receives "the rest".
Any suggestions?
Thanks in advance,
Dimitri
[[alternative HTML version deleted]]
2011 Apr 07
1
Assigning a larger number of levels to a factor that has fewer levels
Hello!
I have larger and a smaller data frame with 1 factor in each - it's
the same factor:
large.frame<-data.frame(myfactor=LETTERS[1:10])
small.frame<-data.frame(myfactor=LETTERS[c(9,7,5,3,1)])
levels(large.frame$myfactor)
levels(small.frame$myfactor)
table(large.frame$myfactor)
table(small.frame$myfactor)
myfactor has 10 levels in large.frame and 5 levels in small.frame. All
5
2005 Oct 25
2
Inf in regressions
Hi,
Suppose I I wish to run
lm( y ~ x + z + log(w) )
where w assumes non-negative values. A problem arises when w=0, as log(0)
= -Inf, and R doesn't accept that (as it "accepts" NA). Is there a way to
tell R to do with -Inf the same it does with NA, i.e, to ignore it? (
Otherwise I have to do something like
w[w==0] <- NA
which doesn't hurt, but might be a bit
2006 Jan 26
1
efficiency with "%*%"
Hi,
x and y are (numeric) vectors. I wonder if one of the following is more
efficient than the other:
x%*%y
or
sum(x*y)
?
Thanks,
Dimitri Szerman
2012 Mar 28
1
discrepancy between paired t test and glht on lme models
Hi folks,
I am working with repeated measures data and I ran into issues where the
paired t-test results did not match those obtained by employing glht()
contrasts on a lme model. While the lme model itself appears to be fine,
there seems to be some discrepancy with using glht() on the lme model
(unless I am missing something here). I was wondering if someone could
help identify the issue. On
2009 May 06
4
tapply changing order of factor levels?
Hi,
Does tapply change the order when applied on a factor? Below is the code I
tried.
> mylevels<-c("IN0020020155","IN0019800021","IN0020020064")
>
2012 Jan 18
1
drop rare factors
I have a data frame with some factor columns.
I want to drop the rows with rare factor values
(and remove the factor values from the factors).
E.g., frame$MyFactor takes values
A 1,000 times,
B 2,000 times,
C 30 times and
D 4 times.
I want to remove all rows which assume rare values (<1%), i.e., C and D.
i.e.,
frame <- frame[[! (frame$MyFactor %in% c("A","B"))]]
except
2010 Jul 05
2
repeated measures with missing data
Dear R help group, I am teaching myself linear mixed models with missing data since I would like to analyze a stats design with these kind of models. The textbook example is for the procedure "proc MIXED" in SAS, but I would like to know if there is an equivalent in R. This example only includes two time-measurements across subjects (a t-test "with missing values"), but I
2006 Jul 05
1
creating a data frame from a list
Dear all,
I have a list with three (named) numeric vectors:
> lst = list(a=c(A=1,B=8) , b=c(A=2,B=3,C=0), c=c(B=2,D=0) )
> lst
$a
A B
1 8
$b
A B C
2 3 0
$c
B D
2 0
Now, I'd love to use this list to create the following data frame:
> dtf = data.frame(a=c(A=1,B=8,C=NA,D=NA),
+ b=c(A=2,B=3,C=0,D=NA),
+ c=c(A=NA,B=2,C=NA,D=0) )
> dtf
a b
2006 Jul 12
1
help in vectorization
Hi,
I have two data frames. One is like
> dtf = data.frame(y=c(rep(2002,4), rep(2003,5)),
+ m=c(9:12, 1:5),
+ def=c(.74,.75,.76,.78,.80,.82,.85,.85,.87))
and the other
dtf2 = data.frame(y=rep( c(2002,2003),20),
m=c(trunc(runif(20,1,5)),trunc(runif(20,9,12))),
inc=rnorm(40,mean=300,sd=150) )
What I want is to divide
2012 Nov 24
1
Adding a new variable to each element of a list
Hello,
I have a list of data with multiple elements, and each element in the list
has multiple variables in it. Here's an example:
### Make the fake data
dv <- c(1,3,4,2,2,3,2,5,6,3,4,4,3,5,6)
subject <- factor(c("s1","s1","s1","s2","s2","s2","s3","s3","s3",
2011 Mar 30
2
summing values by week - based on daily dates - but with some dates missing
Dear everybody,
I have the following challenge. I have a data set with 2 subgroups,
dates (days), and corresponding values (see example code below).
Within each subgroup: I need to aggregate (sum) the values by week -
for weeks that start on a Monday (for example, 2008-12-29 was a
Monday).
I find it difficult because I have missing dates in my data - so that
sometimes I don't even have the
2006 Apr 26
1
help using tapply
Dear R-mates,
# Here's what I am trying to do. I have a dataset like this:
id = c(rep(1,8), rep(2,8))
dur1 <- c( 17,18,19,18,24,19,24,24 )
est1 <- c( rep(1,5), rep(2,3) )
dur2 <- c(1,1,3,4,8,12,13,14)
est2 <- rep(1,8)
mydata = data.frame(id,
estat=c(est1, est2),
durat=c(dur1, dur2))
# I want to one have this:
id = c(rep(1,8), rep(2,8))
2009 Mar 03
1
repeated measures anova, sphericity, epsilon, etc
I have 3 questions (below).
Background: I am teaching an introductory statistics course in which we are
covering (among other things) repeated measures anova. This time around
teaching it, we are using R for all of our computations. We are starting by
covering the univariate approach to repeated measures anova.
Doing a basic repeated measures anova (univariate approach) using aov()
seems
2011 Nov 04
2
Efficiency of factor objects
R factors are the natural way to represent factors -- and should be
efficient since they use small integers. But in fact, for many (but
not all) operations, R factors are considerably slower than integers,
or even character strings. This appears to be because whenever a
factor vector is subsetted, the entire levels vector is copied. For
example:
> i1 <- sample(1e4,1e6,replace=T)
> c1
2003 Jun 04
2
convert factor to numeric
Hi R-experts!
Every once in a while I need to convert a factor to a vector of numeric
values. as.numeric(myfactor) of course returns a nice numeric vector of
the indexes of the levels which is usually not what I had in mind:
> v <- c(25, 3.78, 16.5, 37, 109)
> f <- factor(v)
> f
[1] 25 3.78 16.5 37 109
Levels: 3.78 16.5 25 37 109
> as.numeric(f)
[1] 3 1 2 4 5
>
What I
2003 Nov 25
2
O2 optimization produces wrong code (PR#5315)
Full_Name: jean coursol
Version: 1.7.1, 1.8.0
OS: linux & Windows-XP
Submission from: (NULL) (129.175.52.7)
Binary MS-Windows akima module from CRAN (1.8.0 version) produces wrong results
with some data.
Installing akima source in linux, with same data:
-with gcc-2.95.3 -O2 : give correct results (under R 1.7.1);
-with gcc-3.2.3 -O2 : give wrong results (under R-1.7.1 and R-1.8.0);
-with
2004 Nov 30
1
RecordPlot
I want to do a zoom with recordPlot(). I have problems with lists.
(R-2.0.1 patched 2004-11-30 , various linux). I have problems
with RecordPlot class structure.
> plot(1:10)
> saveP <- recordPlot()
> dev.off()
> sx <- saveP[[1]][[2]][[2]]
> saveP[[1]][[2]][[2]] <- sx
Error in "[[<-"(`*tmp*`, 1, value = list(list(
.Primitive("plot.new")),
2007 Dec 09
2
adjusting "levels" after subset a table
Um texto embutido e sem conjunto de caracteres especificado associado...
Nome: n?o dispon?vel
Url: https://stat.ethz.ch/pipermail/r-help/attachments/20071208/5409f1a7/attachment.pl
2006 May 24
3
How to make attributes persist after indexing?
Dear All!
For descriptive purposes I would like to add attributes to objects. These
attributes should be kept, even if by indexing only part of the object is
used.
I noted that some attributes like levels and class of a factor exist also
after indexing, while others, like comment or label vanish.
Is there a way to make an arbitrary attribute to be kept after indexing?
This would be especially