Displaying 20 results from an estimated 10000 matches similar to: "Bug in tapply with factors containing NAs (PR#6672)"
2005 Aug 08
1
tapply huge speed difference if X has names
Hi all,
Apologies if this has been raised before ... R's tapply is very fast, but if
X has names in this example, there seems to be a huge slow down: under 1
second compared to 151 seconds. The following timings are repeatable and
are timed properly on a single user machine :
> X = 1:100000
> names(X) = X
> system.time(fast<<-tapply(as.vector(X), rep(1:10000,each=10), mean))
2004 Jan 07
2
problem assigning an array to a variable in a data frame
Dear r-devel list members,
Dirk Eddelbuettel brought the following problem to my attention. The code
is abstracted from the appendix on mixed models from my R and S-PLUS Companion:
> set.seed(12345) # for reproducibility
> library(nlme)
Loading required package: lattice
> data(MathAchieve)
> data(MathAchSchool)
> attach(MathAchieve)
> mses <- tapply(SES, School,
2004 Apr 08
1
Why are Split and Tapply so slow with named vectors, why is a for loop faster than mapply
First, here's the problem I'm working on so you understand the context. I
have a data frame of travel activity characteristics with 70,000+ records.
These activities are identified by unique chain numbers. (Activities are
part of trip chains.) There are 17,500 chains.
I use the chain numbers as factors to split various data fields into lists
of chain characteristics with each element of
2001 Feb 21
1
Specification of factors in tapply
After some fiddling around with the tapply command, I discovered that the
factors (the INDEX argument) given to tapply must be specified in
fastest-cycling first order.
The following code shows how I discovered my error: (R version 1.2.1)
-o-o-o-o-o-
x <- as.data.frame(list(data=c(-9,0,3,1,-9,1,0,-9,0,3,1,-9,1,0),
subj=c(rep(1,7),rep(2,7)),
2004 Apr 15
1
tapply() and barplot() help files for 1.8.1
Hi,
I've just upgraded to 1.9.0 and one of my Sweave files that produces a
number of barplots in a standard manner now produces them in a
different way. I have made a couple of small changes to my code to
get the back the output I was getting before upgrading and now (mostly
out of curiosity) would like to understand what has changed.
I *think* I've tracked it down to tapply() and/or
2001 Apr 17
1
tapply using 2 factors
Hi
I'm trying to use tapply with two factors and the results are not what I
expected. Using one factor at a time it works well but when combining
the two factors the results are:
> tapply(mat[,1],list(mat[,3],mat[,6]),FUN=rep.boot)
1 2 3 4 5 6
9 "Numeric,12" "Numeric,13" "Numeric,13"
1998 Jan 06
1
More on tapply and factors
Here is an even more awkward property of tapply with factors. What we
want to do is to determine the modal value of the first factor given
the level of the second factor. We do that by applying table in a
tapply call then finding the maximum count then ...
Turns out that table gives unexpected results in tapply.
> table(Machines$Machine)
A B C
18 18 18
> tapply( Machines$Machine,
2004 May 13
2
tapply & hist
I'm learning how to use tapply.
Now I'm having a go at the following code in which dati contains almost 600
lines, Pot - numeric - are the capacities of power plants and SGruppo - text
- the corresponding six technologies ("CCC", "CIC","TGC", "CSC","CPC", "TE").
.....................................................
2007 Nov 06
1
A suggestion for an amendment to tapply
Dear R-developers,
when tapply() is invoked on factors that have empty levels, it returns
NA. This behaviour is in accord with the tapply documentation, and is
reasonable in many cases. However, when FUN is sum, it would also
seem reasonable to return 0 instead of NA, because "the sum of an
empty set is zero, by definition."
I'd like to raise a discussion of the possibility of an
2010 Feb 02
3
tapply for function taking of >1 argument?
I'm sure I can put this together from the various 'apply's and split, but I
wonder if anyone has a quick incantation:
E.g. I can do tapply( data, groups, mean)
but how can I do something like: tapply( list(data,weights), groups,
weighted.mean ) ?
(or: mapply is to sapply as ? is to tapply )
Thanks for your help.
--
View this message in context:
2012 Sep 03
1
Scatter plot from tapply output, labels of data
Hei,
i am trying to plot the means of two variables (d13C and d15N), by 2
grouping factors (Species and Year) that i obtained by the function tapply.
I would like to plot with different colours according to the Year and show
the "Species" as data labels.
My data looks like this:
Species d13C d13N Year
"Species1" 14,4 11.5 2009
"Species2"
2008 Nov 14
1
# values used in a function in tapply
Hello,
I am using tapply to pull out data by the day of week and then perform
functions (e.g. mean). I would like to have the number of values used for
the calcuation for the functions, sorted by each day of week. A number of
entries in any given column are NAs.
I have tried the following code and simple variants with no luck.
for (i in 1:length(a[1,])){
x<-tapply(a[,i],a[,1],mean,
2007 Jun 18
1
getting tapply() to work across multiple columns
I have the following data.frame:
index <- c("a","a","b","b","b")
alpha <- c(1,2,3,4,5)
beta <- c(2,3,4,5,6)
table <-data.frame(index,alpha,beta)
I'm now interested in getting means of alpha and beta for each of the
index values and do a tapply() for each of the columns, e.g.
means.alpha <- tapply(table$alpha, index,mean)
2008 Sep 01
1
how to pass additional parameters to a function called in tapply?
Hi all,
the following problem is still beyond my R-knowledge:
I have one data vector containing the signal from 4 channels that are measured
subsequently and in repeating cycles (with one factor vector for cycle and
one for channel identification).
To extract the mean of each channel during each cycle tapply is the method of
choice. However, I cannot use the whole measuring period for each
2008 Sep 28
2
using tapply on a data frame in a function
Hello,
I'm trying to use tapply to find group means in a function. It works
outside of a function, but I get the error message from the following code:
"Error in tapply(index, cluster, mean) : arguments must have same length."
Any suggestions? Thanks.
eric
d <- data.frame(cbind(cluster=1:2, value1=1:10, value2=11:20))
d
FindClusterTraits <- function(framename, index){
2017 Jan 27
1
RFC: tapply(*, ..., init.value = NA)
The "no factor combination" case is distinguishable by 'tapply' with simplify=FALSE.
> D2 <- data.frame(n = gl(3,4), L = gl(6,2, labels=LETTERS[1:6]), N=3)
> D2 <- D2[-c(1,5), ]
> DN <- D2; DN[1,"N"] <- NA
> with(DN, tapply(N, list(n,L), FUN=sum, simplify=FALSE))
A B C D E F
1 NA 6 NULL NULL NULL NULL
2 NULL NULL 3 6
2009 Dec 01
1
Remark on tapply().
Consider the following:
> set.seed(42)
> ff <- factor(sample(c(1,3,5),42,TRUE),levels=1:5)
> x <- runif(42)
> tapply(x,ff,sum)
1 2 3 4 5
3.675436 NA 7.519675 NA 9.094210
I got bitten by those NAs in the result of tapply(). Effectively
one is summing over the empty set, and consequently (according to what
I learned as a child)
2011 Mar 15
1
Questions on dividing lists and tapply
Hello R community,
I have two questions about using R.
The first is about dividing each element of a list with another similar
sized list. So, if the first list has two elements and so does the second,
then the result should also be a list with two elements.
For example, the inputs are:
list(matrix(1:6,ncol=2),matrix(1:6,ncol=2))->l1
l2<-list(1:3,2)
I want to get a list, l3 with the
2009 Apr 13
3
tapply output as a dataframe
i use tapply and by often, but i always end up banging my head against
the wall with the output.
is there a simpler way to convert the output of the following tapply to
a dataframe or matrix than what i have here:
# setup data for tapply
dt = data.frame(bucket=rep(1:4,25),val=rnorm(100))
fn = function(x) {
ret =
c(unname(quantile(x,probs=seq(.25,.75,.25),na.rm=T)),mean(x,na.rm=T))
}
a =
2008 Sep 09
2
exporting tapply objects to csv-files
Dear Everyone,
I try to create a cvs-file with different results form the table function.
Imagine a data-frame with two vectors a and b where b is of the class factor.
I use the tapply function to count a for the different values of b.
tapply(a,b,table)
and I use the table function to have a look of the frequencies as a total
table(a)
I would like to put both results together in one txt or