similar to: efficiently picking one row from a data frame per unique key

Displaying 20 results from an estimated 2000 matches similar to: "efficiently picking one row from a data frame per unique key"

2020 Nov 21
3
Error in unsplit() with tibbles
Hello, using the `unsplit()` function with tibbles currently leads to the following error: > mtcars_tb <- as_tibble(mtcars, rownames = NULL) > s <- split(mtcars_tb, mtcars_tb$gear) > unsplit(s, mtcars_tb$gear) Error: Must subset rows with a valid subscript vector. ? Logical subscripts must match the size of the indexed input. x Input has size 15 but subscript `rep(NA, len)` has
2002 Jul 28
1
[R] bug in unsplit()? (PR#1843)
Hedderik van Rijn <hedderik@cmu.edu> writes: > If the second argument to unsplit is not a simple vector (but a "list > containing multiple lists"), the function seems to have some problems. > > Given a slight modification of the examples in help(split): > > > xg <- split(x,list(g1=g,g2=g)) > > unsplit(xg,list(g1=g,g2=g)) > [1] -0.7877109
2020 Nov 21
2
Error in unsplit() with tibbles
I get the sentiment, but this is really just bad coding (on my own part, I suspect), so we might as well just fix it... -pd > On 21 Nov 2020, at 17:42 , Marc Schwartz via R-devel <r-devel at r-project.org> wrote: > > >> On Nov 21, 2020, at 10:55 AM, Mario Annau <mario.annau at gmail.com> wrote: >> >> Hello, >> >> using the `unsplit()`
2010 Apr 19
2
Using split and then unsplit
Hello everyone, I use the split function splitting with the f function on a 3 columns and more than 100 000 rows data frame. Once it's split I have a list of data frames still with 3 columns and n rows. I manipulate those list elements and get a list of data frames still with 3 columns but less rows. So when I unsplit it, I get an error as I use the same factor function I used to split ( f in
2006 Jun 08
1
NAs in unsplit factor
R-devel, Below is a simple example calling split and unsplit on a numeric vector of length 2 where 'f' is c(1,NA). > unsplit(split(c(1,2), c(1,NA)), c(1,NA)) [1] 1 0 I noticed that the call to vector in unsplit gives us 0 as the 2nd element of the result. Is this the intended result, as opposed to NA? Thanks for your help, Jeff -- Jeff Enos Kane Capital Management jeff at
2005 Sep 27
2
Using unsplit - unsplit does not seem to reverse the effect of split
In data OME in MASS I would like to extract the first 5 observations per subject (=ID). So I do library(MASS) OMEsub <- split(OME, OME$ID) OMEsub <- lapply(OMEsub,function(x)x[1:5,]) unsplit(OMEsub, OME$ID) - which results in [[1]] [1] 1 1 1 1 1 [[2]] [1] 30 30 30 30 30 [[3]] [1] low low low low low Levels: N/A high low [[4]] [1] 35 35 40 40 45 [[5]] [1] coherent incoherent coherent
2009 May 08
1
unsplit list of data.frames with one column
Perhaps this is the intended behavior, but I discovered that unsplit throws an error when it tries to set rownames of a variable that has no dimension. This occurs when unsplit is passed a list of data.frames that have only a single column. An example: df <- data.frame(letters[seq(25)]) fac <- rep(seq(5), 5) unsplit(split(df, fac), fac) For reference, I'm using R version 2.9.0
2011 May 19
1
Problems with unsplit()
Hi everyone, I have already used split() and unsplit() in data frames without problems, but now I’m applying these functions to other data and when using unsplit() I have received the following message: Error in `row.names<-.data.frame`(`*tmp*`, value = c("1", "2", "3", "4", : duplicate ''row.names'' are not allowed In
2011 Mar 10
1
getting percentiles by factor
Hello, I'm trying to get percentiles (PERCENTRANK for excel users) by factor in the following data.frame: myExample <- data.frame(Ret=seq(-2, 2.5, by=0.5),PE=seq(10,19),Sectors=rep(c("Financial","Industrial"),5)) myExample <- na.omit(myExample) Thanks to Patrick I I managed to put together the following lines which does it for the "Ret" column: myecdf
2009 Nov 19
2
Efficient cbind of elements from two lists
Hi! I have a data.frame "data" and splitted it. data <- split(data, data[,1]) This is a quite slow procedure; and I do not want to do it again. So, any unsplit and "resplit" is no option for me. But: I have to cbind "variables" to the splitted data from another list, that contains of vectors with matching sizes, so for (i in 1:length(data)) { data[[i]]
2024 Oct 06
0
Coda: On the efficiency of unsplit() for Rolf Turner's recent post
(only of interest -- maybe! -- to those who followed this thread of a couple of weeks ago) Just for the heckuva it, I compared the timing of Deepayan's unsplit(x,f) solution to my as.vector(do.call(rbind, x)) approach to the query for a list of 3 vectors each of length 1000 (the original toy example was for a list of 3 vectors of length 5). Unsurprisingly, I think, because the unsplit()
2024 Sep 27
1
Is there a sexy way ...?
>>>>> Chris Evans via R-help >>>>> on Fri, 27 Sep 2024 12:20:47 +0200 writes: > Oh glorious!? Thanks Duncan. > Fortune cookie nomination! I don't disagree with the nomination -- thank you, Duncan! However, please note that I'm sure Rolf's was challenged / question was ment to work correctly for all factors `f` with levels
2005 Jun 25
1
group means: split and unsplit
Took me a while but I figured out how to put in common values of group means/counts, etc. to do the same thing as egen. lapply with split and then unsplit. Thomas Davidoff Assistant Professor Haas School of Business UC Berkeley Berkeley, CA 94720 phone: (510) 643-1425 fax: (510) 643-7357 davidoff@haas.berkeley.edu http://faculty.haas.berkeley.edu/davidoff [[alternative HTML
2009 Nov 25
0
Possible bug in "unsplit" (PR#14084)
Dear R-bug-people I have encountered a problem with "unsplit", which I believe may be caused by a bug in the function. However, unexpericend with bug-reports I apologise if this is barely a user problem rather than a problem within R. The problem occurs if an object is split by several grouping factors with levels not occuring in the data, and using drop = TRUE. This may appear as
2010 Aug 20
2
Problem with POSIXct in ave
Hi, I am having trouble using the ave function with a POSIXct object. For example: x<-Sys.time()+0:9*3600 dat<-data.frame(id=rep(c('a',' b','c'),each=10),dt=rep(x,3),i=rep(1:10,3)) dat # This is what I want to do: dat$time.elapsed<-unsplit(lapply(split(dat$dt,dat$id),function(x) x-x[1]),f=dat$id) dat # The above code does the trick, but from the standpoint of
2006 Jul 13
1
Scalling/Centering the Data by an Index
Dear All: I would like to center the data in 'x' by 'group'. The following code scale the data and I have not been able to figure out how to change it so I get the centered data. x <- c(1, 2, 3, 4, 5, 6, 7, 8) group <- c(1,1,1,2,2,2,2,2) unsplit(lapply(split(x,group),scale),group) I would appreciate your help. Ashraf
2011 May 06
1
Cumsum in Lattice Panel Function
I'm trying to create an xyplot with a "groups" argument where the y-variable is the cumsum of the values stored in the input data frame. I almost have it, but I can't get it to automatically adjust the y-axis scale. How do I get the y-axis to automatically scale as it would have if the cumsum values had been stored in the data frame? Here is the code I have so far:
2005 Jul 21
1
Clustered standard errors in a panel
I want to do the following: glm(y ~ x1 + x2 +...) within a panel. Hence y, x1, and x2 all vary at the individual level. However, there is likely correlation of these variables within an individual, so standard errors need adjustment. I do not want to estimate fixed effects, but do want to cluster standard errors at the individual level. Is there an automated way to do this? Nothing in
2020 Nov 21
0
Error in unsplit() with tibbles
Cool - thank you Peter! @Marc: This is really not a tidyverse vs base-R debate and I personally think that they should both work together for most parts. The common environment is still R. But just to give you the full picture I also filed a bug for tibbles (https://github.com/tidyverse/tibble/issues/829). With these two fixes I think that split/unsplit would work for tibbles and users (like me)
2020 Nov 21
0
Error in unsplit() with tibbles
> On Nov 21, 2020, at 10:55 AM, Mario Annau <mario.annau at gmail.com> wrote: > > Hello, > > using the `unsplit()` function with tibbles currently leads to the > following error: > >> mtcars_tb <- as_tibble(mtcars, rownames = NULL) >> s <- split(mtcars_tb, mtcars_tb$gear) >> unsplit(s, mtcars_tb$gear) > Error: Must subset rows with a valid