thr3ads.net - similar to: "Strange case of partial matching in .[

Displaying 20 results from an estimated 4000 matches similar to: "Strange case of partial matching in .[ - possible bug / wrong documentation?"

Partial matching performance in data frame rownames using [

2023 Dec 16

Partial matching performance in data frame rownames using [

On Wed, 13 Dec 2023 09:04:18 +0100 Hilmar Berger via R-devel <r-devel at r-project.org> wrote: > Still, I feel that default partial matching cripples the functionality > of data.frame for larger tables. Changing the default now would require a long deprecation cycle to give everyone who uses `[.data.frame` and relies on partial matching (whether they know it or not) enough time to

Partial matching performance in data frame rownames using [

2023 Dec 19

Partial matching performance in data frame rownames using [

Hi Hilmar and Ivan, I have used your code examples to write a blog post about this topic, which has figures that show the asymptotic time complexity of the various approaches, https://tdhock.github.io/blog/2023/df-partial-match/ The asymptotic complexity of partial matching appears to be quadratic O(N^2) whereas the other approaches are asymptotically faster: linear O(N) or log-linear O(N log N).

Partial matching performance in data frame rownames using [

2023 Dec 13

Partial matching performance in data frame rownames using [

Dear Ivan, thanks a lot, that is helpful. Still, I feel that default partial matching cripples the functionality of data.frame for larger tables. Thanks again and best regards Hilmar On 12.12.23 13:55, Ivan Krylov wrote: > ? Mon, 11 Dec 2023 21:11:48 +0100 > Hilmar Berger via R-devel <r-devel at r-project.org> ?????: > >> What was unexpected is that in this case was that

tapply on empty data.frames (PR#10644)

2008 Jan 27

tapply on empty data.frames (PR#10644)

Full_Name: Hilmar Berger Version: 2.4.1/2.6.2alpha OS: WinXP Submission from: (NULL) (84.185.128.110) Hi all, If I use tapply on an empty data.frame I get an error. I'm not quite sure if one can actually expect the function to return with a result. However, the error message suggests that this case does not get handled well. This happens both in R-2.4.1 and 2.6.2alpha (version 2008-01-26).

Find strings in a array

2003 Feb 08

Find strings in a array

I need to know which strings of an array that are in another array. What a best solution? Tks, Francisco. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Francisco Júnior, Computer Science - UFPE-Brazil "One life has more value that the world whole" ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

indexing list elements with lapply?

2011 Apr 22

indexing list elements with lapply?

Dear colleagues, I have a list that looks like what the code below produces. I need a function to go through each list element and work on the second column of each list element (the first column is irrelevant to me...if the proposed function works on the first column as a consequence of a writing something simple, that's fine). I need to index the second column of each list element to the

Refactor all factors in a data frame

2007 Jun 05

Refactor all factors in a data frame

Hi all, Assume I have a data frame with numerical and factor variables that I got through merging various other data frames and subsetting the resulting data frame afterwards. The number levels of the factors seem to be the same as in the original data frames, probably because subset() calls [.factor without drop = TRUE (that's what I gather from scanning the mailing lists). I wonder if

Inconsistencies in subassignment (PR#7210)

2004 Sep 04

Inconsistencies in subassignment (PR#7210)

I have made the 3-d case do the same as the vector case, which is what the C code clearly intended (a goto label was in the wrong place). This leaves the bigger question of the right thing to do. I note that data frames give an error when any indices are NA. -thomas On Fri, 3 Sep 2004 ripley@stats.ox.ac.uk wrote: > Apart from the inconsistencies, there are two clear bugs here: > > 1)

Take care with codes()! (was type of representation)

2003 Jan 03

Take care with codes()! (was type of representation)

Ahh yes, sorry about that. Here's the corrected snippet: # Create an Example Data Frame Containing Car x Color data carnames <- c("bmw","renault","mercedes","seat") carcolors <- c("red","white","silver","green") datavals <- round(rnorm(16, mean=10, sd=4),1) data <- data.frame(Car=rep(carnames,4),

divide column in a dataframe based on a character

2010 Oct 26

divide column in a dataframe based on a character

Hello, If I have a dataframe: example(data.frame) zz<-c("aa_bb","bb_cc","cc_dd","dd_ee","ee_ff","ff_gg","gg_hh","ii_jj","jj_kk","kk_ll") ddd <- cbind(dd, group = zz) and I want to divide the column named group by the "_", how would I do this? so instead of the first row being x

Median

2007 May 08

Median

Hello. I need calculate the median of several column of a data.frame, in a new column of this data frame, but the median operator only calculate from a vector. I have made a functionc that calculate the median but it is very slow. Are there any method in any package to calculate this? Best regards. Jose Sierra. A B C -0.01678042

Crash after (wrongly) applying product operator on object from LIMMA package

2017 Apr 24

Crash after (wrongly) applying product operator on object from LIMMA package

Hi Hilmar, weird. The memory problem seems be due to recursion (my R, version 3.3.3, says: Error: evaluation nested too deeply: infinite recursion / options(expressions=)?, just write traceback() to see how it happens), but why does it segfault with xlsx? Nb xlsx is the culprit: neither rJava nor xlsxjars cause the problem. On the other hand, quick googling for r+xlsx+segfault returns tons of

'==' operator: inconsistency in data.frame(...) == NULL

2019 Sep 11

'==' operator: inconsistency in data.frame(...) == NULL

Dear Martin, On 11/09/2019 09:56, Martin Maechler wrote: > > > I wonder if data.frame(<some non-empty data>) == NULL should also return > > a value instead of an error. R help reads: > > > "At least one of |x| and |y| must be an atomic vector, but > > if the other is a list R attempts to coerce it to the > > type of the atomic

type of representation

2003 Jan 03

type of representation

Hi I have some data that i want to plot but i don't find how to do it. I have car types (bmw,renault,mercedes,seat ...), colors and a number for each car type-color relation.I want to come up with a matrix representation of cars vs colors where in each intersection i could set a dot proportional in size to my third variable. Can anybody give me a clue of hoe to come up with such

Partial matching performance in data frame rownames using [

2023 Dec 11

Partial matching performance in data frame rownames using [

Dear all, I have seen that others have discussed the partial matching behaviour of data.frame[idx,] in the past, in particular with respect to unexpected results sets. I am aware of the fact that one can work around this using either match() or switching to tibble/data.table or similar altogether. I have a different issue with the partial matching, in particular its performance when used on

Crash after (wrongly) applying product operator on object from LIMMA package

2017 Apr 18

Crash after (wrongly) applying product operator on object from LIMMA package

Hi, this is a problem that occurs in the presence of two libraries (limma, xlsx) and leads to a crash of R. The problematic code is the wrong application of sweep or the product ("*") function on an LIMMA MAList object. To my knowledge, limma does not define a "*" method for MAList objects. If only LIMMA is loaded but not package xlsx, the code does not crash but rather

Seg fault stats::runmed

2018 Oct 05

Seg fault stats::runmed

Dear all, I just found this issue: dd1 = c(rep(NaN,82), rep(-1, 144), rep(1, 74)) xx = runmed(dd1, 21) -> R crashes reproducibly in R 3.4.3, R3.4.4 (Ubuntu 14.04/Ubuntu 16.04) With GDB: Program received signal SIGSEGV, Segmentation fault. swap (l=53, r=86, window=window at entry=0xc59308, outlist=outlist at entry=0x12ea2e8, nrlist=nrlist at entry=0x114fdd8, print_level=print_level at

Is there a faster way to do it?

2009 Oct 28

Is there a faster way to do it?

#Mdarts is a matrix 2343x788 #frequencia is a vector 2343x1 # 9 in Mdarts[fri,frj] stands for my missing values which i want to replace by the value in the vector frequencia Mdarts<-t(matrix(scan("C:/GWS/CNB/dartg.txt"),ncol=nindT,nrow=nm, byrow=T)) frequencia <- matrix(scan("C:/GWS/CNB/freq.txt"),ncol=1) for (fri in 1:nindT){ for (frj in 1:nm){ Mdarts[fri,frj] <- if

'==' operator: inconsistency in data.frame(...) == NULL

2019 Sep 11

'==' operator: inconsistency in data.frame(...) == NULL

Sorry, I can't reproduce the example below even on the same machine. However, the following example produces the same error as NULL values in prior examples: > setClass("FOOCLASS", +????????? representation("list") + ) > ma = new("FOOCLASS", list(M=matrix(rnorm(300), 30,10))) > isS4(ma) [1] TRUE > data.frame(a=1:3) == ma Error in

Crash after (wrongly) applying product operator on S4 object that derives from list

2017 Apr 19

Crash after (wrongly) applying product operator on S4 object that derives from list

Dear Hilmar Perhaps this gives an indication of why the infinite recursion happens: ## after calling `*` on ma and a matrix: > showMethods(classes=class(ma), includeDefs=TRUE, inherited = TRUE) Function: * (package base) e1="FOOCLASS", e2="matrix" (inherited from: e1="vector", e2="structure") (definition from function "Ops")

similar to: Strange case of partial matching in .[ - possible bug / wrong documentation?