Displaying 20 results from an estimated 1000 matches similar to: "[.data.frame speedup"
2023 Dec 16
2
Partial matching performance in data frame rownames using [
On Wed, 13 Dec 2023 09:04:18 +0100
Hilmar Berger via R-devel <r-devel at r-project.org> wrote:
> Still, I feel that default partial matching cripples the functionality
> of data.frame for larger tables.
Changing the default now would require a long deprecation cycle to give
everyone who uses `[.data.frame` and relies on partial matching
(whether they know it or not) enough time to
2023 Dec 19
1
Partial matching performance in data frame rownames using [
Hi Hilmar and Ivan,
I have used your code examples to write a blog post about this topic,
which has figures that show the asymptotic time complexity of the
various approaches,
https://tdhock.github.io/blog/2023/df-partial-match/
The asymptotic complexity of partial matching appears to be quadratic
O(N^2) whereas the other approaches are asymptotically faster: linear
O(N) or log-linear O(N log N).
2006 Nov 07
1
data frame subscription operator
Hi all,
I was looking at the data frame subscription operator (attached in the end
of this e-mail) and got puzzled by the following line:
class(x) <- attr(x, "row.names") <- NULL
This appears to set the class and row.names attributes of the incoming data
frame to NULL. So far I was not able to figure out why this is necessary -
could anyone help ?
The reason I am
2023 Dec 13
1
Partial matching performance in data frame rownames using [
Dear Ivan,
thanks a lot, that is helpful.
Still, I feel that default partial matching cripples the functionality
of data.frame for larger tables.
Thanks again and best regards
Hilmar
On 12.12.23 13:55, Ivan Krylov wrote:
> ? Mon, 11 Dec 2023 21:11:48 +0100
> Hilmar Berger via R-devel <r-devel at r-project.org> ?????:
>
>> What was unexpected is that in this case was that
2009 May 26
1
Bug in "$<-.data.frame" yields corrupt data frame (PR#13724)
Full_Name: Steven McKinney
Version: 2.9.0
OS: Mac OS X 10.5.6
Submission from: (NULL) (142.103.207.10)
A corrupt data frame can be constructed as follows:
foo <- matrix(1:12, nrow = 3)
bar <- data.frame(foo)
bar$NewCol <- foo[foo[, 1] == 4, 4]
bar
lapply(bar, length)
> foo <- matrix(1:12, nrow = 3)
> bar <- data.frame(foo)
> bar$NewCol <- foo[foo[, 1] == 4, 4]
2009 Oct 14
1
using mapply to avoid loops
Hello, I would like to use mapply to avoid using a loop but for some reason, I can't seem to get it to work. I've included copies of my code below. The first set of code uses a loop (and it works fine), and the second set of code attempts to use mapply but I get a "subscript out of bounds" error. Any guidance would be greatly appreciated. Xj, Yj, and Wj are also lists, and s2,
2023 Nov 14
1
data.frame weirdness
They differ in whether the row names are "automatic":
> .row_names_info(a1)
[1] -3
> .row_names_info(a2)
[1] 3
Best,
-Deepayan
On Tue, 14 Nov 2023 at 08:23, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
>
> What is going on here? In the lines ending in #### the inputs and outputs
> are identical yet one gives a warning and the other does not.
>
>
2023 Nov 14
1
data.frame weirdness
In that case identical should be FALSE but it is TRUE
identical(a1, a2)
## [1] TRUE
On Tue, Nov 14, 2023 at 8:58?AM Deepayan Sarkar
<deepayan.sarkar at gmail.com> wrote:
>
> They differ in whether the row names are "automatic":
>
> > .row_names_info(a1)
> [1] -3
> > .row_names_info(a2)
> [1] 3
>
> Best,
> -Deepayan
>
> On Tue, 14 Nov
2004 May 24
1
as.matrix.data.frame() in R 1.9.0 converts to character when it should (?) convert to numeric
Conversion of a data frame to a matrix using as.matrix() when a
column of the data frame is POSIXt and all other columns are numeric
has changed in R 1.9.0 from R 1.8.1. The new behavior issues a
warning message and converts to a character matrix. In R 1.8.1, such
an object was converted to a numeric matrix.
Here is an example.
#### R 1.9.0 ####
> foo <- data.frame(
2023 Nov 14
1
data.frame weirdness
Also why should that difference result in different behavior?
On Tue, Nov 14, 2023 at 9:38?AM Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
>
> In that case identical should be FALSE but it is TRUE
>
> identical(a1, a2)
> ## [1] TRUE
>
>
> On Tue, Nov 14, 2023 at 8:58?AM Deepayan Sarkar
> <deepayan.sarkar at gmail.com> wrote:
> >
> >
2017 Dec 01
1
Bug is as.matrix.data.frame with nested data.frame
Converting a data.frame with a nested data.frame to a matrix fails:
x <- structure(list(a = data.frame(letters)),
class = "data.frame",
row.names = .set_row_names(26))
as.matrix(x)
#> Error in ncol(xj) : object 'xj' not found
The offending code is here, in the definition of as.matrix.data.frame
(source/base/all.R):
for (j in pseq) {
2009 Oct 29
3
Weird error: Error in xj[i] : invalid subscript type 'list'
I got the error. I haven't been able to get a stand along case so that
I can show it here. But could somebody give some clue on what could
cause this error? Since I never defined xj[i], I don't understand
where this error come from.
Error in xj[i] : invalid subscript type 'list'
2023 Nov 14
1
data.frame weirdness
On Tue, 14 Nov 2023 at 09:41, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
>
> Also why should that difference result in different behavior?
That's justifiable, I think; consider:
> d1 = data.frame(a = 1:4)
> d2 = d3 = data.frame(b = 1:2)
> row.names(d3) = c("a", "b")
> data.frame(d1, d2)
a b
1 1 1
2 2 2
3 3 1
4 4 2
> data.frame(d1,
2003 Jun 18
1
suggestion for make.names
I would like to suggest a modification to the make.names() function.
The current implementation has two problems:
1. It doesn't check if a name matches an R keyword (like "function").
2. The uniqueness algorithm is not invariant to concatenation.
In other words,
make.names(c("a","a","a"),unique=T) !=
2003 Jun 18
1
suggestion for make.names
I would like to suggest a modification to the make.names() function.
The current implementation has two problems:
1. It doesn't check if a name matches an R keyword (like "function").
2. The uniqueness algorithm is not invariant to concatenation.
In other words,
make.names(c("a","a","a"),unique=T) !=
2009 Nov 04
4
unexpected results in comparison (x == y)
Dear readers of the list,
I have a problem a comparison of two data from a vector. The comparison
yields FALSE but should be TRUE. I have checked for mode(), length() and
attributes(). See the following code (R2.10.0):
-----------------------------------------------
# data vector of 66 double data
X =
2017 Oct 09
1
Using response variable in interaction as explanatory variable in glm crashes R
>>>>> Jan van der Laan <rhelp at eoos.dds.nl>
>>>>> on Fri, 6 Oct 2017 12:13:39 +0200 writes:
> It is actually model.matrix that crashes, not glm. Same
> crash occurs with e.g. lm.
> model.matrix(dob_mon ~ dob_day*dob_mon, data = tab)
> also crashes R.
Yes, segmentation fault.
It only happens when these are *logical*
2005 May 26
1
Simplify formula for heterogeneity
Dear R-ians,
I'm looking for a computational simplified formula to calculate a
measure for heterogeneity (let's say H ):
H = sqrt [ (Si (Sj (Xi - Xj)?? ) ) /n ]
where:
sqrt = square root
Si = summation over i (= 0 to n)
Sj = summation over j (= 0 to n)
Xi = element of X with index i
Xj = element of X with index j
I can simplify the formula to:
H = sqrt [ ( 2 * n * Si (Xi) - 2 Si (Sj
2010 Mar 26
2
R loop help
Hi,
I am tring to write a loop to compute this,
==========================
x1=c(
rep(-1,4),
rep(1,4)
)
x2=c(
rep(c(-1,-1,1,1),2)
)
x3=c(
rep(c(-1,1),4)
)
x1*x2
x1*x3
x2*x3
========================
suppose i have x1,x2,x3
i want to compute their ' two factor interactions', x1x2,x1x3 and x2x3,
I wrote
========================
for(i in 1:2){
for( j in i+1:3){
xij=c()
2003 Oct 02
3
Query: weighting cells in histogram
I have the 'breaks' for the histogram ('hist') but I want weight the cells instead of using actual observations. I thought that using freq=FALSE implied that the numbers in 'x' were weights but this turned out to be wrong.
Any help and/or comment is very much appreciated.
Regards,
M?rten
M?rten Bjellerup
Doctoral Student in Economics
School of Management and Economics