Ivan Krylov
2023-Dec-12 12:55 UTC
[Rd] Partial matching performance in data frame rownames using [
? Mon, 11 Dec 2023 21:11:48 +0100 Hilmar Berger via R-devel <r-devel at r-project.org> ?????:> What was unexpected is that in this case was that [.data.frame was > hanging for a long time (I waited about 10 minutes and then restarted > R). Also, this cannot be interrupted in interactive mode.That's unfortunate. If an operation takes a long time, it ought to be interruptible. Here's a patch that passes make check-devel: --- src/main/unique.c (revision 85667) +++ src/main/unique.c (working copy) @@ -1631,6 +1631,7 @@ } } + unsigned int ic = 9999; if(nexact < n_input) { /* Second pass, partial matching */ for (R_xlen_t i = 0; i < n_input; i++) { @@ -1642,6 +1643,10 @@ mtch = 0; mtch_count = 0; for (int j = 0; j < n_target; j++) { + if (!--ic) { + R_CheckUserInterrupt(); + ic = 9999; + } if (no_dups && used[j]) continue; if (strncmp(ss, tar[j], temp) == 0) { mtch = j + 1; -- Best regards, Ivan
Hilmar Berger
2023-Dec-13 08:04 UTC
[Rd] Partial matching performance in data frame rownames using [
Dear Ivan, thanks a lot, that is helpful. Still, I feel that default partial matching cripples the functionality of data.frame for larger tables. Thanks again and best regards Hilmar On 12.12.23 13:55, Ivan Krylov wrote:> ? Mon, 11 Dec 2023 21:11:48 +0100 > Hilmar Berger via R-devel <r-devel at r-project.org> ?????: > >> What was unexpected is that in this case was that [.data.frame was >> hanging for a long time (I waited about 10 minutes and then restarted >> R). Also, this cannot be interrupted in interactive mode. > That's unfortunate. If an operation takes a long time, it ought to be > interruptible. Here's a patch that passes make check-devel: > > --- src/main/unique.c (revision 85667) > +++ src/main/unique.c (working copy) > @@ -1631,6 +1631,7 @@ > } > } > > + unsigned int ic = 9999; > if(nexact < n_input) { > /* Second pass, partial matching */ > for (R_xlen_t i = 0; i < n_input; i++) { > @@ -1642,6 +1643,10 @@ > mtch = 0; > mtch_count = 0; > for (int j = 0; j < n_target; j++) { > + if (!--ic) { > + R_CheckUserInterrupt(); > + ic = 9999; > + } > if (no_dups && used[j]) continue; > if (strncmp(ss, tar[j], temp) == 0) { > mtch = j + 1; >
Possibly Parallel Threads
- Partial matching performance in data frame rownames using [
- Partial matching performance in data frame rownames using [
- Partial matching performance in data frame rownames using [
- [.data.frame speedup
- Strange case of partial matching in .[ - possible bug / wrong documentation?