Displaying 20 results from an estimated 7000 matches similar to: "read.csv quadratic time in number of columns"
2023 Mar 30
1
write.csv performance improvements?
Dear R-devel,
I did a systematic comparison of write.csv with similar functions, and
observed two asymptotic inefficiencies that could be improved.
1. write.csv is quadratic time (N^2) in the number of columns N.
Can write.csv be improved to use a linear time algorithm, so it can handle
CSV files with larger numbers of columns?
For more details including figures and session info, please see
2019 Feb 22
1
Bug: time complexity of substring is quadratic as string size and number of substrings increases
On 2/20/19 7:55 PM, Toby Hocking wrote:
> Update: I have observed that stringi::stri_sub is linear time complexity,
> and it computes the same thing as base::substring. figure
> https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.png
> source:
> https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.R
>
> To me this is a
2019 Feb 20
0
Bug: time complexity of substring is quadratic as string size and number of substrings increases
Update: I have observed that stringi::stri_sub is linear time complexity,
and it computes the same thing as base::substring. figure
https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.png
source:
https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.R
To me this is a clear indication of a bug in substring, but again it would
be nice to have
2019 Feb 20
2
Bug: time complexity of substring is quadratic as string size and number of substrings increases
Hi all, (and especially hi to Tomas Kalibera who accepted my patch sent
yesterday)
I believe that I have found another bug, this time in the substring
function. The use case that I am concerned with is when there is a single
(character scalar) text/subject, and many substrings to extract. For example
substring("AAAA", 1:4, 1:4)
or more generally,
N=1000
2019 Feb 19
1
patch for gregexpr(perl=TRUE)
Hi all,
Several people have noticed that gregexpr is very slow for large subject
strings when perl=TRUE is specified.
-
https://stackoverflow.com/questions/31216299/r-faster-gregexpr-for-very-large-strings
-
http://r.789695.n4.nabble.com/strsplit-perl-TRUE-gregexpr-perl-TRUE-very-slow-for-long-strings-td4727902.html
- https://stat.ethz.ch/pipermail/r-help/2008-October/178451.html
I figured out
2019 Oct 29
0
stats::reshape quadratic in number of input columns
Hi R-core,
I have been performance testing R packages for wide-to-tall data reshaping
and for the most part I see they differ by constant factors.
However in one test, which involves converting into multiple output
columns, I see that stats::reshape is in fact quadratic in the number of
input columns. For example take the iris data, which has 4 input columns to
reshape, and the desired output
2006 May 17
1
install.packages bug (PR#8873)
Hello,
I've been using R for about 3 years now and I'm pretty sure this is a bug.
I'm using R 2.2.0.
The way R is set up to get packages from CRAN using install.packages is
really convenient --- if you are installing to your system's main package
directory. However, I observe the following problem:
I want package X but it requires package Y. Further, I have neither
package
2015 Sep 02
0
mclapply memory leak?
Well it's only a leak if you don't get the memory back after it returns,
right?
Anyway, one (untested by me) possibility is the copying of memory pages
when the garbage collector touches objects, as pointed out by Radford Neal
here:
http://r.789695.n4.nabble.com/Re-R-devel-Digest-Vol-149-Issue-22-td4710367.html
If so, I don't think this would be easily avoidable, but there may be
2015 Sep 03
0
mclapply memory leak?
Toby,
> On Sep 2, 2015, at 1:12 PM, Toby Hocking <tdhock5 at gmail.com> wrote:
>
> Dear R-devel,
>
> I am running mclapply with many iterations over a function that modifies
> nothing and makes no copies of anything. It is taking up a lot of memory,
> so it seems to me like this is a bug. Should I post this to
> bugs.r-project.org?
>
> A minimal reproducible
2020 Jun 10
0
valgrind false positive on R startup?
It is known, with a known workaround, see e.g.
https://www.stats.ox.ac.uk/pub/bdr/memtests/README.txt . Set
suppressions in ~/.valgrindrc, e.g. the CRAN check machine has
--suppressions=/data/blackswan/ripley/wcsrtombs.supp
It is an issue in your OS (glibc), not TRE nor R.
On 10/06/2020 00:21, Toby Hocking wrote:
> Hi all,
>
> I'm on Ubuntu 18.04, running R-4.0.0 which I
2019 May 30
0
R pkg install should fail for unsuccessful DLL copy on windows?
Hi Toby,
AFAIK it has not been addressed in R. You can handle the problem on
your package side, see
https://github.com/Rdatatable/data.table/pull/3237
Regards,
Jan
On Thu, May 30, 2019 at 4:46 AM Toby Hocking <tdhock5 at gmail.com> wrote:
>
> Hi all,
>
> I am having an issue related to installing packages on windows with
> R-3.6.0. When installing a package that is in use, I
2015 Sep 02
4
mclapply memory leak?
Dear R-devel,
I am running mclapply with many iterations over a function that modifies
nothing and makes no copies of anything. It is taking up a lot of memory,
so it seems to me like this is a bug. Should I post this to
bugs.r-project.org?
A minimal reproducible example can be obtained by first starting a memory
monitoring program such as htop, and then executing the following code
while
2020 May 13
1
docs about _R_CHECK_FORCE_SUGGESTS_ ?
Hi Toby,
As Gabor pointed out the place where the various levers R CMD check
supports is in the R-internals manual, but there is a link directly to that
section in
https://cloud.r-project.org/doc/manuals/r-release/R-exts.html#Checking-packages
It could perhaps be more prominent, perhaps by moving the paragraph that
appears in to before the detailed list of exact tests that are performed?
2020 Jun 09
2
valgrind false positive on R startup?
Hi all,
I'm on Ubuntu 18.04, running R-4.0.0 which I compiled from source, and
using valgrind I am always seeing the following message. Does anybody
else see that? Is that a known false positive? Any ideas how to
fix/suppress? Seems related to TRE, do I need to upgrade that?
(base) tdhock at maude-MacBookPro:~/R/binsegRcpp$ R --vanilla -d valgrind
-e 'extSoftVersion()'
==9565==
2024 Oct 25
1
Could .Primitive("[") stop forcing R_Visible = TRUE?
On Thu, 24 Oct 2024 13:23:56 -0400
Toby Hocking <tdhock5 at gmail.com> wrote:
> The patch you are proposing to base R is
> https://github.com/Rdatatable/data.table/issues/6566#issuecomment-2428912338
> right?
Yes, it's this one, thank you for providing the link.
Surprisingly, a very cursory check of 100 packages most downloaded from
cloud.r-project.org in the last month
2011 Feb 25
0
Named capture in regexp
Dear R core developers,
One feature from Python that I have been wanting in R is the ability
to capture groups in regular expressions using names. Consider the
following example in R.
> notables <- c(" Ben Franklin and Jefferson Davis","\tMillard Fillmore")
> name.rex <- "(?<first>[A-Z][a-z]+) (?<last>[A-Z][a-z]+)"
> (parsed <-
2020 Jan 08
1
add jsslogo.jpg to R sources?
On Wed, 8 Jan 2020, I?aki Ucar wrote:
> On Wed, 8 Jan 2020 at 19:21, Toby Hocking <tdhock5 at gmail.com> wrote:
>>
>> Hi R-core, I was wondering if somebody could please add jsslogo.jpg to the
>> R sources? (as I reported yesterday in this bug)
>>
>> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17687
>>
>> R already includes jss.cls which
2023 Dec 19
1
Partial matching performance in data frame rownames using [
Hi Hilmar and Ivan,
I have used your code examples to write a blog post about this topic,
which has figures that show the asymptotic time complexity of the
various approaches,
https://tdhock.github.io/blog/2023/df-partial-match/
The asymptotic complexity of partial matching appears to be quadratic
O(N^2) whereas the other approaches are asymptotically faster: linear
O(N) or log-linear O(N log N).
2020 Jun 27
1
Error in substring: invalid multibyte string
Thanks for the quick response Ivan. readLines with encoding='latin1' works
for me (on Ubuntu).
However I was more concerned with the inconsistency in results between
substr and regexpr. I was expecting that if one of them errors because of
an unknown encoding then the other should as well. Even better, if regexpr
works, why shouldn't substr work as well?
Incidentally the analogous
2015 May 04
0
Print output during long tests?
Dear Toby,
Have you tried adding output to the tests with the context() function?
Best regards,
Thierry
Op 4 mei 2015 18:28 schreef "Toby Hocking" <tdhock5 at gmail.com>:
I am the author of R package animint which uses testthat for unit tests.
This means that there is a single test file (animint/tests/testthat.R) and
during R CMD check we will see the following output
*