similar to: read big text file into R

Displaying 20 results from an estimated 2000 matches similar to: "read big text file into R"

2007 Oct 22
2
Help interpreting output of Rprof
Hello there, I am not quite sure how to interpret the output of Rprof (in the following the output I was staring at). I was poking around the web a little bit for documentation but without much success. I guess if I want to figure out what takes so long in my code the 2nd table $by.total and the total.pct column (pct = percent) is the most helpful. What does it mean that [ or [.data.frame is
2012 Dec 05
1
Understanding svd usage and its necessity in generalized inverse calculation
Dear R-devel: I could use some advice about matrix calculations and steps that might make for faster computation of generalized inverses. It appears in some projects there is a bottleneck at the use of svd in calculation of generalized inverses. Here's some Rprof output I need to understand. > summaryRprof("Amelia.out") $by.self self.time self.pct
2009 Mar 03
1
profiler and loops
Hello, (This is follow up from this thread: http://www.nabble.com/execution-time-of-.packages-td22304833.html but with a different focus) I am often confused by the result of the profiler, when a loop is involved. Consider these two scripts: script1: Rprof( ) x <- numeric( ) for( i in 1:10000){ x <- c( x, rnorm(10) ) } Rprof( NULL ) print( summaryRprof( ) ) script2:
2013 Apr 05
2
line profiling
Hello, This is about the new "line profiling" feature in R 3.0.0. As I was testing it, I find the results somewhat disappointing so I'd like to get your opinion. I put some poorly written code in a test.R file, here are the contents: double <- function(x) { out <- c() for (i in x) { out <- c(out, 2*i) # line 4 } return(out) } Then this how I source the file
2009 Jun 12
1
Rprof loses all system() time
Rprof seems to ignore all time spent inside system() calls. E.g., this simple example actually takes about 10 seconds, but Rprof thinks the total time is only 0.12 seconds: > Rprof("sleep-system.out") ; system.time(system(command="sleep 10")) ; Rprof(NULL) user system elapsed 0.000 0.004 10.015 > summaryRprof("sleep-system.out")$by.total
2004 Oct 16
3
Lazy loading... advices
Hello, I am looking for more information about lazy loading introduced in R 2.0.0. Doing ?lazyLoad I got some and there is a 'see also' section that points to 'makeLazyLoading'... But I cannot reach this page. My problem is: I recompiled a library that uses a lot of functions from other libraries (of course I can give details if needed). I load it in my computer: library(svGUI),
2004 Jul 16
3
interpreting profiling output
I have some trouble interpreting the output from profiling. I have read the help pages Rprof, summaryRprof and consult the R extensions manual, but I still have problems understanding the output. Basically the output consist of self.time and total.time. I have the understanding that total.time is the time spent in a given function including any subcalls or child functions or whatever the
2007 Mar 31
1
Probem with argument "append" in "Rprof"
Hello, Appending information to the profiler's output seems to generate problems. Here is a small example of code : <code r> require(boot) Rprof( memory.profiling = TRUE) Rprof(NULL) for(i in 1:2){ Rprof( memory.profiling = TRUE, append = TRUE) example(boot) Rprof(NULL) } </code> The problem is that the file Rprof.out contains more than once the header information: $ grep
2013 Mar 28
1
make R program faster
Hi there are some good tips in "The R Inferno" http://www.burns-stat.com/documents/books/the-r-inferno/ or connect C++ to R with Rcpp http://dirk.eddelbuettel.com/code/rcpp.html or byte code compiler (library(compiler)) or library(data.table) but do you have an idea to fasten standard R source code, with the following Rprof output self.time self.pct total.time
2012 Oct 26
1
Parsing very large xml datafiles with SAX: How to profile <anonymous> functions?
Hello everyone, I'm trying to parse a very large XML file using SAX with the XML package (i.e., mainly the xmlEventParsing function). This function takes as an argument a list of other functions (handlers) that will be called to handle particular xml nodes. If when I use Rprof(), all the handler functions are lumped together under the <anonymous> label, and I get something like this:
2009 Oct 19
2
how to get rid of 2 for-loops and optimize runtime
Short: get rid of the loops I use and optimize runtime Dear all, I want to calculate for each row the amount of the month ago. I use a matrix with 2100 rows and 22 colums (which is still a very small matrix. nrows of other matrixes can easily be more then 100000) Table before Year month quarter yearmonth Service ... Amount 2009 9 Q3 092009 A ...
2009 Nov 10
1
standardGeneric seems slow; any way to get around it?
Hi, I'm running some routines with standard matrix operations like solve() and diag(). When I do a profile, the lead item under total time is standardGeneric(). Furthermore, solve() and diag() have much greater total time than self time. ??? I assume there is some time-consuming decision going on in the usual functions; is there any way to avoid that and go straight to the calculaions? Thanks
2003 Nov 29
3
performance gap between R 1.7.1 and 1.8.0
Dear R-help, A colleague of mine was running some code on two of our boxes, and noticed a rather large difference in running time. We've so far isolated the problem to the difference between R 1.7.1 and 1.8.0, but not more than that. The exact same code took 933.5 seconds in 1.7.1, and 3594.4 seconds in 1.8.1, on the same box. Basically, the code calls boot() to bootstrap fitting mixture
2012 Sep 14
1
please comment on my function
this function is supposed to canonicalize the language: --8<---------------cut here---------------start------------->8--- canonicalize.language <- function (s) { s <- tolower(s) long <- nchar(s) == 5 s[long] <- sub("^([a-z]{2})[-_][a-z]{2}$","\\1",s[long]) s[nchar(s) != 2 & s != "c"] <- "unknown" s }
2008 Nov 26
2
Very slow: using double apply and cor.test to compute correlation p.values for 2 matrices
My two matrices are roughly the sizes of m1 and m2. I tried using two apply and cor.test to compute the correlation p.values. More than an hour, and the codes are still running. Please help to make it more efficient. m1 <- matrix(rnorm(100000), ncol=100) m2 <- matrix(rnorm(10000000), ncol=100) cor.pvalues <- apply(m1, 1, function(x) { apply(m2, 1, function(y) { cor.test(x,y)$p.value
2005 Oct 25
1
performance of nchar
Hi, Is nchar function knowingly slow in R? I'm doing some string formatting that requires multiple call to nchar, and nchar seems to be very slow. Experiment 1, pass nchar inside sprintf, and it takes 0.7 seconds > system.time(for (i in 1:10000) + str = sprintf('0005%020d', nchar(op)) + )[3] [1] 0.7 Experiment 2, get the length of op separately using nchar, and then pass
2023 Oct 15
2
Create new data frame with conditional sums
Under the hood, sapply() is also a loop (at the interpreted level). As is lapply(), etc. -- Bert On Sun, Oct 15, 2023 at 2:34?AM Jason Stout, M.D. <jason.stout at duke.edu> wrote: > > That's very helpful and instructive, thank you! > > Jason Stout, MD, MHS > Box 102359-DUMC > Durham, NC 27710 > FAX 919-681-7494 > ________________________________ > From: John
2009 Mar 18
2
Profiling question: string formatting extremely slow
Hi all, I'm using R to find duplicates in a set of 6 files containing Part Number information. Before applying the intersect method to identify the duplicates I need to normalize the P/Ns. Converting the P/N to uppercase if alphanumerical and applying an 18 char long zero padding if numerical. When I apply the pn_formatting function (see code below) to "Part Number" column of the
2023 Oct 14
1
Create new data frame with conditional sums
That's very helpful and instructive, thank you! Jason Stout, MD, MHS Box 102359-DUMC Durham, NC 27710 FAX 919-681-7494 ________________________________ From: John Fox <jfox at mcmaster.ca> Sent: Saturday, October 14, 2023 10:13 AM To: Jason Stout, M.D. <jason.stout at duke.edu> Cc: r-help at r-project.org <r-help at r-project.org> Subject: Re: [R] Create new data frame with
2023 Oct 14
2
Create new data frame with conditional sums
Well, here's one way to do it: (dat is your example data frame) Cutoff <- seq(0, .15, .01) Pop <- with(dat, sapply(Cutoff, \(p)sum(Totpop[Pct >= p]))) I think there must be a more efficient way to do it with cumsum(), though. Cheers, Bert On Sat, Oct 14, 2023 at 12:53?AM Jason Stout, M.D. <jason.stout at duke.edu> wrote: > > This seems like it should be simple but I