thr3ads.net - similar to: "Newbie wants to compare 2 huge RDSs row by row."

Displaying 20 results from an estimated 4000 matches similar to: "Newbie wants to compare 2 huge RDSs row by row."

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 27

Newbie wants to compare 2 huge RDSs row by row.

Also, it will be easier to provide helpful information if you'd describe what in your data you want to compare and what you hope to get out of the comparison. Best wishes, Ulrik Eric Berger <ericjberger at gmail.com> schrieb am Sa., 27. Jan. 2018, 08:18: > Hi Marsh, > An RDS is not a data structure such as a data.frame. It can be anything. > For example if I want to save my

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 27

Newbie wants to compare 2 huge RDSs row by row.

Hi Guys, I apologize for my rank & utter newness at R. I used summary() and found about 95 variables, both character and numeric, all with "Length:368842" I assume is the # of records. I'd like to know the record number (row #?) of any record where the data doesn't match in the 2 files of what should be the same output. Thanks in advance, M. //

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 27

Newbie wants to compare 2 huge RDSs row by row.

If your two objects have class "data.frame" (look at class(objectName)) and they both have the same number of columns and the same order of columns and the column types match closely enough (use all.equal(x1, x2) for that), then you can try which( rowSums( x1 != x2 ) > 0) E.g., > x1 <- data.frame(X=1:5, Y=rep(c("A","B"),c(3,2))) > x2 <-

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 27

Newbie wants to compare 2 huge RDSs row by row.

Hi Marsh, An RDS is not a data structure such as a data.frame. It can be anything. For example if I want to save my objects a, b, c I could do: > saveRDS( list(a,b,c,), file="tmp.RDS") Then read them back later with > myList <- readRDS( "tmp.RDS" ) Do you have additional information about your "RDSs" ? Eric On Sat, Jan 27, 2018 at 6:54 AM, Marsh Hardy

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 28

Newbie wants to compare 2 huge RDSs row by row.

The anti_join from the package dplyr might also be handy. install.package("dplyr") library(dplyr) anti_join (x1, x2) You can get help on the different functions by ?function.name(), so ?anti_join() will bring you help - and examples - on the anti_join function. It might be worth testing your approach on a small subset of the data. That makes it easier for you to follow what happens

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 28

Newbie wants to compare 2 huge RDSs row by row.

Cool, looks like that'd do it, almost as if converting an entire record to a character string and comparing strings. -- M. B. Hardy, statistician work: Applied Research Associates, S. E. Div. 8537 Six Forks Rd., # 6000 / Raleigh, NC 27615-2963 (919) 582-3329, fax: 582-3301 home: 1020 W. South St. / Raleigh, NC 27603-2162 (919) 834-1245

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 28

Newbie wants to compare 2 huge RDSs row by row.

Thanks, I think I've found the most succinct expression of differences in two data.frames... length(which( rowSums( x1 != x2 ) > 0)) gives a count of the # of records in two data.frames that do not match. // ________________________________________ From: Henrik Bengtsson [henrik.bengtsson at gmail.com] Sent: Sunday, January 28, 2018 11:12 AM To: Ulrik Stervbo Cc: Marsh Hardy ARA/RISK;

Newbie wants to compare 2 huge RDSs row by row.

2018 Jan 28

Newbie wants to compare 2 huge RDSs row by row.

The diffobj package (https://cran.r-project.org/package=diffobj) is really helpful here. It provides "diff" functions diffPrint(), diffStr(), and diffChr() to compare two object 'x' and 'y' and provide neat colorized summary output. Example: > iris2 <- iris > iris2[122:125,4] <- iris2[122:125,4] + 0.1 > diffobj::diffPrint(iris2, iris) < iris2 >

readRDS and saveRDS

2011 Oct 18

readRDS and saveRDS

Hi all, Is there any chance that readRDS and saveRDS might one day become read.rds and write.rds? That would make them more consistent with the other reading and writing functions. Hadley -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/

Risk of readRDS() not detecting race conditions with parallel saveRDS()?

2012 Sep 15

Risk of readRDS() not detecting race conditions with parallel saveRDS()?

I hardly know anything about the format used in (non-compressed) serialization/RDS, but hoping someone with more knowledge could give me some feedback; Consider two R processes running in parallel on the same unknown file system. Both of them write and read to the same RDS file foo.rds (without compression) at random times using saveRDS(object, file="foo.rds", compress=FALSE) and

recordPlot/replayPlot not working with saveRDS/readRDS

2018 Apr 02

recordPlot/replayPlot not working with saveRDS/readRDS

The documentation for recordPlot says the following: > As of R 3.3.0, it is possible (again) to replay a plot from another R session using, for example, saveRDS and readRDS. However, I haven't been able to save and restore a plot displaylist and have it work within the same R session, using R 3.4.3 or 3.3.3. Here's an example: # Save displaylist for a simple plot

Problem reading RDS files

2018 Apr 22

Problem reading RDS files

Hi there, I faced a weird problem doing a seemingly simple task in R. Specifically, when trying for reading an RDS file from the working directory, the following error is appeared. Code: records <- readRDS("tweets.rds") Error: Error in readRDS("tweets.rds") : error reading from connection In addition: Warning message: In readRDS("tweets.rds") : invalid or

How to benchmark speed of load/readRDS correctly

2017 Aug 22

How to benchmark speed of load/readRDS correctly

Dear all I was thinking about efficient reading data into R and tried several ways to test if load(file.Rdata) or readRDS(file.rds) is faster. The files file.Rdata and file.rds contain the same data, the first created with save(d, ' file.Rdata', compress=F) and the second with saveRDS(d, ' file.rds', compress=F). First I used the function microbenchmark() and was a astonished

serializing recordedplot object

2012 Jan 09

serializing recordedplot object

I use recordPlot() to save plots to disk that I render later to a variety of formats. This works fine for base R plots and ggplot2 plots, and also used to work for lattice plots. However somewhere in version 2.14 things stopped working for lattice plots. Here is an example: library(lattice); histogram(rnorm(100)); x <- recordPlot(); saveRDS(x, "myplot.rds"); y <-

readRDS, In as.double.xts(fishReport$count) : NAs introduced by coercion

2012 Jul 29

readRDS, In as.double.xts(fishReport$count) : NAs introduced by coercion

Hello, I looked in the R-help but could not find an archive addressing the following. I would like to convert a character to numeric after reading a file with RDS extension. After using as.numeric, I checked if it is numeric. It was not converted. Please help. Here is my code >Report <- readRDS(file="RDS/Report.RDS") > Report[1:2,] dive_id date

[FORGED] recordPlot/replayPlot not working with saveRDS/readRDS

2018 Apr 03

[FORGED] recordPlot/replayPlot not working with saveRDS/readRDS

>>>>> Paul Murrell <paul at stat.auckland.ac.nz> >>>>> on Tue, 3 Apr 2018 09:41:56 +1200 writes: > Hi What you are doing "wrong" is loading a recordedplot > into the same session that it was created in. The > saveRDS()/readRDS() works if you save in one R session and > then read in a different R session. The

Problem reading RDS files

2018 Apr 22

Problem reading RDS files

Wouldn't the obvious problem be that your data file is corrupted or was never created using saveRDS in the first place? Can you show us a complete example of creating and attempting to read what was just created? On April 22, 2018 10:20:05 AM CDT, mohammad moradi <mri.moradi at gmail.com> wrote: >Hi there, > >I faced a weird problem doing a seemingly simple task in R.

Problem reading RDS files

2018 Apr 23

Problem reading RDS files

I've tried to re-experiment the tutorial presented at http://www.rdatamining.com/docs/twitter-analysis-with-r and specifically aimed to use rds files (tweet records) at http://www.rdatamining.com/data/. On Sun, Apr 22, 2018 at 9:16 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote: > Wouldn't the obvious problem be that your data file is corrupted or was > never created

How to benchmark speed of load/readRDS correctly

2017 Aug 22

How to benchmark speed of load/readRDS correctly

The large value for maximum time may be due to garbage collection, which happens periodically. E.g., try the following, where the unlist(as.list()) creates a lot of garbage. I get a very large time every 102 or 51 iterations and a moderately large time more often mb <- microbenchmark::microbenchmark({ x <- as.list(sin(1:5e5)); x <- unlist(x) / cos(1:5e5) ; sum(x) }, times=1000)

How to benchmark speed of load/readRDS correctly

2017 Aug 22

How to benchmark speed of load/readRDS correctly

Note that if you force a garbage collection each iteration the times are more stable. However, on the average it is faster to let the garbage collector decide when to leap into action. mb_gc <- microbenchmark::microbenchmark(gc(), { x <- as.list(sin(1:5e5)); x <- unlist(x) / cos(1:5e5) ; sum(x) }, times=1000, control=list(order="inorder")) with(mb_gc,

similar to: Newbie wants to compare 2 huge RDSs row by row.