Thanks for the input Jeff and Burt, and I could have been more clear about what I was looking for. You are of course right that there is nothing preventing me from using plain text files, I initially went with compressed binary files because they were, of course, smaller. What I was curious about is if there was any existing program/script/function/tool that converted an rds to plain text, so that I could point git at it when creating the change set for a commit. For example, there are programs that convert pdf documents to plain text, and you can configure git to use these programs to convert the PDF's to plain text before comparing the current version with the version in the last commit, and then only add the things that have changes to the new commit. This allows you to the track changes to a pdf file the same way you would a text file, and your delta's don't blow up by committing the entire PDF file each time you change part of the file. It would be simple enough to write something like this with some combination of a shell script and an R script to do this same thing for .rds files, but I wondered if there was something out there already I was overlooking. I do see the point that I may be trying to compensate for a flawed premise, perhaps I should not use a .rds file here if I'm concerned about the repo getting too big as time goes on. I was just hoping to have my cake and eat it too. - Will On Mon, Nov 23, 2015 at 1:33 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> You are sending contradictory signals... do you or do you not want plain > text files? You have not indicated what advantage you are getting from > having binary files in your repository. By far the best answer I see to > your dilemma is to save in ASCII format instead of the default binary > format. > > On November 23, 2015 9:36:21 AM PST, Will Hopper <wjhopper510 at gmail.com> > wrote: > >> Hi all, >> >> I'm posting to see if anyone knows of any existing resources that >> auto-magically converts r objects in saved in .rds files to a plain text >> representation, suitable for diffing? >> >> I often save the results of long running calculation as .rds files, and >> since I use git for source control, it would be nice if there were a way to >> convert rds files to a plain text representation for diffing, so I could >> avoid having large commits full of binary data. Git allows you specify a >> programs for binary --> text conversion for any file type, effectively >> teaching git how to diff binary files. If there was something out there >> developed to do this with .rds files, it would really like to know about it! >> >> I realize I could save the rds file with ascii=TRUE and compress=FALSE, but >> that kind of defeats the point of saving as .rds in the first place. >> >> If there is no tool out that there anyone >> knows of, I don't think it would >> be too hard for me to write something with bash + Rscript to get the job >> done, but I'd like to avoid re-inventing the wheel if possible. >> >> Thanks for the help! >> >> [[alternative HTML version deleted]] >> >> ------------------------------ >> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. >[[alternative HTML version deleted]]
Your any-RDS-to-ASCII-converter could be the R function toASCIIRDS <- function (fromRDS, toRDS) { saveRDS(readRDS(fromRDS), file = toRDS, ascii = TRUE, compress = FALSE) } which you can call from Rscript with appropriate input and output file names. Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Nov 23, 2015 at 12:18 PM, Will Hopper <wjhopper510 at gmail.com> wrote:> Thanks for the input Jeff and Burt, and I could have been more clear about > what I was looking for. > > You are of course right that there is nothing preventing me from using > plain text files, I initially went with compressed binary files because > they were, of course, smaller. > > What I was curious about is if there was any existing > program/script/function/tool that converted an rds to plain text, so that I > could point git at it when creating the change set for a commit. For > example, there are programs that convert pdf documents to plain text, and > you can configure git to use these programs to convert the PDF's to plain > text before comparing the current version with the version in the last > commit, and then only add the things that have changes to the new commit. > This allows you to the track changes to a pdf file the same way you would a > text file, and your delta's don't blow up by committing the entire PDF file > each time you change part of the file. > > It would be simple enough to write something like this with some > combination of a shell script and an R script to do this same thing for > .rds files, but I wondered if there was something out there already I was > overlooking. > > I do see the point that I may be trying to compensate for a flawed premise, > perhaps I should not use a .rds file here if I'm concerned about the repo > getting too big as time goes on. I was just hoping to have my cake and eat > it too. > > - Will > > On Mon, Nov 23, 2015 at 1:33 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> > wrote: > >> You are sending contradictory signals... do you or do you not want plain >> text files? You have not indicated what advantage you are getting from >> having binary files in your repository. By far the best answer I see to >> your dilemma is to save in ASCII format instead of the default binary >> format. >> >> On November 23, 2015 9:36:21 AM PST, Will Hopper <wjhopper510 at gmail.com> >> wrote: >> >>> Hi all, >>> >>> I'm posting to see if anyone knows of any existing resources that >>> auto-magically converts r objects in saved in .rds files to a plain text >>> representation, suitable for diffing? >>> >>> I often save the results of long running calculation as .rds files, and >>> since I use git for source control, it would be nice if there were a way to >>> convert rds files to a plain text representation for diffing, so I could >>> avoid having large commits full of binary data. Git allows you specify a >>> programs for binary --> text conversion for any file type, effectively >>> teaching git how to diff binary files. If there was something out there >>> developed to do this with .rds files, it would really like to know about it! >>> >>> I realize I could save the rds file with ascii=TRUE and compress=FALSE, but >>> that kind of defeats the point of saving as .rds in the first place. >>> >>> If there is no tool out that there anyone >>> knows of, I don't think it would >>> be too hard for me to write something with bash + Rscript to get the job >>> done, but I'd like to avoid re-inventing the wheel if possible. >>> >>> Thanks for the help! >>> >>> [[alternative HTML version deleted]] >>> >>> ------------------------------ >>> >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> -- >> Sent from my Android device with K-9 Mail. Please excuse my brevity. >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
I was envisioning something involving write.csv(), but this is a better idea, thanks! On Mon, Nov 23, 2015 at 3:31 PM, William Dunlap <wdunlap at tibco.com> wrote:> Your any-RDS-to-ASCII-converter could be the R function > toASCIIRDS <- function (fromRDS, toRDS) > { > saveRDS(readRDS(fromRDS), file = toRDS, ascii = TRUE, compress > FALSE) > } > which you can call from Rscript with appropriate input and output file > names. > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > > On Mon, Nov 23, 2015 at 12:18 PM, Will Hopper <wjhopper510 at gmail.com> > wrote: > > Thanks for the input Jeff and Burt, and I could have been more clear > about > > what I was looking for. > > > > You are of course right that there is nothing preventing me from using > > plain text files, I initially went with compressed binary files because > > they were, of course, smaller. > > > > What I was curious about is if there was any existing > > program/script/function/tool that converted an rds to plain text, so > that I > > could point git at it when creating the change set for a commit. For > > example, there are programs that convert pdf documents to plain text, and > > you can configure git to use these programs to convert the PDF's to plain > > text before comparing the current version with the version in the last > > commit, and then only add the things that have changes to the new commit. > > This allows you to the track changes to a pdf file the same way you > would a > > text file, and your delta's don't blow up by committing the entire PDF > file > > each time you change part of the file. > > > > It would be simple enough to write something like this with some > > combination of a shell script and an R script to do this same thing for > > .rds files, but I wondered if there was something out there already I was > > overlooking. > > > > I do see the point that I may be trying to compensate for a flawed > premise, > > perhaps I should not use a .rds file here if I'm concerned about the repo > > getting too big as time goes on. I was just hoping to have my cake and > eat > > it too. > > > > - Will > > > > On Mon, Nov 23, 2015 at 1:33 PM, Jeff Newmiller < > jdnewmil at dcn.davis.ca.us> > > wrote: > > > >> You are sending contradictory signals... do you or do you not want plain > >> text files? You have not indicated what advantage you are getting from > >> having binary files in your repository. By far the best answer I see to > >> your dilemma is to save in ASCII format instead of the default binary > >> format. > >> > >> On November 23, 2015 9:36:21 AM PST, Will Hopper <wjhopper510 at gmail.com > > > >> wrote: > >> > >>> Hi all, > >>> > >>> I'm posting to see if anyone knows of any existing resources that > >>> auto-magically converts r objects in saved in .rds files to a plain > text > >>> representation, suitable for diffing? > >>> > >>> I often save the results of long running calculation as .rds files, and > >>> since I use git for source control, it would be nice if there were a > way to > >>> convert rds files to a plain text representation for diffing, so I > could > >>> avoid having large commits full of binary data. Git allows you specify > a > >>> programs for binary --> text conversion for any file type, effectively > >>> teaching git how to diff binary files. If there was something out there > >>> developed to do this with .rds files, it would really like to know > about it! > >>> > >>> I realize I could save the rds file with ascii=TRUE and > compress=FALSE, but > >>> that kind of defeats the point of saving as .rds in the first place. > >>> > >>> If there is no tool out that there anyone > >>> knows of, I don't think it would > >>> be too hard for me to write something with bash + Rscript to get the > job > >>> done, but I'd like to avoid re-inventing the wheel if possible. > >>> > >>> Thanks for the help! > >>> > >>> [[alternative HTML version deleted]] > >>> > >>> ------------------------------ > >>> > >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >>> > >>> > >> -- > >> Sent from my Android device with K-9 Mail. Please excuse my brevity. > >> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]