I'm trying to streamline a backup system using ZFS. In our situation, we're writing pg_dump files repeatedly, each file being highly similar to the previous file. Is there a file system (EG: ext4? xfs?) that, when re-writing a similar file, will write only the changed blocks and not rewrite the entire file to a new set of blocks? Assume that we're writing a 500 MB file with only 100 KB of changes. Other than a utility like diff, is there a file system that would only write 100KB and not 500 MB of data? In concept, this would work similarly to using the 'diff' utility... -Ben
m.roth at 5-cent.us
2014-Jul-02 19:57 UTC
[CentOS] block level changes at the file system level?
Lists wrote:> I'm trying to streamline a backup system using ZFS. In our situation, > we're writing pg_dump files repeatedly, each file being highly similar > to the previous file. Is there a file system (EG: ext4? xfs?) that, when > re-writing a similar file, will write only the changed blocks and not > rewrite the entire file to a new set of blocks? > > Assume that we're writing a 500 MB file with only 100 KB of changes. > Other than a utility like diff, is there a file system that would only > write 100KB and not 500 MB of data? In concept, this would work > similarly to using the 'diff' utility... >I think the buzzword you want is dedup. mark
On Wed, Jul 2, 2014 at 2:53 PM, Lists <lists at benjamindsmith.com> wrote:> I'm trying to streamline a backup system using ZFS. In our situation, > we're writing pg_dump files repeatedly, each file being highly similar > to the previous file. Is there a file system (EG: ext4? xfs?) that, when > re-writing a similar file, will write only the changed blocks and not > rewrite the entire file to a new set of blocks? > > Assume that we're writing a 500 MB file with only 100 KB of changes. > Other than a utility like diff, is there a file system that would only > write 100KB and not 500 MB of data? In concept, this would work > similarly to using the 'diff' utility...There is something called rdiff-backup (http://www.nongnu.org/rdiff-backup/ and packaged in EPEL) that does reverse diffs at the application level. If it performs well enough it might be easier to manage than a de-duping filesystem. Or backuppc - which would store a complete copy if there are any changes at all between dumps but would compress them and automatically manage the number you need to keep. -- Les Mikesell lesmikesell at gmail.com
John R Pierce
2014-Jul-03 19:19 UTC
[CentOS] block level changes at the file system level?
On 7/2/2014 12:53 PM, Lists wrote:> I'm trying to streamline a backup system using ZFS. In our situation, > we're writing pg_dump files repeatedly, each file being highly similar > to the previous file. Is there a file system (EG: ext4? xfs?) that, when > re-writing a similar file, will write only the changed blocks and not > rewrite the entire file to a new set of blocks? > > Assume that we're writing a 500 MB file with only 100 KB of changes. > Other than a utility like diff, is there a file system that would only > write 100KB and not 500 MB of data? In concept, this would work > similarly to using the 'diff' utility...you do realize, adding/removing or even changing the length of a single line in a block of that pg_dump file will change every block after it as the data will be offset ? may I suggest that instead of pg_dump, you use pg_basebackup and WAL archiving... this is the best way to do delta backups of a sql database server. -- john r pierce 37N 122W somewhere on the middle of the left coast