On Sun, 2007-12-30 at 12:21 -0600, reader@newsguy.com
wrote:> I'm looking for a way to accomplish this goal without too much regular
> intervention.
>
> I want to backup.. but no thats not really the word I need here. It
> is a backup but also a continuing non-deleting collection of what
> comes on the spools over time. So the backup will never look like or
> mirror the src, except on the first run.
>
> It is NNTP and Mail spools (in one numbered file per message or post)
> I'm working with, hundreds of thousands of them.
>
> The end result would be (after 12 months) new top level directory started
> every 4th month and no overlap where the changeover occurs.
>
> jan-feb-mar-apr/ may-jun-jul-aug/ sep-oct-nov-dec/
I think these names are unwieldy. You might consider a naming scheme
like 2008T1, 2008T2, 2008T3 (T for third, like Q for quarter) instead.
> So the problem as I see it is that the src spools continue to accrue
> on the new end and expire on the old with about 5000 message que or
> holding period (works out to close to 1 mnth).
>
> So when a changover occurs, since the new top level heirarchy has
> nothing for rsync to find as `uptodate', it copies over whatever is on
> the src spool which will include a massive amount of overlap.
>
> I guess what I'd like to do is redirect the actual destination but
> make rsync look at the old destination to determine what is uptodate
> for some period.
>
> Its a given that rsync has a wonderfully flexible set of flags and
> switches. Still, near as I can tell what I described is not possible,
> using only rsync switches.
There's a switch just for you: --compare-dest. ?Copy to the new
destination and specify the old destination as a --compare-dest
directory to have rsync skip copying source files that appear
identically in the old destination.
Matt