Chris Deigan
2016-Apr-14 05:30 UTC
"Total File Size" Statistic counts each instance of hard linked files
Hi, This is a question is seeking clarification of intended behaviour. Right now, rsync reports a statistic of "Total file size". This represents "the total sum of all file sizes in the transfer" (as described in the man page). A case I've hit in using this statistic is that it counts each instance of a file even when it has multiple hard links. We are using --hard-links to preserve hard links on the destination. As a result we get a statistic of, for instance, 2TB when the actual sum on disk (counted with du, using the default behaviour of counting hard linked files only once) is only around 80GB. I'm using the statistic for generating backup disk usage numbers that eventually become billing data, so this has generated a few surprise cases. There are a few alternatives for my use-case, but I was wondering if counting hard links multiple times is actually correct behaviour? My feeling is no, but this consideration isn't apparent in the source or docs that I've read. Appreciate any comments, particularly from the project maintainers. Thanks, Chris
Possibly Parallel Threads
- calculate correlation effect size using contrast analysis for an omnibus Chi-square test statistic
- Enable STATISTIC all the time again?
- Enable STATISTIC all the time again?
- how to count the total number of (INCLUDING overlapping) occurrences of a substring within a string?
- [PATCH 2/2] virtio-pci: check name when counting MSI-X vectors