-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I just upgraded to Solaris 10 Update 10, and one of the improvements is "zfs diff". Using the "birthtime" of the sectors, I would expect very high performance. The actual performance doesn''t seems better that an standard "rdiff", though. Quite disappointing... Should I disable "atime" to improve "zfs diff" performance? (most data doesn''t change, but "atime" of most files would change). """ [root at buffy backups]# zfs list datos/backups/buffy NAME USED AVAIL REFER MOUNTPOINT datos/backups/buffy 8.95G 553G 7.55G /backups/buffy [root at buffy backups]# time zfs diff -Ft datos/backups/buffy at 20110926-20:22 datos/backups/buffy at 20110926-20:35 1317061842.659141598 M / /backups/buffy/root/proc 1317061812.437869058 M / /backups/buffy/root/dev/fd 1317061816.752409624 M | /backups/buffy/root/etc/saf/_sacpipe 1317061816.791269117 M | /backups/buffy/root/etc/saf/zsmon/_pmpipe 1317061817.291653834 M / /backups/buffy/root/etc/svc/volatile 1317061934.727002843 M F /backups/buffy/var/adm/lastlog 1317061934.796205623 M F /backups/buffy/var/adm/wtmpx 1317061938.764996484 M F /backups/buffy/var/ntp/ntpstats/loopstats 1317061938.978388173 M F /backups/buffy/var/ntp/ntpstats/peerstats.20110926 real 10m0.272s user 0m0.809s sys 2m6.693s """ 10 minutes to "diff" 7.55 GB is... disappointing. This machine uses a 2-mirror configurations, and there is no more activity going on in the machine. ZPOOL version 29, ZFS version 5. Am I missing anything? - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBToDKpZlgi5GaxT1NAQJzvQP/YEi58gQe20mYicPFbnrUoC4LU3wu7Evf xA3M+NjXnK8Y8MU9CboIH1+vj8PK7m9lqkZu0N9znAMU5OqDeXmSVBqjRYfJrzBk A4Px9Y1RNA8Dslqm3w8RUdWczIzt4WuyvnjCN8k3YBOMIaVlFQjCQlRjDUDDbzcI tISDPeYzO9w=ko6a -----END PGP SIGNATURE-----
On Mon, Sep 26, 2011 at 1:55 PM, Jesus Cea <jcea at jcea.es> wrote:> I just upgraded to Solaris 10 Update 10, and one of the improvements > is "zfs diff". > > Using the "birthtime" of the sectors, I would expect very high > performance. The actual performance doesn''t seems better that an > standard "rdiff", though. Quite disappointing... > > Should I disable "atime" to improve "zfs diff" performance? (most data > doesn''t change, but "atime" of most files would change).atime has nothing to do with it. How much work zfs diff has to do depends on how much has changed between snapshots. Nico --
On Mon, September 26, 2011 14:55, Jesus Cea wrote: [...]> real 10m0.272s > user 0m0.809s > sys 2m6.693s > """ > > 10 minutes to "diff" 7.55 GB is... disappointing. > > This machine uses a 2-mirror configurations, and there is no more > activity going on in the machine. ZPOOL version 29, ZFS version 5. > > Am I missing anything?[...] Talking about "7.55 GB" is mostly useless as well. If it''s a dozen video files then stat()ing them all with be done very quickly by just running find(1). If however the 7.55 GB is made up of 7,550,000 files then going through them would take quite a long time. How long would it take for (say) rsync to walk two file systems (or snapshot directories) to come up with the same list? Ten minutes may seem like a lot in ''absolute'' terms, but if something like rsync takes an hour or two to stat() every file, then it''s a big improvement. So the question is: by what metric are you comparing that you came up with the "disappointing" conclusion? Why is ten minutes disappointing? What would /not/ be disappointing to you? 8m? 5m? 3.14 seconds?
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 26/09/11 22:29, David Magda wrote:> Talking about "7.55 GB" is mostly useless as well. If it''s a dozen > video files then stat()ing them all with be done very quickly by > just running find(1). If however the 7.55 GB is made up of > 7,550,000 files then going through them would take quite a long > time.Point taken, although "zfs diff" time is (should) proportional to changes, not to number of files.> How long would it take for (say) rsync to walk two file systems > (or snapshot directories) to come up with the same list? Ten > minutes may seem like a lot in ''absolute'' terms, but if something > like rsync takes an hour or two to stat() every file, then it''s a > big improvement."rsync" takes a bit less than 7 minutes. So "zfs diff" is actually slower!.> So the question is: by what metric are you comparing that you came > up with the "disappointing" conclusion? Why is ten minutes > disappointing? What would /not/ be disappointing to you? 8m? 5m? > 3.14 seconds?If I change 10 files in dataset with a trillion files, I would expect less than a couple of seconds. Given the tree walking pruning with "birthdate" age, I actually think this is reasonable (you skip over entire on-disk branches if there are no changes under them). - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBToDmlJlgi5GaxT1NAQKh7QP+OCokqiBNo79Tojtvy9aLztQy0T+mNMoh i5z9BW38h8xdTNHiUqp8qnYaK3c+t8kyl90ZPR42dCKAl3hkk11x695yZuvRp+bm IKO+CPHfQ+wu3G2hoWWwvoHEdiXRvpg2MRZxXXZnzqldthrlq0PtSpNAGctm5Apl Ca564U9dkes=TeMO -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 26/09/11 21:31, Nico Williams wrote:> atime has nothing to do with it. > > How much work zfs diff has to do depends on how much has changed > between snapshots.That is what I thought, but look at my example: less than 20 changes and more than 10 minutes to locate them... Technically, if a datasets have "atime" active, the FS diverges from the "dataset" even if the "data" is not changed. I just did a snapshot over another unchanged snapshot. "zfs diff" finish inmediatelly with no changes, and it should be. But doing a "zfs diff" of "/usr/local/" takes a lot of time, even without changes. I am really thinking that "atime" is actually playing a role. In my personal situation, I am doing "zfs diff" between snapshots taken on the receive side of an "rdiff --inplace". I would say that "rdiff" is modifying the "atime" of ALL files in the receiving "dataset", and although that is not showed in "zfs diff", it is "breaking" the tree pruning by "birthdate" age. I just disabled "atime" in this particular dataset. I do a new "rdiff - --inplace" on it (as the destination). After that, "zfs diff" takes 12 seconds instead of the initial 10 minutes. A big improvement. So, yes, "atime" seems to be harmful. Badly. PS: I saw something similar with "zfs send" too. - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBToDml5lgi5GaxT1NAQIWQgQAnoeFnltM1SyzUWDb5fxxYQJIff19B8Gp 5jpfHw3dcri6OYQzUkqxCAq0QvQdzMP899HPE2gx8yW1XqC706H1xaVsM1Ho7IJM ZzKPulCAoEZ7njYo2ycipDIlQtxdaSuA9UPu6XDY142fq5GmnMx9lCChuWLK5gDb Ox+ffh4867k=Ji6T -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 26/09/11 22:54, Jesus Cea wrote:> On 26/09/11 22:29, David Magda wrote: >> Talking about "7.55 GB" is mostly useless as well. If it''s a >> dozen video files then stat()ing them all with be done very >> quickly by just running find(1). If however the 7.55 GB is made >> up of 7,550,000 files then going through them would take quite a >> long time. > > Point taken, although "zfs diff" time is (should) proportional to > changes, not to number of files.Providing info, the "used" column in "zfs list" for these snapshots, giving the "difference" between adjacent snapshots, is around 30MB (with "atime" active). 10 minutes to dig in 30MB... - -- Jesus Cea Avion _/_/ _/_/_/ _/_/_/ jcea at jcea.es - http://www.jcea.es/ _/_/ _/_/ _/_/ _/_/ _/_/ jabber / xmpp:jcea at jabber.org _/_/ _/_/ _/_/_/_/_/ . _/_/ _/_/ _/_/ _/_/ _/_/ "Things are not so easy" _/_/ _/_/ _/_/ _/_/ _/_/ _/_/ "My name is Dump, Core Dump" _/_/_/ _/_/_/ _/_/ _/_/ "El amor es poner tu felicidad en la felicidad de otro" - Leibniz -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQCVAwUBToDnrJlgi5GaxT1NAQJKFwP/XqkUeEi66WynywY4BpWishHwmEtMfZIv Ex5YG38/5k+0lmuMDX3wGKxTueA08AxV5YOSyFJ23Rf3FCqksJ7C8ZX2PFIT3I2D 4Z52QKMF6tw9OzcCavkLE+15pp1IEixutcLnS8mVv7gw1SHrmGyIQvXpouL3sM4a dbKdHyUVHQk=sD8O -----END PGP SIGNATURE-----
On Mon, 26 Sep 2011, Jesus Cea wrote:> > "rsync" takes a bit less than 7 minutes. So "zfs diff" is actually > slower!.It is important to define what is meant by "rsync". For example, a common rsync operating mode is to simply compare whole-file timestamps and file size in order to determine that a file has changed. However, zfs surely works at the zfs block level so it does more work due to files being comprised of multiple blocks. Rsync may be executed in a mode (--checksum) by which it compares blocks of data. This mode would be considerably slower. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On 09/26/11 12:31, Nico Williams wrote:> On Mon, Sep 26, 2011 at 1:55 PM, Jesus Cea <jcea at jcea.es> wrote: >> Should I disable "atime" to improve "zfs diff" performance? (most data >> doesn''t change, but "atime" of most files would change). > > atime has nothing to do with it.based on my experiences with time-based snapshots and atime on a server which had cron-driven file tree walks running every night, I can easily believe atime has a lot to do with it - the atime updates associated with a tree walk will mean that that much of a filesystem''s metadata will diverge between the writeable filesystem and its last snapshot. - Bill
Ah yes, of course. I''d misread your original post. Yes, disabling atime updates will reduce the number of superfluous transactions. It''s *all* transactions that count, not just the ones the app explicitly caused, and atime implies lots of transactions. Nico --
On 09/27/11 07:55 AM, Jesus Cea wrote:> -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > I just upgraded to Solaris 10 Update 10, and one of the improvements > is "zfs diff". > > Using the "birthtime" of the sectors, I would expect very high > performance. The actual performance doesn''t seems better that an > standard "rdiff", though. Quite disappointing... > > Should I disable "atime" to improve "zfs diff" performance? (most data > doesn''t change, but "atime" of most files would change). >I tend to disable atime in the root filesystem and only enable it on a filesystem if required. So far, it has never been required on any of the systems I look after! -- Ian.
On 27 September, 2011 - Ian Collins sent me these 0,8K bytes:> On 09/27/11 07:55 AM, Jesus Cea wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> I just upgraded to Solaris 10 Update 10, and one of the improvements >> is "zfs diff". >> >> Using the "birthtime" of the sectors, I would expect very high >> performance. The actual performance doesn''t seems better that an >> standard "rdiff", though. Quite disappointing... >> >> Should I disable "atime" to improve "zfs diff" performance? (most data >> doesn''t change, but "atime" of most files would change). >> > I tend to disable atime in the root filesystem and only enable it on a > filesystem if required. So far, it has never been required on any of > the systems I look after!I''ve found it useful time after time.. do things and then check atime to see whatever files it looked at.. (yes, I know about truss and dtrace) /Tomas -- Tomas Forsman, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
On 09/27/11 10:59 AM, Tomas Forsman wrote:> On 27 September, 2011 - Ian Collins sent me these 0,8K bytes: > >> On 09/27/11 07:55 AM, Jesus Cea wrote: >>> -----BEGIN PGP SIGNED MESSAGE----- >>> Hash: SHA1 >>> >>> I just upgraded to Solaris 10 Update 10, and one of the improvements >>> is "zfs diff". >>> >>> Using the "birthtime" of the sectors, I would expect very high >>> performance. The actual performance doesn''t seems better that an >>> standard "rdiff", though. Quite disappointing... >>> >>> Should I disable "atime" to improve "zfs diff" performance? (most data >>> doesn''t change, but "atime" of most files would change). >>> >> I tend to disable atime in the root filesystem and only enable it on a >> filesystem if required. So far, it has never been required on any of >> the systems I look after! > I''ve found it useful time after time.. do things and then check atime > to see whatever files it looked at.. > (yes, I know about truss and dtrace) >It can be useful, but unless you really want the functionality, it generates a lot of unnecessary writes. -- Ian.