Hi, We have seen a huge performance drop in 1.6.3, due to the checksum being enabled by default. I looked at the algorithm being used, and it is actually a CRC32, which is a very strong algorithm for detecting all sorts of problems, such as single bit errors, swapped bytes, and missing bytes. I''ve been experimenting with using a simple XOR algorithm. I''ve been able to recover most of the lost performance. This algorithm will detected corrupted bytes and words. This algorithm will not detect swapped bytes errors, but I think that these are pretty rare. This algorithm will not detect missing bytes, but I suspect that other things in Lustre or LNET will detect this problem. This algorithm will not detect two errors that offset each other, such as a single bit error in two words that are a multiple of 4 bytes apart. Should we consider using a more efficient checksum algorithm, in order to regain performance? Should the algorithm be configurable? -Roger _________________________________________________________________ Boo!?Scare away worms, viruses and so much more! Try Windows Live OneCare! http://onecare.live.com/standard/en-us/purchase/trial.aspx?s_cid=wl_hotmailnews -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-devel/attachments/20071106/30a2c755/attachment-0004.html
Roger, We''ve been running with checksums enabled in our release for some time now and have seen the exact same impact on performance. In our case single node performance is impacted but aggregate FS performance remains good when enough clients are involved. We are tracking the performance issue under bug 13805 and would love any input/insight you might have on the issue. Bug13805 <https://bugzilla.lustre.org/show_bug.cgi?id=13805> My view on the issue is that it is madness to run with checksums disabled and we need to investigate more efficient checksum algorithms. The current crc32 algorithm may be too heavy weight but the simple XOR algorithm you propose I fear is not strong enough. I''ve seen to many cases now of various network components corrupting data in all sorts of interesting ways. Happily we have a lot of other choices for algorithms to investigate. If you have the time I''d encourage you to investigate an assortment of algorithms and see which work best. Making this a runtime option via proc I think is also an excellent idea. -- Thanks, Brian> Hi, > > We have seen a huge performance drop in 1.6.3, due to the checksum being > enabled by default. I looked at the algorithm being used, and it is > actually a CRC32, which is a very strong algorithm for detecting all sorts > of problems, such as single bit errors, swapped bytes, and missing bytes. > > I''ve been experimenting with using a simple XOR algorithm. I''ve been able > to recover most of the lost performance. This algorithm will detected > corrupted bytes and words. This algorithm will not detect swapped bytes > errors, but I think that these are pretty rare. This algorithm will not > detect missing bytes, but I suspect that other things in Lustre or LNET > will detect this problem. This algorithm will not detect two errors that > offset each other, such as a single bit error in two words that are a > multiple of 4 bytes apart. > > Should we consider using a more efficient checksum algorithm, in order to > regain performance? Should the algorithm be configurable? > > -Roger-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-devel/attachments/20071106/a8ce0ec3/attachment-0004.bin
Brian, How does the crc mechanism work? I assume that the crc is done at the client, does the server verify the crc? Also are the crc''s stored on disk? thanks, paul Brian Behlendorf wrote:> Roger, > > We''ve been running with checksums enabled in our release for some time now > and have seen the exact same impact on performance. In our case single node > performance is impacted but aggregate FS performance remains good when enough > clients are involved. We are tracking the performance issue under bug 13805 > and would love any input/insight you might have on the issue. > > Bug13805 <https://bugzilla.lustre.org/show_bug.cgi?id=13805> > > My view on the issue is that it is madness to run with checksums disabled > and we need to investigate more efficient checksum algorithms. The current > crc32 algorithm may be too heavy weight but the simple XOR algorithm you > propose I fear is not strong enough. I''ve seen to many cases now of various > network components corrupting data in all sorts of interesting ways. > Happily we have a lot of other choices for algorithms to investigate. > > If you have the time I''d encourage you to investigate an assortment of > algorithms and see which work best. Making this a runtime option via > proc I think is also an excellent idea. > > > ------------------------------------------------------------------------ > > _______________________________________________ > Lustre-devel mailing list > Lustre-devel at clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-devel >
On Tue, 6 Nov 2007, Brian Behlendorf wrote:> My view on the issue is that it is madness to run with checksums disabled > and we need to investigate more efficient checksum algorithms. The current > crc32 algorithm may be too heavy weight but the simple XOR algorithm you > propose I fear is not strong enough. I''ve seen to many cases now of various > network components corrupting data in all sorts of interesting ways. > Happily we have a lot of other choices for algorithms to investigate. > > If you have the time I''d encourage you to investigate an assortment of > algorithms and see which work best. Making this a runtime option via > proc I think is also an excellent idea.I''d strongly recommend looking at which algorithms are used for checksumming in ZFS, they have done rather extensive investigations on the subject. If I remember correctly ZFS is using fletcher by default with good performance, with sha256 as an option for those who wants it. Anyhow, given the nowadays tight bond between CFS and SUN finding the relevant info on the subject should be a no-brainer :) /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | nikke at hpc2n.umu.se --------------------------------------------------------------------------- OH NO, my wife burned the rice crispies--AGAIN!! =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
On Nov 06, 2007 11:59 -0500, RS RS wrote:> We have seen a huge performance drop in 1.6.3, due to the checksum > being enabled by default. I looked at the algorithm being used, and it is > actually a CRC32, which is a very strong algorithm for detecting all sorts > of problems, such as single bit errors, swapped bytes, and missing bytes.> I''ve been experimenting with using a simple XOR algorithm. I''ve > been able to recover most of the lost performance. This algorithm > will detected corrupted bytes and words. This algorithm will not > detect swapped bytes errors, but I think that these are pretty rare. > This algorithm will not detect missing bytes, but I suspect that other > things in Lustre or LNET will detect this problem. This algorithm will > not detect two errors that offset each other, such as a single bit error > in two words that are a multiple of 4 bytes apart.Note that it is possible to disable checksums to get the previous behaviour back at runtime with (on all clients that should skip checksums): for C in /proc/fs/lustre/osc/*/checksums; do echo 0 > $C done in the lustre configuration: mgs> lctl conf_param testfs-OST0001.osc.checksums=0 or at compile time with "configure --disable-checksum ..." Cheers, Andreas -- Andreas Dilger Sr. Software Engineer, Lustre Group Sun Microsystems of Canada, Inc.