sss@cray.com
2007-Jan-08 09:13 UTC
[Lustre-devel] [Bug 10744] ior surveys for Catamount at ORNL
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10744 This bug reports performance problems: poor scaling performance on reads and erratic and poor performance on shared-file I/O. Alex and Peter Braam have posted some comments but there have been no fixes. I don''t think this bug should be closed until the problems have been resolved.
sss@cray.com
2007-Jan-08 18:51 UTC
[Lustre-devel] [Bug 10744] ior surveys for Catamount at ORNL
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10744 I think we need to know how the DDNs were misconfigured. It is not very helpful to say that the performance problems were caused by misconfiguration without saying how it *should* be configured. (It also seems pretty weak to point to misconfiguration six months after the problem report.) Are you saying that if we reconfigure the DDNs and re-run these tests then the performance all will scale nicely? (I''m thinking of moving to Missouri.)
braam@clusterfs.com
2007-Jan-09 10:02 UTC
[Lustre-devel] [Bug 10744] ior surveys for Catamount at ORNL
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10744 The precise settings for DDN 8500 arrays were published recently on the lustre-devel mailing list, (with graphs). Also our wiki was updated at the same time: Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://mail.clusterfs.com/wikis/lustre/LustreDdnTuning For DDN 9500 we are still discovering the settings and waiting on DDN.
nic@cray.com
2007-Jan-09 10:15 UTC
[Lustre-devel] [Bug 10744] ior surveys for Catamount at ORNL
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10744 (In reply to comment #30)> The precise settings for DDN 8500 arrays were published recently on the > lustre-devel mailing list, (with graphs). Also our wiki was updated at the > same time: >Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link:> https://mail.clusterfs.com/wikis/lustre/LustreDdnTuningAs far as I can tell, those setting recommendations were made on the DDN 8500s used at ORNL for these surveys. I fail to see a magic bullet in that wiki page that would solve the performance issues we saw in this bug. As I recall, we were told it was the reservation based allocator that would solve the performance problems here. We''ve not seen that yet, or proved the performance improvement claims either, so I would want to leave this bug open as well until we can verify any "fix".
scjody@clusterfs.com
2007-Jan-17 14:56 UTC
[Lustre-devel] [Bug 10744] ior surveys for Catamount at ORNL
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=10744 Created an attachment (id=9365) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9365&action=view) survey of various software RAID 0 methods Here is a survey of various software RAID 0 methods. Since current versions of Lustre typically perform 1 MB IOs, I only plotted (and in most cases only surveyed) this size. As you can see, the results show modest improvements for reads with high region counts, but not much improvement for lower counts. There''s not much performance difference between LVM (DM) or MD RAID - LVM is faster with low region counts but not by much. Performance is also largely insensitive to chunk size. Unfortunately, writes are a different story. Both subsystems show consistently worse write performance, even lower than with only one LUN. Again, larger chunk sizes don''t improve things much. I will discuss this issue with our IO specialists and see if there is a way to improve write performance.