Does Lustre increase random access performance? I would like to know this becauseI have a large random access file (a hash table). I have striped this file across multiple OSTs. The file is 24 gigabytes, and the stripe size was 1gig across 10 OSTs. I also tried a stripe size of 100megabytes. Both stripe sizes did not seem to improve random access performance. Am I doing something wrong?
Hello! On Apr 6, 2009, at 12:16 PM, sethpn at gmail.com wrote:> Does Lustre increase random access performance? I would like to know > this becauseI have a large random access file (a hash table). I have > striped this file across multiple OSTs. The file is 24 gigabytes, and > the stripe size was 1gig across 10 OSTs. I also tried a stripe size > of 100megabytes. Both stripe sizes did not seem to improve random > access performance. Am I doing something wrong?If you are seeing significantly less performance than how much you can get from a single disk*10 is one matter. If you just at around that singledisk*10 case, you are just disk bound and nothing we can do about it. In a sense Lustre just works as a huge RAID0 over your network, so the more spindles you trow at it is the better. If the file eventually fits into the client cache and there is no parallel write activity to the file, you might want to increase readahead size from default 40 Mb so that the file is cached in memory more quickly. Also random write performance is somewhat improved due to all the cache happening before the eventual flush, but again at the end we cannot go faster than the sum of underlying disk devices (that tend to perform pretty weakly on random io) Bye, Oleg
sethpn at gmail.com wrote:> Does Lustre increase random access performance? I would like to know > this becauseI have a large random access file (a hash table). I have > striped this file across multiple OSTs. The file is 24 gigabytes, and > the stripe size was 1gig across 10 OSTs. I also tried a stripe size > of 100megabytes. Both stripe sizes did not seem to improve random > access performance. Am I doing something wrong? > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussI may be wrong, but I would think performance of a single random access would still be mostly limited by disk seek times, etc. Lustre should do better with multiple random queries, since they should be spread across multiple disk spindles. Less chance of two queries contending for the same spindle. But there is nothing we do that will make a single disk access any faster, afaik. If you are jumping about randomly, it''s going to be up to the disk heads. Changing the stripe size won''t do anything here. If you are doing multiple random queries, increasing the number of OSTs would spread the load out. This is a case where a future feature (OST cache) might help. cliffw