Hi all The numbers I''ve heard say the number of iops for a raidzn volume should be about the number of iops for the slowest drive in the set. While this might sound like a good base point, I tend to disagree. I''ve been doing some testing on some raidz2 volumes with various sizes and similar various amount of VDEVs. It seems, with iozone, the number of iops are rather high per drive, up to 250 for these 7k2 drives, even with an 8-drive RAIDz2 VDEV. The testing has not utilized a high number of theads (yet), but still, it looks like for most systems, RAIDzN performance should be quite decent. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On Mon, Dec 6 at 23:22, Roy Sigurd Karlsbakk wrote:>Hi all > >The numbers I''ve heard say the number of iops for a raidzn volume >should be about the number of iops for the slowest drive in the >set. While this might sound like a good base point, I tend to >disagree. I''ve been doing some testing on some raidz2 volumes with >various sizes and similar various amount of VDEVs. It seems, with >iozone, the number of iops are rather high per drive, up to 250 for >these 7k2 drives, even with an 8-drive RAIDz2 VDEV. The testing has >not utilized a high number of theads (yet), but still, it looks like >for most systems, RAIDzN performance should be quite decent.I think that statement is meant to describe random IO. I doubt anyone is getting 250 IOPS out of a 7200rpm drive, unless it has been significantly short-stroked or is using a very deep SCSI queue depth. Sequential IO should be fast in any configuration with lots of drives. -- Eric D. Mudama edmudama at mail.bounceswoosh.org
> From: Roy Sigurd Karlsbakk [mailto:roy at karlsbakk.net] > > The numbers I''ve heard say the number of iops for a raidzn volume should > be about the number of iops for the slowest drive in the set. While this might > sound like a good base point, I tend to disagree. I''ve been doing some testing > on some raidz2 volumes with various sizes and similar various amount of > VDEVs. It seems, with iozone, the number of iops are rather high per drive, > up to 250 for these 7k2 drives, even with an 8-drive RAIDz2 VDEV. The testing > has not utilized a high number of theads (yet), but still, it looks like for most > systems, RAIDzN performance should be quite decent.Bear a few things in mind: iops is not iops. When you perform lots of writes, ZFS is going to accelerate that by aggregating them into sequential disk blocks, and therefore greatly exceed the true random iops limits of the drive. For a single disk, using iozone, I measured around 550 to 600 writes on 15krpm drives. You have to compare apples to apples. Not the brand. I mean don''t compare miles to kilometers. raidz - seek limited? or bandwidth limited? If you''re doing a single thread, which will perform a small random read or write, and blocking-wait until that''s done before issuing the next command... Then you''re going to get the worst performance of any one disk in the set. But if you''re allowing the system to queue up commands... Then... While 6 disks are idly sitting around waiting for disk 7, those other 6 disks can already begin on the next task. In either case, your performance is going to be limited by either the worst case, or the average case of a single drive. And thanks to ZFS write acceleration, you''ll see approx 10x higher write iops. But there''s nothing you can do to accelerate random reads, aside from command queueing. If you''re performing large sequential writes/reads, then you''re going to get stripe-like performance, of many disks all working simultaneously as a team. Long story short, the performance you measure will vary enormously by the type of workload you''re generating. For some workloads, absolutely, you WILL see the performance of just a single disk. For other workloads, you''ll scale right up to the maximum number of disks... Getting N-times the performance of a single disk.
> Bear a few things in mind: > > iops is not iops.<snip/> I am totally aware of these differences, but it seems some people think RAIDz is nonsense unless you don''t need speed at all. My testing shows (so far) that the speed is quite good, far better than single drives. Also, as Eric said, those speeds are for random i/o. I doubt there is very much out there that is truely random i/o except perhaps databases, but then, I would never use raid5/raidz for a DB unless at gunpoint. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On Dec 7, 2010, at 12:46 PM, Roy Sigurd Karlsbakk <roy at karlsbakk.net> wrote:>> Bear a few things in mind: >> >> iops is not iops. > <snip/> > > I am totally aware of these differences, but it seems some people think RAIDz is nonsense unless you don''t need speed at all. My testing shows (so far) that the speed is quite good, far better than single drives. Also, as Eric said, those speeds are for random i/o. I doubt there is very much out there that is truely random i/o except perhaps databases, but then, I would never use raid5/raidz for a DB unless at gunpoint.Well besides databases there are VM datastores, busy email servers, busy ldap servers, busy web servers, and I''m sure the list goes on and on. I''m sure it is much harder to list servers that are truly sequential in IO then random. This is especially true when you have thousands of users hitting it. -Ross
> From: Roy Sigurd Karlsbakk [mailto:roy at karlsbakk.net] > > > Bear a few things in mind: > > > > iops is not iops. > <snip/> > > I am totally aware of these differences, but it seems some people think > RAIDz is nonsense unless you don''t need speed at all. My testing shows (so > far) that the speed is quite good, far better than single drives.There is a grain of truth. For sequential IO, either reads or writes, raidz will be much faster than a single drive. For random IO, it''s more complex... If you''re doing random writes, then ZFS will make them into sequential IO, and hence, your raidz will greatly outperform a single drive. If you''re doing random reads, you will get the performance of a single drive, at best. In order to test random reads, you have to configure iozone to use a data set which is much larger than physical ram. Since iozone will write a big file and then immediately afterward, start reading it ... It means that whole file will be in cache unless that whole file is much larger than physical ram. You''ll get false read results which are unnaturally high. For this reason, when I''m using an iozone benchmark, I remove as much ram from the system as possible.
> From: Ross Walker [mailto:rswwalker at gmail.com] > > Well besides databases there are VM datastores, busy email servers, busy > ldap servers, busy web servers, and I''m sure the list goes on and on. > > I''m sure it is much harder to list servers that are truly sequential in IOthen> random. This is especially true when you have thousands of users hittingit. Depends on the purpose of your server. For example, I have a ZFS server whose sole purpose is to receive a backup data stream from another machine, and then write it to tape. This is a highly sequential operation, and I use raidz. Some people have video streaming servers. And http/ftp servers with large files. And a fileserver which is the destination for laptop whole-disk backups. And a repository that stores iso files and rpm''s used for OS installs on other machines. And data capture from lab equipment. And packet sniffer / compliance email/data logger. and I''m sure the list goes on and on. ;-)
> > I am totally aware of these differences, but it seems some people > > think RAIDz is nonsense unless you don''t need speed at all. My > > testing shows (so far) that the speed is quite good, far better than > > single drives. Also, as Eric said, those speeds are for random i/o. > > I doubt there is very much out there that is truely random i/o > > except perhaps databases, but then, I would never use raid5/raidz > > for a DB unless at gunpoint. > > Well besides databases there are VM datastores, busy email servers, > busy ldap servers, busy web servers, and I''m sure the list goes on and > on. > > I''m sure it is much harder to list servers that are truly sequential > in IO then random. This is especially true when you have thousands of > users hitting it.For busy web servers, I would guess most of the data can be cached, at least over time, and with good amounts of arc/l2arc, this should remove most of that penalty. A spooling server is another thing, for which I don''t think raidz would be suitable, although with async i/o will streamline at least some of it. For VM datastores, I totally agree. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On Dec 7, 2010, at 9:49 PM, Edward Ned Harvey <opensolarisisdeadlongliveopensolaris at nedharvey.com> wrote:>> From: Ross Walker [mailto:rswwalker at gmail.com] >> >> Well besides databases there are VM datastores, busy email servers, busy >> ldap servers, busy web servers, and I''m sure the list goes on and on. >> >> I''m sure it is much harder to list servers that are truly sequential in IO > then >> random. This is especially true when you have thousands of users hitting > it. > > Depends on the purpose of your server. For example, I have a ZFS server > whose sole purpose is to receive a backup data stream from another machine, > and then write it to tape. This is a highly sequential operation, and I use > raidz. > > Some people have video streaming servers. And http/ftp servers with large > files. And a fileserver which is the destination for laptop whole-disk > backups. And a repository that stores iso files and rpm''s used for OS > installs on other machines. And data capture from lab equipment. And > packet sniffer / compliance email/data logger. > > and I''m sure the list goes on and on. ;-)Ok, single stream backup servers are one type, but as soon as you have multiple streams, even for large files, then IOPS trumps throughput to a degree, of course if throughput is very bad then that''s no good either. Know your workload is key, or have enough $$ to implement RAID10 everywhere. -Ross
> From: Edward Ned Harvey > [mailto:opensolarisisdeadlongliveopensolaris at nedharvey.com] > > In order to test random reads, you have to configure iozone to use a data set > which is much larger than physical ram. Since iozone will write a big file and > then immediately afterward, start reading it ... It means that whole file will > be in cache unless that whole file is much larger than physical ram. You''ll get > false read results which are unnaturally high. > > For this reason, when I''m using an iozone benchmark, I remove as much ram > from the system as possible.Sorry. There''s a better way. This is straight from the mouth of Don Capps, author of iozone: If you use the -w option, then the test file will be left behind. Then reboot, or umount and mount? If you then use the read test, without the write test and again use -w, then you will achieve what you are describing. Example: iozone -i 0 -w -r $recsize -s $filesize Umount, then remount iozone -i 1 -w -r $recsize -s $filesize