Hi all, I''ve been following zfs progress from the first public announcments and I''m glad it''s finally available for public use. My background is large mail servers on linux, so I have some related questions regarding zfs. As mail serving is mostly gazillion of small random writes, the actual bottleneck is disk seek time, which is around 4ms even on 15k rpm drives. Does zfs scheduling have some logic to efficiently use stripes to decrease seek times? Can I expect at least linear decrease of seek time with the number od spindles? Can zfs handle different size disks in its raid-z? If I have mirror with three disks, is a single block saved on all three disks or on dynamically picked up two disks?>From what I read, ATA-over-Ethernet technology (www.coraid.com) seems to be a natural partner to a zfs. When can we expect sol10 drivers for AoE?Thanks for answers, -- Jure Pe?ar http://jure.pecar.org
On Thu, Nov 17, 2005 at 01:00:26PM +0100, Jure Pe?ar wrote:> > Hi all, > > I''ve been following zfs progress from the first public announcments > and I''m glad it''s finally available for public use. > > My background is large mail servers on linux, so I have some related > questions regarding zfs. > > As mail serving is mostly gazillion of small random writes, the actual > bottleneck is disk seek time, which is around 4ms even on 15k rpm > drives. Does zfs scheduling have some logic to efficiently use stripes > to decrease seek times? Can I expect at least linear decrease of seek > time with the number od spindles?ZFS is designed to turn random file writes into sequential disk writes; see Bill Moore''s response in: http://www.opensolaris.org/jive/thread.jspa?threadID=3615&tstart=0> Can zfs handle different size disks in its raid-z?Yes, but only by truncating all of the disks to the smallest disk''s size.> If I have mirror with three disks, is a single block saved on all > three disks or on dynamically picked up two disks?It''s saved on all three disks; with a three-way mirror, you can survive the loss of up to two disks (or corruption of the same block on up to two disks).> From what I read, ATA-over-Ethernet technology (www.coraid.com) seems > to be a natural partner to a zfs. When can we expect sol10 drivers for > AoE?I have no idea. Cheers, - jonathan -- Jonathan Adams, Solaris Kernel Development
On Thu, Nov 17, 2005 at 09:46:24AM -0800, Jonathan Adams wrote:> > From what I read, ATA-over-Ethernet technology (www.coraid.com) seems > > to be a natural partner to a zfs. When can we expect sol10 drivers for > > AoE? > > I have no idea.Around here, we have iSCSI initiator support in S10U1 (as well as the upcoming S11), and we have an almost complete iSCSI target implementation as well. What advantages does AoE have over iSCSI? From what I can tell, it''s only possible advantage is that it sits on top of Ethernet directly rather than using TCP (like iSCSI does). 5-10 years ago, I could have understood this argument. But with every network vendor driving TCP performance through the stratosphere, I really don''t see the advantage. I do see a disadvantage, though: AoE is not routable and can only work with other machines sharing the same subnet. So to answer your question, I really don''t see the point of ever doing AoE, given iSCSI. And Sun is delivering iSCSI support into Solaris (initiator is done, target is almost done). --Bill
On Thu, 2005-11-17 at 19:22, Bill Moore wrote:> So to answer your question, I really don''t see the point of ever doing > AoE, given iSCSI. And Sun is delivering iSCSI support into Solaris > (initiator is done, target is almost done).I think is is purely a hardware cost issue. IIRC AoE is much cheaper hardware wise than iSCSI at this time. So I suspect it probably wouldn''t come from Sun but if someone in the OpenSolaris community developed AoE drivers and the quality is up to Solaris standards I''m pretty sure we would be interested in integrating them into Solaris. -- Darren J Moffat
On Fri, Nov 18, 2005 at 10:40:01AM +0000, Darren J Moffat wrote:> On Thu, 2005-11-17 at 19:22, Bill Moore wrote: > > > So to answer your question, I really don''t see the point of ever doing > > AoE, given iSCSI. And Sun is delivering iSCSI support into Solaris > > (initiator is done, target is almost done). > > I think is is purely a hardware cost issue. IIRC AoE is much cheaper > hardware wise than iSCSI at this time.Huh? If they both use an Ethernet adapter, how is one cheaper than the other? Or am I missing something that''s blindingly obvious? --Bill
Bill Moore wrote:> On Fri, Nov 18, 2005 at 10:40:01AM +0000, Darren J Moffat wrote: > >> On Thu, 2005-11-17 at 19:22, Bill Moore wrote: >> >> >>> So to answer your question, I really don''t see the point of ever doing >>> AoE, given iSCSI. And Sun is delivering iSCSI support into Solaris >>> (initiator is done, target is almost done). >>> >> I think is is purely a hardware cost issue. IIRC AoE is much cheaper >> hardware wise than iSCSI at this time. >> > > Huh? If they both use an Ethernet adapter, how is one cheaper than the > other? Or am I missing something that''s blindingly obvious? >Most of the time when people talk about iSCSI they start into TOE hardware costs. Or they talk about the additional load on CPU and the hardware cost to offset.
On Fri, 2005-11-18 at 17:17, Bill Moore wrote:> On Fri, Nov 18, 2005 at 10:40:01AM +0000, Darren J Moffat wrote: > > On Thu, 2005-11-17 at 19:22, Bill Moore wrote: > > > > > So to answer your question, I really don''t see the point of ever doing > > > AoE, given iSCSI. And Sun is delivering iSCSI support into Solaris > > > (initiator is done, target is almost done). > > > > I think is is purely a hardware cost issue. IIRC AoE is much cheaper > > hardware wise than iSCSI at this time. > > Huh? If they both use an Ethernet adapter, how is one cheaper than the > other? Or am I missing something that''s blindingly obvious?Actually I think it might be me that is missing something obvious. I was assuming, quite probably incorrectly, that because it was iSCSI that the disks at the other end would be SCSI and thus more expensive than *ATA disks (as they so often are). -- Darren J Moffat
Darren J Moffat wrote:> On Fri, 2005-11-18 at 17:17, Bill Moore wrote: > >> On Fri, Nov 18, 2005 at 10:40:01AM +0000, Darren J Moffat wrote: >> >>> On Thu, 2005-11-17 at 19:22, Bill Moore wrote: >>> >>> >>>> So to answer your question, I really don''t see the point of ever doing >>>> AoE, given iSCSI. And Sun is delivering iSCSI support into Solaris >>>> (initiator is done, target is almost done). >>>> >>> I think is is purely a hardware cost issue. IIRC AoE is much cheaper >>> hardware wise than iSCSI at this time. >>> >> Huh? If they both use an Ethernet adapter, how is one cheaper than the >> other? Or am I missing something that''s blindingly obvious? >> > > Actually I think it might be me that is missing something obvious. > > I was assuming, quite probably incorrectly, that because it was > iSCSI that the disks at the other end would be SCSI and thus > more expensive than *ATA disks (as they so often are).The on-wire protocol doesn''t impact the drive type unless you''re going straight from the controller to the disk. In many cases you''ll see a SCSI controller that has SATA drives on the backend.