-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, i''m getting involved in a pre-production test and want to be sure of the means i''ll have to use. Take 2 SunFire x4150 & 1 3750 Gb Cisco Switche 1 private VLAN on the Gb ports of the SW. 1 x4150 is going to be ESX4 aka VSphere Server ( 1 Hardware mirror of 146G & 32G Ram) Booting ESX 4 on local disk. The other is going to be used as a poor-man-SAN : 8X146G SAS 15k 8Go Ram Solaris 10 2 first disks Hardware mirror of 146Go with Sol10 & UFS filesystem on it. The next 6 others will be used as a raidz2 ZFS volume of 535G, compression and shareiscsi=on. I''m going to CHAP protect it soon... I''m going to put two zfs slices on it: zfs create -V 250G SAN/ESX1 zfs create -V 250G SAN/ESX2 And using it for VMFS. Oh, by the way, i''ve no VMotion plugin. In my tests ESX4 seems to work fine with this, but i haven''t already stressed it ;-) Therefore, i don''t know if the 1Gb FDuplex per port will be enough, i don''t know either i''have to put sort of redundant access form ESX to SAN,etc.... Is my configuration OK ? It''s only a preprod install, i''m able to break almost everything if it''s necessary. Thanks for all yours answers. Yours, faithfully. - -- Cordialement. - - Lyc?e Alfred Nobel,Clichy sous bois http://www.lyceenobel.org KeyID 0x46EA1D16 FingerPrint 997B164F4F606A61E7B1FC61961A821646EA1D16 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkpCHx4ACgkQlhqCFkbqHRbf/ACfbV1amZJxHfVHKDknoh2hT/5y SpwAoJktgPqvEkFa5jHgUGXNnkv7TX99 =zH2Q -----END PGP SIGNATURE-----
On Wed, June 24, 2009 08:42, Philippe Schwarz wrote:> In my tests ESX4 seems to work fine with this, but i haven''t already > stressed it ;-) > > Therefore, i don''t know if the 1Gb FDuplex per port will be enough, i > don''t know either i''have to put sort of redundant access form ESX to > SAN,etc.... > > Is my configuration OK ? It''s only a preprod install, i''m able to break > almost everything if it''s necessary.At least in 3.x, VMware had a limitation of only being able to use one connection per iSCSI target (even if there were multiple LUNs on it): http://mail.opensolaris.org/pipermail/zfs-discuss/2009-June/028731.html Not sure if that''s changed in 4.x, so if you''re going to have more than one LUN, then having more than one target may be advantageous. See also: http://www.vmware.com/files/pdf/iSCSI_design_deploy.pdf You may want to go to the VMware lists / forums to see what the people there say as well. Out of curiosity, any reason why went with iSCSI and not NFS? There seems to be some debate on which is better under which circumstances.
> 2 first disks Hardware mirror of 146Go with Sol10 & UFS filesystem on it. > The next 6 others will be used as a raidz2 ZFS volume of 535G, > compression and shareiscsi=on. > I''m going to CHAP protect it soon...you''re not going to get the random read & write performance you need for a vm backend out of any kind of parity raid. just go with 3 sets of mirrors. unless you''re ok with subpar performance (and if you think you are, you should really reconsider). also you might get significant mileage out of putting an ssd in and using it for zil. here''s a good post from roch''s blog about parity vs mirrored setups: http://blogs.sun.com/roch/entry/when_to_and_not_to
See this thread for information on load testing for vmware: http://communities.vmware.com/thread/73745?tstart=0&start=0 Within the thread there are instructions for using iometer to load test your storage. You should test out your solution before going live, and compare what you get with what you need. Just because striping 3 mirrors *will* give you more performance than raidz2 doesn''t always mean that is the best solution. Choose the best solution for your use case. You should have at least two NICs per connection to storage and LAN (4 total in this simple example), for redundancy if nothing else. Performance wise, vsphere can now have multiple SW iSCSI connections to a single LUN. My testing showed compression increased iSCSI performance by 1.7x, so I like compression. But again, these are my tests in my situation. Your results may differ from mine. Regarding ZIL usage, from what I have read you will only see benefits if you are using NFS backed storage, but that it can be significant. Remove the ZIL for testing to see the max benefit you could get. Don''t do this in production! -Scott -- This message posted from opensolaris.org
> Within the thread there are instructions for using iometer to load test your storage. You should test out your solution before going live, and compare what you get with what you need. Just because striping 3 mirrors *will* give you more performance than raidz2 doesn''t always mean that is the best solution. Choose the best solution for your use case.multiple vm disks that have any kind of load on them will bury a raidz or raidz2. out of a 6x raidz2 you are going to get the iops and random seek latency of a single drive (realistically the random seek will probably be slightly worse, actually). how could that be adequate for a virtual machine backend? if you set up a raidz2 with 6x15k drives, for the majority of use cases, you are pretty much throwing your money away. you are going to roll your own san, buy a bunch of 15k drives, use 2-3u of rackspace and four (or more) switchports, and what you''re getting out of it is essentially a 500gb 15k drive with a high mttdl and a really huge theoretical transfer speed for sequential operations (which you won''t be able to saturate anyway because you''re delivering over gige)? for this particular setup i can''t really think of a situation where that would make sense.> Regarding ZIL usage, from what I have read you will only see benefits if you are using NFS backed storage, but that it can be significant.link?
Bottim line with virtual machines is that your IO will be random by definition since it all goes into the same pipe. If you want to be able to scale, go with RAID 1 vdevs. And don''t skimp on the memory. Our current experience hasn''t shown a need for an SSD for the ZIL but it might be useful for L2ARC (using iSCSI for VMs, NFS for templates and iso images) Cordialement, Erik Ableson +33.6.80.83.58.28 Envoy? depuis mon iPhone On 24 juin 2009, at 18:56, milosz <mewash at gmail.com> wrote:>> Within the thread there are instructions for using iometer to load >> test your storage. You should test out your solution before going >> live, and compare what you get with what you need. Just because >> striping 3 mirrors *will* give you more performance than raidz2 >> doesn''t always mean that is the best solution. Choose the best >> solution for your use case. > > multiple vm disks that have any kind of load on them will bury a raidz > or raidz2. out of a 6x raidz2 you are going to get the iops and > random seek latency of a single drive (realistically the random seek > will probably be slightly worse, actually). how could that be > adequate for a virtual machine backend? if you set up a raidz2 with > 6x15k drives, for the majority of use cases, you are pretty much > throwing your money away. you are going to roll your own san, buy a > bunch of 15k drives, use 2-3u of rackspace and four (or more) > switchports, and what you''re getting out of it is essentially a 500gb > 15k drive with a high mttdl and a really huge theoretical transfer > speed for sequential operations (which you won''t be able to saturate > anyway because you''re delivering over gige)? for this particular > setup i can''t really think of a situation where that would make sense. > >> Regarding ZIL usage, from what I have read you will only see >> benefits if you are using NFS backed storage, but that it can be >> significant. > > link? > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 milosz a ?crit :>> Within the thread there are instructions for using iometer to load test your storage. You should test out your solution before going live, and compare what you get with what you need. Just because striping 3 mirrors *will* give you more performance than raidz2 doesn''t always mean that is the best solution. Choose the best solution for your use case. > > multiple vm disks that have any kind of load on them will bury a raidz > or raidz2. out of a 6x raidz2 you are going to get the iops and > random seek latency of a single drive (realistically the random seek > will probably be slightly worse, actually). how could that be > adequate for a virtual machine backend? if you set up a raidz2 with > 6x15k drives, for the majority of use cases, you are pretty much > throwing your money away. you are going to roll your own san, buy a > bunch of 15k drives, use 2-3u of rackspace and four (or more) > switchports, and what you''re getting out of it is essentially a 500gb > 15k drive with a high mttdl and a really huge theoretical transfer > speed for sequential operations (which you won''t be able to saturate > anyway because you''re delivering over gige)? for this particular > setup i can''t really think of a situation where that would make sense.Ouch ! Pretty direct answer. That''s very interesting however. Let me focus on a few more points : - - The hardware can''t really be extended any more. No budget ;-( - - the VM will be mostly few IO systems : - -- WS2003 with Trend Officescan, WSUS (for 300 XP) and RDP - -- Solaris10 with SRSS 4.2 (Sunray server) (File and DB servers won''t move in a nearby future to VM+SAN) I thought -but could be wrong- that those systems could afford a high latency IOs data rate.>what you''re getting out of it is essentially a 500gb > 15k drive with a high mttdlThat''s what i wanted, a rock-solid disk area, despite a not-as-good-as-i''d-like random IO. I''ll give it a try with sequential tranfer. However, thanks for your answer.> >> Regarding ZIL usage, from what I have read you will only see benefits if you are using NFS backed storage, but that it can be significant. > > link? > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >- -- Cordialement. - - Lyc?e Alfred Nobel,Clichy sous bois http://www.lyceenobel.org KeyID 0x46EA1D16 FingerPrint 997B164F4F606A61E7B1FC61961A821646EA1D16 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkpCjKIACgkQlhqCFkbqHRZ8EwCbBbtEsFOimeiUXFMNRBrJI4uO xuAAnRO8pv3ES2bhIUWfEuyEtp8M1vGl =kRUK -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David Magda a ?crit :> On Wed, June 24, 2009 08:42, Philippe Schwarz wrote: > >> In my tests ESX4 seems to work fine with this, but i haven''t already >> stressed it ;-) >> >> Therefore, i don''t know if the 1Gb FDuplex per port will be enough, i >> don''t know either i''have to put sort of redundant access form ESX to >> SAN,etc.... >> >> Is my configuration OK ? It''s only a preprod install, i''m able to break >> almost everything if it''s necessary. > > At least in 3.x, VMware had a limitation of only being able to use one > connection per iSCSI target (even if there were multiple LUNs on it): > > http://mail.opensolaris.org/pipermail/zfs-discuss/2009-June/028731.html > > Not sure if that''s changed in 4.x, so if you''re going to have more than > one LUN, then having more than one target may be advantageous. See also: > > http://www.vmware.com/files/pdf/iSCSI_design_deploy.pdf > > You may want to go to the VMware lists / forums to see what the people > there say as well. >> Out of curiosity, any reason why went with iSCSI and not NFS? There seems > to be some debate on which is better under which circumstances.iSCSI instead of NFS ? Because of the overwhelming difference in transfer rate between them,.... In fact, that''s what i read. And setting isCSI target is so simple, that i didn''t even search another solution. Thanks for your answer. - -- Cordialement. - - Lyc?e Alfred Nobel,Clichy sous bois http://www.lyceenobel.org KeyID 0x46EA1D16 FingerPrint 997B164F4F606A61E7B1FC61961A821646EA1D16 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkpCkpgACgkQlhqCFkbqHRZYtACfc5QMhQmWvC1wAZD36YLJkBNT XV8An0DPj+te+ppS0fBAlDL8vmFKMGG+ =h0Nv -----END PGP SIGNATURE-----
> - - the VM will be mostly few IO systems : > - -- WS2003 with Trend Officescan, WSUS (for 300 XP) and RDP > - -- Solaris10 with SRSS 4.2 (Sunray server) > > (File and DB servers won''t move in a nearby future to VM+SAN) > > I thought -but could be wrong- that those systems could afford a high > latency IOs data rate.might be fine most of the time... rdp in particular is vulnerable to io spiking and disk latency. depends on how many users you have on that rdp vm. also wsus is surprisingly (or not, given it''s a microsoft production) resource-hungry. if those servers are on physical boxes right now i''d do some perfmon caps and add up the iops.>>what you''re getting out of it is essentially a 500gb >> 15k drive with a high mttdl > > That''s what i wanted, a rock-solid disk area, despite a > not-as-good-as-i''d-like random IO.fair enough.
On Jun 24, 2009, at 16:54, Philippe Schwarz wrote:>> Out of curiosity, any reason why went with iSCSI and not NFS? There >> seems >> to be some debate on which is better under which circumstances. > iSCSI instead of NFS ? > Because of the overwhelming difference in transfer rate between > them,.... In fact, that''s what i read.That would depend on I/O pattern, wouldn''t it? If you have mostly random I/O then it''s unlikely you''d saturate a GigE as you''re not streaming. Well, this is with 3.x. I don''t have any experience with 4.x so I guess it''s best to test. Everyone''s going to have to build up all their knowledge from scratch with the new software. :) http://tinyurl.com/d8urpx http://vmetc.com/2009/05/01/reasons-for-using-nfs-with-vmware-virtual-infrastructure/ Cloning Windows images (assuming one VMDK per FS) would be a possibility as well. Either way, you may want to tweak some of the TCP settings for best results: http://serverfault.com/questions/13190> And setting isCSI target is so simple, that i didn''t even search > another > solution.# zfs set sharenfs=on mypool/myfs1 http://docs.sun.com/app/docs/doc/819-5461/gamnd
> if those servers are on physical boxes right now i''d do some perfmon > caps and add up the iops.Using perfmon to get a sense of what is required is a good idea. Use the 95 percentile to be conservative. The counters I have used are in the Physical disk object. Don''t ignore the latency counters either. In my book, anything consistently over 20ms or so is excessive. I run 30+ VMs on an Equallogic array with 14 sata disks, broken up as two striped 6 disk raid5 sets (raid 50) with 2 hot spares. That array is, on average, about 25% loaded from an IO stand point. Obviously my VMs are pretty light. And the EQL gear is *fast*, which makes me feel better about spending all of that money :).>> Regarding ZIL usage, from what I have read you will only see >> benefits if you are using NFS backed storage, but that it can be >> significant. > > link?>From the ZFS Evil Tuning Guide (http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide):"ZIL stands for ZFS Intent Log. It is used during synchronous writes operations." further down: "If you''ve noticed terrible NFS or database performance on SAN storage array, the problem is not with ZFS, but with the way the disk drivers interact with the storage devices. ZFS is designed to work with storage devices that manage a disk-level cache. ZFS commonly asks the storage device to ensure that data is safely placed on stable storage by requesting a cache flush. For JBOD storage, this works as designed and without problems. For many NVRAM-based storage arrays, a problem might come up if the array takes the cache flush request and actually does something rather than ignoring it. Some storage will flush their caches despite the fact that the NVRAM protection makes those caches as good as stable storage. ZFS issues infrequent flushes (every 5 second or so) after the uberblock updates. The problem here is fairly inconsequential. No tuning is warranted here. ZFS also issues a flush every time an application requests a synchronous write (O_DSYNC, fsync, NFS commit, and so on). The completion of this type of flush is waited upon by the application and impacts performance. Greatly so, in fact. From a performance standpoint, this neutralizes the benefits of having an NVRAM-based storage." When I was testing iSCSI vs. NFS, it was clear iSCSI was not doing sync, NFS was. Here are some zpool iostat numbers: iSCSI testing using iometer with the RealLife work load (65% read, 60% random, 8k transfers - see the link in my previous post) - it is clear that writes are being cached in RAM, and then spun off to disk. # zpool iostat data01 1 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- data01 55.5G 20.4T 691 0 4.21M 0 data01 55.5G 20.4T 632 0 3.80M 0 data01 55.5G 20.4T 657 0 3.93M 0 data01 55.5G 20.4T 669 0 4.12M 0 data01 55.5G 20.4T 689 0 4.09M 0 data01 55.5G 20.4T 488 1.77K 2.94M 9.56M data01 55.5G 20.4T 29 4.28K 176K 23.5M data01 55.5G 20.4T 25 4.26K 165K 23.7M data01 55.5G 20.4T 20 3.97K 133K 22.0M data01 55.6G 20.4T 170 2.26K 1.01M 11.8M data01 55.6G 20.4T 678 0 4.05M 0 data01 55.6G 20.4T 625 0 3.74M 0 data01 55.6G 20.4T 685 0 4.17M 0 data01 55.6G 20.4T 690 0 4.04M 0 data01 55.6G 20.4T 679 0 4.02M 0 data01 55.6G 20.4T 664 0 4.03M 0 data01 55.6G 20.4T 699 0 4.27M 0 data01 55.6G 20.4T 423 1.73K 2.66M 9.32M data01 55.6G 20.4T 26 3.97K 151K 21.8M data01 55.6G 20.4T 34 4.23K 223K 23.2M data01 55.6G 20.4T 13 4.37K 87.1K 23.9M data01 55.6G 20.4T 21 3.33K 136K 18.6M data01 55.6G 20.4T 468 496 2.89M 1.82M data01 55.6G 20.4T 687 0 4.13M 0 Testing against NFS shows writes to disk continuously. NFS Testing capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- data01 59.6G 20.4T 57 216 352K 1.74M data01 59.6G 20.4T 41 21 660K 2.74M data01 59.6G 20.4T 44 24 655K 3.09M data01 59.6G 20.4T 41 23 598K 2.97M data01 59.6G 20.4T 34 33 552K 4.21M data01 59.6G 20.4T 46 24 757K 3.09M data01 59.6G 20.4T 39 24 593K 3.09M data01 59.6G 20.4T 45 25 687K 3.22M data01 59.6G 20.4T 45 23 683K 2.97M data01 59.6G 20.4T 33 23 492K 2.97M data01 59.6G 20.4T 16 41 214K 1.71M data01 59.6G 20.4T 3 2.36K 53.4K 30.4M data01 59.6G 20.4T 1 2.23K 20.3K 29.2M data01 59.6G 20.4T 0 2.24K 30.2K 28.9M data01 59.6G 20.4T 0 1.93K 30.2K 25.1M data01 59.6G 20.4T 0 2.22K 0 28.4M data01 59.7G 20.4T 21 295 317K 4.48M data01 59.7G 20.4T 32 12 495K 1.61M data01 59.7G 20.4T 35 25 515K 3.22M data01 59.7G 20.4T 36 11 522K 1.49M data01 59.7G 20.4T 33 24 508K 3.09M data01 59.7G 20.4T 35 23 536K 2.97M data01 59.7G 20.4T 32 23 483K 2.97M data01 59.7G 20.4T 37 37 538K 4.70M Note, the ZIL is being used, just not on a separate device. The periodic high writes show it being flushed. You can also see reads stall to nearly zero as the ZIL is dumping. Not good. This thread is discussing this behavior: http://www.opensolaris.org/jive/thread.jspa?threadID=106453 Coming from a mostly Windows world, I really like the tools that you get on Opensolaris to see this kind of stuff. -Scott -- This message posted from opensolaris.org
>>>>> "sm" == Scott Meilicke <no-reply at opensolaris.org> writes:sm> Some storage will flush their caches despite the fact that the sm> NVRAM protection makes those caches as good as stable sm> storage. [...] ZFS also issues a flush every time an sm> application requests a synchronous write (O_DSYNC, fsync, NFS sm> commit, and so on). [...] this neutralizes the benefits of sm> having an NVRAM-based storage." if the external RAID array or the solaris driver is broken, yes. If not broken, the NVRAM should provide an extra-significant speed boost for exactly the case of frequent synchronous writes. Isn''t that section of the evil tuning guide you''re quoting actually about checking if the NVRAM/driver connection is working right or not? sm> When I was testing iSCSI vs. NFS, it was clear iSCSI was not sm> doing sync, NFS was. I wonder if this is a bug in iSCSI, in either the VMWare initiator or the Sun target. With VM''s there shouldn''t be any opening and closing of files to provoke an extra sync on NFS, only read, write, and sync to the middle of big files, so I wouldn''t think NFS should do any more or less syncing than iSCSI. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090625/525e0a94/attachment.bin>
> Isn''t that section of the evil tuning guide you''re quoting actually about > checking if the NVRAM/driver connection is working right or not?Miles, yes, you are correct. I just thought it was interesting reading about how syncs and such work within ZFS. Regarding my NFS test, you remind me that my test was flawed, in that my iSCSI numbers were using the ESXi iSCSI SW initiator, while the NFS tests were performed with the VM as the guest, not ESX. I''ll give ESX as the NFS client, vmdks on NFS, a go and get back to you. Thanks! Scott -- This message posted from opensolaris.org
I ran the RealLife iometer profile on NFS based storage (vs. SW iSCSI), and got nearly identical results to having the disks on iSCSI: iSCSI IOPS: 1003.8 MB/s: 7.8 Avg Latency (s): 27.9 NFS IOPS: 1005.9 MB/s: 7.9 Avg Latency (s): 29.7 Interesting! Here is how the pool was behaving during the testing. Again this is NFS backed storage: data01 122G 20.3T 166 63 2.80M 4.49M data01 122G 20.3T 145 59 2.28M 3.35M data01 122G 20.3T 168 58 2.89M 4.38M data01 122G 20.3T 169 59 2.79M 3.69M data01 122G 20.3T 54 935 856K 18.1M data01 122G 20.3T 9 7.96K 183K 134M data01 122G 20.3T 49 3.82K 900K 61.8M data01 122G 20.3T 160 61 2.73M 4.23M data01 122G 20.3T 166 63 2.62M 4.01M data01 122G 20.3T 162 64 2.55M 4.24M data01 122G 20.3T 163 61 2.63M 4.14M data01 122G 20.3T 145 54 2.37M 3.89M data01 122G 20.3T 163 63 2.69M 4.35M data01 122G 20.3T 171 64 2.80M 3.97M data01 122G 20.3T 153 67 2.68M 4.65M data01 122G 20.3T 164 66 2.63M 4.10M data01 122G 20.3T 171 66 2.75M 4.51M data01 122G 20.3T 175 53 3.02M 3.83M data01 122G 20.3T 157 59 2.64M 3.80M data01 122G 20.3T 172 59 2.85M 4.11M data01 122G 20.3T 173 68 2.99M 4.11M data01 122G 20.3T 97 35 1.66M 2.61M data01 122G 20.3T 170 58 2.87M 3.62M data01 122G 20.3T 160 64 2.72M 4.17M data01 122G 20.3T 163 63 2.68M 3.77M data01 122G 20.3T 160 60 2.67M 4.29M data01 122G 20.3T 165 65 2.66M 4.05M data01 122G 20.3T 191 59 3.25M 3.97M data01 122G 20.3T 159 65 2.76M 4.18M data01 122G 20.3T 154 52 2.64M 3.50M data01 122G 20.3T 164 61 2.76M 4.38M data01 122G 20.3T 154 62 2.66M 4.08M data01 122G 20.3T 160 58 2.71M 3.95M data01 122G 20.3T 84 34 1.48M 2.37M data01 122G 20.3T 9 7.27K 156K 125M data01 122G 20.3T 25 5.20K 422K 84.3M data01 122G 20.3T 170 60 2.77M 3.64M data01 122G 20.3T 170 63 2.85M 3.85M So it appears NFS is doing syncs, while iSCSI is not (See my earlier zpool iostat data for iSCSI). Isn''t this what we expect, because NFS does syncs, while iSCSI does not (assumed)? -Scott -- This message posted from opensolaris.org
On Fri, 26 Jun 2009, Scott Meilicke wrote:> I ran the RealLife iometer profile on NFS based storage (vs. SW > iSCSI), and got nearly identical results to having the disks on > iSCSI:Both of them are using TCP to access the server.> So it appears NFS is doing syncs, while iSCSI is not (See my earlier > zpool iostat data for iSCSI). Isn''t this what we expect, because NFS > does syncs, while iSCSI does not (assumed)?If iSCSI does not do syncs (presumably it should when a cache flush is requested) then NFS is safer in case the server crashes and reboots. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Fri, Jun 26, 2009 at 6:04 PM, Bob Friesenhahn<bfriesen at simple.dallas.tx.us> wrote:> On Fri, 26 Jun 2009, Scott Meilicke wrote: > >> I ran the RealLife iometer profile on NFS based storage (vs. SW iSCSI), >> and got nearly identical results to having the disks on iSCSI: > > Both of them are using TCP to access the server. > >> So it appears NFS is doing syncs, while iSCSI is not (See my earlier zpool >> iostat data for iSCSI). Isn''t this what we expect, because NFS does syncs, >> while iSCSI does not (assumed)? > > If iSCSI does not do syncs (presumably it should when a cache flush is > requested) then NFS is safer in case the server crashes and reboots. > > Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ > GraphicsMagick Maintainer, ? ?http://www.GraphicsMagick.org/ > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >I''ll chime in here as I''ve had experience with this subject as well (ZFS NFS/iSCSI). It depends on your NFS client! I was using the FreeBSD NFSv3 client, which by default does an fsync() for every NFS block (8KB afaik). However, I changed the source and recompile so it would only fsync() on file close or I believe after 5MB. I went from 3MB/sec, to over 100MB/sec after my change. I detailed my struggle here: http://www.brentrjones.com/?p=29 As for iSCSI, I am currently benchmarking the COMSTAR iSCSI target. I previously used the old iscsitgtd framework with ZFS. Previously I would get about 35-40MB/sec. My initial testing with the new COMSTAR iSCSI target is not revealing any substantial performance increase at all. I''ve tried zvol based lu''s, and file based lu''s with no perceived performance difference at all. The iSCSI target is an X4540, 64GB RAM, and 48x 1TB disks configured with 8 vdevs with 5-6 disks each. No SSD, ZIL enabled. My NFS performance is now over 100MB/sec, I can get over 100MB/sec with CIFS as well. However, my iSCSI performance is still rather low for the hardware. It is a standard GigE network, currently jumbo frames are disabled, when I get some time I may make a VLAN with jumbo frames enabled and see if that changes anything at all (not likely). I am CC''ing the storage-discuss group as well for coverage as this covers ZFS, and storage. If anyone has some thoughts, code, or tests, I can run them on my X4540''s and see how it goes. Thanks -- Brent Jones brent at servuhome.net