I''m testing an Iscsi multipath configuration on a T2000 with two disk devices provided by a Netapp filer. Both the T2000 and the Netapp have two ethernet interfaces for Iscsi, going to separate switches on separate private networks. The scsi_vhci devices look like this in `format'': 1. c4t60A98000433469764E4A413571444B63d0 <NETAPP-LUN-0.2-50.00GB> /scsi_vhci/ssd at g60a98000433469764e4a413571444b63 2. c4t60A98000433469764E4A41357149432Fd0 <NETAPP-LUN-0.2-50.00GB> /scsi_vhci/ssd at g60a98000433469764e4a41357149432f These are concatenated in the ZFS pool. There are two network paths to each of the two devices, managed by the scsi_vhci driver. The pool looks like this: # zpool status pool: space state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM space ONLINE 0 0 0 c4t60A98000433469764E4A413571444B63d0 ONLINE 0 0 0 c4t60A98000433469764E4A41357149432Fd0 ONLINE 0 0 0 errors: No known data errors The /kernel/drv/scsi_vhci.conf file, unchanged from the defaut, specifies: load-balance="round-robin"; Indeed, when I generate I/O on a ZFS filesystem, I see TCP traffic with `snoop'' on both of the Iscsi ethernet interfaces. It certainly appears to be doing round-robin. The I/O are going to the same disk devices, of course, but by two different paths. Is this a correct configuration for ZFS? I assume it''s safe, but I thought I should check. -- -Gary Mills- -Unix Support- -U of M Academic Computing and Networking-
This is the same configuration we use on 4 separate servers (T2000, two X4100, and a V215). We do use a different iSCSI solution, but we have the same multi path config setup with scsi_vhci. Dual GigE switches on separate NICs both server and iSCSI node side. We suffered from the e1000g interface flapping bug, on two of these systems, and one time a SAN interface went down to stay (until reboot). The vhci multi path performed flawlessly. I scrubbed the pools (one of them is 10TB) and no errors were found, even though we had heavy IO at the time of the NIC failure. I think this configuration is a good one. Jon Gary Mills wrote:> I''m testing an Iscsi multipath configuration on a T2000 with two disk > devices provided by a Netapp filer. Both the T2000 and the Netapp > have two ethernet interfaces for Iscsi, going to separate switches on > separate private networks. The scsi_vhci devices look like this in > `format'': > > 1. c4t60A98000433469764E4A413571444B63d0 <NETAPP-LUN-0.2-50.00GB> > /scsi_vhci/ssd at g60a98000433469764e4a413571444b63 > 2. c4t60A98000433469764E4A41357149432Fd0 <NETAPP-LUN-0.2-50.00GB> > /scsi_vhci/ssd at g60a98000433469764e4a41357149432f > > These are concatenated in the ZFS pool. There are two network paths > to each of the two devices, managed by the scsi_vhci driver. The pool > looks like this: > > # zpool status > pool: space > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > space ONLINE 0 0 0 > c4t60A98000433469764E4A413571444B63d0 ONLINE 0 0 0 > c4t60A98000433469764E4A41357149432Fd0 ONLINE 0 0 0 > > errors: No known data errors > > The /kernel/drv/scsi_vhci.conf file, unchanged from the defaut, specifies: > > load-balance="round-robin"; > > Indeed, when I generate I/O on a ZFS filesystem, I see TCP traffic with > `snoop'' on both of the Iscsi ethernet interfaces. It certainly appears > to be doing round-robin. The I/O are going to the same disk devices, > of course, but by two different paths. Is this a correct configuration > for ZFS? I assume it''s safe, but I thought I should check. > >-- - _____/ _____/ / - Jonathan Loran - - - / / / IT Manager - - _____ / _____ / / Space Sciences Laboratory, UC Berkeley - / / / (510) 643-5146 jloran at ssl.berkeley.edu - ______/ ______/ ______/ AST:7731^29u18e3
On Fri, Dec 14, 2007 at 10:55:10PM -0800, Jonathan Loran wrote:> > This is the same configuration we use on 4 separate servers (T2000, two > X4100, and a V215). We do use a different iSCSI solution, but we have > the same multi path config setup with scsi_vhci. Dual GigE switches on > separate NICs both server and iSCSI node side. We suffered from the > e1000g interface flapping bug, on two of these systems, and one time a > SAN interface went down to stay (until reboot). The vhci multi path > performed flawlessly. I scrubbed the pools (one of them is 10TB) and no > errors were found, even though we had heavy IO at the time of the NIC > failure. I think this configuration is a good one.Thanks for the response. I did a failover test by disconnecting ethernet cables yesterday. It didn''t behave the way it was supposed to. Likely there''s something wrong with my multipath configuration. I''ll have to review it, but that''s why I have a test server. I was concerned about simultaneous SCSI commands over the two paths that might get executed out of order, but something must ensure that that never happens. -- -Gary Mills- -Unix Support- -U of M Academic Computing and Networking-
Gary Mills wrote:> On Fri, Dec 14, 2007 at 10:55:10PM -0800, Jonathan Loran wrote: > >> This is the same configuration we use on 4 separate servers (T2000, two >> X4100, and a V215). We do use a different iSCSI solution, but we have >> the same multi path config setup with scsi_vhci. Dual GigE switches on >> separate NICs both server and iSCSI node side. We suffered from the >> e1000g interface flapping bug, on two of these systems, and one time a >> SAN interface went down to stay (until reboot). The vhci multi path >> performed flawlessly. I scrubbed the pools (one of them is 10TB) and no >> errors were found, even though we had heavy IO at the time of the NIC >> failure. I think this configuration is a good one. >> > > Thanks for the response. I did a failover test by disconnecting > ethernet cables yesterday. It didn''t behave the way it was supposed > to. Likely there''s something wrong with my multipath configuration. > I''ll have to review it, but that''s why I have a test server. > > I was concerned about simultaneous SCSI commands over the two paths > that might get executed out of order, but something must ensure that > that never happens. > >From the Sun side, the scsi_vhci is pretty simple. There''s a number of options you can tweak with mdadm, but I haven''t ever needed to. Perhaps you will however. The iSCSI targets may be persisting on the failed path for some reason, I don''t now. Not familiar with the Netapp in iSCSI config. Our targets simply respond on what ever path a SCSI command is sent on. This means the initiator side (scsi_vhci) drives the path assignments for each iSCSI command. At least this is how I understand it. The scsi_vhci will never send two of the same commands to both paths, as long as they are up. If a paths fails, then a retry of any pending operations will occur on the operational path, and that''s it. If a path returns to service, then it will be re-utilized. Jon -- - _____/ _____/ / - Jonathan Loran - - - / / / IT Manager - - _____ / _____ / / Space Sciences Laboratory, UC Berkeley - / / / (510) 643-5146 jloran at ssl.berkeley.edu - ______/ ______/ ______/ AST:7731^29u18e3 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071218/877206dd/attachment.html>
Jonathan Loran wrote:> > > Gary Mills wrote: >> On Fri, Dec 14, 2007 at 10:55:10PM -0800, Jonathan Loran wrote: >> >>> This is the same configuration we use on 4 separate servers (T2000, two >>> X4100, and a V215). We do use a different iSCSI solution, but we have >>> the same multi path config setup with scsi_vhci. Dual GigE switches on >>> separate NICs both server and iSCSI node side. We suffered from the >>> e1000g interface flapping bug, on two of these systems, and one time a >>> SAN interface went down to stay (until reboot). The vhci multi path >>> performed flawlessly. I scrubbed the pools (one of them is 10TB) and no >>> errors were found, even though we had heavy IO at the time of the NIC >>> failure. I think this configuration is a good one. >>> >> >> Thanks for the response. I did a failover test by disconnecting >> ethernet cables yesterday. It didn''t behave the way it was supposed >> to. Likely there''s something wrong with my multipath configuration. >> I''ll have to review it, but that''s why I have a test server. >> >> I was concerned about simultaneous SCSI commands over the two paths >> that might get executed out of order, but something must ensure that >> that never happens. >> >> > > >From the Sun side, the scsi_vhci is pretty simple. There''s a number > of options you can tweak with mdadm,I meant "mpathadm". Oops. I need more coffee (or less). Jon> but I haven''t ever needed to. Perhaps you will however. The iSCSI > targets may be persisting on the failed path for some reason, I don''t > now. Not familiar with the Netapp in iSCSI config. Our targets > simply respond on what ever path a SCSI command is sent on. This > means the initiator side (scsi_vhci) drives the path assignments for > each iSCSI command. At least this is how I understand it. The > scsi_vhci will never send two of the same commands to both paths, as > long as they are up. If a paths fails, then a retry of any pending > operations will occur on the operational path, and that''s it. If a > path returns to service, then it will be re-utilized. > > Jon > -- > > > - _____/ _____/ / - Jonathan Loran - - > - / / / IT Manager - > - _____ / _____ / / Space Sciences Laboratory, UC Berkeley > - / / / (510) 643-5146 jloran at ssl.berkeley.edu > - ______/ ______/ ______/ AST:7731^29u18e3 > > > > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- - _____/ _____/ / - Jonathan Loran - - - / / / IT Manager - - _____ / _____ / / Space Sciences Laboratory, UC Berkeley - / / / (510) 643-5146 jloran at ssl.berkeley.edu - ______/ ______/ ______/ AST:7731^29u18e3 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20071218/501fa517/attachment.html>