We already use multipathd in our install already, but this was something I wondered about. We use Sun disk arrays and they mention the use of their RDAC driver to multipathing on Linux. Since its from the vendor, one would think it be better. What does the collective think? Sun StorageTek RDAC Multipath Failover Driver for Linux http://download.oracle.com/docs/cd/E19373-01/820-4738-13/chapsing.html David -- Personally, I liked the university. They gave us money and facilities, we didn''t have to produce anything! You''ve never been out of college! You don''t know what it''s like out there! I''ve worked in the private sector. They expect results. -Ray Ghostbusters
David Noriega wrote:> We already use multipathd in our install already, but this was > something I wondered about. We use Sun disk arrays and they mention > the use of their RDAC driver to multipathing on Linux. Since its from > the vendor, one would think it be better. What does the collective > think? > > Sun StorageTek RDAC Multipath Failover Driver for Linux > http://download.oracle.com/docs/cd/E19373-01/820-4738-13/chapsing.html > > David >I assume you are using the ST25xx or ST6xxx storage with Lustre? Exactly which arrays? I''ve been happy with RDAC, but I don''t think Oracle has released RHEL6 support yet (but Oracle also does not support Lustre servers on RHEL6 yet). If your multupath config is working (ie, you''ve tested it by unplugging/replugging cables under load and were happy with the behavior), I''m not going to tell you to change. Kevin
They are 2540 and I''m running EL5(centos). Well the thought came around since I had to rebuild a node after a hardware problem. So I went ahead and gave it a shot. I think I posted about this problem before somewhere in the mailing list about getting stray I/O errors which were for /dev/sdX devices that were the other path to the same device(Well thats the idea we came to). Well after installing the Sun RDAC module and disabling multipathd, I can happily say those messages are gone, so I suppose Sun''s module is able to talk to the disk array in a better manner then multipathd. Though I haven''t failed back the lustre ost''s to this particular node just yet(will wait till the weekend). I''ll post again if anything goes wrong, but I think going with this RDAC module might be better. ps: One thing that has nagged me since Lustre was installed and setup by a vendor, was the disk arrays were never setup with initiators or hosts in the configuration(Using CAM). We have another similar disk array(6140) we setup for another filesystem and I know initiators/hosts were setup on the array. I can''t say that this has caused any problems, but its something in the back of my mind. Thanks, David On Wed, Jul 20, 2011 at 4:15 PM, Kevin Van Maren <kevin.van.maren at oracle.com> wrote:> David Noriega wrote: >> >> We already use multipathd in our install already, but this was >> something I wondered about. We use Sun disk arrays and they mention >> the use of their RDAC driver to multipathing on Linux. Since its from >> the vendor, one would think it be better. What does the collective >> think? >> >> Sun StorageTek RDAC Multipath Failover Driver for Linux >> http://download.oracle.com/docs/cd/E19373-01/820-4738-13/chapsing.html >> >> David >> > > I assume you are using the ST25xx or ST6xxx storage with Lustre? ?Exactly > which arrays? > > I''ve been happy with RDAC, but I don''t think Oracle has released RHEL6 > support yet > (but Oracle also does not support Lustre servers on RHEL6 yet). > > If your multupath config is working (ie, you''ve tested it by > unplugging/replugging cables > under load and were happy with the behavior), I''m not going to tell you to > change. > > Kevin > >-- Personally, I liked the university. They gave us money and facilities, we didn''t have to produce anything! You''ve never been out of college! You don''t know what it''s like out there! I''ve worked in the private sector. They expect results. -Ray Ghostbusters
Yes, the controllers are active/passive, so while both controllers export each LUN, only the LUN on the active controller can be used. In the event of a path or controller failure, RDAC will migrate the lun so that it is active on the working controller/path. Seeing those problems either indicates that your multipath driver doesn''t properly support asynchronous multipath, or there is a configuration issue. I believe some firmware versions allow you to have automatic failover, so the LUN is migrated on access, which was meant to work around multipath drivers that didn''t migrate the LUN, but will perform very poorly if more than one path is used. Note that it is also possible to have multiple paths to each controller, which can also be load balanced or zoned (more useful for eg the ST6780). [If you want to experience pain, access a LUN from two hosts at the same time, which each host connected to a different controller. It will also work, but be slow, kindof like reading two CDs at the same time in a CD changer.] Kevin David Noriega wrote:> They are 2540 and I''m running EL5(centos). > > Well the thought came around since I had to rebuild a node after a > hardware problem. So I went ahead and gave it a shot. I think I posted > about this problem before somewhere in the mailing list about getting > stray I/O errors which were for /dev/sdX devices that were the other > path to the same device(Well thats the idea we came to). Well after > installing the Sun RDAC module and disabling multipathd, I can happily > say those messages are gone, so I suppose Sun''s module is able to talk > to the disk array in a better manner then multipathd. Though I haven''t > failed back the lustre ost''s to this particular node just yet(will > wait till the weekend). I''ll post again if anything goes wrong, but I > think going with this RDAC module might be better. > > ps: One thing that has nagged me since Lustre was installed and setup > by a vendor, was the disk arrays were never setup with initiators or > hosts in the configuration(Using CAM). We have another similar disk > array(6140) we setup for another filesystem and I know > initiators/hosts were setup on the array. I can''t say that this has > caused any problems, but its something in the back of my mind. > > Thanks, > David > > On Wed, Jul 20, 2011 at 4:15 PM, Kevin Van Maren > <kevin.van.maren at oracle.com> wrote: > >> David Noriega wrote: >> >>> We already use multipathd in our install already, but this was >>> something I wondered about. We use Sun disk arrays and they mention >>> the use of their RDAC driver to multipathing on Linux. Since its from >>> the vendor, one would think it be better. What does the collective >>> think? >>> >>> Sun StorageTek RDAC Multipath Failover Driver for Linux >>> http://download.oracle.com/docs/cd/E19373-01/820-4738-13/chapsing.html >>> >>> David >>> >>> >> I assume you are using the ST25xx or ST6xxx storage with Lustre? Exactly >> which arrays? >> >> I''ve been happy with RDAC, but I don''t think Oracle has released RHEL6 >> support yet >> (but Oracle also does not support Lustre servers on RHEL6 yet). >> >> If your multupath config is working (ie, you''ve tested it by >> unplugging/replugging cables >> under load and were happy with the behavior), I''m not going to tell you to >> change. >> >> Kevin >> >> >> > > > >