Hello All, We are currently testing a NFS+Sun Cluster solution with ZFS in our environment. Currently we have 2 HP DL360s each with a 2-port LSI SAS 9200-8e controller (mpt_sas driver) connected to a Xyratex OneStor SP1224s 24-bay sas tray. The xyratex sas tray has 2 ports on the controller which can connect to each server. We have a zpool of 2x (8+2) drives and 1 hot spare and also 3 intel X25-E ssds in the tray. We were hoping to have the ssds work as slog/cache devices however when we add them to the pool (as cache or log) we start to get an insane number of scsi resets. When the storage is connected to a single node the resets do not happen. scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk at g50015179591cdd18 (sd24): Error for Command: write(10) Error Level: Retryable scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or bus reset occurred), ASCQ: 0x0, FRU: 0x0 Could this error be because the Intel SSDs are sata and we need a real SAS interface for multi-initiator support or is it a bug in the firmware somewhere that needs to be addressed? Where can we go from here to troubleshoot this oddity? Thanks! Steve Jost -- This message posted from opensolaris.org
Brian Wilson
2010-Jun-01 19:32 UTC
[zfs-discuss] Solaris 10U8, Sun Cluster, and SSD issues.
Silly question - you''re not trying to have the ZFS pool imported on both hosts at the same time, are you? Maybe I misread, had a hard time following the full description of what exact configuration caused the scsi resets. On Jun 1, 2010, at 2:22 PM, Steve Jost wrote:> Hello All, > We are currently testing a NFS+Sun Cluster solution with ZFS in our > environment. Currently we have 2 HP DL360s each with a 2-port LSI > SAS 9200-8e controller (mpt_sas driver) connected to a Xyratex > OneStor SP1224s 24-bay sas tray. The xyratex sas tray has 2 ports > on the controller which can connect to each server. We have a zpool > of 2x (8+2) drives and 1 hot spare and also 3 intel X25-E ssds in > the tray. We were hoping to have the ssds work as slog/cache > devices however when we add them to the pool (as cache or log) we > start to get an insane number of scsi resets. When the storage is > connected to a single node the resets do not happen. > > scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ > disk at g50015179591cdd18 (sd24): > Error for Command: write(10) Error Level: Retryable > scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or > bus reset occurred), ASCQ: 0x0, FRU: 0x0 > > Could this error be because the Intel SSDs are sata and we need a > real SAS interface for multi-initiator support or is it a bug in the > firmware somewhere that needs to be addressed? Where can we go from > here to troubleshoot this oddity? Thanks! > > Steve Jost > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1668 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100601/dd10d3a8/attachment.bin>
Steve D. Jost
2010-Jun-01 19:43 UTC
[zfs-discuss] Solaris 10U8, Sun Cluster, and SSD issues.
Definitely not a silly question. And no, we create the pool on node1 then set up the cluster resources. Once setup, sun cluster manages importing/exporting the pool into only the active cluster node. Sorry for the lack of clarity.. not much sleep has been had recently. When connecting to our SAS tray with a single box and with the pool configuration as stated and SSD''s in the pool as slog devices everything works as it should. When connecting the second box AND when the SSDs are in the pool, there are tons of SCSI resets and even some panics on the nodes. With the sas tray connected to both nodes and no ssds everything works as it should. Sorry for the confusion, Steve Jost> Silly question - you''re not trying to have the ZFS pool imported on > both hosts at the same time, are you? Maybe I misread, had a hard > time following the full description of what exact configuration caused > the scsi resets. > > > > On Jun 1, 2010, at 2:22 PM, Steve Jost wrote: > > > Hello All, > > We are currently testing a NFS+Sun Cluster solution with ZFS in our > > environment. Currently we have 2 HP DL360s each with a 2-port LSI > > SAS 9200-8e controller (mpt_sas driver) connected to a Xyratex > > OneStor SP1224s 24-bay sas tray. The xyratex sas tray has 2 ports > > on the controller which can connect to each server. We have a zpool > > of 2x (8+2) drives and 1 hot spare and also 3 intel X25-E ssds in > > the tray. We were hoping to have the ssds work as slog/cache > > devices however when we add them to the pool (as cache or log) we > > start to get an insane number of scsi resets. When the storage is > > connected to a single node the resets do not happen. > > > > scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ > > disk at g50015179591cdd18 (sd24): > > Error for Command: write(10) Error Level: Retryable > > scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or > > bus reset occurred), ASCQ: 0x0, FRU: 0x0 > > > > Could this error be because the Intel SSDs are sata and we need a > > real SAS interface for multi-initiator support or is it a bug in the > > firmware somewhere that needs to be addressed? Where can we go from > > here to troubleshoot this oddity? Thanks! > > > > Steve Jost > > -- > > This message posted from opensolaris.org > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Brian Wilson
2010-Jun-01 19:49 UTC
[zfs-discuss] Solaris 10U8, Sun Cluster, and SSD issues.
On Jun 1, 2010, at 2:43 PM, Steve D. Jost wrote:> Definitely not a silly question. And no, we create the pool on > node1 then set up the cluster resources. Once setup, sun cluster > manages importing/exporting the pool into only the active cluster > node. Sorry for the lack of clarity.. not much sleep has been had > recently. > > When connecting to our SAS tray with a single box and with the pool > configuration as stated and SSD''s in the pool as slog devices > everything works as it should. When connecting the second box AND > when the SSDs are in the pool, there are tons of SCSI resets and > even some panics on the nodes. With the sas tray connected to both > nodes and no ssds everything works as it should. >Ah, gotcha Hmm.... Only thought I have is if the sas tray and/or the SSD''s properly work with the SCSI-3 reservation flags that Sun Cluster uses. I know I had to do things to EMC storage to make those reservation flags work - might or might not be pointing you in the right direction.> Sorry for the confusion, > > Steve Jost > >> Silly question - you''re not trying to have the ZFS pool imported on >> both hosts at the same time, are you? Maybe I misread, had a hard >> time following the full description of what exact configuration >> caused >> the scsi resets. >> >> >> >> On Jun 1, 2010, at 2:22 PM, Steve Jost wrote: >> >>> Hello All, >>> We are currently testing a NFS+Sun Cluster solution with ZFS in our >>> environment. Currently we have 2 HP DL360s each with a 2-port LSI >>> SAS 9200-8e controller (mpt_sas driver) connected to a Xyratex >>> OneStor SP1224s 24-bay sas tray. The xyratex sas tray has 2 ports >>> on the controller which can connect to each server. We have a zpool >>> of 2x (8+2) drives and 1 hot spare and also 3 intel X25-E ssds in >>> the tray. We were hoping to have the ssds work as slog/cache >>> devices however when we add them to the pool (as cache or log) we >>> start to get an insane number of scsi resets. When the storage is >>> connected to a single node the resets do not happen. >>> >>> scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/ >>> disk at g50015179591cdd18 (sd24): >>> Error for Command: write(10) Error Level: Retryable >>> scsi: [ID 107833 kern.notice] ASC: 0x29 (power on, reset, or >>> bus reset occurred), ASCQ: 0x0, FRU: 0x0 >>> >>> Could this error be because the Intel SSDs are sata and we need a >>> real SAS interface for multi-initiator support or is it a bug in the >>> firmware somewhere that needs to be addressed? Where can we go from >>> here to troubleshoot this oddity? Thanks! >>> >>> Steve Jost >>> -- >>> This message posted from opensolaris.org >>> _______________________________________________ >>> zfs-discuss mailing list >>> zfs-discuss at opensolaris.org >>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 1668 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100601/8cdd4463/attachment.bin>
Andreas GrĂ¼ninger
2010-Jun-02 05:06 UTC
[zfs-discuss] Solaris 10U8, Sun Cluster, and SSD issues.
The Intel SSD is not a dual ported SAS device. This device must be supported by the SAS expander in your external chassis. Did you use an AAMUX transposer card for the SATA device between the connector of the chassis and the SATA drive? Andreas -- This message posted from opensolaris.org
Steve D. Jost
2010-Jun-02 21:40 UTC
[zfs-discuss] Solaris 10U8, Sun Cluster, and SSD issues.
Andreas, We actually are not using one and hadn''t thought about that at all. Do you have a recommendation on a particular model? I see some that do SAS->SATA and some that are just A/A SATA switches, is one better than the other? We were looking at one based on a newer LSI part number that does 6gb SAS but can''t seem to find a source on the gear any ideas? Thanks! Steve Jost> The Intel SSD is not a dual ported SAS device. This device must be supported > by the SAS expander in your external chassis. > Did you use an AAMUX transposer card for the SATA device between the > connector of the chassis and the SATA drive? > > Andreas > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss