So I dropped an Intel SSD in our test x4500 last week and have been playing with it a bit. Performance wise, it''s great. A source code repository that took 18 minutes to check out into NFS mounted ZFS space only took 3 minutes after adding the SSD as a slog (the performance was almost as good as simply disabling the zil). Untar''ing files and other common operations involving lots of files were also much quicker. Unfortunately, fma isn''t very happy with it :(, it keeps complaining that the self test failed and marks the drive as faulty. It''s not a functional issue, the drive remains available and works fine, but the chassis fault light is on, the drive fault light for the bay the SSD is in is on, the IPMI management drive failure indicator is asserted for that bay, and the fault management logs are cluttered with spurious false alerts. I got an official confirmation that the x4540 SSD is basically an Intel X25-E, probably with different firmware containing the test methods fma is looking for. I''m not sure what other differences there might be, the price difference is pretty drastic (I picked up a stock X25-E for about $350, list price on the x4540 SSD is $1500), but I guess like with the markup on regular hard drives and memory you''re paying for the privilege of having it under service contract and having full support for any problems that arise. Even if we could pay that cost (ah, the california budget), since the drive won''t be qualified for our x4500''s, there''s no way to get to a supported configuration anyway :(, although presumably the x4540 SSD in an x4500 would at least past self test (anybody tried it?). On a side note, is there any easy way to get fma to not test a particular drive? -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
We found lots of SAS Controller Reset and errors to SSD on our servers (OpenSolaris 2008.05 and 2009.06 with third-party JBOD and X25-E). Whenever there is an error, the MySQL insert takes more than 4 seconds. It was quite scary. Eventually our engineer disabled the Fault Management SMART Pooling and seems working. -- This message posted from opensolaris.org
We finally resolved this issue by change LSI driver. For details, please refer to here http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/ -- This message posted from opensolaris.org
Alex Li wrote:> We finally resolved this issue by change LSI driver. For details, please > refer to here > http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/Anyone from Sun have any knowledge of when the open source mpt driver will be less broken? Things improved greatly for me re: bus resets with a recent Sol 10 patch, but after my upgrade to OpenSolaris, they''re back with a vengeance. An update to b118 didn''t improve things, and I dare not go to anything more recent until the ZFS bug fixes hit the dev repo. -- Carson
On Thu, 10 Sep 2009, Alex Li wrote:> We finally resolved this issue by change LSI driver. For details, please > refer to here > http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/I believe you hijacked my thread ;). x4500''s have Marvell SATA controllers, not LSI. My issue with Intel SSD''s being marked faulty in X4500''s has yet to be resolved. The last time I rebooted it fm started marking the SSD failed again due to invalid self-check log data. I had some correspondence with Eric Schrock who indicated it looked like a combination of buggy Intel firmware and a bug in the Solaris SATL driver, but haven''t heard back from him as to whether they might fix it. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
On Sep 11, 2009, at 8:48 PM, Paul B. Henson wrote:> > x4500''s have Marvell SATA controllers, not LSI. My issue with Intel > SSD''s > being marked faulty in X4500''s has yet to be resolved. The last time I > rebooted it fm started marking the SSD failed again due to invalid > self-check log data. I had some correspondence with Eric Schrock who > indicated it looked like a combination of buggy Intel firmware and a > bug in > the Solaris SATL driver, but haven''t heard back from him as to > whether they > might fix it.It''s clearly bad firmware - there''s no bug in the sata driver. That drive basically returns random data, and if you''re unlucky that randomness will look like a valid failure response. In the process I found one or two things that could be tightened up with the FMA analysis, but when your drive is returning random log data it''s impossible to actually fix the problem in software. - Eric -- Eric Schrock, Fishworks http://blogs.sun.com/eschrock
On Fri, 11 Sep 2009, Eric Schrock wrote:> It''s clearly bad firmware - there''s no bug in the sata driver. That > drive basically returns random data, and if you''re unlucky that > randomness will look like a valid failure response. In the process I > found one or two things that could be tightened up with the FMA analysis, > but when your drive is returning random log data it''s impossible to > actually fix the problem in software.Well, I won''t claim the drive firmware is completely innocent, but as evidenced in http://mail.opensolaris.org/pipermail/fm-discuss/2009-June/000436.html smartctl on a Linux box seems to work just fine. The exact same model drive also works just fine in an x4540. So I think the assertion that the drive returns random data is demonstrably false. There''s something about the SSD in an x4500 that just doesn''t play nice -- it might be partially the drive firmware, it might be the SAS controller, it might be something else -- but it''s *not* simply random data being returned from the drive. It would be really appreciated if that problem could be tracked down so the drive works as well SMART-wise in an x4500 as it does in a Linux box or an x4540, but I understand Sun does not certify the x4500 with SSD''s so there''s no expectation that would happen. But it would be really really appreciated :)... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
On Sep 12, 2009, at 12:00 AM, Paul B. Henson wrote:> > Well, I won''t claim the drive firmware is completely innocent, but as > evidenced in > > http://mail.opensolaris.org/pipermail/fm-discuss/2009-June/ > 000436.html > > smartctl on a Linux box seems to work just fine. The exact same > model drive > also works just fine in an x4540. So I think the assertion that the > drive > returns random data is demonstrably false.Your statement that it is "just fine" is false: --- SMART Selective Self-Test Log Data Structure Revision Number (0) should be 1 SMART Selective self-test log data structure revision number 0 Warning: ATA Specification requires selective self-test log data structure revision number = 1 --- Like I said, there are ways we could tighten up the FMA code to better handle bad data before going off the rails - most likely smartctl gives up when it sees this invalid record, while we (via SATL) keep going. But any way you slice it, the drive is returning invalid data. - Eric -- Eric Schrock, Fishworks http://blogs.sun.com/eschrock
Also, were you ever able to get this disk behind a SAS transport (X4540, J4400, J4500, etc)? It would be interesting to see how hardware SATL deals with this invalid data. Output from ''smartctl -d sat'' and ''smartctl -d scsi'' on such a system would show both the ATA data and the translated SCSI data. My guess is that it just gives up at the first invalid version record, something we should probably be doing. - Eric -- Eric Schrock, Fishworks http://blogs.sun.com/eschrock
On Thu, 10 Sep 2009 12:31:11 -0700 Carson Gaspar <carson at taltos.org> wrote:> Alex Li wrote: > > We finally resolved this issue by change LSI driver. For details, please > > refer to here > > http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/ > > Anyone from Sun have any knowledge of when the open source mpt driver will be > less broken? Things improved greatly for me re: bus resets with a recent Sol 10 > patch, but after my upgrade to OpenSolaris, they''re back with a vengeance. An > update to b118 didn''t improve things, and I dare not go to anything more recent > until the ZFS bug fixes hit the dev repo.>From reading your blog post, it appears that mpt and fma weretrying really hard to tell you that your SSD was misbehaving, and therefore you should do something about it. Turning _off_ disk fma and then totally replacing the driver with one that doesn''t support fma were definitely not the recommended actions! Given the rest of this thread, I''m really keen to see (as somebody who works on mpt(7d)) how your system behaves with fixed SSD firmware, using mpt(7d) and with disk fma turned on again. After that, let''s talk about "broken" drivers. James C. McPherson -- Senior Kernel Software Engineer, Solaris Sun Microsystems http://blogs.sun.com/jmcp http://www.jmcp.homeunix.com/blog
James C. McPherson wrote:> On Thu, 10 Sep 2009 12:31:11 -0700 > Carson Gaspar <carson at taltos.org> wrote: > >> Alex Li wrote: >>> We finally resolved this issue by change LSI driver. For details, please >>> refer to here >>> http://enginesmith.wordpress.com/2009/08/28/ssd-faults-finally-resolved/ >> Anyone from Sun have any knowledge of when the open source mpt driver will be >> less broken? Things improved greatly for me re: bus resets with a recent Sol 10 >> patch, but after my upgrade to OpenSolaris, they''re back with a vengeance. An >> update to b118 didn''t improve things, and I dare not go to anything more recent >> until the ZFS bug fixes hit the dev repo. > > > From reading your blog post, it appears that mpt and fma were > trying really hard to tell you that your SSD was misbehaving, > and therefore you should do something about it. Turning _off_ > disk fma and then totally replacing the driver with one that > doesn''t support fma were definitely not the recommended actions! > > Given the rest of this thread, I''m really keen to see (as somebody > who works on mpt(7d)) how your system behaves with fixed SSD > firmware, using mpt(7d) and with disk fma turned on again. > > After that, let''s talk about "broken" drivers.Except you replied to me, not to the person who has SSDs. I have dead standard hard disks, and the mpt driver is just not happy. After applying 141737-04 to my Sol 10 system, things improved greatly, and the constant bus resets went away. After upgrading to OpenSolaris 6/09 things went back to being crappy. Updating to b118 did not help. -- Carson
Carson Gaspar wrote:> Except you replied to me, not to the person who has SSDs. I have dead > standard hard disks, and the mpt driver is just not happy. After > applying 141737-04 to my Sol 10 system, things improved greatly, and > the constant bus resets went away. After upgrading to OpenSolaris 6/09 > things went back to being crappy. Updating to b118 did not help.And for the curious, here are one week of uniq''d log messages I receive when I''m having problems: Log info 0x31110b00 received for target 1. Log info 0x31130000 received for target 0. Log info 0x31130000 received for target 1. Log info 0x31140000 received for target 0. Log info 0x31140000 received for target 1. Log info 0x31140000 received for target 3. mpt_handle_event_sync: IOCStatus=0x8000, IOCLogInfo=0x31110b00 mpt_handle_event: IOCStatus=0x8000, IOCLogInfo=0x31110b00 All disks are identical. An example iostat -nE output (note the 93 transport errors...): c7t1d0 Soft Errors: 0 Hard Errors: 6 Transport Errors: 93 Vendor: ATA Product: HDS725050KLA360 Revision: A10C Serial No: Size: 500.11GB <500107861504 bytes> Media Error: 0 Device Not Ready: 0 No Device: 6 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 -- Carson
On Sat, 12 Sep 2009, Eric Schrock wrote:> Your statement that it is "just fine" is false:I didn''t say it worked "perfectly", I said it worked "fine". Yes, it gave a *warning* that the "SMART Selective Self-Test Log Data Structure Revision Number" was 0 instead of 1, **however** other than that warning the data smartctl returned from the drive appeared correct. Results from the virgin drive: SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] Results after manually initiating self tests: SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Extended offline Completed without error 00% 68 - # 2 Short offline Completed without error 00% 68 - The exact same drive in the x4500 running your test program to check self-test results: self-test-failure = (embedded nvlist) nvlist version: 0 result-code = 0x4 timestamp = 0x48a5 segment = 0x0 address = 0xa548a548a548 (end self-test-failure) There''s definitely invalid data all right, but it''s **not** originating from the drive. For that matter, the warning is about the "SMART Selective Self-Test Log Data Structure Revision Number", not the "SMART Self-test log structure revision number" -- which is correctly version 1.> Like I said, there are ways we could tighten up the FMA code to better > handle bad data before going off the rails - most likely smartctl gives > up when it sees this invalid record, while we (via SATL) keep going. > But any way you slice it, the drive is returning invalid data.The drive is not returning invalid data in a Linux box running smartctl. Other than a *warning* about the wrong revision of a data structure for a different self test, the drive seems to work just fine. I really appreciated the help you provided with figuring out what was going on with this drive in an x4500 under Solaris. I understand there''s no obligation on anybody''s part to make this unsupported drive work. However, given it does work correctly (at least in regards to returning smart self-test logs) under Linux, I don''t see why it could not work correctly under Solaris. If it doesn''t get fixed, it doesn''t get fixed, but I don''t understand why you''re saying the drive is returning invalid data when the evidence does not support that conclusion. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
On Sat, 12 Sep 2009, Eric Schrock wrote:> Also, were you ever able to get this disk behind a SAS transport (X4540, > J4400, J4500, etc)? It would be interesting to see how hardware SATL > deals with this invalid data. Output from ''smartctl -d sat'' and > ''smartctl -d scsi'' on such a system would show both the ATA data and the > translated SCSI data. My guess is that it just gives up at the first > invalid version record, something we should probably be doing.Phil Steinbachs gave you some data from an X25-E in a J4400 attached to an X4240 via an LSI 1068E based HBA, as well as one in one of the X4240''s SAS slots connected to the internal Adaptec RAID controller: http://mail.opensolaris.org/pipermail/fm-discuss/2009-June/000432.html and: http://mail.opensolaris.org/pipermail/fm-discuss/2009-June/000435.html Your last email on the subject was: http://mail.opensolaris.org/pipermail/fm-discuss/2009-June/000447.html in which you said: "The primary thing is that this drive is completely busted - it''s reporting totally invalid data in response to the ATA READ EXT LOG command for log 0x07 (Extended SMART self-test log). The spec defines that byte 0 must be 0x1 and that byte 1 is reserved." Phil might still be in a position to run smartctl on the drives if you''re still interested in the data. I guess this is why you''re now saying the drive is returning invalid data, I had forgotten the details, that was almost three months ago. In any case, I agree with you that the firmware is buggy; however I disagree with you as to the outcome of that bug. The drive is not returning random garbage, it has *one* byte wrong. Other than that all of the data seems ok, at least to my inexpert eyes. smartctl under Linux issues a warning about that invalid byte and reports everything else ok. Solaris on an x4500 evidentally barfs over that invalid byte and returns garbage. Overall, I think the Linux approach seems more useful. Be strict in what you generate, and lenient in what you accept ;), or something like that. As I already said, it would be really really nice if the Solaris driver could be fixed to be a little more forgiving and deal better with the drive, but I''ve got no expectation that it should be done. But it could be :). Thanks again for your help. I apologize if I''ve been a bit antagonistic, I tend to go "dog with a bone" when I start debating something. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
On Sat, 12 Sep 2009, Paul B. Henson wrote:> In any case, I agree with you that the firmware is buggy; however I > disagree with you as to the outcome of that bug. The drive is not > returning random garbage, it has *one* byte wrong. Other than that all of > the data seems ok, at least to my inexpert eyes. smartctl under Linux > issues a warning about that invalid byte and reports everything else ok. > Solaris on an x4500 evidentally barfs over that invalid byte and returns > garbage.On another note, my understanding is that the official Sun sold and supported SSD for the x4540 is basically just an OEM''d Intel X25-E. Did Sun install their own fixed firmware on their version of that drive, or does it have the same buggy firmware as the street version? It would be funny if you guys were shipping a drive with buggy firmware that just happens to work because the x4540 hardware doesn''t trip over the one invalid byte :)... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
On Sun, Sep 13, 2009 at 1:14 AM, Paul B. Henson <henson at acm.org> wrote:> On Sat, 12 Sep 2009, Paul B. Henson wrote: > >> In any case, I agree with you that the firmware is buggy; however I >> disagree with you as to the outcome of that bug. The drive is not >> returning random garbage, it has *one* byte wrong. Other than that all of >> the data seems ok, at least to my inexpert eyes. smartctl under Linux >> issues a warning about that invalid byte and reports everything else ok. >> Solaris on an x4500 evidentally barfs over that invalid byte and returns >> garbage. > > On another note, my understanding is that the official Sun sold > and supported SSD for the x4540 is basically just an OEM''d Intel X25-E. Did > Sun install their own fixed firmware on their version of that drive, or > does it have the same buggy firmware as the street version? It would be > funny if you guys were shipping a drive with buggy firmware that just > happens to work because the x4540 hardware doesn''t trip over the one > invalid byte :)...Perhaps some of their fixes have made it upstream. Your message at http://mail.opensolaris.org/pipermail/fm-discuss/2009-June/000436.html from June 10 suggests you are running firmware release (045C)8626. On August 11 they released firmware revisions 8820, 8850, and 02G9, depending on the drive model. http://downloadcenter.intel.com/Detail_Desc.aspx?agr=Y&ProdId=3043&DwnldID=17485&lang=eng -- Mike Gerdts http://mgerdts.blogspot.com/
On Sep 12, 2009, at 11:14 PM, Paul B. Henson wrote:> On Sat, 12 Sep 2009, Paul B. Henson wrote: > > On another note, my understanding is that the official Sun sold > and supported SSD for the x4540 is basically just an OEM''d Intel X25- > E. Did > Sun install their own fixed firmware on their version of that drive, > or > does it have the same buggy firmware as the street version? It would > be > funny if you guys were shipping a drive with buggy firmware that just > happens to work because the x4540 hardware doesn''t trip over the one > invalid byte :)...The X4540 uses SAS, not SATA. So the translation via SATL is done in hardware, not software. - Eric -- Eric Schrock, Fishworks http://blogs.sun.com/eschrock
On Sep 12, 2009, at 10:49 PM, Paul B. Henson wrote:> > In any case, I agree with you that the firmware is buggy; however I > disagree with you as to the outcome of that bug. The drive is not > returning > random garbage, it has *one* byte wrong. Other than that all of the > data > seems ok, at least to my inexpert eyes. smartctl under Linux issues a > warning about that invalid byte and reports everything else ok. > Solaris on > an x4500 evidentally barfs over that invalid byte and returns garbage.Actually, it''s not one byte - the entire page is garbage (as we saw in the dtrace output). But I''m guessing that smartctl (and hardware SATL) is aborting on the first invalid record, while we keep going and blindly "translate" one form of garbage into another.> Overall, I think the Linux approach seems more useful. Be strict in > what > you generate, and lenient in what you accept ;), or something like > that. As > I already said, it would be really really nice if the Solaris driver > could > be fixed to be a little more forgiving and deal better with the > drive, but > I''ve got no expectation that it should be done. But it could be :).Absolutely. The SATA code could definitely be cleaned up to bail when processing an invalid record. I can file a CR for you if you haven''t already done so. Also, I''d encourage any developers out there with one of these drives to take a shot at fixing the issue via the OpenSolaris sponsor process. - Eric -- Eric Schrock, Fishworks http://blogs.sun.com/eschrock
On Sun, 13 Sep 2009, Mike Gerdts wrote:> August 11 they released firmware revisions 8820, 8850, and 02G9, > depending on the drive model.Ooooh, cool, last time I checked they only had updates for the X25-M. Thanks for the pointer.
I can confirm that on an X4240 with the LSI (mpt) controller: X25-M G1 with 8820 still returns invalid selftest data X25-E G1 with 8850 now returns correct selftest data (I haven''t got any X25-M G2) Going to replace an X25-E with the old firmware in one of our X4500s soon and we''ll see if things work right there) I still see heavy write load-induced bus resets with the 8850-firmware X25-Es on the X4240 though. (Unless I wrap the X25-E inside a DiskSuite SVM metadevice for some strange reason). -- This message posted from opensolaris.org
Now tested a firmware 8850 X25-E in one of our X4500:s and things look better:> # /ifm/bin/smartctl -d scsi -l selftest /dev/rdsk/c5t7d0s0 > smartctl version 5.38 [i386-pc-solaris2.10] Copyright (C) 2002-8 Bruce Allen > Home page is http://smartmontools.sourceforge.net/ > > No self-tests have been loggedNo "scsi" console errors so far. -- This message posted from opensolaris.org
On Sun, 13 Sep 2009, Eric Schrock wrote:> Actually, it''s not one byte - the entire page is garbage (as we saw in > the dtrace output). But I''m guessing that smartctl (and hardware SATL) > is aborting on the first invalid record, while we keep going and blindly > "translate" one form of garbage into another.I updated to the new X25-E firmware, and I think it might have resolved the problem. smartctl under Linux no longer give a warning, and the diskstat check under Solaris no longer appears to have garbage. I attached output from smartctl, diskstat, and the dtrace script at the bottom, does it look like the firmware is returning valid stuff now?> Absolutely. The SATA code could definitely be cleaned up to bail when > processing an invalid record. I can file a CR for you if you haven''t > already done so.I haven''t; even if the new firmware does resolve the problem, I like robustness :), so it would still be nice in general for the code to be more forgiving and perhaps just log a warning. Thanks... ------------------------------------------------------------------ smartctl version 5.38 [x86_64-pc-linux-gnu] Copyright (C) 2002-8 Bruce Allen Home page is http://smartmontools.sourceforge.net/ === START OF INFORMATION SECTION ==Device Model: SSDSA2SH032G1GN INTEL Serial Number: CVEM902600J6032HGN Firmware Version: 045C8850 User Capacity: 32,000,000,000 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 7 ATA Standard is: ATA/ATAPI-7 T13 1532D revision 1 Local Time is: Mon Sep 14 18:26:09 2009 PDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION ==SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 32) The self-test routine was interrupted by the host with a hard or soft reset. Total time to complete Offline data collection: ( 1) seconds. Offline data collection capabilities: (0x75) SMART execute Offline immediate. No Auto Offline data collection support. Abort Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 2) minutes. Conveyance self-test routine recommended polling time: ( 1) minutes. SMART Attributes Data Structure revision number: 5 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 3 Spin_Up_Time 0x0000 100 000 000 Old_age Offline In_the_past 0 4 Start_Stop_Count 0x0000 100 000 000 Old_age Offline In_the_past 0 5 Reallocated_Sector_Ct 0x0002 100 100 000 Old_age Always - 0 9 Power_On_Hours 0x0002 100 100 000 Old_age Always - 68 12 Power_Cycle_Count 0x0002 100 100 000 Old_age Always - 151 192 Power-Off_Retract_Count 0x0002 100 100 000 Old_age Always - 22 232 Unknown_Attribute 0x0003 100 100 010 Pre-fail Always - 0 233 Unknown_Attribute 0x0002 099 099 000 Old_age Always - 0 225 Load_Cycle_Count 0x0000 200 200 000 Old_age Offline - 50147 226 Load-in_Time 0x0002 255 000 000 Old_age Always In_the_past 4294967295 227 Torq-amp_Count 0x0002 000 000 000 Old_age Always FAILING_NOW 281474976710655 228 Power-off_Retract_Count 0x0002 000 000 000 Old_age Always FAILING_NOW 4294967295 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 68 - # 2 Short offline Completed without error 00% 68 - # 3 Short offline Completed without error 00% 68 - # 4 Short offline Completed without error 00% 68 - # 5 Short offline Completed without error 00% 68 - # 6 Short offline Completed without error 00% 68 - # 7 Short offline Completed without error 00% 68 - # 8 Short offline Completed without error 00% 68 - # 9 Short offline Completed without error 00% 68 - #10 Short offline Completed without error 00% 68 - #11 Short offline Completed without error 00% 68 - #12 Short offline Completed without error 00% 68 - #13 Short offline Completed without error 00% 68 - #14 Short offline Completed without error 00% 68 - #15 Short offline Completed without error 00% 68 - #16 Short offline Completed without error 00% 68 - #17 Short offline Completed without error 00% 68 - #18 Selective offline Completed without error 00% 68 - #19 Selective offline Completed without error 00% 68 - #20 Selective offline Completed without error 00% 68 - #21 Conveyance offline Completed without error 00% 68 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 20 30 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Interrupted [00% left] (0-65535) Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. ------------------------------------------------------------------ nvlist version: 0 protocol = scsi status = (embedded nvlist) nvlist version: 0 command-length = 6 modepages = (embedded nvlist) nvlist version: 0 informational-exceptions = (embedded nvlist) nvlist version: 0 dexcpt = 0 logerr = 0 mrie = 0x6 test = 0 ewasc = 0 perf = 0 ebf = 0 interval-timer = 0x0 report-count = 0x0 changed = 0 (end informational-exceptions) (end modepages) logpages = (embedded nvlist) nvlist version: 0 informational-exceptions = (embedded nvlist) nvlist version: 0 length = 0x8 general = 1 (end informational-exceptions) self-test = (embedded nvlist) nvlist version: 0 length = 0x190 invalid-param-code = 0x0 (end self-test) (end logpages) (end status) predictive-failure = (embedded nvlist) nvlist version: 0 additional-sense-code = 0x0 additional-sense-code-qualifier = 0x0 (end predictive-failure) faults = (embedded nvlist) nvlist version: 0 predictive-failure = 0 (end faults) ------------------------------------------------------------------ Tracing sata self test queries... sata_ext_smart_selftest_read_log succeeded 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 00 14 00 00 00 00 00 00 00 1e 00 00 00 00 00 ................ 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1f0: 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 c8 ................ sata_ext_smart_selftest_read_log succeeded 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 00 14 00 00 00 00 00 00 00 1e 00 00 00 00 00 ................ 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1f0: 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 c8 ................ sata_ext_smart_selftest_read_log succeeded 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 0: 01 00 14 00 00 00 00 00 00 00 1e 00 00 00 00 00 ................ 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 1f0: 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 c8 ................ -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
On Sep 15, 2009, at 8:32 PM, Paul B. Henson wrote:> > I updated to the new X25-E firmware, and I think it might have > resolved the > problem. smartctl under Linux no longer give a warning, and the > diskstat > check under Solaris no longer appears to have garbage. I attached > output > from smartctl, diskstat, and the dtrace script at the bottom, does > it look > like the firmware is returning valid stuff now?I don''t have the ATA spec in front of me, but that that looks like pretty normal output to me. Glad to hear they addressed the issue. - Eric> >> Absolutely. The SATA code could definitely be cleaned up to bail >> when >> processing an invalid record. I can file a CR for you if you haven''t >> already done so. > > I haven''t; even if the new firmware does resolve the problem, I like > robustness :), so it would still be nice in general for the code to > be more > forgiving and perhaps just log a warning. > > Thanks... > > ------------------------------------------------------------------ > > smartctl version 5.38 [x86_64-pc-linux-gnu] Copyright (C) 2002-8 Bruce > Allen > Home page is http://smartmontools.sourceforge.net/ > > === START OF INFORMATION SECTION ==> Device Model: SSDSA2SH032G1GN INTEL > Serial Number: CVEM902600J6032HGN > Firmware Version: 045C8850 > User Capacity: 32,000,000,000 bytes > Device is: Not in smartctl database [for details use: -P > showall] > ATA Version is: 7 > ATA Standard is: ATA/ATAPI-7 T13 1532D revision 1 > Local Time is: Mon Sep 14 18:26:09 2009 PDT > SMART support is: Available - device has SMART capability. > SMART support is: Enabled > > === START OF READ SMART DATA SECTION ==> SMART overall-health self-assessment test result: PASSED > See vendor-specific Attribute list for marginal Attributes. > > General SMART Values: > Offline data collection status: (0x00) Offline data collection > activity > was never started. > Auto Offline Data Collection: > Disabled. > Self-test execution status: ( 32) The self-test routine was > interrupted > by the host with a hard or soft > reset. > Total time to complete Offline > data collection: ( 1) seconds. > Offline data collection > capabilities: (0x75) SMART execute Offline > immediate. > No Auto Offline data collection > support. > Abort Offline collection upon > new > command. > No Offline surface scan > supported. > Self-test supported. > Conveyance Self-test supported. > Selective Self-test supported. > SMART capabilities: (0x0003) Saves SMART data before > entering > power-saving mode. > Supports SMART auto save timer. > Error logging capability: (0x01) Error logging supported. > General Purpose Logging > supported. > Short self-test routine > recommended polling time: ( 2) minutes. > Extended self-test routine > recommended polling time: ( 2) minutes. > Conveyance self-test routine > recommended polling time: ( 1) minutes. > > SMART Attributes Data Structure revision number: 5 > Vendor Specific SMART Attributes with Thresholds: > ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE > UPDATED > WHEN_FAILED RAW_VALUE > 3 Spin_Up_Time 0x0000 100 000 000 Old_age > Offline > In_the_past 0 > 4 Start_Stop_Count 0x0000 100 000 000 Old_age > Offline > In_the_past 0 > 5 Reallocated_Sector_Ct 0x0002 100 100 000 Old_age > Always > - 0 > 9 Power_On_Hours 0x0002 100 100 000 Old_age > Always > - 68 > 12 Power_Cycle_Count 0x0002 100 100 000 Old_age > Always > - 151 > 192 Power-Off_Retract_Count 0x0002 100 100 000 Old_age > Always > - 22 > 232 Unknown_Attribute 0x0003 100 100 010 Pre-fail > Always > - 0 > 233 Unknown_Attribute 0x0002 099 099 000 Old_age > Always > - 0 > 225 Load_Cycle_Count 0x0000 200 200 000 Old_age > Offline > - 50147 > 226 Load-in_Time 0x0002 255 000 000 Old_age > Always > In_the_past 4294967295 > 227 Torq-amp_Count 0x0002 000 000 000 Old_age > Always > FAILING_NOW 281474976710655 > 228 Power-off_Retract_Count 0x0002 000 000 000 Old_age > Always > FAILING_NOW 4294967295 > > SMART Error Log Version: 1 > No Errors Logged > > SMART Self-test log structure revision number 1 > Num Test_Description Status Remaining LifeTime > (hours) > LBA_of_first_error > # 1 Short offline Completed without error 00% 68 > - > # 2 Short offline Completed without error 00% 68 > - > # 3 Short offline Completed without error 00% 68 > - > # 4 Short offline Completed without error 00% 68 > - > # 5 Short offline Completed without error 00% 68 > - > # 6 Short offline Completed without error 00% 68 > - > # 7 Short offline Completed without error 00% 68 > - > # 8 Short offline Completed without error 00% 68 > - > # 9 Short offline Completed without error 00% 68 > - > #10 Short offline Completed without error 00% 68 > - > #11 Short offline Completed without error 00% 68 > - > #12 Short offline Completed without error 00% 68 > - > #13 Short offline Completed without error 00% 68 > - > #14 Short offline Completed without error 00% 68 > - > #15 Short offline Completed without error 00% 68 > - > #16 Short offline Completed without error 00% 68 > - > #17 Short offline Completed without error 00% 68 > - > #18 Selective offline Completed without error 00% 68 > - > #19 Selective offline Completed without error 00% 68 > - > #20 Selective offline Completed without error 00% 68 > - > #21 Conveyance offline Completed without error 00% 68 > - > > SMART Selective self-test log data structure revision number 1 > SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS > 1 20 30 Not_testing > 2 0 0 Not_testing > 3 0 0 Not_testing > 4 0 0 Not_testing > 5 0 0 Interrupted [00% left] (0-65535) > Selective self-test flags (0x0): > After scanning selected spans, do NOT read-scan remainder of disk. > If Selective self-test is pending on power-up, resume after 0 minute > delay. > > ------------------------------------------------------------------ > > nvlist version: 0 > protocol = scsi > status = (embedded nvlist) > nvlist version: 0 > command-length = 6 > modepages = (embedded nvlist) > nvlist version: 0 > informational-exceptions = (embedded nvlist) > nvlist version: 0 > dexcpt = 0 > logerr = 0 > mrie = 0x6 > test = 0 > ewasc = 0 > perf = 0 > ebf = 0 > interval-timer = 0x0 > report-count = 0x0 > changed = 0 > (end informational-exceptions) > > (end modepages) > > logpages = (embedded nvlist) > nvlist version: 0 > informational-exceptions = (embedded nvlist) > nvlist version: 0 > length = 0x8 > general = 1 > (end informational-exceptions) > > self-test = (embedded nvlist) > nvlist version: 0 > length = 0x190 > invalid-param-code = 0x0 > (end self-test) > > (end logpages) > > (end status) > > predictive-failure = (embedded nvlist) > nvlist version: 0 > additional-sense-code = 0x0 > additional-sense-code-qualifier = 0x0 > (end predictive-failure) > > faults = (embedded nvlist) > nvlist version: 0 > predictive-failure = 0 > (end faults) > > > ------------------------------------------------------------------ > > Tracing sata self test queries... > sata_ext_smart_selftest_read_log succeeded > > 0 1 2 3 4 5 6 7 8 9 a b c d e f > 0123456789abcdef > 0: 01 00 14 00 00 00 00 00 00 00 1e 00 00 00 00 00 > ................ > 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1f0: 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 c8 > ................ > sata_ext_smart_selftest_read_log succeeded > > 0 1 2 3 4 5 6 7 8 9 a b c d e f > 0123456789abcdef > 0: 01 00 14 00 00 00 00 00 00 00 1e 00 00 00 00 00 > ................ > 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1f0: 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 c8 > ................ > sata_ext_smart_selftest_read_log succeeded > > 0 1 2 3 4 5 6 7 8 9 a b c d e f > 0123456789abcdef > 0: 01 00 14 00 00 00 00 00 00 00 1e 00 00 00 00 00 > ................ > 10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 110: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 130: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 150: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 170: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 190: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > ................ > 1f0: 00 00 00 00 05 00 00 00 00 00 00 00 00 00 00 c8 > ................ > > -- > Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/ > ~henson/ > Operating Systems and Network Analyst | henson at csupomona.edu > California State Polytechnic University | Pomona CA 91768-- Eric Schrock, Fishworks http://blogs.sun.com/eschrock
On Tue, 15 Sep 2009, Eric Schrock wrote:> I don''t have the ATA spec in front of me, but that that looks like pretty > normal output to me. Glad to hear they addressed the issue.Excellent; I reinstalled it in my test x4500, if no other issues show up I can try to get my proposal to install them in production going again; they make a huge difference for common sysadmin operations such as tarball extraction or code development scenarios like revision control checkouts. If I''m lucky maybe the ability to import a pool with a dead slog will make it into U8, that was the only other potential snag in my deployment plan, as I''d only have one SSD in each system. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | henson at csupomona.edu California State Polytechnic University | Pomona CA 91768
Just a quick followup that the same issue still seems to be there on our X4500s with the latest Solaris 10 with all the latest patches and the following SSD disks: Intel X25-M G1 firmware 8820 (80GB MLC) Intel X25-M G2 firmware 02HD (160GB MLC) However - things seem to work smoothly with: Intel X25-E G1 firmware 8850 (32GB SLC) OCZ Vertex 2 firmware 1.00 and 1.02 (100GB MLC) I''m currently testing a setup with dual OCZ Vertex 2 100GB SSD units that will be used both as mirrored boot/root (32GB of the 100GB), and the use the rest of those disks as L2ARC cache devices for the big data zpool. And have two mirrored X25-E as slog devices: zpool create DATA raidz2 c0t0d0 c0t1d0 c1t0d0 c1t1d0 c2t0d0 c2t1d0 c3t1d0 \ raidz2 c4t0d0 c4t1d0 c5t0d0 c5t1d0 c0t2d0 c0t3d0 c3t2d0 \ raidz2 c1t2d0 c1t3d0 c2t2d0 c2t3d0 c4t2d0 c4t3d0 c3t3d0 \ raidz2 c5t2d0 c5t3d0 c0t4d0 c0t5d0 c1t4d0 c1t5d0 c3t5d0 \ raidz2 c2t4d0 c2t5d0 c4t4d0 c4t5d0 c5t4d0 c5t5d0 c3t6d0 \ raidz2 c0t6d0 c0t7d0 c1t6d0 c1t7d0 c2t6d0 c2t7d0 c3t7d0 \ spare c4t6d0 c5t6d0 \ cache c3t0d0s3 c3t4d0s3 \ log mirror c4t7d0 c5t7d0 -- This message posted from opensolaris.org
On Thu, Jun 10, 2010 at 05:46:19AM -0700, Peter Eriksson wrote:> Just a quick followup that the same issue still seems to be there on our X4500s with the latest Solaris 10 with all the latest patches and the following SSD disks: > > Intel X25-M G1 firmware 8820 (80GB MLC) > Intel X25-M G2 firmware 02HD (160GB MLC) >What problems did you have with the X25-M models? -- Pasi> However - things seem to work smoothly with: > > Intel X25-E G1 firmware 8850 (32GB SLC) > OCZ Vertex 2 firmware 1.00 and 1.02 (100GB MLC) > > I''m currently testing a setup with dual OCZ Vertex 2 100GB SSD units that will be used both as mirrored boot/root (32GB of the 100GB), and the use the rest of those disks as L2ARC cache devices for the big data zpool. And have two mirrored X25-E as slog devices: > > zpool create DATA raidz2 c0t0d0 c0t1d0 c1t0d0 c1t1d0 c2t0d0 c2t1d0 c3t1d0 \ > raidz2 c4t0d0 c4t1d0 c5t0d0 c5t1d0 c0t2d0 c0t3d0 c3t2d0 \ > raidz2 c1t2d0 c1t3d0 c2t2d0 c2t3d0 c4t2d0 c4t3d0 c3t3d0 \ > raidz2 c5t2d0 c5t3d0 c0t4d0 c0t5d0 c1t4d0 c1t5d0 c3t5d0 \ > raidz2 c2t4d0 c2t5d0 c4t4d0 c4t5d0 c5t4d0 c5t5d0 c3t6d0 \ > raidz2 c0t6d0 c0t7d0 c1t6d0 c1t7d0 c2t6d0 c2t7d0 c3t7d0 \ > spare c4t6d0 c5t6d0 \ > cache c3t0d0s3 c3t4d0s3 \ > log mirror c4t7d0 c5t7d0 > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Thu, Jun 10, 2010 at 04:04:42PM +0300, Pasi K?rkk?inen wrote:> > Intel X25-M G1 firmware 8820 (80GB MLC) > > Intel X25-M G2 firmware 02HD (160GB MLC) > > > > What problems did you have with the X25-M models?I''m not the OP, but I''ve had two X25M G2''s (80 and 160 GByte) suddenly die out me, out of a sample size of maybe 20. -- Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE