mad.scientist.at.large at tutanota.com
2017-Aug-10 02:44 UTC
[CentOS] Errors on an SSD drive
what file system are you using?? ssd drives have different characteristics that need to be accomadated (including a relatively slow write process which is obvious as soon as the buffer is full), and never, never put a swap partition on it, the high activity will wear it out rather quickly.? might also check cables, often a problem particularly if they are older sata cables being run at a possibly higher than rated speed.? in any case, reformating it might not be a bad idea, and you can always use the command line program badblocks to exercise and test it.? keep in mind the drive will invisibly remap any bad sectors if possible.? if the reported size of the drive is smaller than it should be the drive has run out of spare blocks and dying blocks are being removed from the storage place with no replacements. -- Securely sent with Tutanota. Claim your encrypted mailbox today! https://tutanota.com 9. Aug 2017 18:44 by eliezer at ngtech.co.il:> I have yet to see a SSD read\write error which wasn't related to disk issues > like a bad sector but the controller might have an issue with the drive. > To verify it you will need to burn some read\write IOPS of the drive but if > it's under warranty then it's better to verify it now then later. > > Eliezer > > ---- > Eliezer Croitoru > Linux System Administrator > Mobile: +972-5-28704261 > Email: > eliezer at ngtech.co.il > > > > -----Original Message----- > From: CentOS [> mailto:centos-bounces at centos.org> ] On Behalf Of Robert > Moskowitz > Sent: Wednesday, August 9, 2017 17:03 > To: CentOS mailing list <> centos at centos.org> > > Subject: [CentOS] Errors on an SSD drive > > I am building a new system using an Kingston 240GB SSD drive I pulled > from my notebook (when I had to upgrade to a 500GB SSD drive). Centos > install went fine and ran for a couple days then got errors on the > console. Here is an example: > > [168176.995064] sd 0:0:0:0: [sda] tag#14 FAILED Result: > hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > [168177.004050] sd 0:0:0:0: [sda] tag#14 CDB: Read(10) 28 00 01 04 68 b0 > 00 00 08 00 > [168177.011615] blk_update_request: I/O error, dev sda, sector 17066160 > [168487.534510] sd 0:0:0:0: [sda] tag#17 FAILED Result: > hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > [168487.543576] sd 0:0:0:0: [sda] tag#17 CDB: Read(10) 28 00 01 04 68 b0 > 00 00 08 00 > [168487.551206] blk_update_request: I/O error, dev sda, sector 17066160 > [168787.813941] sd 0:0:0:0: [sda] tag#20 FAILED Result: > hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK > [168787.822951] sd 0:0:0:0: [sda] tag#20 CDB: Read(10) 28 00 01 04 68 b0 > 00 00 08 00 > [168787.830544] blk_update_request: I/O error, dev sda, sector 17066160 > > Eventually, I could not do anything on the system. Not even a > 'reboot'. I had to do a cold power cycle to bring things back. > > Is there anything to do about this or trash the drive and start anew? > > Thanks > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos > > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos
On Thu, 10 Aug 2017, mad.scientist.at.large at tutanota.com wrote:> what file system are you using?? ssd drives have different characteristics > that need to be accomadated (including a relatively slow write process which > is obvious as soon as the buffer is full), and never, never put a swap > partition on it, the high activity will wear it out rather quickly.I know this is common doctrine, but is this still generally held true? For a well configured desktop that rarely needs to swap, I struggle to see the load on the SSD as being significant, and yet obviously the performance of an SSD would make it ideal for swap.> might also check cables, often a problem particularly if they are older sata > cables being run at a possibly higher than rated speed.? in any case, > reformating it might not be a bad idea, and you can always use the command > line program badblocks to exercise and test it.Exercising an SSD? smartctl will give you sensible information on what the drive thinks of itself, and will give you actual figures on wear levelling and such like.> keep in mind the drive will invisibly remap any bad sectors if possible.? if > the reported size of the drive is smaller than it should be the drive has > run out of spare blocks and dying blocks are being removed from the storage > place with no replacements.Coo, I've never seen a disk actually shrink due to failed sectors. I don't think I've got an SSD into a worn state yet to see this. jh
On 08/09/2017 10:44 PM, mad.scientist.at.large at tutanota.com wrote:> what file system are you using? ssd drives have different characteristics that need to be accomadated (including a relatively slow write process which is obvious as soon as the buffer is full), and never, never put a swap partition on it, the high activity will wear it out rather quickly. might also check cables, often a problem particularly if they are older sata cables being run at a possibly higher than rated speed.When working with a Cubieboard SoC (or most of the other armv7 boards), you tend to have everything hanging out: http://medon.htt-consult.com/~rgm/cubieboard/cubietower-2.JPG I have checked the cables and they are all tight.> in any case, reformating it might not be a bad idea, and you can always use the command line program badblocks to exercise and test it.I will have to look into that.> keep in mind the drive will invisibly remap any bad sectors if possible. if the reported size of the drive is smaller than it should be the drive has run out of spare blocks and dying blocks are being removed from the storage place with no replacements. > > -- > Securely sent with Tutanota. Claim your encrypted mailbox today! > https://tutanota.com > > 9. Aug 2017 18:44 by eliezer at ngtech.co.il: > > >> I have yet to see a SSD read\write error which wasn't related to disk issues >> like a bad sector but the controller might have an issue with the drive. >> To verify it you will need to burn some read\write IOPS of the drive but if >> it's under warranty then it's better to verify it now then later. >> >> Eliezer >> >> ---- >> Eliezer Croitoru >> Linux System Administrator >> Mobile: +972-5-28704261 >> Email: > eliezer at ngtech.co.il >> >> >> >> -----Original Message----- >> From: CentOS [> mailto:centos-bounces at centos.org> ] On Behalf Of Robert >> Moskowitz >> Sent: Wednesday, August 9, 2017 17:03 >> To: CentOS mailing list <> centos at centos.org> > >> Subject: [CentOS] Errors on an SSD drive >> >> I am building a new system using an Kingston 240GB SSD drive I pulled >> from my notebook (when I had to upgrade to a 500GB SSD drive). Centos >> install went fine and ran for a couple days then got errors on the >> console. Here is an example: >> >> [168176.995064] sd 0:0:0:0: [sda] tag#14 FAILED Result: >> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK >> [168177.004050] sd 0:0:0:0: [sda] tag#14 CDB: Read(10) 28 00 01 04 68 b0 >> 00 00 08 00 >> [168177.011615] blk_update_request: I/O error, dev sda, sector 17066160 >> [168487.534510] sd 0:0:0:0: [sda] tag#17 FAILED Result: >> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK >> [168487.543576] sd 0:0:0:0: [sda] tag#17 CDB: Read(10) 28 00 01 04 68 b0 >> 00 00 08 00 >> [168487.551206] blk_update_request: I/O error, dev sda, sector 17066160 >> [168787.813941] sd 0:0:0:0: [sda] tag#20 FAILED Result: >> hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK >> [168787.822951] sd 0:0:0:0: [sda] tag#20 CDB: Read(10) 28 00 01 04 68 b0 >> 00 00 08 00 >> [168787.830544] blk_update_request: I/O error, dev sda, sector 17066160 >> >> Eventually, I could not do anything on the system. Not even a >> 'reboot'. I had to do a cold power cycle to bring things back. >> >> Is there anything to do about this or trash the drive and start anew? >> >> Thanks >> >> _______________________________________________ >> CentOS mailing list >> CentOS at centos.org >> https://lists.centos.org/mailman/listinfo/centos >> >> _______________________________________________ >> CentOS mailing list >> CentOS at centos.org >> https://lists.centos.org/mailman/listinfo/centos > _______________________________________________ > CentOS mailing list > CentOS at centos.org > https://lists.centos.org/mailman/listinfo/centos
Robert Moskowitz wrote:> On 08/09/2017 10:44 PM, mad.scientist.at.large at tutanota.com wrote:>> what file system are you using? ssd drives have different >> characteristics that need to be accomadated (including a relatively slow >> write process which is obvious as soon as the buffer is full), and >> never, never put a swap partition on it, the high activity will wear it >> out rather quickly. might also check cables, often a problem >> particularly if they are older sata cables being run at a possibly >> higher than rated speed. > > When working with a Cubieboard SoC (or most of the other armv7 boards), > you tend to have everything hanging out: > http://medon.htt-consult.com/~rgm/cubieboard/cubietower-2.JPG > > I have checked the cables and they are all tight. > >> in any case, reformating it might not be a bad idea, and you can always >> use the command line program badblocks to exercise and test it. > > I will have to look into that. >Here's a thought: I've not done this, but could you use smartctl to check the drive? mark
On Aug 10, 2017, at 2:07 AM, John Hodrien <J.H.Hodrien at leeds.ac.uk> wrote:> > For a well configured desktop that rarely needs to swap, I struggle to see the > load on the SSD as being significant, and yet obviously the performance of an > SSD would make it ideal for swap.I agree. It?s a bad idea to do without swap even if you almost never use it, because today?s bloated apps often have many pages of virtual memory they rarely or never actually touch. You want those pages to get swapped out quickly so that the precious RAM can be used more productively; by the buffer cache, if nothing else. I once used a web application server on a headless VPS that still had GUI libraries linked to its binary because one of the underlying technologies it uses was also used in a GUI app, and it was too difficult to tear all that GUI code out, even if it was never called. Because the VPS technology didn?t support swap, I directly paid the price for those megs of unused (and unusable!) libraries in my monthly VPS rental fees.> Coo, I've never seen a disk actually shrink due to failed sectors. I don't > think I've got an SSD into a worn state yet to see this.Me, neither. I?m pretty sure the spare sector pool?s size isn?t reported to the OS, and the drive isn?t allowed to dip into the sectors it does expose externally for spares. When the spare pool is used up, the drive just starts failing in a way that even SMART can see.