build 133 zpool version 22 I''m getting: zpool status: NAME STATE READ WRITE CKSUM z3 DEGRADED 0 0 167 mirror-0 DEGRADED 0 0 334 c5d0 DEGRADED 0 0 335 too many errors c6d0 DEGRADED 0 0 334 too many errors [...] When I saw it I deleted all the files listed in status -v report as recommended on the report Ran a new scrub.... Now I get the same message and it still lists all the files I deleted as being the problem.
On 10/21/10 03:47 PM, Harry Putnam wrote:> build 133 > zpool version 22 > > I''m getting: > > zpool status: > NAME STATE READ WRITE CKSUM > z3 DEGRADED 0 0 167 > mirror-0 DEGRADED 0 0 334 > c5d0 DEGRADED 0 0 335 too many errors > c6d0 DEGRADED 0 0 334 too many errors > [...] > > When I saw it I deleted all the files listed in status -v report as > recommended on the report > > Ran a new scrub.... Now I get the same message and it still lists all > the files I deleted as being the problem. > >It looks like you have some dead hardware. -- Ian.
Ian Collins <ian at ianshome.com> writes:> On 10/21/10 03:47 PM, Harry Putnam wrote: >> build 133 >> zpool version 22 >> >> I''m getting: >> >> zpool status: >> NAME STATE READ WRITE CKSUM >> z3 DEGRADED 0 0 167 >> mirror-0 DEGRADED 0 0 334 >> c5d0 DEGRADED 0 0 335 too many errors >> c6d0 DEGRADED 0 0 334 too many errors >> [...] >> >> When I saw it I deleted all the files listed in status -v report as >> recommended on the report >> >> Ran a new scrub.... Now I get the same message and it still lists all >> the files I deleted as being the problem. >> >> > It looks like you have some dead hardware.If that were true I wouldn''t be able to read and write to that pool would I? Its a mirrored pool of two 750GB WD disks, (both new as of a couple mnths ago)
Cindy Swearingen
2010-Oct-21 16:25 UTC
[zfs-discuss] When `zpool status'' reports bad news
Hi Harry, Generally, you need to use zpool clear to clear the pool errors, but I can''t reproduce the removed files reappearing in zpool status on my own system when I corrupt data so I''m not sure this will help. Some other larger problem is going on here... Did any hardware changes lead up to the z3 pool in this state? I would suspect controller/cable issues with c5d0 and c6d0 if your root pool is running fine. Otherwise, the hardware problems might be CPU, memory... Can you review error messages in /var/adm/messages and also fmdump -eV for clues? Thanks, Cindy On 10/20/10 20:47, Harry Putnam wrote:> build 133 > zpool version 22 > > I''m getting: > > zpool status: > NAME STATE READ WRITE CKSUM > z3 DEGRADED 0 0 167 > mirror-0 DEGRADED 0 0 334 > c5d0 DEGRADED 0 0 335 too many errors > c6d0 DEGRADED 0 0 334 too many errors > [...] > > When I saw it I deleted all the files listed in status -v report as > recommended on the report > > Ran a new scrub.... Now I get the same message and it still lists all > the files I deleted as being the problem. > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Could you show us ''iostat -En'' please? On 21 Oct 2010 13:31, "Harry Putnam" <reader at newsguy.com> wrote: Ian Collins <ian at ianshome.com> writes:> On 10/21/10 03:47 PM, Harry Putnam wrote: >> build 133 >> zpool version 22 >> >> I''m getting: >> >> zpool status: >> NAME STATE READ WRITE CKSUM >> z3 DEGRADED 0 0 167 >> mirror-0 DEGRADED 0 0 334 >> c5d0 DEGRADED 0 0 335 too many errors >> c6d0 DEGRADED 0 0 334 too many errors >> [...] >> >> When I saw it I deleted all the files listed in status -v report as >> recommended on the report >> >> Ran a new scrub.... Now I get the same message and it still lists all >> the files I deleted as being the problem. >> >> > It looks like you have some dead hardware.If that were true I wouldn''t be able to read and write to that pool would I? Its a mirrored pool of two 750GB WD disks, (both new as of a couple mnths ago) _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101021/41c495a7/attachment-0001.html>
Cindy Swearingen <cindy.swearingen at oracle.com> writes:> Hi Harry, > > Generally, you need to use zpool clear to clear the pool errors, but I > can''t reproduce the removed files reappearing in zpool status on my own > system when I corrupt data so I''m not sure this will help. Some other > larger problem is going on here... > > Did any hardware changes lead up to the z3 pool in this state? I would > suspect controller/cable issues with c5d0 and c6d0 if your root pool is > running fine. Otherwise, the hardware problems might be CPU, memory... > > Can you review error messages in /var/adm/messages and also fmdump -eV > for clues? >You didn''t ask me to post that info, and so I haven''t but on review I didn''t have a clue what to make of it. Since the ouput of: fmdump -eV -t 11Oct2010 was quite extensive I have posted it for the days where I suspect the problems would show (Oct 11-21 ) here: (Beware that it is 83,633 lines) http://www.jtan.com/~reader/fmdump_out.txt I''ve learned from someone who was present when the activity described below occurred that there was a point where the machine began beeping in two different tones, two beeps high, two beeps low with pause between. Machine could not be accessed and so was shutdown with a hard shutdown. I was not present but am told: ,---- | It took several reboots before the machine would start up and stay | running. `---- Not sure what occurred or how to tell from fmdump ouput. Not even sure now of exactly when this occurred but some few days or at least 1 day before I started noticing the `zpool status'' reports. Some of the problem seems to have cleared up after a series of zpool scub/zpool clear. The message from zpool status is more friendly looking now but still will not clear. ------- --------- ---=--- --------- -------- zpool status z3 pool: z3 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: scrub completed after 2h2m with 83 errors on Thu Oct 21 23:46:23 2010 config: NAME STATE READ WRITE CKSUM z3 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c5d0 ONLINE 0 0 0 c6d0 ONLINE 0 0 0 errors: 36 data errors, use ''-v'' for a list ,---- | NOTE: The 36 mentioned are all references to files that were deleted | several days ago. | | I''ve only included a few lines to show that they are not present: | | We look for the top level directory from the output of zpool status -v. | Which is in the format like: | | (wrapped for mail) | z3/projects at zfs-auto-snap:daily-2010-09-26-00:00:/prjtmp_bak/m2_d/\ | sources/audio/FullclipRedo/bjp_t2''20100912 16.34.39_cp3.avi | | First we go to the problem `.zfs/snapshot'' directory. | cd /projects/.zfs/snapshot | | zpool status -v z3| \ | awk -F ''/'' ''/projects@/{sub(/^.*projects@/,"");print $1}'' | | ls: cannot access zfs-auto-snap:daily-2010-09-26-00:00:: No such file or directory | ls: cannot access zfs-auto-snap:daily-2010-09-26-00:00:: No such file or directory | ls: cannot access zfs-auto-snap:daily-2010-09-26-00:00:: No such file or directory | ls: cannot access zfs-auto-snap:daily-2010-09-26-00:00:: No such file or directory | | And so it goes for all 36. The parent directories of the files | have been deleted (hence the repeated reference above) `---- Several cycles of zpool scub/clear have been performed since the deletions. (Resulting only in the friendlier message referenced above) ------- --------- ---=--- --------- -------- I''m going to attempt writing a semi-hefty amount of data to that mirror now and see what happens.
Roy Sigurd Karlsbakk
2010-Oct-24 14:22 UTC
[zfs-discuss] When `zpool status'' reports bad news
----- Original Message -----> build 133 > zpool version 22 > > I''m getting: > > zpool status: > NAME STATE READ WRITE CKSUM > z3 DEGRADED 0 0 167 > mirror-0 DEGRADED 0 0 334 > c5d0 DEGRADED 0 0 335 too many errors > c6d0 DEGRADED 0 0 334 too many errors > [...] > > When I saw it I deleted all the files listed in status -v report as > recommended on the report > > Ran a new scrub.... Now I get the same message and it still lists all > the files I deleted as being the problem.I''d guess you either have bad drives, bad cables or a bad backplane. I''d start with adding another drive (if it''s room for it in the box) and zpool attach it. If there isn''t room, zpool detach one drive first, replace the drive, and attach the new one (after doing some cfgadm magick - can''t remember the syntax - man cfgadm) Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On Wed, 20 Oct 2010, Harry Putnam wrote:> > When I saw it I deleted all the files listed in status -v report as > recommended on the report > > Ran a new scrub.... Now I get the same message and it still lists all > the files I deleted as being the problem.Note that if you have been using snapshots, "deleting" the file does not cause its blocks to be freed if a snapshot still references them. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Am 24.10.10 17:31, schrieb Bob Friesenhahn:> On Wed, 20 Oct 2010, Harry Putnam wrote: >> >> When I saw it I deleted all the files listed in status -v report as >> recommended on the report >> >> Ran a new scrub.... Now I get the same message and it still lists all >> the files I deleted as being the problem. > > Note that if you have been using snapshots, "deleting" the file does > not cause its blocks to be freed if a snapshot still references them. > > BobHmm... that makes sense, but what about this: errors: Permanent errors have been detected in the following files: obelixData/JvMpreprint at 2010-10-02_2359:/DTP/Jobs/Mercedes-Benz/C_Klasse/RZ in CI vor ET 10.6.2010/13404_41_08017 S_204 Konvertierung_Datenversand/S_204 C-Klasse Konvertierung/Dealer_Launch_Standard/DL_Flyer_standard_4pp_CS2/Links/Vorhang_Innen.eps obelixData/JvMpreprint at 2010-10-02_2359:/DTP/Jobs/Mercedes-Benz/C_Klasse/RZ in CI vor ET 10.6.2010/13404_41_07008 Estate HandelsMarketing/Dealer_Launch_Invitations Fremddokumente/Dealer_Launch_S204/Images/Vorhang_Innen.eps obelixData/JvMpreprint at 2010-10-02_2359:/DTP/Jobs/Mercedes-Benz/C_Klasse/RZ in CI vor ET 10.6.2010/13404_41_07008 Estate HandelsMarketing/Salesfolder Estate RZ/Inhalt_Bilder_PS-Dateien/DL_Flyer_standard_4pp.ps obelixData/JvMpreprint at 2010-10-06_2359:/DTP/Jobs/Mercedes-Benz/C_Klasse/RZ in CI vor ET 10.6.2010/13404_41_07008 Estate HandelsMarketing/Dealer_Launch_Invitations Fremddokumente/Dealer_Launch_S204/Images/Vorhang_Innen.eps <0x3b2>:<0x1bca86> <0x3b2>:<0x1bba92> <0x3b2>:<0x1bbeba> I believe that the lower ones were files in snapshots that have been deleted, but why are they still referenced like this? budy -- Stephan Budach Jung von Matt/it-services GmbH Glash?ttenstra?e 79 20357 Hamburg Tel: +49 40-4321-1353 Fax: +49 40-4321-1114 E-Mail: stephan.budach at jvm.de Internet: http://www.jvm.com Gesch?ftsf?hrer: Ulrich Pallas, Frank Wilhelm AG HH HRB 98380 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101024/1748389d/attachment-0001.html>
On Sun, 24 Oct 2010, Stephan Budach wrote:> > I believe that the lower ones were files in snapshots that have been deleted, but why are they still > referenced like this?Have you used ''zpool clear'' to clear the errors in the pool? Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Am 25.10.10 01:48, schrieb Bob Friesenhahn:> On Sun, 24 Oct 2010, Stephan Budach wrote: >> >> I believe that the lower ones were files in snapshots that have been >> deleted, but why are they still >> referenced like this? > > Have you used ''zpool clear'' to clear the errors in the pool? > > BobYes I did - several times. The last time I used zpool clear right before I started scrub. budy -- Stephan Budach Jung von Matt/it-services GmbH Glash?ttenstra?e 79 20357 Hamburg Tel: +49 40-4321-1353 Fax: +49 40-4321-1114 E-Mail: stephan.budach at jvm.de Internet: http://www.jvm.com Gesch?ftsf?hrer: Ulrich Pallas, Frank Wilhelm AG HH HRB 98380 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101025/ad2d2190/attachment.html>