I''ve got a small home fileserver, Chenowith case with 8 hot-swap bays. Of course, at this level, I don''t have cute little lights next to each drive that the OS knows about and can control to indicate things to me. The configuration I think I have is three mirror pairs. I''ve got motherboard SATA connections, and an add-in SAS card with SAS-to-SATA cabling (all drives are SATA), and I''ve tried to wire it so each mirror is split across the two controllers. However -- the old disks were already a pool before. So if I put them in the "wrong" physical slots, when I imported the pool it would have still found them. So I could have the disks in slots that aren''t what I expected, without knowing it. I''m planning to upgrade the first mirror by attaching new, larger, drives, letting the resilver finish, and eventually detaching the old drives. I just installed the first new drive, located what controller it was on, and typed an attach command that did what I wanted: bash-4.0$ zpool status zp1 pool: zp1 state: ONLINE status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h4m, 3.13% done, 2h5m to go config: NAME STATE READ WRITE CKSUM zp1 ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 c9t3d0 ONLINE 0 0 0 c5t1d0 ONLINE 0 0 0 c9t5d0 ONLINE 0 0 0 14.0G resilvered mirror-1 ONLINE 0 0 0 c9t4d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 c9t2d0 ONLINE 0 0 0 c5t0d0 ONLINE 0 0 0 errors: No known data errors As you can see, the new drive being resilvered is in fact associated with the first mirror, as I had intended. (The old drives in the first mirror are older than in the second two, and all three are the same size, so that''s definitely the one to replace first.) HOWEVER...the activity lights on the drives aren''t doing what I expect. The activity light on the new drive is on pretty solidly (that I expected), but the OTHER activity puzzles me. (User activity is so close to nil that I''m quite confident that''s not confusing me; 95% + of the access right now is the resilver. Besides, usage could light up other drives, but it couldn''t turn off the lights on the ones being resilvered.) At first, I saw the second drive in the rack light up. I believe that to be c5t1d0, the second disk in mirror-0, and it''s the drive I specified for the old drive in the attach command. However, soon I started seeing the fourth drive in the rack light up. I believe that to be c6t1d0; part of mirror-1, and thus having no place in this resilver. It remained active. And after a while, the second drive activity light went off. For some minutes now, I''ve been seeing activity ONLY on the new drive, and on drive 4 (the one I don''t think is part of mirror 0). The activity lights aren''t connected by separate cables, so I don''t see how I could have them hooked up differently from the disks. It''s clear from zpool status that I have attached the new drive to the right mirror. So things are fine for now, I can let the resilver run to completion. I can detach one of the old drives fine, because that''s done with logical names, and those are shown in zpool status, so I have no doubt which logical names are the old drives in mirror 0. However, eventually it will be time to physically remove the old drives. If I remove only one at a time, I "shouldn''t" cause a disaster even if I pull the wrong one, and I can tell by checking spool status right away whether I pulled the right or wrong one. But this gets me into what I regard as risky territory -- if I pull a live drive, I''m going to suddenly need to know the commands needed to reattach it. Can somebody point me at clear examples of that (or post them)? I just found zpool iostat -v; now that I''m seeing traffic on the individual drives in the pool, it''s clearly reading from both the old drives, and writing to the new drive, exactly as expected. But only one activity light is lit on any of the old drives. Is there a clever way to figure out which drive is which? And if I have to fall back on removing a drive I think is right, and seeing if that''s true, what admin actions will I have to perform to get the pool back to safety? (I''ve got backups, but it''s a pain to restore of course.) (Hmmm; in single-user mode, use dd to read huge chunks of one disk, and see which lights come on? Do I even need to be in single-user mode to do that?) -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
On Feb 5, 2011, at 2:43 PM, David Dyer-Bennet wrote:> Is there a clever way to figure out which drive is which? And if I have to fall back on removing a drive I think is right, and seeing if that''s true, what admin actions will I have to perform to get the pool back to safety? (I''ve got backups, but it''s a pain to restore of course.) (Hmmm; in single-user mode, use dd to read huge chunks of one disk, and see which lights come on? Do I even need to be in single-user mode to do that?)Obviously this depends on your lights working to some extent (the right light doing something when the right disk is accessed), but I''ve used: dd if=/dev/rdsk/c8t3d0s0 of=/dev/null bs=4k count=100000 which someone mentioned on this list. Assuming you can actually read from the disk (it isn''t completely dead), it should allow you to direct traffic to each drive individually. Good luck, Ware
Will this not ruin the zpool? If you overwrite one of discs in the zpool won''t the zpool go broke, so you need to repair it? -- This message posted from opensolaris.org
> Will this not ruin the zpool? If you overwrite one of discs in the > zpool won''t the zpool go broke, so you need to repair it?As suggested, dd if=/dev/rdsk/c8t3d0s0 of=/dev/null bs=4k count=100000, that will do its best to overwrite /dev/null, which the system is likely to allow :P Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On 2011-02-06 05:58, Orvar Korvar wrote:> Will this not ruin the zpool? If you overwrite one of discs in the zpool won''t the zpool go broke, so you need to repair it?Without quoting I can''t tell what you think you''re responding to, but from my memory of this thread, I THINK you''re forgetting how dd works. The dd commands being proposed to create drive traffic are all read-only accesses, so they shouldn''t damage anything -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
David Dyer-Bennet
2011-Feb-06 17:49 UTC
[zfs-discuss] Identifying drives (SATA), question about hot spare allocation
Following up to myself, I think I''ve got things sorted, mostly. 1. The thing I was most sure of, I was wrong about. Some years back, I must have split the mirrors so that they used different brand disks. I probably did this, maybe even accidentally, when I had to restore from backups at one point. I suppose I could have physically labeled the carriers...no, that''s crazy talk! 2. The dd trick doesn''t produce reliable activity light activation in my system. I think some of the drives and/or controllers only turn on the activity light for writes. 3. However, in spite of all this, I have replaced the disks in mirror-0 with the bigger disks (via attach-new-resilver-detach-old), and added the third drive I bought as a hot spare. All without having to restore from backups. 4. AND I know which physical drive the detached 400GB drive is. It occurs to me I could make that a second hot spare -- there are 4 remaining 400GB drives in the pool, so it''s useful for 2/3 of the failures by drive count. Leading to a new question -- is ZFS smart about hot spare sizes? Will it skip over too-small drives? Will it, even better, prefer smaller drives to larger so long as they are big enough (thus leaving the big drives for bigger failures)? -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Heh. My bad. Didnt read the command. Yes, that should be safe. -- This message posted from opensolaris.org
Roy, I read your question on OpenIndiana mail lists: how can you rebalance your huge raid, without implementing block pointer rewrite? You have an old vdev full of data, and now you have added a new vdev - and you want the data to be evenly spread out to all vdevs. I answer here beceause it is easier to me, than mail openindina. I think it should work to create a new zfs filesystem which will reside on all vdevs, and then move your old data to the new fileystem. Then all data will be evenly spread out. -- This message posted from opensolaris.org