This is the current state of my pool: ethan at save:~# zpool import pool: q id: 5055543090570728034 state: UNAVAIL status: One or more devices contains corrupted data. action: The pool cannot be imported due to damaged devices or data. see: sun.com/msg/ZFS-8000-5E config: q UNAVAIL insufficient replicas raidz1 UNAVAIL insufficient replicas c9t4d0s8 UNAVAIL corrupted data c9t5d0s2 ONLINE c9t2d0s8 UNAVAIL corrupted data c9t1d0s8 UNAVAIL corrupted data c9t0d0s8 UNAVAIL corrupted data back story: I was previously running and using this pool on linux using zfs-fuse. one day the zfs-fuse daemon behaved strangely. zpool and zfs commands gave a message about not being able to connect to the daemon. the filesystems for the pool q were still available and seemed to be working correctly. I started the zfs-fuse daemon again. I''m not sure if this meant that there were two deamons running, since the filesystem was still available but I couldn''t get any response from the zfs or zpool commands. I then decided instead just to reboot. after rebooting, the pool appeared to import successfully, but `zfs list` showed no filesystems. I rebooted again, not really having any better ideas. after that `zpool import` just hung forever. I decided I should get off of the fuse/linux implementation and use a more recent version of zfs in its native environment, so I installed opensolaris. I had been running the pool on truecrypt encrypted volumes, so I copied them off of the encrypted volumes onto blank volumes, and put them on opensolaris. I got the above when I tried to import. Now, no idea where to go from here. It doesn''t seem like my data should be just gone - there is no problem with the physical drives. It seems unlikely that a misbehaving zfs-fuse would completely corrupt the data of 4 out of 5 drives (or so I am hoping). Is there any hope for my data? I have some not-very-recent backups of some fraction of it, but if recovering this is possible that would of course be infinitely preferable. -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100216/d80b7e20/attachment.html>
On Tue, Feb 16, 2010 at 10:06:13PM -0500, Ethan wrote:> This is the current state of my pool: > > ethan at save:~# zpool import > pool: q > id: 5055543090570728034 > state: UNAVAIL > status: One or more devices contains corrupted data. > action: The pool cannot be imported due to damaged devices or data. > see: sun.com/msg/ZFS-8000-5E > config: > > q UNAVAIL insufficient replicas > raidz1 UNAVAIL insufficient replicas > c9t4d0s8 UNAVAIL corrupted data > c9t5d0s2 ONLINE > c9t2d0s8 UNAVAIL corrupted data > c9t1d0s8 UNAVAIL corrupted data > c9t0d0s8 UNAVAIL corrupted data > > > back story: > I was previously running and using this pool on linux using zfs-fuse.Two things to try: - import -F (with -n, first time) on a recent build - zdb -l <dev> for each of the devs above, compare and/or post. this helps ensure that you copied correctly, with respect to all the various translations, labelling, partitioning etc differences between the platforms. Since you apparenly got at least one right, hopefully this is less of an issue if you did the same for all. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/0c52dd7c/attachment.bin>
On Wed, Feb 17, 2010 at 02:30:28PM +1100, Daniel Carosone wrote:> > c9t4d0s8 UNAVAIL corrupted data > > c9t5d0s2 ONLINE > > c9t2d0s8 UNAVAIL corrupted data > > c9t1d0s8 UNAVAIL corrupted data > > c9t0d0s8 UNAVAIL corrupted data> - zdb -l <dev> for each of the devs above, compare and/or post. this > helps ensure that you copied correctly, with respect to all the various > translations, labelling, partitioning etc differences between the > platforms. Since you apparenly got at least one right, hopefully this > is less of an issue if you did the same for all.Actually, looking again, is there any signifigance to the fact that s2 on one disk is ok, and s8 on the others are not? Perhaps start with the zdb -l, and make sure you''re pointing at the right data before importing. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/84aef775/attachment.bin>
On Tue, Feb 16, 2010 at 22:35, Daniel Carosone <dan at geek.com.au> wrote:> On Wed, Feb 17, 2010 at 02:30:28PM +1100, Daniel Carosone wrote: > > > c9t4d0s8 UNAVAIL corrupted data > > > c9t5d0s2 ONLINE > > > c9t2d0s8 UNAVAIL corrupted data > > > c9t1d0s8 UNAVAIL corrupted data > > > c9t0d0s8 UNAVAIL corrupted data > > > - zdb -l <dev> for each of the devs above, compare and/or post. this > > helps ensure that you copied correctly, with respect to all the > various > > translations, labelling, partitioning etc differences between the > > platforms. Since you apparenly got at least one right, hopefully this > > is less of an issue if you did the same for all. > > Actually, looking again, is there any signifigance to the fact that s2 > on one disk is ok, and s8 on the others are not? Perhaps start with > the zdb -l, and make sure you''re pointing at the right data before > importing. > > -- > Dan. > >I do not know if there is any significance to the okay disk being s2 and the others s8 - in fact I do not know what the numbers mean at all, being out of my element in opensolaris (but trying to learn as much as quickly as I can). As for the copying, all that I did was `dd if=<the truecrypt volume> of=<the new disk>` for each of the five disks. Output of zdb -l looks identical for each of the five volumes, apart from guid. I have pasted one of them below. Thanks for your help. -Ethan ethan at save:/dev# zdb -l dsk/c9t2d0s8 -------------------------------------------- LABEL 0 -------------------------------------------- version=13 name=''q'' state=1 txg=361805 pool_guid=5055543090570728034 hostid=8323328 hostname=''that'' top_guid=441634638335554713 guid=13840197833631786818 vdev_tree type=''raidz'' id=0 guid=441634638335554713 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=7501483868160 is_log=0 children[0] type=''disk'' id=0 guid=4590162841999933602 path=''/dev/mapper/truecrypt3'' whole_disk=0 children[1] type=''disk'' id=1 guid=12502103998258102871 path=''/dev/mapper/truecrypt2'' whole_disk=0 children[2] type=''disk'' id=2 guid=13840197833631786818 path=''/dev/mapper/truecrypt1'' whole_disk=0 children[3] type=''disk'' id=3 guid=3763020893739678459 path=''/dev/mapper/truecrypt5'' whole_disk=0 children[4] type=''disk'' id=4 guid=4929061713231157616 path=''/dev/mapper/truecrypt4'' whole_disk=0 -------------------------------------------- LABEL 1 -------------------------------------------- version=13 name=''q'' state=1 txg=361805 pool_guid=5055543090570728034 hostid=8323328 hostname=''that'' top_guid=441634638335554713 guid=13840197833631786818 vdev_tree type=''raidz'' id=0 guid=441634638335554713 nparity=1 metaslab_array=23 metaslab_shift=32 ashift=9 asize=7501483868160 is_log=0 children[0] type=''disk'' id=0 guid=4590162841999933602 path=''/dev/mapper/truecrypt3'' whole_disk=0 children[1] type=''disk'' id=1 guid=12502103998258102871 path=''/dev/mapper/truecrypt2'' whole_disk=0 children[2] type=''disk'' id=2 guid=13840197833631786818 path=''/dev/mapper/truecrypt1'' whole_disk=0 children[3] type=''disk'' id=3 guid=3763020893739678459 path=''/dev/mapper/truecrypt5'' whole_disk=0 children[4] type=''disk'' id=4 guid=4929061713231157616 path=''/dev/mapper/truecrypt4'' whole_disk=0 -------------------------------------------- LABEL 2 -------------------------------------------- failed to unpack label 2 -------------------------------------------- LABEL 3 -------------------------------------------- failed to unpack label 3 -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100216/f811722b/attachment.html>
On Feb 16, 2010, at 7:57 PM, Ethan wrote:> On Tue, Feb 16, 2010 at 22:35, Daniel Carosone <dan at geek.com.au> wrote: > On Wed, Feb 17, 2010 at 02:30:28PM +1100, Daniel Carosone wrote: > > > c9t4d0s8 UNAVAIL corrupted data > > > c9t5d0s2 ONLINE > > > c9t2d0s8 UNAVAIL corrupted data > > > c9t1d0s8 UNAVAIL corrupted data > > > c9t0d0s8 UNAVAIL corrupted dataslice 8 tends to be tiny and slice 2 is the whole disk, which is why you can''t find label 2 or 3, which are at the end of the disk. Try exporting the pool and then import. -- richard> > > - zdb -l <dev> for each of the devs above, compare and/or post. this > > helps ensure that you copied correctly, with respect to all the various > > translations, labelling, partitioning etc differences between the > > platforms. Since you apparenly got at least one right, hopefully this > > is less of an issue if you did the same for all. > > Actually, looking again, is there any signifigance to the fact that s2 > on one disk is ok, and s8 on the others are not? Perhaps start with > the zdb -l, and make sure you''re pointing at the right data before > importing. > > -- > Dan. > > > I do not know if there is any significance to the okay disk being s2 and the others s8 - in fact I do not know what the numbers mean at all, being out of my element in opensolaris (but trying to learn as much as quickly as I can). > As for the copying, all that I did was `dd if=<the truecrypt volume> of=<the new disk>` for each of the five disks. > Output of zdb -l looks identical for each of the five volumes, apart from guid. I have pasted one of them below. > Thanks for your help. > > -Ethan > > > ethan at save:/dev# zdb -l dsk/c9t2d0s8 > -------------------------------------------- > LABEL 0 > -------------------------------------------- > version=13 > name=''q'' > state=1 > txg=361805 > pool_guid=5055543090570728034 > hostid=8323328 > hostname=''that'' > top_guid=441634638335554713 > guid=13840197833631786818 > vdev_tree > type=''raidz'' > id=0 > guid=441634638335554713 > nparity=1 > metaslab_array=23 > metaslab_shift=32 > ashift=9 > asize=7501483868160 > is_log=0 > children[0] > type=''disk'' > id=0 > guid=4590162841999933602 > path=''/dev/mapper/truecrypt3'' > whole_disk=0 > children[1] > type=''disk'' > id=1 > guid=12502103998258102871 > path=''/dev/mapper/truecrypt2'' > whole_disk=0 > children[2] > type=''disk'' > id=2 > guid=13840197833631786818 > path=''/dev/mapper/truecrypt1'' > whole_disk=0 > children[3] > type=''disk'' > id=3 > guid=3763020893739678459 > path=''/dev/mapper/truecrypt5'' > whole_disk=0 > children[4] > type=''disk'' > id=4 > guid=4929061713231157616 > path=''/dev/mapper/truecrypt4'' > whole_disk=0 > -------------------------------------------- > LABEL 1 > -------------------------------------------- > version=13 > name=''q'' > state=1 > txg=361805 > pool_guid=5055543090570728034 > hostid=8323328 > hostname=''that'' > top_guid=441634638335554713 > guid=13840197833631786818 > vdev_tree > type=''raidz'' > id=0 > guid=441634638335554713 > nparity=1 > metaslab_array=23 > metaslab_shift=32 > ashift=9 > asize=7501483868160 > is_log=0 > children[0] > type=''disk'' > id=0 > guid=4590162841999933602 > path=''/dev/mapper/truecrypt3'' > whole_disk=0 > children[1] > type=''disk'' > id=1 > guid=12502103998258102871 > path=''/dev/mapper/truecrypt2'' > whole_disk=0 > children[2] > type=''disk'' > id=2 > guid=13840197833631786818 > path=''/dev/mapper/truecrypt1'' > whole_disk=0 > children[3] > type=''disk'' > id=3 > guid=3763020893739678459 > path=''/dev/mapper/truecrypt5'' > whole_disk=0 > children[4] > type=''disk'' > id=4 > guid=4929061713231157616 > path=''/dev/mapper/truecrypt4'' > whole_disk=0 > -------------------------------------------- > LABEL 2 > -------------------------------------------- > failed to unpack label 2 > -------------------------------------------- > LABEL 3 > -------------------------------------------- > failed to unpack label 3 > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > mail.opensolaris.org/mailman/listinfo/zfs-discussZFS storage and performance consulting at RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance nexenta-atlanta.eventbrite.com (March 15-17, 2010)
On Tue, Feb 16, 2010 at 23:24, Richard Elling <richard.elling at gmail.com>wrote:> On Feb 16, 2010, at 7:57 PM, Ethan wrote: > > On Tue, Feb 16, 2010 at 22:35, Daniel Carosone <dan at geek.com.au> wrote: > > On Wed, Feb 17, 2010 at 02:30:28PM +1100, Daniel Carosone wrote: > > > > c9t4d0s8 UNAVAIL corrupted data > > > > c9t5d0s2 ONLINE > > > > c9t2d0s8 UNAVAIL corrupted data > > > > c9t1d0s8 UNAVAIL corrupted data > > > > c9t0d0s8 UNAVAIL corrupted data > > slice 8 tends to be tiny and slice 2 is the whole disk, which is why > you can''t find label 2 or 3, which are at the end of the disk. > > Try exporting the pool and then import. > -- richard > >The pool is never successfully importing, so I can''t export. The import just gives the output I pasted, and the pool is not imported. If slice 2 is the whole disk, why is zpool trying to using slice 8 for all but one disk? Can I explicitly tell zpool to use slice 2 for each device? -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100216/b5c4f9f4/attachment.html>
On Tue, Feb 16, 2010 at 11:39:39PM -0500, Ethan wrote:> If slice 2 is the whole disk, why is zpool trying to using slice 8 for all > but one disk?Because it''s finding at least part of the labels for the pool member there. Please check the partition tables of all the disks, and use zdb -l on the various partitions, to make sure that you haven''t got funny offsets or other problems hiding the data from import. In a default solaris label, s2 and s8 start at cylinder 0 but are vastly different sizes. You need to arrange for your labels to match however the data you copied got laid out.> Can I explicitly tell zpool to use slice 2 for each device?Not for import, only at creation time. On import, devices are chosen by inspection of the zfs labels within. zdb -l will print those for you; when you can see all 4 labels for all devices your import has a much better chance of success. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/9accf7a7/attachment.bin>
On Tue, Feb 16, 2010 at 23:57, Daniel Carosone <dan at geek.com.au> wrote:> On Tue, Feb 16, 2010 at 11:39:39PM -0500, Ethan wrote: > > If slice 2 is the whole disk, why is zpool trying to using slice 8 for > all > > but one disk? > > Because it''s finding at least part of the labels for the pool member there. > > Please check the partition tables of all the disks, and use zdb -l on > the various partitions, to make sure that you haven''t got funny > offsets or other problems hiding the data from import. > > In a default solaris label, s2 and s8 start at cylinder 0 but are > vastly different sizes. You need to arrange for your labels to match > however the data you copied got laid out. > > > Can I explicitly tell zpool to use slice 2 for each device? > > Not for import, only at creation time. On import, devices are chosen > by inspection of the zfs labels within. zdb -l will print those for > you; when you can see all 4 labels for all devices your import has a > much better chance of success. > > -- > Dan. >How would I go about arranging labels? I only see labels 0 and 1 (and do not see labels 2 and 3) on every device, for both slices 8 (which makes sense if 8 is just part of the drive; the zfs devices take up the whole drive) and slice 2 (which doesn''t seem to make sense to me). Since only two of the four labels are showing up for each of the drives on both slice 2 and slice 8, I guess that causes zpool to not have a preference between slice 2 and slice 8? So it just picks whichever it sees first, which happened to be slice 2 for one of the drives, but 8 for the others? (I am really just guessing at this.) So, on one hand, the fact that it says slice 2 is online for one drive makes me think that if I could get it to use slice 2 for the rest maybe it would work. On the other hand, the fact that I can''t see labels 2 and 3 on slice 2 for any drive (even the one that says it''s online) is worrisome and I want to figure out what''s up with that. Labels 2 and 3 _do_ show up (and look right) in zdb -l running in zfs-fuse on linux, on the truecrypt volumes. If it might just be a matter of arranging the labels so that the beginning and end of a slice are in the right place, that sounds promising, although I have no idea how I go about arranging labels. Could you point me in the direction of what utility I might use or some documentation to get me started in that direction? Thanks, -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/d2c617fe/attachment.html>
On Wed, Feb 17, 2010 at 00:27, Ethan <notethan at gmail.com> wrote:> On Tue, Feb 16, 2010 at 23:57, Daniel Carosone <dan at geek.com.au> wrote: > >> On Tue, Feb 16, 2010 at 11:39:39PM -0500, Ethan wrote: >> > If slice 2 is the whole disk, why is zpool trying to using slice 8 for >> all >> > but one disk? >> >> Because it''s finding at least part of the labels for the pool member >> there. >> >> Please check the partition tables of all the disks, and use zdb -l on >> the various partitions, to make sure that you haven''t got funny >> offsets or other problems hiding the data from import. >> >> In a default solaris label, s2 and s8 start at cylinder 0 but are >> vastly different sizes. You need to arrange for your labels to match >> however the data you copied got laid out. >> >> > Can I explicitly tell zpool to use slice 2 for each device? >> >> Not for import, only at creation time. On import, devices are chosen >> by inspection of the zfs labels within. zdb -l will print those for >> you; when you can see all 4 labels for all devices your import has a >> much better chance of success. >> >> -- >> Dan. >> > > How would I go about arranging labels? > I only see labels 0 and 1 (and do not see labels 2 and 3) on every device, > for both slices 8 (which makes sense if 8 is just part of the drive; the zfs > devices take up the whole drive) and slice 2 (which doesn''t seem to make > sense to me). > > Since only two of the four labels are showing up for each of the drives on > both slice 2 and slice 8, I guess that causes zpool to not have a preference > between slice 2 and slice 8? So it just picks whichever it sees first, which > happened to be slice 2 for one of the drives, but 8 for the others? (I am > really just guessing at this.) > > So, on one hand, the fact that it says slice 2 is online for one drive > makes me think that if I could get it to use slice 2 for the rest maybe it > would work. > On the other hand, the fact that I can''t see labels 2 and 3 on slice 2 for > any drive (even the one that says it''s online) is worrisome and I want to > figure out what''s up with that. > > Labels 2 and 3 _do_ show up (and look right) in zdb -l running in zfs-fuse > on linux, on the truecrypt volumes. > > If it might just be a matter of arranging the labels so that the beginning > and end of a slice are in the right place, that sounds promising, although I > have no idea how I go about arranging labels. Could you point me in the > direction of what utility I might use or some documentation to get me > started in that direction? > > Thanks, > -Ethan >And I just realized - yes, labels 2 and 3 are in the wrong place relative to the end of the drive; I did not take into account the overhead taken up by truecrypt when dd''ing the data. The raw drive is 1500301910016 bytes; the truecrypt volume is 1500301647872 bytes. Off by 262144 bytes - I need a slice that is sized like the truecrypt volume. -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/ae2e0988/attachment.html>
On Wed, Feb 17, 2010 at 12:31:27AM -0500, Ethan wrote:> And I just realized - yes, labels 2 and 3 are in the wrong place relative to > the end of the drive; I did not take into account the overhead taken up by > truecrypt when dd''ing the data. The raw drive is 1500301910016 bytes; the > truecrypt volume is 1500301647872 bytes. Off by 262144 bytes - I need a > slice that is sized like the truecrypt volume.It shouldn''t matter if the slice is larger than the original; this is how autoexpand works. 2 should be near the start (with 1), 3 should be near the logical end (with 4). Did this resolve the issue? You didn''t say, and I have my doubts. I''m not sure this is your problem, but it seems you''re on the track to finding the real problem. In the labels you can see, are the txg''s the same for all pool members? If not, you may still need import -F, once all the partitioning gets sorted out. Also, re-reading what I wrote above, I realised I was being ambiguous in my use of "label". Sometimes I meant the zfs labels that zdb -l prints, and sometimes I meant the vtoc that format uses for slices. In the BSD world we call those labels too, and I didn''t realise I was mixing terms. Sorry for any confusion but it seems you figured out what I meant :) -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100218/69507b42/attachment.bin>
On Wed, Feb 17, 2010 at 15:22, Daniel Carosone <dan at geek.com.au> wrote:> On Wed, Feb 17, 2010 at 12:31:27AM -0500, Ethan wrote: > > And I just realized - yes, labels 2 and 3 are in the wrong place relative > to > > the end of the drive; I did not take into account the overhead taken up > by > > truecrypt when dd''ing the data. The raw drive is 1500301910016 bytes; the > > truecrypt volume is 1500301647872 bytes. Off by 262144 bytes - I need a > > slice that is sized like the truecrypt volume. > > It shouldn''t matter if the slice is larger than the original; this is > how autoexpand works. 2 should be near the start (with 1), 3 should > be near the logical end (with 4). > > Did this resolve the issue? You didn''t say, and I have my doubts. I''m > not sure this is your problem, but it seems you''re on the track to > finding the real problem. > > In the labels you can see, are the txg''s the same for all pool > members? If not, you may still need import -F, once all the > partitioning gets sorted out. > > Also, re-reading what I wrote above, I realised I was being ambiguous > in my use of "label". Sometimes I meant the zfs labels that zdb -l > prints, and sometimes I meant the vtoc that format uses for slices. In > the BSD world we call those labels too, and I didn''t realise I was > mixing terms. Sorry for any confusion but it seems you figured out > what I meant :) > > -- > Dan.I have not yet successfully imported. I can see two ways of making progress forward. One is forcing zpool to attempt to import using slice 2 for each disk rather than slice 8. If this is how autoexpand works, as you say, it seems like it should work fine for this. But I don''t know how, or if it is possible to, make it use slice 2. The other way is to make a slice that is the correct size of the volumes as I had them before (262144 bytes less than the size of the disk). It seems like this should cause zpool to prefer to use this slice over slice 8, as it can find all 4 labels, rather than just labels 0 and 1. I don''t know how to go about this either, or if it''s possible. I have been starting to read documentation on slices in solaris but haven''t had time to get far enough to figure out what I need. I also have my doubts about this solving my actual issues - the ones that caused me to be unable to import in zfs-fuse. But I need to solve this issue before I can move forward to figuring out/solving whatever that issue was. txg is the same for every volume. -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/bb31b09b/attachment.html>
On Wed, Feb 17, 2010 at 03:37:59PM -0500, Ethan wrote:> On Wed, Feb 17, 2010 at 15:22, Daniel Carosone <dan at geek.com.au> wrote: > I have not yet successfully imported. I can see two ways of making progress > forward. One is forcing zpool to attempt to import using slice 2 for each > disk rather than slice 8. If this is how autoexpand works, as you say, it > seems like it should work fine for this. But I don''t know how, or if it is > possible to, make it use slice 2.Just get rid of 8? :-) Normally, when using the whole disk, convention is that slice 0 is used, and there''s a small initial offset (for the EFI label). I think you probably want to make a slice 0 that spans the right disk sectors. Were you using some other partitioning inside the truecrypt "disks"? What devices were given to zfs-fuse, and what was their starting offset? You may need to account for that, too. How did you copy the data, and to what target device, on what platform? Perhaps the truecrypt device''s partition table is now at the start of the physical disk, but solaris can''t read it properly? If that''s an MBR partition table (which you look at with fdisk), you could try zdb -l on /dev/dsk/c...p[01234] as well. We''re just guessing here.. to provide more concrete help, you''ll need to show us some of the specifics, both of what you did and what you''ve ended up with. fdisk and format partition tables and zdb -l output would be a good start. Figuring out what is different about the disk where s2 was used would be handy too. That may be a synthetic label because something is missing from that disk that the others have.> The other way is to make a slice that is the correct size of the volumes as > I had them before (262144 bytes less than the size of the disk). It seems > like this should cause zpool to prefer to use this slice over slice 8, as it > can find all 4 labels, rather than just labels 0 and 1. I don''t know how to > go about this either, or if it''s possible. I have been starting to read > documentation on slices in solaris but haven''t had time to get far enough to > figure out what I need.format will let you examine and edit these. Start by making sure they have all the same partitioning, flags, etc.> I also have my doubts about this solving my actual issues - the ones that > caused me to be unable to import in zfs-fuse. But I need to solve this issue > before I can move forward to figuring out/solving whatever that issue was.Yeah - my suspicion is that import -F may help here. That is a pool recovery mode, where it rolls back progressive transactions until it finds one that validates correctly. It was only added recently and is probably missing from the fuse version. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100218/e45b7c04/attachment.bin>
On Thu, Feb 18, 2010 at 08:14:03AM +1100, Daniel Carosone wrote:> I think > you probably want to make a slice 0 that spans the right disk sectors.[..]> you could try zdb -l on /dev/dsk/c...p[01234] as well.Depending on how and what you copied, you may have zfs data that start at sector 0, with no space for any partitioning labels at all. If zdb -l /dev/rdsk/c..p0 shows a full set, this is what has happened. Trying to write partition tables may overwrite some of the zfs labels. zfs won''t import such a pool by default (it doesn''t check those devices). You could cheat, by making a directory with symlinks to the p0 devices, and using import -d, but this will not work at boot. It would be a way to verify current state, so you can then plan next steps. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100218/d1d41b7d/attachment.bin>
On Wed, Feb 17, 2010 at 16:14, Daniel Carosone <dan at geek.com.au> wrote:> On Wed, Feb 17, 2010 at 03:37:59PM -0500, Ethan wrote: > > On Wed, Feb 17, 2010 at 15:22, Daniel Carosone <dan at geek.com.au> wrote: > > I have not yet successfully imported. I can see two ways of making > progress > > forward. One is forcing zpool to attempt to import using slice 2 for each > > disk rather than slice 8. If this is how autoexpand works, as you say, it > > seems like it should work fine for this. But I don''t know how, or if it > is > > possible to, make it use slice 2. > > Just get rid of 8? :-) >That sounds like an excellent idea, but, being very new to opensolaris, I have no idea how to do this. I''m reading through multiboot.solaris-x86.org/iv/3.html at the moment. You mention the ''format'' utility below, which I will read more into.> > Normally, when using the whole disk, convention is that slice 0 is > used, and there''s a small initial offset (for the EFI label). I think > you probably want to make a slice 0 that spans the right disk sectors. > > Were you using some other partitioning inside the truecrypt "disks"? > What devices were given to zfs-fuse, and what was their starting > offset? You may need to account for that, too. How did you copy the > data, and to what target device, on what platform? Perhaps the > truecrypt device''s partition table is now at the start of the physical > disk, but solaris can''t read it properly? If that''s an MBR partition > table (which you look at with fdisk), you could try zdb -l on > /dev/dsk/c...p[01234] as well. >There was no partitioning on the truecrypt disks. The truecrypt volumes occupied the whole raw disks (1500301910016 bytes each). The devices that I gave to the zpool on linux were the whole raw devices that truecrypt exposed (1500301647872 bytes each). There were no partition tables on either the raw disks or the truecrypt volumes, just truecrypt headers on the raw disk and zfs on the truecrypt volumes. I copied the data simply using dd if=/dev/mapper/truecrypt1 of=/dev/sdb on linux, where /dev/mapper/truecrypt1 is the truecrypt volume for one hard disk (which was on /dev/sda) and /dev/sdb is a new blank drive of the same size as the old drive (but slightly larger than the truecrypt volume). And repeat likewise for each of the five drives. The labels 2 and 3 should be on the drives, but they are 262144 bytes further from the end of slice 2 than zpool must be looking. I could create a partition table on each drive, specifying a partition with the size of the truecrypt volume, and re-copy the data onto this partition (would have to re-copy as creating the partition table would overwrite zfs data, as zfs starts at byte 0). Would this be preferable? I was under some impression that zpool devices were preferred to be raw drives, not partitions, but I don''t recall where I came to believe that much less whether it''s at all correct.> > We''re just guessing here.. to provide more concrete help, you''ll need > to show us some of the specifics, both of what you did and what you''ve > ended up with. fdisk and format partition tables and zdb -l output > would be a good start. > > Figuring out what is different about the disk where s2 was used would > be handy too. That may be a synthetic label because something is > missing from that disk that the others have. > > > The other way is to make a slice that is the correct size of the volumes > as > > I had them before (262144 bytes less than the size of the disk). It seems > > like this should cause zpool to prefer to use this slice over slice 8, as > it > > can find all 4 labels, rather than just labels 0 and 1. I don''t know how > to > > go about this either, or if it''s possible. I have been starting to read > > documentation on slices in solaris but haven''t had time to get far enough > to > > figure out what I need. > > format will let you examine and edit these. Start by making sure they > have all the same partitioning, flags, etc. >I will have a look at format, but if this operates on partition tables, well, my disks have none at the moment so I''ll have to remedy that.> > > I also have my doubts about this solving my actual issues - the ones that > > caused me to be unable to import in zfs-fuse. But I need to solve this > issue > > before I can move forward to figuring out/solving whatever that issue > was. > > Yeah - my suspicion is that import -F may help here. That is a pool > recovery mode, where it rolls back progressive transactions until it > finds one that validates correctly. It was only added recently and is > probably missing from the fuse version. > > -- > Dan. > >as for using import -F, I am on snv_111b, which I am not sure has -F for import. I tried to update to the latest dev build (using the instructions at pkg.opensolaris.org/dev/en/index.shtml ) but things are behaving very strangely. I get error messages on boot - "gconf-sanity-check-2 exited with error status 256", and when I dismiss this and go into gnome, terminal is messed up and doesn''t echo anything I type, and I can''t ssh in (error message about not able to allocate a TTY). anyway, zfs mailing list isn''t really the place to be discussing that I suppose. -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/02d0a226/attachment.html>
On Wed, Feb 17, 2010 at 16:25, Daniel Carosone <dan at geek.com.au> wrote:> On Thu, Feb 18, 2010 at 08:14:03AM +1100, Daniel Carosone wrote: > > I think > > you probably want to make a slice 0 that spans the right disk sectors. > [..] > > you could try zdb -l on /dev/dsk/c...p[01234] as well. > > Depending on how and what you copied, you may have zfs data that start > at sector 0, with no space for any partitioning labels at all. If > zdb -l /dev/rdsk/c..p0 shows a full set, this is what has happened. > Trying to write partition tables may overwrite some of the zfs labels. > > zfs won''t import such a pool by default (it doesn''t check those > devices). You could cheat, by making a directory with symlinks to the > p0 devices, and using import -d, but this will not work at boot. It > would be a way to verify current state, so you can then plan next > steps. > > -- > Dan. >It looks like using p0 is exactly what I want, actually. Are s2 and p0 both the entire disk? The idea of symlinking to the full-disk devices from a directory and using -d had crossed my mind, but I wasn''t sure about it. I think that is something worth trying. I''m not too concerned about it not working at boot - I just want to get something working at all, at the moment. -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/621200dc/attachment.html>
On Wed, Feb 17, 2010 at 04:48:23PM -0500, Ethan wrote:> It looks like using p0 is exactly what I want, actually. Are s2 and p0 both > the entire disk?No. s2 depends on there being a solaris partition table (Sun or EFI), and if there''s also an fdisk partition table (disk shared with other OS), s2 will only cover the solaris part of the disk. It also typically doesn''t cover the last 2 cylinders, which solaris calls "reserved" for hysterical raisins.> The idea of symlinking to the full-disk devices from a directory and using > -d had crossed my mind, but I wasn''t sure about it. I think that is > something worth trying.Note, I haven''t tried it either..> I''m not too concerned about it not working at boot - > I just want to get something working at all, at the moment.Yup. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100218/c034df2a/attachment.bin>
On Wed, Feb 17, 2010 at 04:44:19PM -0500, Ethan wrote:> There was no partitioning on the truecrypt disks. The truecrypt volumes > occupied the whole raw disks (1500301910016 bytes each). The devices that I > gave to the zpool on linux were the whole raw devices that truecrypt exposed > (1500301647872 bytes each). There were no partition tables on either the raw > disks or the truecrypt volumes, just truecrypt headers on the raw disk and > zfs on the truecrypt volumes. > I copied the data simply using > > dd if=/dev/mapper/truecrypt1 of=/dev/sdbOk, then as you noted, you want to start with the ..p0 device, as the equivalent.> The labels 2 and 3 should be on the drives, but they are 262144 bytes > further from the end of slice 2 than zpool must be looking.I don''t think so.. They''re found by counting from the start; the end can move out further (LUN expansion), and with autoexpand the vdev can be extended (adding metaslabs) and the labels will be rewritten at the new end after the last metaslab. I think the issue is that there are no partitions on the devices that allow import to read that far. Fooling it into using p0 would work around this.> I could create a partition table on each drive, specifying a partition with > the size of the truecrypt volume, and re-copy the data onto this partition > (would have to re-copy as creating the partition table would overwrite zfs > data, as zfs starts at byte 0). Would this be preferable?Eventually, probably, yes - once you''ve confirmed all the speculation, gotten past the partitioning issue to whatever other damage is in the pool, resolved that, and have some kind of access to your data. There are other options as well, including using replace one at a time, or send|recv.> I was under some > impression that zpool devices were preferred to be raw drives, not > partitions, but I don''t recall where I came to believe that much less > whether it''s at all correct.Sort of. zfs commands can be handed bare disks; internally they put EFI labels on them automatically (though evidently not the fuse variants). ZFS mostly makes partitioning go away (hooray), but it still becomes important in cases like this - shared disks and migration between operating systems.> as for using import -F, I am on snv_111b, which I am not sure has -F for > import.Nope.> I tried to update to the latest dev build (using the instructions > at pkg.opensolaris.org/dev/en/index.shtml ) but things are behaving > very strangely. I get error messages on boot - "gconf-sanity-check-2 exited > with error status 256", and when I dismiss this and go into gnome, terminal > is messed up and doesn''t echo anything I type, and I can''t ssh in (error > message about not able to allocate a TTY). anyway, zfs mailing list isn''t > really the place to be discussing that I suppose.Not really, but read the release notes. Alternately, if this is a new machine, you could just reinstall (or boot from) a current livecd/usb, download from genunix.org -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100218/bbce07b4/attachment.bin>
On Wed, Feb 17, 2010 at 17:44, Daniel Carosone <dan at geek.com.au> wrote:> On Wed, Feb 17, 2010 at 04:48:23PM -0500, Ethan wrote: > > It looks like using p0 is exactly what I want, actually. Are s2 and p0 > both > > the entire disk? > > No. s2 depends on there being a solaris partition table (Sun or EFI), > and if there''s also an fdisk partition table (disk shared with other > OS), s2 will only cover the solaris part of the disk. It also > typically doesn''t cover the last 2 cylinders, which solaris calls > "reserved" for hysterical raisins. > > > The idea of symlinking to the full-disk devices from a directory and > using > > -d had crossed my mind, but I wasn''t sure about it. I think that is > > something worth trying. > > Note, I haven''t tried it either.. > > > I''m not too concerned about it not working at boot - > > I just want to get something working at all, at the moment. > > Yup. > > -- > Dan.Success! I made a directory and symlinked p0''s for all the disks: ethan at save:~/qdsk# ls -al total 13 drwxr-xr-x 2 root root 7 Feb 17 23:06 . drwxr-xr-x 21 ethan staff 31 Feb 17 14:16 .. lrwxrwxrwx 1 root root 17 Feb 17 23:06 c9t0d0p0 -> /dev/dsk/c9t0d0p0 lrwxrwxrwx 1 root root 17 Feb 17 23:06 c9t1d0p0 -> /dev/dsk/c9t1d0p0 lrwxrwxrwx 1 root root 17 Feb 17 23:06 c9t2d0p0 -> /dev/dsk/c9t2d0p0 lrwxrwxrwx 1 root root 17 Feb 17 23:06 c9t4d0p0 -> /dev/dsk/c9t4d0p0 lrwxrwxrwx 1 root root 17 Feb 17 23:06 c9t5d0p0 -> /dev/dsk/c9t5d0p0 I attempt to import using -d: ethan at save:~/qdsk# zpool import -d . pool: q id: 5055543090570728034 state: ONLINE status: The pool is formatted using an older on-disk version. action: The pool can be imported using its name or numeric identifier, though some features will not be available without an explicit ''zpool upgrade''. config: q ONLINE raidz1 ONLINE /export/home/ethan/qdsk/c9t4d0p0 ONLINE /export/home/ethan/qdsk/c9t5d0p0 ONLINE /export/home/ethan/qdsk/c9t2d0p0 ONLINE /export/home/ethan/qdsk/c9t1d0p0 ONLINE /export/home/ethan/qdsk/c9t0d0p0 ONLINE The pool is not imported. This does look promising though. I attempt to import using the name: ethan at save:~/qdsk# zpool import -d . q it sits there for a while. I worry that it''s going to hang forever like it did on linux. but then it returns! ethan at save:~/qdsk# zpool status pool: q state: ONLINE status: The pool is formatted using an older on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using ''zpool upgrade''. Once this is done, the pool will no longer be accessible on older software versions. scrub: scrub in progress for 0h2m, 0.43% done, 8h57m to go config: NAME STATE READ WRITE CKSUM q ONLINE 0 0 0 raidz1 ONLINE 0 0 0 /export/home/ethan/qdsk/c9t4d0p0 ONLINE 0 0 0 /export/home/ethan/qdsk/c9t5d0p0 ONLINE 0 0 0 /export/home/ethan/qdsk/c9t2d0p0 ONLINE 0 0 0 /export/home/ethan/qdsk/c9t1d0p0 ONLINE 0 0 0 /export/home/ethan/qdsk/c9t0d0p0 ONLINE 0 0 0 errors: No known data errors All the filesystems are there, all the files are there. Life is good. Thank you all so much. -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/90e95454/attachment.html>
On Wed, Feb 17, 2010 at 06:15:25PM -0500, Ethan wrote:> Success!Awesome. Let that scrub finish before celebrating completely, but this looks like a good place to stop and consider what you want for an end state. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100218/c258313a/attachment.bin>
On Wed, Feb 17, 2010 at 18:24, Daniel Carosone <dan at geek.com.au> wrote:> On Wed, Feb 17, 2010 at 06:15:25PM -0500, Ethan wrote: > > Success! > > Awesome. Let that scrub finish before celebrating completely, but > this looks like a good place to stop and consider what you want for an > end state. > > -- > Dan. >True. Thinking about where to end up - I will be staying on opensolaris despite having no truecrypt. My paranoia likes having encryption, but it''s not really necessary for me, and it looks like encryption will be coming to zfs itself soon enough. So, no need to consider getting things working on zfs-fuse again. I should have a partition table, for one thing, I suppose. The partition table is EFI GUID Partition Table, looking at the relevant documentation. So, I''ll need to somehow shift my zfs data down by 17408 bytes (34 512-byte LBA''s, the size of the GPT''s stuff at the beginning of the disk) - perhaps just by copying from the truecrypt volumes as I did before, but with offset of 17408 bytes. Then I should be able to use format to make the correct partition information, and use the s0 partition for each drive as seems to be the standard way of doing things. Or maybe I can format (write the GPT) first, then get linux to recognize the GPT, and copy from truecrypt into the partition. Does that sound correct / sensible? Am I missing or mistaking anything? Thanks, -Ethan PS: scrub in progress for 4h4m, 65.43% done, 2h9m to go - no errors yet. Looking good. -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/de30929b/attachment.html>
On Wed, 17 Feb 2010, Ethan wrote:> > I should have a partition table, for one thing, I suppose. The partition table is EFI?GUID Partition > Table, looking at the relevant documentation. So, I''ll need to somehow shift my zfs data down by?17408 > bytes (34 512-byte LBA''s, the size of the GPT''s stuff at the beginning of the disk) - perhaps just by > > Does that sound correct / sensible? Am I missing or mistaking anything??It seems to me that you could also use the approach of ''zpool replace'' for each device in turn until all of the devices are re-written to normal Solaris/zfs defaults. This would also allows you to expand the partition size a bit for a larger pool. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, simplesystems.org/users/bfriesen GraphicsMagick Maintainer, GraphicsMagick.org
On Wed, Feb 17, 2010 at 23:21, Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:> On Wed, 17 Feb 2010, Ethan wrote: > >> >> I should have a partition table, for one thing, I suppose. The partition >> table is EFI GUID Partition >> Table, looking at the relevant documentation. So, I''ll need to somehow >> shift my zfs data down by 17408 >> bytes (34 512-byte LBA''s, the size of the GPT''s stuff at the beginning of >> the disk) - perhaps just by >> >> Does that sound correct / sensible? Am I missing or mistaking anything? >> > > It seems to me that you could also use the approach of ''zpool replace'' for > each device in turn until all of the devices are re-written to normal > Solaris/zfs defaults. This would also allows you to expand the partition > size a bit for a larger pool. > > Bob > -- > Bob Friesenhahn > bfriesen at simple.dallas.tx.us, simplesystems.org/users/bfriesen > GraphicsMagick Maintainer, GraphicsMagick.orgThat is true. It seems like it then have to rebuild from parity for every drive, though, which I think would take rather a long while, wouldn''t it? I could put in a new drive to overwrite. Then the replace command would just copy from the old drive rather than rebuilding from parity (I think? that seems like the sensible thing for it to do, anyway.) But I don''t have a spare drive for this - I have the original drives that still contain the truecrypt volumes, but I am disinclined to start overwriting these, this episode having given me a healthy paranoia about having good backups. I guess this question just comes down to weighing whether rebuilding each from parity or re-copying from the truecrypt volumes to a different offset is more of a hassle. -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100217/e6e4b184/attachment.html>
Create a new empty pool on the solaris system, let it format the disks etc ie used the disk names cXtXd0 This should put the EFI label on the disks and then setup the partitions for you. Just encase here is an example. Go back to the Linux box, and see if you can use tools to see the same partition layout, if you can then dd it to the currect spot which in Solaris c5t2d0s0. (zfs send|zfs recv would be easier) -bash-4.0$ pfexec fdisk -R -W - /dev/rdsk/c5t2d0p0 * /dev/rdsk/c5t2d0p0 default fdisk table * Dimensions: * 512 bytes/sector * 126 sectors/track * 255 tracks/cylinder * 60800 cylinders * * systid: * 1: DOSOS12 * 238: EFI_PMBR * 239: EFI_FS * * Id Act Bhead Bsect Bcyl Ehead Esect Ecyl Rsect Numsect 238 0 255 63 1023 255 63 1023 1 1953525167 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -bash-4.0$ pfexec prtvtoc /dev/rdsk/c5t2d0 * /dev/rdsk/c5t2d0 partition map * * Dimensions: * 512 bytes/sector * 1953525168 sectors * 1953525101 accessible sectors * * Flags: * 1: unmountable * 10: read-only * * Unallocated space: * First Sector Last * Sector Count Sector * 34 222 255 * * First Sector Last * Partition Tag Flags Sector Count Sector Mount Directory 0 4 00 256 1953508495 1953508750 8 11 00 1953508751 16384 1953525134 -- This message posted from opensolaris.org
On Wed, Feb 17, 2010 at 11:37:54PM -0500, Ethan wrote:> > It seems to me that you could also use the approach of ''zpool replace'' for > That is true. It seems like it then have to rebuild from parity for every > drive, though, which I think would take rather a long while, wouldn''t it?No longer than copying - plus, it will only resilver active data, so unless the pool is close to full it could save some time. Certainly it will save some hassle and risk of error, plugging and swapping drives between machines more times. As a further benefit, all this work will count towards a qualification cycle for the current hardware setup. I would recommend using replace, one drive at a time. Since you still have the original drives to fall back on, you can do this now (before making more changes to the pool with new data) without being overly worried about a second failure killing your raidz1 pool. Normally, when doing replacements like this on a singly-redundant pool, it''s a good idea to run a scrub after each replace, making sure everything you just wrote is valid before relying on it to resilver the next disk. If you''re keen on copying, I''d suggest doing over the network; that way your write target is a system that knows the target partitioning and there''s no (mis)caclulation of offsets. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100218/b9b6d64b/attachment.bin>
On Thu, Feb 18, 2010 at 04:14, Daniel Carosone <dan at geek.com.au> wrote:> On Wed, Feb 17, 2010 at 11:37:54PM -0500, Ethan wrote: > > > It seems to me that you could also use the approach of ''zpool replace'' > for > > That is true. It seems like it then have to rebuild from parity for every > > drive, though, which I think would take rather a long while, wouldn''t it? > > No longer than copying - plus, it will only resilver active data, so > unless the pool is close to full it could save some time. Certainly > it will save some hassle and risk of error, plugging and swapping drives > between machines more times. As a further benefit, all this work will > count towards a qualification cycle for the current hardware setup. > > I would recommend using replace, one drive at a time. Since you still > have the original drives to fall back on, you can do this now (before > making more changes to the pool with new data) without being overly > worried about a second failure killing your raidz1 pool. Normally, > when doing replacements like this on a singly-redundant pool, it''s a > good idea to run a scrub after each replace, making sure everything > you just wrote is valid before relying on it to resilver the next > disk. > > If you''re keen on copying, I''d suggest doing over the network; that > way your write target is a system that knows the target partitioning > and there''s no (mis)caclulation of offsets. > > -- > Dan.These are good points - it sounds like replacing one at a time is the way to go. Thanks for pointing out these benefits. Although I do notice that right now, it imports just fine using the p0 devices using just `zpool import q`, no longer having to use import -d with the directory of symlinks to p0 devices. I guess this has to do with having repaired the labels and such? Or whatever it''s repaired having successfully imported and scrubbed. After the scrub finished, this is the state of my pool: # zpool status pool: q state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: sun.com/msg/ZFS-8000-9P scrub: scrub completed after 7h18m with 0 errors on Thu Feb 18 06:25:44 2010 config: NAME STATE READ WRITE CKSUM q DEGRADED 0 0 0 raidz1 DEGRADED 0 0 0 /export/home/ethan/qdsk/c9t4d0p0 ONLINE 0 0 0 /export/home/ethan/qdsk/c9t5d0p0 ONLINE 0 0 0 /export/home/ethan/qdsk/c9t2d0p0 ONLINE 0 0 0 /export/home/ethan/qdsk/c9t1d0p0 DEGRADED 4 0 60 too many errors /export/home/ethan/qdsk/c9t0d0p0 ONLINE 0 0 0 errors: No known data errors I have no idea what happened to the one disk, but "No known data errors" is what makes me happy. I''m not sure if I should be concerned about the physical disk itself, or just assume that some data got screwed up with all this mess. I guess maybe I''ll see how the disk behaves during the replace operations (restoring to it and then restoring from it four times seems like a pretty good test of it), and if it continues to error, replace the physical drive and if necessary restore from the original truecrypt volumes. So, current plan: - export the pool. - format c9t1d0 to have one slice being the entire disk. - import. should be degraded, missing c9t1d0p0. - replace missing c9t1d0p0 with c9t1d0 (should this be c9t1d0s0? my understanding is that zfs will treat the two about the same, since it adds the partition table to raw devices if that''s what it''s given and ends up using s0 anyway) - wait for resilver. - repeat with the other four disks. Sound good? -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100218/09410fa5/attachment.html>
Hi Ethan, Great job putting this pool back together... I would agree with the disk-by-disk replacement by using the zpool replace command. You can read about this command here: docs.sun.com/app/docs/doc/817-2271/gazgd?a=view Having a recent full backup of your data before making any more changes is always recommended. You might be able to figure out if c9t1d0p0 is a failing disk by checking the fmdump -eV output but with the changing devices, it might be difficult to isolate with gobs of output. Also, if you are using whole disks, then use the c9t*d* designations. The s* designations are unnecessary for whole disks and building pools with p* devices isn''t recommended. Thanks, Cindy On 02/18/10 10:42, Ethan wrote:> On Thu, Feb 18, 2010 at 04:14, Daniel Carosone <dan at geek.com.au > <mailto:dan at geek.com.au>> wrote: > > On Wed, Feb 17, 2010 at 11:37:54PM -0500, Ethan wrote: > > > It seems to me that you could also use the approach of ''zpool > replace'' for > > That is true. It seems like it then have to rebuild from parity > for every > > drive, though, which I think would take rather a long while, > wouldn''t it? > > No longer than copying - plus, it will only resilver active data, so > unless the pool is close to full it could save some time. Certainly > it will save some hassle and risk of error, plugging and swapping drives > between machines more times. As a further benefit, all this work will > count towards a qualification cycle for the current hardware setup. > > I would recommend using replace, one drive at a time. Since you still > have the original drives to fall back on, you can do this now (before > making more changes to the pool with new data) without being overly > worried about a second failure killing your raidz1 pool. Normally, > when doing replacements like this on a singly-redundant pool, it''s a > good idea to run a scrub after each replace, making sure everything > you just wrote is valid before relying on it to resilver the next > disk. > > If you''re keen on copying, I''d suggest doing over the network; that > way your write target is a system that knows the target partitioning > and there''s no (mis)caclulation of offsets. > > -- > Dan. > > > > These are good points - it sounds like replacing one at a time is the > way to go. Thanks for pointing out these benefits. > Although I do notice that right now, it imports just fine using the p0 > devices using just `zpool import q`, no longer having to use import -d > with the directory of symlinks to p0 devices. I guess this has to do > with having repaired the labels and such? Or whatever it''s repaired > having successfully imported and scrubbed. > After the scrub finished, this is the state of my pool: > > > # zpool status > pool: q > state: DEGRADED > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: sun.com/msg/ZFS-8000-9P > scrub: scrub completed after 7h18m with 0 errors on Thu Feb 18 06:25:44 > 2010 > config: > > NAME STATE READ WRITE CKSUM > q DEGRADED 0 0 0 > raidz1 DEGRADED 0 0 0 > /export/home/ethan/qdsk/c9t4d0p0 ONLINE 0 0 0 > /export/home/ethan/qdsk/c9t5d0p0 ONLINE 0 0 0 > /export/home/ethan/qdsk/c9t2d0p0 ONLINE 0 0 0 > /export/home/ethan/qdsk/c9t1d0p0 DEGRADED 4 0 > 60 too many errors > /export/home/ethan/qdsk/c9t0d0p0 ONLINE 0 0 0 > > errors: No known data errors > > > I have no idea what happened to the one disk, but "No known data errors" > is what makes me happy. I''m not sure if I should be concerned about the > physical disk itself, or just assume that some data got screwed up with > all this mess. I guess maybe I''ll see how the disk behaves during the > replace operations (restoring to it and then restoring from it four > times seems like a pretty good test of it), and if it continues to > error, replace the physical drive and if necessary restore from the > original truecrypt volumes. > > So, current plan: > - export the pool. > - format c9t1d0 to have one slice being the entire disk. > - import. should be degraded, missing c9t1d0p0. > - replace missing c9t1d0p0 with c9t1d0 (should this be c9t1d0s0? my > understanding is that zfs will treat the two about the same, since it > adds the partition table to raw devices if that''s what it''s given and > ends up using s0 anyway) > - wait for resilver. > - repeat with the other four disks. > > Sound good? > > -Ethan > > > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > mail.opensolaris.org/mailman/listinfo/zfs-discuss
Ethan wrote:> So, current plan: > - export the pool. > - format c9t1d0 to have one slice being the entire disk. > - import. should be degraded, missing c9t1d0p0. > - replace missing c9t1d0p0 with c9t1d0 (should this be c9t1d0s0? my > understanding is that zfs will treat the two about the same, since it > adds the partition table to raw devices if that''s what it''s given and > ends up using s0 anyway) > - wait for resilver. > - repeat with the other four disks. > > Sound good?Almost. You can run into issue with size - slice 0 on EFI-labeled (whole-) disk may not be sufficient to replace disk in your raidz1. regards, victor
On Thu, Feb 18, 2010 at 13:22, Victor Latushkin <Victor.Latushkin at sun.com>wrote:> Ethan wrote: > >> So, current plan: >> - export the pool. >> - format c9t1d0 to have one slice being the entire disk. >> - import. should be degraded, missing c9t1d0p0. >> - replace missing c9t1d0p0 with c9t1d0 (should this be c9t1d0s0? my >> understanding is that zfs will treat the two about the same, since it adds >> the partition table to raw devices if that''s what it''s given and ends up >> using s0 anyway) >> - wait for resilver. >> - repeat with the other four disks. >> >> Sound good? >> > > Almost. You can run into issue with size - slice 0 on EFI-labeled (whole-) > disk may not be sufficient to replace disk in your raidz1. > > regards, > victor >This should be okay, I think. The overhead from truecrypt was 262144 bytes, so I have that much to spare on the non-truecrypted disks. An EFI GPT is 34 512-byte LBAs at each end, or 34816 bytes total. So there should be plenty of room. -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100218/c25675cc/attachment.html>
Ethan wrote:> On Thu, Feb 18, 2010 at 13:22, Victor Latushkin > <Victor.Latushkin at sun.com <mailto:Victor.Latushkin at sun.com>> wrote: > > Ethan wrote: > > So, current plan: > - export the pool. > - format c9t1d0 to have one slice being the entire disk. > - import. should be degraded, missing c9t1d0p0. > - replace missing c9t1d0p0 with c9t1d0 (should this be c9t1d0s0? > my understanding is that zfs will treat the two about the same, > since it adds the partition table to raw devices if that''s what > it''s given and ends up using s0 anyway) > - wait for resilver. > - repeat with the other four disks. > > Sound good? > > > Almost. You can run into issue with size - slice 0 on EFI-labeled > (whole-) disk may not be sufficient to replace disk in your raidz1. > > regards, > victor > > > This should be okay, I think. The overhead from truecrypt was 262144 > bytes, so I have that much to spare on the non-truecrypted disks. An EFI > GPT is 34 512-byte LBAs at each end, or 34816 bytes total. So there > should be plenty of room.By default ZFS creates s0 on EFI-labeled disk at offset of 256 sectors from the beginning of disk. Also there''s 8MB reserved partition number 8.
On Thu, Feb 18, 2010 at 12:42:58PM -0500, Ethan wrote:> On Thu, Feb 18, 2010 at 04:14, Daniel Carosone <dan at geek.com.au> wrote: > Although I do notice that right now, it imports just fine using the p0 > devices using just `zpool import q`, no longer having to use import -d with > the directory of symlinks to p0 devices. I guess this has to do with having > repaired the labels and such? Or whatever it''s repaired having successfully > imported and scrubbed.It''s the zpool.cache file at work, storing extra copies of labels with corrected device paths. For curiosity''s sake, what happens when you remove (rename) your dir with the symlinks?> After the scrub finished, this is the state of my pool: > /export/home/ethan/qdsk/c9t1d0p0 DEGRADED 4 0 60 > too many errorsIck. Note that there are device errors as well as content (checksum) errors, which means it''s can''t only be correctly-copied damage from your orignal pool that was having problems. zpool clear and rescrub, for starters, and see if they continue. I suggest also: - carefully checking and reseating cables, etc - taking backups now of anything you really wanted out of the pool, while it''s still available. - choosing that disk as the first to replace, and scrubbing again after replacing onto it, perhaps twice. - doing a dd to overwrite that entire disk with random data and let it remap bad sectors, before the replace (not just zeros, and not just the sectors a zfs resilver would hit. openssl enc of /dev/zero with a lightweight cipher and whatever key; for extra caution read back and compare with a second openssl stream using the same key) - being generally very watchful and suspicious of that disk in particular, look at error logs for clues, etc. - being very happy that zfs deals so well with all this abuse, and you know your data is ok.> I have no idea what happened to the one disk, but "No known data errors" is > what makes me happy. I''m not sure if I should be concerned about the > physical disk itselfgiven that it''s reported disk errors as well as damaged content, yes.> or just assume that some data got screwed up with all > this mess. I guess maybe I''ll see how the disk behaves during the replace > operations (restoring to it and then restoring from it four times seems like > a pretty good test of it), and if it continues to error, replace the > physical drive and if necessary restore from the original truecrypt volumes.Good plan; note the extra scrubs at key points in the process above.> So, current plan: > - export the pool.shouldn''t be needed; zpool offline <dev> would be enough> - format c9t1d0 to have one slice being the entire disk.Might not have been needed, but given Victor''s comments about reserved space, you may need to do this manually, yes. Be sure to use EFI labels. Pick the suspect disk first.> - import. should be degraded, missing c9t1d0p0.no need if you didn''t export> - replace missing c9t1d0p0 with c9t1d0yup, or if you''ve manually partitioned you may need to mention the slice number to prevent it repartitioning with the default reserved space again. You may even need to use some other slice (s5 or whatever), but I don''t think so.> - wait for resilver. > - repeat with the other four disks.- tell us how it went - drink beer. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100219/d14bace2/attachment.bin>
On Thu, Feb 18, 2010 at 15:31, Daniel Carosone <dan at geek.com.au> wrote:> On Thu, Feb 18, 2010 at 12:42:58PM -0500, Ethan wrote: > > On Thu, Feb 18, 2010 at 04:14, Daniel Carosone <dan at geek.com.au> wrote: > > Although I do notice that right now, it imports just fine using the p0 > > devices using just `zpool import q`, no longer having to use import -d > with > > the directory of symlinks to p0 devices. I guess this has to do with > having > > repaired the labels and such? Or whatever it''s repaired having > successfully > > imported and scrubbed. > > It''s the zpool.cache file at work, storing extra copies of labels with > corrected device paths. For curiosity''s sake, what happens when you > remove (rename) your dir with the symlinks? >I''ll let you know when the current scrub finishes.> > > After the scrub finished, this is the state of my pool: > > /export/home/ethan/qdsk/c9t1d0p0 DEGRADED 4 0 60 > > too many errors > > Ick. Note that there are device errors as well as content (checksum) > errors, which means it''s can''t only be correctly-copied damage from > your orignal pool that was having problems. > > zpool clear and rescrub, for starters, and see if they continue. >Doing that now.> > I suggest also: > - carefully checking and reseating cables, etc > - taking backups now of anything you really wanted out of the pool, > while it''s still available. > - choosing that disk as the first to replace, and scrubbing again > after replacing onto it, perhaps twice. > - doing a dd to overwrite that entire disk with random data and let > it remap bad sectors, before the replace (not just zeros, and not > just the sectors a zfs resilver would hit. openssl enc of /dev/zero > with a lightweight cipher and whatever key; for extra caution read > back and compare with a second openssl stream using the same key) > - being generally very watchful and suspicious of that disk in > particular, look at error logs for clues, etc. >Very thorough. I have no idea how to do that with openssl, but I will look into learning this.> - being very happy that zfs deals so well with all this abuse, and > you know your data is ok. >Yes indeed - very happy.> > > I have no idea what happened to the one disk, but "No known data errors" > is > > what makes me happy. I''m not sure if I should be concerned about the > > physical disk itself > > given that it''s reported disk errors as well as damaged content, yes. >Okay. Well, it''s a brand-new disk and I can exchange it easily enough.> > > or just assume that some data got screwed up with all > > this mess. I guess maybe I''ll see how the disk behaves during the replace > > operations (restoring to it and then restoring from it four times seems > like > > a pretty good test of it), and if it continues to error, replace the > > physical drive and if necessary restore from the original truecrypt > volumes. > > Good plan; note the extra scrubs at key points in the process above. >Will do. Thanks for the tip.> > > So, current plan: > > - export the pool. > > shouldn''t be needed; zpool offline <dev> would be enough > > > - format c9t1d0 to have one slice being the entire disk. > > Might not have been needed, but given Victor''s comments about reserved > space, you may need to do this manually, yes. Be sure to use EFI > labels. Pick the suspect disk first. > > > - import. should be degraded, missing c9t1d0p0. > > no need if you didn''t export > > > - replace missing c9t1d0p0 with c9t1d0 > > yup, or if you''ve manually partitioned you may need to mention the > slice number to prevent it repartitioning with the default reserved > space again. You may even need to use some other slice (s5 or > whatever), but I don''t think so. > > > - wait for resilver. > > - repeat with the other four disks. > > - tell us how it went > - drink beer. > > -- > Dan.Okay. Plan is updated to reflect your suggestions. Beer was already in the plan, but I forgot to list it. Speaking of which, I see your e-mail address is .au, but if you''re ever in new york city I''d love to buy you a beer as thanks for all your excellent help with this. And anybody else in this thread - you guys are awesome. -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100218/152e0b42/attachment.html>
On Thu, Feb 18, 2010 at 16:03, Ethan <notethan at gmail.com> wrote:> On Thu, Feb 18, 2010 at 15:31, Daniel Carosone <dan at geek.com.au> wrote: > >> On Thu, Feb 18, 2010 at 12:42:58PM -0500, Ethan wrote: >> > On Thu, Feb 18, 2010 at 04:14, Daniel Carosone <dan at geek.com.au> wrote: >> > Although I do notice that right now, it imports just fine using the p0 >> > devices using just `zpool import q`, no longer having to use import -d >> with >> > the directory of symlinks to p0 devices. I guess this has to do with >> having >> > repaired the labels and such? Or whatever it''s repaired having >> successfully >> > imported and scrubbed. >> >> It''s the zpool.cache file at work, storing extra copies of labels with >> corrected device paths. For curiosity''s sake, what happens when you >> remove (rename) your dir with the symlinks? >> > > I''ll let you know when the current scrub finishes. > > >> >> > After the scrub finished, this is the state of my pool: >> > /export/home/ethan/qdsk/c9t1d0p0 DEGRADED 4 0 60 >> > too many errors >> >> Ick. Note that there are device errors as well as content (checksum) >> errors, which means it''s can''t only be correctly-copied damage from >> your orignal pool that was having problems. >> >> zpool clear and rescrub, for starters, and see if they continue. >> > > Doing that now. > > >> >> I suggest also: >> - carefully checking and reseating cables, etc >> - taking backups now of anything you really wanted out of the pool, >> while it''s still available. >> - choosing that disk as the first to replace, and scrubbing again >> after replacing onto it, perhaps twice. >> - doing a dd to overwrite that entire disk with random data and let >> it remap bad sectors, before the replace (not just zeros, and not >> just the sectors a zfs resilver would hit. openssl enc of /dev/zero >> with a lightweight cipher and whatever key; for extra caution read >> back and compare with a second openssl stream using the same key) >> - being generally very watchful and suspicious of that disk in >> particular, look at error logs for clues, etc. >> > > Very thorough. I have no idea how to do that with openssl, but I will look > into learning this. > > >> - being very happy that zfs deals so well with all this abuse, and >> you know your data is ok. >> > > Yes indeed - very happy. > > >> >> > I have no idea what happened to the one disk, but "No known data errors" >> is >> > what makes me happy. I''m not sure if I should be concerned about the >> > physical disk itself >> >> given that it''s reported disk errors as well as damaged content, yes. >> > > Okay. Well, it''s a brand-new disk and I can exchange it easily enough. > > >> >> > or just assume that some data got screwed up with all >> > this mess. I guess maybe I''ll see how the disk behaves during the >> replace >> > operations (restoring to it and then restoring from it four times seems >> like >> > a pretty good test of it), and if it continues to error, replace the >> > physical drive and if necessary restore from the original truecrypt >> volumes. >> >> Good plan; note the extra scrubs at key points in the process above. >> > > Will do. Thanks for the tip. > > >> >> > So, current plan: >> > - export the pool. >> >> shouldn''t be needed; zpool offline <dev> would be enough >> >> > - format c9t1d0 to have one slice being the entire disk. >> >> Might not have been needed, but given Victor''s comments about reserved >> space, you may need to do this manually, yes. Be sure to use EFI >> labels. Pick the suspect disk first. >> >> > - import. should be degraded, missing c9t1d0p0. >> >> no need if you didn''t export >> >> > - replace missing c9t1d0p0 with c9t1d0 >> >> yup, or if you''ve manually partitioned you may need to mention the >> slice number to prevent it repartitioning with the default reserved >> space again. You may even need to use some other slice (s5 or >> whatever), but I don''t think so. >> >> > - wait for resilver. >> > - repeat with the other four disks. >> >> - tell us how it went >> - drink beer. >> >> -- >> Dan. > > > Okay. Plan is updated to reflect your suggestions. Beer was already in the > plan, but I forgot to list it. Speaking of which, I see your e-mail address > is .au, but if you''re ever in new york city I''d love to buy you a beer as > thanks for all your excellent help with this. And anybody else in this > thread - you guys are awesome. > > -Ethan > >Update: I''m stuck. Again. To answer "For curiosity''s sake, what happens when you remove (rename) your dir with the symlinks?": it finds the devices on p0 with no problem, with the symlinks directory deleted. After clearing the errors and scrubbing again, no errors were encountered in the second scrub. Then I offlined the disk which had errors in the first scrub. I followed the suggestion to thoroughly test the disk (and remap any bad sectors), filling it with random-looking data by encrypting /dev/zero. Reading back and decrypting the drive, it all read back as zeros - all good. I then checked the SMART status of the drive, which had 0 error rates for everything. I ran the several-hour "extended self-test", whatever that does, after which I had two write errors on one drive which weren''t there before. I believe it''s the same drive that had the zfs errors, but I did the SMART stuff in linux, not being able to find SMART tools in solaris, and I haven''t been able to figure out which drive is which. Is there a way to get a drive''s serial number in solaris? I could identify it by that. I scrubbed again with the pool degraded. No errors. NAME STATE READ WRITE CKSUM q ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c9t4d0p0 ONLINE 0 0 0 c9t5d0p0 ONLINE 0 0 0 c9t2d0p0 ONLINE 0 0 0 3763020893739678459 UNAVAIL 0 0 0 was /dev/dsk/c9t1d0p0 c9t0d0p0 ONLINE 0 0 0 errors: No known data errors I tried zpool replace on the drive. # zpool replace q 3763020893739678459 c9t1d0 cannot replace 3763020893739678459 with c9t1d0: device is too small Victor was right. I went into ''format'' and fought with it for a while. Moving the beginning of slice 0 from block 256 down to block 34 was simple enough, but I can not figure out how to tell it I don''t want 8MB in slice 8. Is it even possible? I haven''t got 8MB to spare (as silly as that sounds for a 1.5TB drive) - if I can''t get rid of slice 8, I may have to stick with using p0''s. I haven''t encountered a problem using them so far (who needs partition tables anyway?) but I figured I''d ask if anybody had ideas about getting back that space. What''s the 8MB for, anyway? Some stuff seems to indicate that it has to do with booting the drive, but this will never be a boot drive. That seems to be for VTOC stuff, not EFI, though. I did look at switching to VTOC labels, but it seems they don''t support disks as large as I am using, so I think that''s out. I also see "Information that was stored in the alternate cylinders area, the last two cylinders of the disk, is now stored in slice 8." ( docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWaadm/SYSADV1/p117.html) Not sure what an "alternate cylinders area" is - it sounds sort of like remapping bad sectors, but that''s something that the disk does on its own. So, can I get the 8MB back? Should I use p0? Is there another option I''m not thinking of? (I could always try diving into the EFI label with a hex editor and set it the way I please with no silly slice 8) -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100221/daf9852d/attachment.html>
On Sun, Feb 21, 2010 at 12:41, Ethan <notethan at gmail.com> wrote:> > Update: I''m stuck. Again. > > To answer "For curiosity''s sake, what happens when you remove (rename) your > dir with the symlinks?": it finds the devices on p0 with no problem, with > the symlinks directory deleted. > > After clearing the errors and scrubbing again, no errors were encountered > in the second scrub. Then I offlined the disk which had errors in the first > scrub. > > I followed the suggestion to thoroughly test the disk (and remap any bad > sectors), filling it with random-looking data by encrypting /dev/zero. > Reading back and decrypting the drive, it all read back as zeros - all > good. > > I then checked the SMART status of the drive, which had 0 error rates for > everything. I ran the several-hour "extended self-test", whatever that does, > after which I had two write errors on one drive which weren''t there before. > I believe it''s the same drive that had the zfs errors, but I did the SMART > stuff in linux, not being able to find SMART tools in solaris, and I haven''t > been able to figure out which drive is which. Is there a way to get a > drive''s serial number in solaris? I could identify it by that. > > I scrubbed again with the pool degraded. No errors. > > NAME STATE READ WRITE CKSUM > q ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > c9t4d0p0 ONLINE 0 0 0 > c9t5d0p0 ONLINE 0 0 0 > c9t2d0p0 ONLINE 0 0 0 > 3763020893739678459 UNAVAIL 0 0 0 was > /dev/dsk/c9t1d0p0 > c9t0d0p0 ONLINE 0 0 0 > > errors: No known data errors > > I tried zpool replace on the drive. > # zpool replace q 3763020893739678459 c9t1d0 > cannot replace 3763020893739678459 with c9t1d0: device is too small > > Victor was right. I went into ''format'' and fought with it for a while. > Moving the beginning of slice 0 from block 256 down to block 34 was simple > enough, but I can not figure out how to tell it I don''t want 8MB in slice 8. > Is it even possible? I haven''t got 8MB to spare (as silly as that sounds for > a 1.5TB drive) - if I can''t get rid of slice 8, I may have to stick with > using p0''s. I haven''t encountered a problem using them so far (who needs > partition tables anyway?) but I figured I''d ask if anybody had ideas about > getting back that space. > What''s the 8MB for, anyway? Some stuff seems to indicate that it has to do > with booting the drive, but this will never be a boot drive. That seems to > be for VTOC stuff, not EFI, though. I did look at switching to VTOC labels, > but it seems they don''t support disks as large as I am using, so I think > that''s out. > I also see "Information that was stored in the alternate cylinders area, > the last two cylinders of the disk, is now stored in slice 8." ( > docsun.cites.uiuc.edu/sun_docs/C/solaris_9/SUNWaadm/SYSADV1/p117.html) > Not sure what an "alternate cylinders area" is - it sounds sort of like > remapping bad sectors, but that''s something that the disk does on its own. > > So, can I get the 8MB back? Should I use p0? Is there another option I''m > not thinking of? (I could always try diving into the EFI label with a hex > editor and set it the way I please with no silly slice 8) > > -Ethan > >I did a replace onto p0 of the drive I''d randomized, and did a scrub. No errors. (Then I did another scrub, just for the hell of it; no errors again.) I feel fairly content staying with p0''s, unless there''s a good reason not to. There are a few things I''m not entirely certain about: - Is there any significant advantage to having a partition table? - If there is, is it possible to drop the 8MB slice 8 so that I can actually have enough space to put my raid on slice 0? - Should I replace the disk that had errors on the initial scrub, or is it probably sufficient to just be wary of it, scrub frequently, and replace it if it encounters any more errors? -Ethan -------------- next part -------------- An HTML attachment was scrubbed... URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100223/d05890d8/attachment.html>