Phillip Oldham
2010-Apr-23 11:06 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
I''m trying to provide some "disaster-proofing" on Amazon EC2 by using a ZFS-based EBS volume for primary data storage with Amazon S3-backed snapshots. My aim is to ensure that, should the instance terminate, a new instance can spin-up, attach the EBS volume and auto-/re-configure the zpool. I''ve created an OpenSolaris 2009.06 x86_64 image with the zpool structure already defined. Starting an instance from this image, without attaching the EBS volume, shows the pool structure exists and that the pool state is "UNAVAIL" (as expected). Upon attaching the EBS volume to the instance the status of the pool changes to "ONLINE", the mount-point/directory is accessible and I can write data to the volume. Now, if I terminate the instance, spin-up a new one, and connect the same (now unattached) EBS volume to this new instance the data is no longer there with the EBS volume showing as blank. EBS is supposed to ensure persistence of data after EC2 instance termination, so I''m assuming that when the newly attached drive is seen by ZFS for the first time it is wiping the data some how? Or possibly that some ZFS logs or details on file location/allocation aren''t being persisted? Assuming this, I created an additional EBS volume to persist the intent-logs across instances but I''m seeing the same problem. I''m new to ZFS, and would really appreciate the community''s help on this. -- This message posted from opensolaris.org
Mark Musante
2010-Apr-23 11:16 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
On 23 Apr, 2010, at 7.06, Phillip Oldham wrote:> > I''ve created an OpenSolaris 2009.06 x86_64 image with the zpool structure already defined. Starting an instance from this image, without attaching the EBS volume, shows the pool structure exists and that the pool state is "UNAVAIL" (as expected). Upon attaching the EBS volume to the instance the status of the pool changes to "ONLINE", the mount-point/directory is accessible and I can write data to the volume. > > Now, if I terminate the instance, spin-up a new one, and connect the same (now unattached) EBS volume to this new instance the data is no longer there with the EBS volume showing as blank.Could you share with us the zpool commands you are using? Regards, markm -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100423/aadfbb23/attachment.html>
Phillip Oldham
2010-Apr-23 11:31 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
I''m not actually issuing any when starting up the new instance. None are needed; the instance is booted from an image which has the zpool configuration stored within, so simply starts and sees that the devices aren''t available, which become available after I''ve attached the EBS device. Before the image was bundled the following zpool commands were issued with the EBS volumes attached at "10" (primary), "6" (log main) and "7" (log mirror): # zpool create foo c7d16 log mirror c7d6 c7d7 # zpool status pool: mnt state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM mnt ONLINE 0 0 0 c7d1p0 ONLINE 0 0 0 c7d2p0 ONLINE 0 0 0 errors: No known data errors pool: foo state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM foo ONLINE 0 0 0 c7d16 ONLINE 0 0 0 logs ONLINE 0 0 0 mirror ONLINE 0 0 0 c7d6 ONLINE 0 0 0 c7d7 ONLINE 0 0 0 errors: No known data errors pool: rpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 c7d0s0 ONLINE 0 0 0 errors: No known data errors After booting a new instance based on the image I see this: # zpool status pool: foo state: UNAVAIL status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-3C scrub: none requested config: NAME STATE READ WRITE CKSUM foo UNAVAIL 0 0 0 insufficient replicas c7d16 UNAVAIL 0 0 0 cannot open logs UNAVAIL 0 0 0 insufficient replicas mirror UNAVAIL 0 0 0 insufficient replicas c7d6 UNAVAIL 0 0 0 cannot open c7d7 UNAVAIL 0 0 0 cannot open Which changes to "ONLINE" (as previous) when the EBS volumes are attached. After reading through the documentation a little more, could this be due to the zpool.cache file being stored on the image (& therefore refreshed after each boot) rather than somewhere more persistent? -- This message posted from opensolaris.org
Mark Musante
2010-Apr-23 12:00 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
On 23 Apr, 2010, at 7.31, Phillip Oldham wrote:> I''m not actually issuing any when starting up the new instance. None are needed; the instance is booted from an image which has the zpool configuration stored within, so simply starts and sees that the devices aren''t available, which become available after I''ve attached the EBS device. >Forgive my ignorance with EC2/EBS, but why doesn''t the instance remember that there were EBS volumes attached? Why aren''t they automatically attached prior to booting solaris within the instance? The error output from "zpool status" that you''re seeing matches what I would expect if we are attempting to import the pool at boot, and the disks aren''t present.
Phillip Oldham
2010-Apr-23 12:38 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
The instances are "ephemeral"; once terminated they cease to exist, as do all their settings. Rebooting an image keeps any EBS volumes attached, but this isn''t the case I''m dealing with - its when the instance terminates unexpectedly. For instance, if a reboot operation doesn''t succeed or if there''s an issue with the data-centre. There isn''t any way (yet, AFACT) to attach an EBS during the boot process, so they must be attached after boot. -- This message posted from opensolaris.org
Phillip Oldham
2010-Apr-23 12:42 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
One thing I''ve just noticed is that after a reboot of the new instance, which showed no data on the EBS volume, the files return. So: 1. Start new instance 2. Attach EBS vols 3. `ls /foo` shows no data 4. Reboot instance 5. Wait a few minutes 6. `ls /foo` shows data as expected Not sure if this helps track down why, after the initial start + attach, the EBS vol shows no data. -- This message posted from opensolaris.org
Mark Musante
2010-Apr-23 13:22 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
On 23 Apr, 2010, at 8.38, Phillip Oldham wrote:> The instances are "ephemeral"; once terminated they cease to exist, as do all their settings. Rebooting an image keeps any EBS volumes attached, but this isn''t the case I''m dealing with - its when the instance terminates unexpectedly. For instance, if a reboot operation doesn''t succeed or if there''s an issue with the data-centre.OK, I think if this issue can be addressed, it would be by people familiar with how EC2 & EBS interact. The steps I see are: - start a new instance - attach the EBS volumes to it - log into the instance and "zpool online" the disks I know the last step can be automated with a script inside the instance, but I''m not sure about the other two steps. Regards, markm
Robert Milkowski
2010-Apr-23 13:24 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
On 23/04/2010 13:38, Phillip Oldham wrote:> The instances are "ephemeral"; once terminated they cease to exist, as do all their settings. Rebooting an image keeps any EBS volumes attached, but this isn''t the case I''m dealing with - its when the instance terminates unexpectedly. For instance, if a reboot operation doesn''t succeed or if there''s an issue with the data-centre. > > There isn''t any way (yet, AFACT) to attach an EBS during the boot process, so they must be attached after boot. >Then perhaps you should do zpool import -R / pool *after* you attach EBS. That way Solaris won''t automatically try to import the pool and your scripts will do it once disks are available. -- Robert Milkowski http://milek.blogspot.com
Phillip Oldham
2010-Apr-23 14:06 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
I can replicate this case; Start new instance > attach EBS volumes > reboot instance > data finally available. Guessing that it''s something to do with the way the volumes/devices are "seen" & then made available. I''ve tried running various operations (offline/online, scrub) to see whether it will force zfs to recognise the drive, but nothing seems to work other than a restart after attaching. -- This message posted from opensolaris.org
Phillip Oldham
2010-Apr-26 08:27 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
> Then perhaps you should do zpool import -R / pool > *after* you attach EBS. > That way Solaris won''t automatically try to import > the pool and your > scripts will do it once disks are available.zpool import doesn''t work as there was no previous export. I''m trying to solve the case where the instance terminates unexpectedly; think of someone just pulling the plug. There''s no way to do the export operation before it goes down, but I still need to bring it back up, attach the EBS drives and continue as previous. The start/attach/reboot/available cycle is interesting, however. I may be able to init a reboot after attaching the drives, but it''s not optimal - there''s always a chance the instance might not come back up after the reboot. And it still doesn''t answer *why* the drives aren''t showing any data after they''re initially attached. -- This message posted from opensolaris.org
Robert Milkowski
2010-Apr-26 09:21 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
On 26/04/2010 09:27, Phillip Oldham wrote:>> Then perhaps you should do zpool import -R / pool >> *after* you attach EBS. >> That way Solaris won''t automatically try to import >> the pool and your >> scripts will do it once disks are available. >> > zpool import doesn''t work as there was no previous export. > > I''m trying to solve the case where the instance terminates unexpectedly; think of someone just pulling the plug. There''s no way to do the export operation before it goes down, but I still need to bring it back up, attach the EBS drives and continue as previous. > > The start/attach/reboot/available cycle is interesting, however. I may be able to init a reboot after attaching the drives, but it''s not optimal - there''s always a chance the instance might not come back up after the reboot. And it still doesn''t answer *why* the drives aren''t showing any data after they''re initially attached. >You don''t have to do exports as I suggested to use ''zpool -R / pool'' (notice -R). If you do so that a pool won''t be added to zpool.cache and therefore after a reboot (unexpected or not) you will be able to import it again (and do so with -R). That way you can easily script it so import happens after your disks ara available. -- Robert Milkowski http://milek.blogspot.com
Phillip Oldham
2010-Apr-26 10:14 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
> You don''t have to do exports as I suggested to use > ''zpool -R / pool'' > (notice -R).I tried this after your suggestion (including the -R switch) but it failed, saying the pool I was trying to import didn''t exist.> If you do so that a pool won''t be added to > zpool.cache and therefore > after a reboot (unexpected or not) you will be able > to import it again > (and do so with -R). That way you can easily script > it so import happens > after your disks ara available.I''m pretty sure that the zpool.cache is part of the image, and that its state/contents isn''t persisted from one instance to the next. -- This message posted from opensolaris.org
Robert Milkowski
2010-Apr-26 13:51 UTC
[zfs-discuss] Re-attaching zpools after machine termination [amazon ebs & ec2]
On 26/04/2010 11:14, Phillip Oldham wrote:>> You don''t have to do exports as I suggested to use >> ''zpool -R / pool'' >> (notice -R). >> > I tried this after your suggestion (including the -R switch) but it failed, saying the pool I was trying to import didn''t exist. > >which means it couldn''t discover it. does ''zpool import'' (no other options) list the pool? -- Robert Milkowski http://milek.blogspot.com