Hi, As a part of the next stages of the time-slider project we are looking into doing actual backups onto removable media devices such as USB media. The goal is to be able to view snapshots stored on the media and merge these into the list of viewable snapshots in nautilus giving the user a broader selection of restore points. In an ideal world we would like to detect the insertion of the selected media, have it automatically mounted and backup to it automatically. More info on time-slider: blogs.sun.com/erwann/entry/zfs_on_the_desktop_zfs So to realise this we need to be sending datasets to a zfs formatted device instead of file blobs stored on a fat32 formatted storage device etc. We would aim to provide the user with a GUI to do this and create a zpool on the selected storage device. I started out testing out how ZFS handles hotplugging ZFS formatted USB storage stick. I tried creating a zpool on both the existing fdisk primary partition and letting the zpool have the whole device and create it''s own EFI disk label. Creating a pool on the primary partition, hald complained that it could not mount the volume. It did somehow manage to mount the old fat32 filesystem that I thought I overwrote with the zpool create command however. Running zpool import gets the zpool mounted but I can then easily cause errors on the pool by writing into the mounted fat32 filesystem. When allowing zfs to use the whole device for the zpool, I don''t get any error messages but no attempt to automatically mount the device appears to be made. I then have to run "zpool import" to get the pool mounted. If I then pull out the usb stick the system appears not to notice it as zfs status reports the zpool as being still online. Running the sync command seems to block for a very long time though, indicating the unplugging has upset ZFS. So, is ZFS usable on removable media in this manner? What steps could I take to avoid the kind of problems I''m seeing and are these considered bugs either in hald or zfs? It seems things are a far way off being able to just plug something in and out and having the system deal with it like it does for other filesystems. Thanks, Niall -- This message posted from opensolaris.org
Hi James, James Litchfield wrote:> I believe the answer is in the last email in that thread. hald doesn''t > offer > the notifications and it''s not clear that ZFS can handle them. As is > noted, > there are complications with ZFS due to the possibility of multiple disks > comprising a volume, etc. It would be a lot of work to make it work > correctly for any but the simplest single disk case.For the kind of usage cases we have in mind the percentage of people wanting to back up to some kind of hot pluggable multi device volume is minutely small in all likelihood. Would it be a lot of work to make it work for single disk volumes? It seems a shame to not provide this functionality and convenience because of a few very rare/exotic configurations for which it wouldn''t work. What would we stand to lose by getting at least some (majority?) of configurations working? Thanks, Niall> > Jim > --- > Niall Power wrote: >> Hi Tim, >> >> Tim Foster wrote: >> >>> Niall Power wrote: >>> >>>> Bueller? Anyone? >>>> >>> Yeah, I''d love to know the answer too. The furthest I got into >>> investigating this last time was: >>> >>> mail.opensolaris.org/pipermail/zfs-discuss/2007-December/044787.html >>> >>> >>> - does that help at all Niall? >>> >> >> I dug around and found those few hald pieces for zpools also. Seems >> to me >> that there was at least an intention or desire to make things work >> work with >> hald. >> Some further searching around reveals this conversation thread: >> opensolaris.org/jive/thread.jspa?messageID=257186 >> The trail goes cold there though. >> >>> The context to Niall''s question is to extend Time Slider to do proper >>> backups to usb devices whenever a device is inserted. I nearly had >>> this >>> working with: >>> >>> blogs.sun.com/timf/entry/zfs_backups_to_usb_mass >>> blogs.sun.com/timf/entry/zfs_automatic_backup_0_1 >>> >>> but I used pcfs on the storage device to store flat zfs send-streams as >>> I didn''t have a chance to work out what was going on. Getting ZFS plug >>> n'' play on usb disks would be much much cooler though[1]. >>> >> >> Exactly. Having zfs as the native filesystem would enable snapshot >> browsing >> from within nautilus so it''s a requirement for this project. >> >>> cheers, >>> tim >>> >>> [1] and I reckon that by relying on the ''zfs/interval'' ''none'' setting >>> for the auto-snapshot service, doing this now will be a lot easier than >>> my previous auto-backup hack. >>> >> That could be quite useful alright. We might need to come up with a >> mechanism >> to delete the snapshot after it''s taken and backed up. >> >> Cheers, >> Niall >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >> >
Bueller? Anyone? -- This message posted from opensolaris.org
Hi Niall, I noticed that ZFS won''t automatically import pools myself. I didn''t really consider it a problem since I wanted to script a bunch of stuff on USB insertion. I was hoping to be able to write a script that would detect the insertion, attempt to automatically mount pools on devices that are recognised by the system, and issue a ZFS send to the device. Regarding the hot plugging of USB devices, yes, that can cause problems. I created a number of bug reports after finding out that ZFS can continue writing to a removed USB hard drive for some considerable period after the drive was removed. The main thread where this is documented is here. What you probably want is section 4 of the attached PDF: opensolaris.org/jive/thread.jspa?threadID=68748 I reported a range of bugs from that, two that I think are probably relevant are: Data loss when ZFS doesn''t react to device removal: bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6735932 (Hopefully not as severe with snv_100 onwards now the zpool status hang bug has been resolved.) ZFS has inconsistent handling of device removal bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6735853 Ross -- This message posted from opensolaris.org
Hi Tim, Tim Foster wrote:> Niall Power wrote: >> Bueller? Anyone? > > Yeah, I''d love to know the answer too. The furthest I got into > investigating this last time was: > > mail.opensolaris.org/pipermail/zfs-discuss/2007-December/044787.html > > > - does that help at all Niall?I dug around and found those few hald pieces for zpools also. Seems to me that there was at least an intention or desire to make things work work with hald. Some further searching around reveals this conversation thread: opensolaris.org/jive/thread.jspa?messageID=257186 The trail goes cold there though.> > The context to Niall''s question is to extend Time Slider to do proper > backups to usb devices whenever a device is inserted. I nearly had this > working with: > > blogs.sun.com/timf/entry/zfs_backups_to_usb_mass > blogs.sun.com/timf/entry/zfs_automatic_backup_0_1 > > but I used pcfs on the storage device to store flat zfs send-streams as > I didn''t have a chance to work out what was going on. Getting ZFS plug > n'' play on usb disks would be much much cooler though[1].Exactly. Having zfs as the native filesystem would enable snapshot browsing from within nautilus so it''s a requirement for this project.> > cheers, > tim > > [1] and I reckon that by relying on the ''zfs/interval'' ''none'' setting > for the auto-snapshot service, doing this now will be a lot easier than > my previous auto-backup hack.That could be quite useful alright. We might need to come up with a mechanism to delete the snapshot after it''s taken and backed up. Cheers, Niall
Niall Power wrote:> Bueller? Anyone?Yeah, I''d love to know the answer too. The furthest I got into investigating this last time was: mail.opensolaris.org/pipermail/zfs-discuss/2007-December/044787.html - does that help at all Niall? The context to Niall''s question is to extend Time Slider to do proper backups to usb devices whenever a device is inserted. I nearly had this working with: blogs.sun.com/timf/entry/zfs_backups_to_usb_mass blogs.sun.com/timf/entry/zfs_automatic_backup_0_1 but I used pcfs on the storage device to store flat zfs send-streams as I didn''t have a chance to work out what was going on. Getting ZFS plug n'' play on usb disks would be much much cooler though[1]. cheers, tim [1] and I reckon that by relying on the ''zfs/interval'' ''none'' setting for the auto-snapshot service, doing this now will be a lot easier than my previous auto-backup hack.
James Litchfield
2008-Oct-28 17:00 UTC
[zfs-discuss] Hotplug issues on USB removable media.
I believe the answer is in the last email in that thread. hald doesn''t offer the notifications and it''s not clear that ZFS can handle them. As is noted, there are complications with ZFS due to the possibility of multiple disks comprising a volume, etc. It would be a lot of work to make it work correctly for any but the simplest single disk case. Jim --- Niall Power wrote:> Hi Tim, > > Tim Foster wrote: > >> Niall Power wrote: >> >>> Bueller? Anyone? >>> >> Yeah, I''d love to know the answer too. The furthest I got into >> investigating this last time was: >> >> mail.opensolaris.org/pipermail/zfs-discuss/2007-December/044787.html >> >> >> - does that help at all Niall? >> > > I dug around and found those few hald pieces for zpools also. Seems to me > that there was at least an intention or desire to make things work work with > hald. > Some further searching around reveals this conversation thread: > opensolaris.org/jive/thread.jspa?messageID=257186 > The trail goes cold there though. > >> The context to Niall''s question is to extend Time Slider to do proper >> backups to usb devices whenever a device is inserted. I nearly had this >> working with: >> >> blogs.sun.com/timf/entry/zfs_backups_to_usb_mass >> blogs.sun.com/timf/entry/zfs_automatic_backup_0_1 >> >> but I used pcfs on the storage device to store flat zfs send-streams as >> I didn''t have a chance to work out what was going on. Getting ZFS plug >> n'' play on usb disks would be much much cooler though[1]. >> > > Exactly. Having zfs as the native filesystem would enable snapshot browsing > from within nautilus so it''s a requirement for this project. > >> cheers, >> tim >> >> [1] and I reckon that by relying on the ''zfs/interval'' ''none'' setting >> for the auto-snapshot service, doing this now will be a lot easier than >> my previous auto-backup hack. >> > That could be quite useful alright. We might need to come up with a > mechanism > to delete the snapshot after it''s taken and backed up. > > Cheers, > Niall > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > mail.opensolaris.org/mailman/listinfo/zfs-discuss > >
Hey guys, This may be a dumb thought from an end user, but why does it have to be hard for ZFS to automatically mount volumes on removable media? Mounting single volumes should be straightforward and couldn''t you just try to import any others and just silently fail if any required pieces are missing? That way you''re using the existing ZFS import behaviour. It means nothing would mount as you insert the first disk of a raid-z volume, but as soon as you plug enough of the disks in, ZFS would mount it automatically (albeit in a degraded state). Then once the pool is mounted, the existing SATA and USB auto mount behaviour should be enough to incorporate any remaining devices that are inserted. You might want to just allow simple mounts by default though. Could you have a generic zfs automount property, with settings of ''off'', ''simple'', ''all''? Simple pools are definitely going to be the most common usage, but it would be nice to have support for more complex setups too. Especially since this would allow people to do things like easily expand their USB pool once it fills up, just by adding extra USB drives to the pool. Ross -- This message posted from opensolaris.org
>>>>> "tf" == Tim Foster <Tim.Foster at Sun.COM> writes:tf> store flat zfs send-streams I thought it was said over and over that ''zfs send'' streams could never be stored, only piped to ''zfs recv''. If you store one and then find it''s corrupt, the answer is ``didn''t let ZFS handle redundancy,'''' ``sysadmin''s fault, not a bug,'''' ``it was corrupted by weak TCP checksums, USB gremlins, poor FLASH ECC, traces on your motherboard without parity (it does happen! be glad it didn''t happen silently! restore the pool from ba---oh, that was your backup. shit.)'''' Also I don''t think it is currently safe to allow mounting of stick-based USB ZFS filesystems on a multi-user machine because someone coudl show up at a SunRay cluster with one of these poison-sticks that panics on import. I stumbled onto a bugnumber with a wild idea for addressing this: bugs.opensolaris.org/view_bug.do?bug_id=4879357 The suggestion is scary that problems with one pool will restart ZFS for all pools, and it seems like something that could loop. but the idea of a single bulletted-whitepaper-feechur addressing a whole class of problems is pretty attractive. I guess using FUSE for all removable media is another path, but feels like defeat---the hotplug stuff isn''t always perfect on Macs but at least they don''t seem to panic from corrupt filesystems often, and they do proper high-speed in-kernel filesystems. I guess I''m asking for something more drastic and beyond common-practice with the SunRay reference though---to treat USB sticks as untrusted input analagous to network packets, meaning if you can create a stick that makes the kernel panic, you''ve potentially discovered a kernel-level privilege-escalation exploit, not just a broken stick. With this whole power-saving theme of ``containers'''' and so on, it''s no longer reasonable to punt and say, ``well he had physical access to the machine anyway---he could have taken the cover off and done whatever,'''' because we''d like to allow people to introduce USB sticks over the network. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081028/ebc18502/attachment.bin>
I don''t remember anyone saying they couldn''t be stored, just that if they are stored it''s not ZFS'' fault if they go corrupt as it''s outside of its control. I''m actually planning to store the zfs send dump on external USB devices myself, but since the USB device will be running on ZFS I expect to be able to find out if there are any corruption problems, at which point I''ll just run the send / receive again. However, I don''t think that''s what they''re talking about here. I think they''re talking about a ZFS pool that consists of an external USB device, and doing a send / receive directly to that pool. That way the USB device is a true backup copy of your ZFS pool, and I think the idea is that you can then delete snapshots from your main system, confident that they are still present on the USB backup. If it works it''s a nice idea, especially with an integrated restore interface. And yeah, auto mounting isn''t something you''re going to want to enable on production servers. But for desktop / development / small scale use, it''s a great idea. And I agree 100% about hotplug stuff being untrusted input. You shouldn''t be able to crash anything with a USB stick, especially not a bunch of unrelated ZFS filesystems. -- This message posted from opensolaris.org
> > However, I don''t think that''s what they''re talking > about here. I think they''re talking about a ZFS pool > that consists of an external USB device, and doing a > send / receive directly to that pool. That way the > USB device is a true backup copy of your ZFS pool, > and I think the idea is that you can then delete > snapshots from your main system, confident that they > are still present on the USB backup.That''s exactly it. While it''s great to have snapshots stored on the same device as the filesystem for convenience, this is limited in that it''s not a full recovery solution since it doesn''t offer any protection from physical hardware failure, especially on laptops and workstations where there will most likely be just one hard disk on the system. What we have right now is protection from user and software errors which is probably the source of most cases of data loss or corruption, but we need to extend this to cover hardware failure, and we need to be able to backup to secondary devices to acheive that. For most single users that means an external USB disk or thumb drive.> > If it works it''s a nice idea, especially with an > integrated restore interface.That would also be part of our plan. Cheers, Niall> > And yeah, auto mounting isn''t something you''re going > to want to enable on production servers. But for > desktop / development / small scale use, it''s a great > idea. > > And I agree 100% about hotplug stuff being untrusted > input. You shouldn''t be able to crash anything with > a USB stick, especially not a bunch of unrelated ZFS > filesystems.-- This message posted from opensolaris.org
>>>>> "r" == Ross <myxiplx at googlemail.com> writes:r> I don''t remember anyone saying they couldn''t be stored, just r> that if they are stored it''s not ZFS'' fault if they go corrupt r> as it''s outside of its control. they can''t be stored because they ``go corrupt'''' in a transactional all-or-nothing way that other checksummed storable things like zpools, tarballs, zipfiles, do not. The software''s written with the assumption it''ll be used in a pipe, and I think fails appropriately for that use. But it''s utterly without grace if you accidentally separate a stored ''zfs send'' from your opportunity to re-send it. r> I''m actually planning to store the zfs send dump on external r> USB devices myself, but since the USB device will be running r> on ZFS I expect to be able to find out if there are any r> corruption problems, yeah I still think it''s not a good idea. Our best practices are based on a feel for the statistics of bit-flips and their consequences If you flip one bit: in tar: you get the file back, with the flipped bit, and a warning from tar. in UFS: you get the file back with a bit flipped. in ZFS with redundancy: you get the file back with the bit unflipped. in ZFS without redundancy: you are warned about the flipped bit by losing the whole file in ''zfs send'': you lose the ENTIRE FILESYSTEM because of one bit. ''zfs recv'' will do a great job of warning you about corruption problems, whether it has ZFS underneath it or not. I don''t think anyone could mistake the signal it sends for affection or negotiability. That''s not the issue. It''s the probability you''ll be able to successfully restore your backup <n> years from now. We have a feel, not as good as we should but some, for this probability given ''tar'' on tape, DVDR, disk, stick. I think ''zfs send'' reduces this probability enough compared to other kinds of backup that you''d be crazy to use it. I also think this observation isn''t adequately communicated to the average sysadmin by the disclaimery-sounding IT warning ``*BRRK* *BRRK* ''zfs send'' is not a Backup Soloooshun!!'''' It sounds like someone wants you to spend more money on CYAware and not, as it should, like they are begging you to use tar instead. There are other problems besides fragileness with storing ''zfs send'' that are solved by both ''tar'' and piping immediately to ''zfs recv'': * stream format has not been as forward-portable across zfs versions and endiness as the pool structure itself. You could easily find yourself with a stream that won''t restore, with no user-readable marker as to what it requires to restore itself. The choices are unstable Nevada builds you may not even be able to obtain two years from now much less get to boot on the hardware for sale at the time, and machine endynesses(!)---much harder to iteratively test than versions of GNU tar, if tar had this same problem which it doesn''t. It may even demand to be restored ONTO a specific version of zpool, so your desperate restore attempt strategy is to reinstall an older Nevada, destroy and recreate a blank zpool, ''zfs recv'', repeat. It''s a bad situation. Store a zpool instead and you won''t be in it because there''s a commitment to forward-portability and endyness-independence. * no equivalent of ''zpool scrub'' or ''tar t''. The only way to verify a ''zfs send'' dump is to restore it. I think it''s a best-practice when making backups to read back what you wrote. This catches medium errors, broken network pipes, filesystems you didn''t quite sync before ejecting, whatever. You don''t need to do silly Windows things like compare the backup to the files still on the disk and then scream OMG OMG something Changed!, but you ought to at least read it back now if you expect to restore it later. Is something missing from the toolbox? hell yes! Even beyond these inadequacies I''d like a way to restore a ZFS snapshot tree onto an LVM2 or Netapp snapshot tree. so I rather think it''ll be some stream-storing rsync or deduplicating GNUtar instead of something coming from the ZFS project. (AIUI GNUtar incrementals include whole changed files, which is great for portability but useless for VM disk images) There are interesting questions like, would it drastically improve the efficiency of incremental backups if ZFS exposed some interface to its snapshot tree besides two copies of the file? Maybe the ''zfs send'' stream itself is sufficient exposure, or maybe it needs to be some file-level rather than filesystem-level map. But this should probably come after we have userspace tools that do the job correctly but inefficiently given two copies of the file. And how do we design a storable incremental format that will take bit-flips gracefully, without seg-faulting and without killing the entire snapshot tree because one block is lost forever? Thirdly think it''d be good to write backups so one ``tape'''' can be lost. ex: day 1: full backup day 3: incremental based on day 1 day 5: incremental based on day 1 day 6: full backup day 7: incremental based on day 1 day 8: incremental based on day 6 day 9: incremental based on day 1 day 10: incremental based on day 6 day 11: full backup day 12: incremental based on day 6 day 13: incremental based on day 11 day 14: incremental based on day 6 day 15: incremental based on day 11 Maybe it''s a bit quaint given how people do backups now, I don''t know. The point of it: you can combine backps on a single ``tape'''' (stick) if there''s enough space, but you can never combine something written on an odd day with something written on an even day. if you insist on foolishly using ''zfs send'' maybe a two-stick strategy of that sort can boost your odds a bit. I still think they''re not good enough. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081029/e35a26ce/attachment.bin>
Hi Miles, I probably should have explained that storing the zfs send on a USB device is just one part of the strategy, and in fact that''s just our way of getting the backup off-site. Once off-site, we do zfs receive that into another pool, and in fact we plan to have two offsite zfs pools, plus standard tar backups on tape (just in case the zpools all get corrupted somehow). We''re also considering streaming zfs send to file, and then ftp''ing that file to the remote servers. I don''t fancy restarting an entire zfs send because a packet got lost 4/5 of the way through. I''d much rather use restartable FTP, and do the zfs receive later on. Ross -- This message posted from opensolaris.org