I''m trying to figure out an algorithm from taking an arbitrary mounted btrfs directory and break it down into: <device(s), subvolume, subpath> where, keep in mind, <subpath> may not actually be part of the mount. /proc/self/mountinfo seems to have some of that information, however, it does not appear to distinguish between non-default subvolumes and directories. At the same time, once I have mounted a subvolume I see its name in the root btrfs directory even if I didn''t access it. Questions, thus: a. Are subvolumes always part of the "root" namespace? If so, is it the mounted root, the default subvolume, or subvolume 0 which always exposes these other subvolumes? Are there disambiguation rules so that if I have /btrfs/root/blah and "blah" is both a subvolume and a directory (I presume that can happen?) b. Are there better ways (walking the tree using BTRFS_IOC_TREE_SEARCH?) to accomplish this than using /proc/self/mountinfo? -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don''t speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2012-06-18 at 17:39 -0700, H. Peter Anvin wrote:> I''m trying to figure out an algorithm from taking an arbitrary mounted > btrfs directory and break it down into: > > <device(s), subvolume, subpath> > > where, keep in mind, <subpath> may not actually be part of the mount. > > /proc/self/mountinfo seems to have some of that information, however, it > does not appear to distinguish between non-default subvolumes and > directories. At the same time, once I have mounted a subvolume I see > its name in the root btrfs directory even if I didn''t access it. > > Questions, thus: > > a. Are subvolumes always part of the "root" namespace?Yes. There is only one namespace in btrfs that all files, directories, and subvolumes are contained in.> If so, is it the > mounted root, the default subvolume, or subvolume 0 which always exposes > these other subvolumes?All subvolumes are accessible from the volume mounted when you use -o subvolid=0. (Note that 0 is not the real ID of the root volume, it''s just a shortcut for mounting it.) The ''default'' subvolume can be arbitrarily changed to any subvolume by a user; the result is equivalent to having a file-system default ''subvolid='' value that''s used when none is specified by the user.> Are there disambiguation rules so that if I > have /btrfs/root/blah and "blah" is both a subvolume and a directory (I > presume that can happen?)This cannot happen; see my first answer.> b. Are there better ways (walking the tree using BTRFS_IOC_TREE_SEARCH?) > to accomplish this than using /proc/self/mountinfo?I''m not sure; you might want to look into how the btrfs subvolume list <path> tool reads the list of subvolumes. (Note that this tool lists the subvolume paths relative to the root, and the <path> parameter is only used to determine which btrfs filesystem you''re looking at.) Unless it has changed recently, mounting a subvolume by path on btrfs is (almost) the same as mounting the root volume, then doing a bind mount like "mount --bind /mnt/subvolume/path /mnt". This used to not even bother checking if you were attempting to mount a directory or subvolume. -- Calvin Walton <calvin.walton@kepstin.ca> -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/19/2012 07:22 AM, Calvin Walton wrote:> > All subvolumes are accessible from the volume mounted when you use -o > subvolid=0. (Note that 0 is not the real ID of the root volume, it''s > just a shortcut for mounting it.) >Could you clarify this bit? Specifically, what is the real ID of the root volume, then? I found that after having set the default subvolume to something other than the root, and then mounting it without the -o subvol= option, then the subvolume name does *not* show in /proc/self/mountinfo; the same happens if a subvolume is mounted by -o subvolid= rather than -o subvol=. Is this a bug? This would seem to give the worst of both worlds in terms of actually knowing what the underlying filesystem path would end up looking like. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jun 18, 2012 at 06:39:31PM -0600, H. Peter Anvin wrote:> I''m trying to figure out an algorithm from taking an arbitrary mounted > btrfs directory and break it down into: > > <device(s), subvolume, subpath> > > where, keep in mind, <subpath> may not actually be part of the mount.Do you want an API for this, or is it enough to wander through /dev/disk style symlinks? The big reason it isn''t here yet is because Kay had this neat patch to blkid and udev to just put all the info you need into /dev/btrfs (or some other suitable location). It would allow you to see which devices belong to which filesystems etc.> > /proc/self/mountinfo seems to have some of that information, however, it > does not appear to distinguish between non-default subvolumes and > directories. At the same time, once I have mounted a subvolume I see > its name in the root btrfs directory even if I didn''t access it. > > Questions, thus: > > a. Are subvolumes always part of the "root" namespace? If so, is it the > mounted root, the default subvolume, or subvolume 0 which always exposes > these other subvolumes? Are there disambiguation rules so that if I > have /btrfs/root/blah and "blah" is both a subvolume and a directory (I > presume that can happen?)subvolumes may become disconnected from the root namespace. In this case we can find it just by the subvol id, and mount it into an arbitrary directory.> > b. Are there better ways (walking the tree using BTRFS_IOC_TREE_SEARCH?) > to accomplish this than using /proc/self/mountinfo?Not yet, but I''m definitely open to adding them. Lets just hash out what you need and we''ll either go through Kay''s stuff or add ioctls for you. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/19/2012 04:49 PM, Chris Mason wrote:> On Mon, Jun 18, 2012 at 06:39:31PM -0600, H. Peter Anvin wrote: >> I''m trying to figure out an algorithm from taking an arbitrary mounted >> btrfs directory and break it down into: >> >> <device(s), subvolume, subpath> >> >> where, keep in mind, <subpath> may not actually be part of the mount. > > Do you want an API for this, or is it enough to wander through /dev/disk > style symlinks? > > The big reason it isn''t here yet is because Kay had this neat patch to > blkid and udev to just put all the info you need into /dev/btrfs (or > some other suitable location). It would allow you to see which devices > belong to which filesystems etc. >I want an algorithm, it doesn''t have an API per se. I would really like to avoid relying on blkid and udev for this, though... that is pretty much a nonstarter. If the answer is to walk the tree then I''m fine with that.> subvolumes may become disconnected from the root namespace. In this > case we can find it just by the subvol id, and mount it into an > arbitrary directory.OK, so it sounds like the best thing is actually to record the subvolume *number* (ID) where (in my case) Syslinux is installed. This is actually a good thing because the fewer O(n) strings I have to stick into the boot block the better.>> b. Are there better ways (walking the tree using BTRFS_IOC_TREE_SEARCH?) >> to accomplish this than using /proc/self/mountinfo? > > Not yet, but I''m definitely open to adding them. Lets just hash out > what you need and we''ll either go through Kay''s stuff or add ioctls for > you.Well, I''d be interested in what Kay''s stuff actually does. Other than that, I would suggest adding a pair of ioctls that when executed on an arbitrary btrfs inode returns the corresponding subvolume and one which returns the path relative to the subvolume root. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Chris, Du meintest am 19.06.12:>> I''m trying to figure out an algorithm from taking an arbitrary >> mounted btrfs directory and break it down into: >> >> <device(s), subvolume, subpath> >> >> where, keep in mind, <subpath> may not actually be part of the >> mount.> Do you want an API for this, or is it enough to wander through > /dev/disk style symlinks?> The big reason it isn''t here yet is because Kay had this neat patch > to blkid and udev to just put all the info you need into /dev/btrfs > (or some other suitable location). It would allow you to see which > devices belong to which filesystems etc."btrfs" should work even without any "udev" installation. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> The big reason it isn''t here yet is because Kay had this neat patch >> to blkid and udev to just put all the info you need into /dev/btrfs >> (or some other suitable location). It would allow you to see which >> devices belong to which filesystems etc. > > "btrfs" should work even without any "udev" installation.It does; you can always mount with an explicit -o device=/dev/foo,device=/dev/bar if you''re inclined to punish yourself^w^w^w^w^w your requirements dictate that you don''t rely on udev. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 20, 2012 at 6:35 AM, H. Peter Anvin <hpa@zytor.com> wrote:> On 06/19/2012 07:22 AM, Calvin Walton wrote: >> >> All subvolumes are accessible from the volume mounted when you use -o >> subvolid=0. (Note that 0 is not the real ID of the root volume, it''s >> just a shortcut for mounting it.) >> > > Could you clarify this bit? Specifically, what is the real ID of the > root volume, then?5 -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/19/2012 06:16 PM, cwillu wrote:>>> The big reason it isn''t here yet is because Kay had this neat patch >>> to blkid and udev to just put all the info you need into /dev/btrfs >>> (or some other suitable location). It would allow you to see which >>> devices belong to which filesystems etc. >> >> "btrfs" should work even without any "udev" installation. > > It does; you can always mount with an explicit -o > device=/dev/foo,device=/dev/bar if you''re inclined to punish > yourself^w^w^w^w^w your requirements dictate that you don''t rely on > udev.I think you''re misunderstanding what this is about. I''m working on trying to make the Syslinux installer for btrfs as robust as it possibly can be. I really don''t like leaving corner cases where it will do the wrong thing and leave your system unbootable. Now, that having been said, there are a lot of things that are not really very clear how they should work given btrfs. Specifically, what is needed is: 1. The underlying device(s) for boot block installation. 2. A concept of a root. 3. A concept of a path within that root to the installation directory, where we can find syslinux.cfg and the other bootloader modules. All of this needs to be installed in the fixed-sized boot block, so a compact representation is very much a plus. The concept of what is the "root" and what is the "path" is straightforward for lesser filesystems: the root of the filesystem is defined by the root inode, and the path is a unique sequence of directories from that root. Note that this is completely independent of how the filesystem was mounted when the boot loader was installed. For btrfs, a lot of things aren''t so clear-cut, especially in the light of explicit and implicit subvolumes. Furthermore, sorting out these semantic issues is really important in order to support the "atomic update" scenario: a. Make a snapshot of the current root; b. Mount said snapshot; c. Install the new distro on the snapshot; d. Change the bootloader configuration *inside* the snapshot to point to the snapshot as the root; e. Install the bootloader on the snapshot, thereby making the boot block point to it and making it "live". If the root also contains subvolumes, e.g. /boot may be a subvolume because it has different policies, this gets pretty gnarly to get right. It is also a very high value to get right. So it is possible I''m approaching this wrong. I would love to have a discussion about this. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don''t speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jun 20, 2012 at 10:22 AM, H. Peter Anvin <hpa@zytor.com> wrote:> a. Make a snapshot of the current root; > b. Mount said snapshot; > c. Install the new distro on the snapshot; > d. Change the bootloader configuration *inside* the snapshot to point > to the snapshot as the root; > e. Install the bootloader on the snapshot, thereby making the boot > block point to it and making it "live".IMHO a more elegant solution would be similar to what (open)solaris/indiana does: make the boot parts (bootloader, configuration) as a separate area, separate from root snapshots. In solaris case IIRC this is will br /rpool/grub. A similar approach should be implementable in linux, at least on certain configurations, since if you put /boot as part of "/" (thus, also on btrfs), AND you don''t change the default subvolume, AND the roots are on their own subvolume, the paths to vmlinuz and initrd on grub.cfg will have subvols name in it. So it''s possible to have a single grub.cfg having several entries that points to different subvols. So you don''t need to install a new bootloader to make a particular subvol live, you only need to select it from the boot menu. I''m doing this currently with ubuntu precise, but with manually-created grub.cfg though. Still haven''t found a way to manage this automatically. -- Fajar -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jun 19, 2012 at 04:35:59PM -0700, H. Peter Anvin wrote:> On 06/19/2012 07:22 AM, Calvin Walton wrote: > > > > All subvolumes are accessible from the volume mounted when you use -o > > subvolid=0. (Note that 0 is not the real ID of the root volume, it''s > > just a shortcut for mounting it.) > > > > Could you clarify this bit? Specifically, what is the real ID of the > root volume, then? > > I found that after having set the default subvolume to something other > than the root, and then mounting it without the -o subvol= option, then > the subvolume name does *not* show in /proc/self/mountinfo; the same > happens if a subvolume is mounted by -o subvolid= rather than -o subvol=. > > Is this a bug? This would seem to give the worst of both worlds in > terms of actually knowing what the underlying filesystem path would end > up looking like.Yes, it''s a bug, and rather an irritating one at that. I know that David Sterba looked at fixing it, but apparently it was trickier to fix than was expected. (I don''t recall the reason, and probably wouldn''t have understood it anyway, so I''ll leave it to Dave to tell you about it in detail). Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Doughnut furs ache me, Omar Dorlin. ---
On Tue, Jun 19, 2012 at 06:03:49PM -0600, H. Peter Anvin wrote:> On 06/19/2012 04:49 PM, Chris Mason wrote: > > On Mon, Jun 18, 2012 at 06:39:31PM -0600, H. Peter Anvin wrote: > >> I''m trying to figure out an algorithm from taking an arbitrary mounted > >> btrfs directory and break it down into: > >> > >> <device(s), subvolume, subpath> > >> > >> where, keep in mind, <subpath> may not actually be part of the mount. > > > > Do you want an API for this, or is it enough to wander through /dev/disk > > style symlinks? > > > > The big reason it isn''t here yet is because Kay had this neat patch to > > blkid and udev to just put all the info you need into /dev/btrfs (or > > some other suitable location). It would allow you to see which devices > > belong to which filesystems etc. > > > > I want an algorithm, it doesn''t have an API per se. I would really like > to avoid relying on blkid and udev for this, though... that is pretty > much a nonstarter. > > If the answer is to walk the tree then I''m fine with that.Ok, fair enough.> > > subvolumes may become disconnected from the root namespace. In this > > case we can find it just by the subvol id, and mount it into an > > arbitrary directory. > > OK, so it sounds like the best thing is actually to record the subvolume > *number* (ID) where (in my case) Syslinux is installed. This is actually > a good thing because the fewer O(n) strings I have to stick into the > boot block the better.Right, the subvolume number doesn''t change over the life of the subvol, regardless of the path that was used for mounting the subvol. So all you need is that number (64 bits) and the filename relative to the subvol root and you''re set. We''ll have to add an ioctl for that. Finding the path relative to the subvol is easy, just walk backwards up the directory chain (cd ..) until you get to inode number 256. All the subvol roots have inode number 256.> > >> b. Are there better ways (walking the tree using BTRFS_IOC_TREE_SEARCH?) > >> to accomplish this than using /proc/self/mountinfo? > > > > Not yet, but I''m definitely open to adding them. Lets just hash out > > what you need and we''ll either go through Kay''s stuff or add ioctls for > > you. > > Well, I''d be interested in what Kay''s stuff actually does. Other than > that, I would suggest adding a pair of ioctls that when executed on an > arbitrary btrfs inode returns the corresponding subvolume and one which > returns the path relative to the subvolume root.udev already scans block devices as they appear. When it finds btrfs, it calls the btrfs dev scan ioctl for that one device. It also reads in the FS uuid and the device uuid and puts them into a tree. Very simple stuff, but it gets rid of the need to manually call btrfs dev scan yourself. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/20/2012 06:34 AM, Chris Mason wrote:>> >> I want an algorithm, it doesn''t have an API per se. I would really like >> to avoid relying on blkid and udev for this, though... that is pretty >> much a nonstarter. >> >> If the answer is to walk the tree then I''m fine with that. > > Ok, fair enough. > > > Right, the subvolume number doesn''t change over the life of the subvol, > regardless of the path that was used for mounting the subvol. So all > you need is that number (64 bits) and the filename relative to the > subvol root and you''re set. > > We''ll have to add an ioctl for that. > > Finding the path relative to the subvol is easy, just walk backwards up > the directory chain (cd ..) until you get to inode number 256. All the > subvol roots have inode number 256. >... assuming I can actually see the root (that it is not obscured because of bindmounts and so on). Yes, I know I''m weird for worrying about these hyper-obscure corner cases, but I have a pretty explicit goal of trying to write Syslinux so it will very rarely if ever do the wrong thing, and since I can already get that information for other filesystems. The other thing, of course, is what is the desirable behavior,which I have brought up in a few posts already. Specifically, I see two possibilities: a. Always handle a path from the global root, and treat subvolumes as directories. This would mostly require that the behavior of /proc/self/mountinfo with regards to mount -o subvolid= would need to be fixed. I also have no idea how one would deal with a detached subvolume, or if that subcase even matters. A major problem with this is that it may be *very* confusing to a user to have to specify a path in their bootloader configuration as /subvolume/foo/bar when the string "subvolume" doesn''t show up in any way in their normal filesystem. b. Treat the subvolume as the root (which is what I so far have been asssuming.) In this case, I think the <subvolume ID, path_in_subvolume> ioctl is the way to go, unless there is a way to do this with BTRFS_IOC_TREE_SEARCH already. I think I''m leaning, still, at "b" just because of the very high potential for user confusion with "a".>> >> Well, I''d be interested in what Kay''s stuff actually does. Other than >> that, I would suggest adding a pair of ioctls that when executed on an >> arbitrary btrfs inode returns the corresponding subvolume and one which >> returns the path relative to the subvolume root. > > udev already scans block devices as they appear. When it finds btrfs, > it calls the btrfs dev scan ioctl for that one device. It also reads in > the FS uuid and the device uuid and puts them into a tree. > > Very simple stuff, but it gets rid of the need to manually call btrfs > dev scan yourself. >For the record, I implemented the use of BTRFS_IOC_DEV_INFO yesterday; it is still way better than what I had there before and will make an excellent fallback for a new ioctl. This would be my suggestion for a new ioctl: 1. Add the device number to the information already returned by BTRFS_IOC_DEV_INFO. 2. Allow returning more than one device at a time. Userspace can already know the number of devices from BTRFS_IOC_FS_INFO(*), and it''d be better to just size a buffer and return N items rather having to iterate over the potentially sparse devid space. I might write this one up if I can carve out some time today... -hpa (*) - because race conditions are still possible, a buffer size/limit check is still needed. -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don''t speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/19/2012 11:31 PM, Fajar A. Nugraha wrote:> > IMHO a more elegant solution would be similar to what > (open)solaris/indiana does: make the boot parts (bootloader, > configuration) as a separate area, separate from root snapshots. In > solaris case IIRC this is will br /rpool/grub. >It is both more and less elegant; it means you don''t get the same kind of atomic update for the bootloader itself. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don''t speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html