Hi, How could one find out if 2 files share any extents on a btrfs file system? A more generic variation of the above: How to list files on the same file system/subvolume sharing content? Thanks, Gábor -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 30, 2012 at 04:20:05PM +0100, Gábor Nyers wrote:> Hi, > > How could one find out if 2 files share any extents on a btrfs file system? > > A more generic variation of the above: How to list files on the same > file system/subvolume sharing content?You have direct (read-only) access to the metadata trees through the TREE_SEARCH ioctl. It should be possible to walk through the extents of a given file, and (I think) follow back-refs from the extent back to the other files that share it. There''s no simple code to do that right now, though. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- And what rough beast, its hour come round at last / slouches --- towards Bethlehem, to be born?
On Tue, October 30, 2012 at 16:39 (+0100), Hugo Mills wrote:> It should be possible to walk through the > extents of a given file, and (I think) follow back-refs from the > extent back to the other files that share it.You wish :-) Backrefs are not made to walk them while the file system is online. However "btrfs inspect logical" manages quite well, at least I haven''t heard otherwise so far. You still need to get the logical block numbers, either by TREE_SEARCH ioctl or by filefrag. -Jan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/30/2012 11:20 PM, Gábor Nyers wrote:> Hi, > > How could one find out if 2 files share any extents on a btrfs file system? > > A more generic variation of the above: How to list files on the same > file system/subvolume sharing content? >Indeed ocfs2 already has the feature where you can get shared parts via ''du'', we''re planning to support this in btrfs, too. thanks, liubo> Thanks, > Gábor > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/31/2012 08:40 AM, Liu Bo wrote:> On 10/30/2012 11:20 PM, Gábor Nyers wrote: >> Hi, >> >> How could one find out if 2 files share any extents on a btrfs file system? >> >> A more generic variation of the above: How to list files on the same >> file system/subvolume sharing content?One idea is to mark those cloned extents as FIEMAP_EXTENT_SHARED so that we can go through a file to figure out how many extents are shared through fiemap(2), and calculate the real storage(fs/subvolume) footprint in the end. Thanks, -Jeff>> > > Indeed ocfs2 already has the feature where you can get shared parts via ''du'', > we''re planning to support this in btrfs, too. > > thanks, > liubo > >> Thanks, >> Gábor >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 31, 2012 at 10:30:22AM +0800, Jeff Liu wrote:> One idea is to mark those cloned extents as FIEMAP_EXTENT_SHARED so that > we can go through a file to figure out how many extents are shared > through fiemap(2), and calculate the real storage(fs/subvolume) footprint > in the end.This will cost at least one more seek per extent to find out that the extent is shared, could be quite expensive. And without any possibility to turn this off, I''m afraid this will render FIEMAP unusable in practice. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 10/31/2012 07:31 PM, David Sterba wrote:> On Wed, Oct 31, 2012 at 10:30:22AM +0800, Jeff Liu wrote: >> One idea is to mark those cloned extents as FIEMAP_EXTENT_SHARED so that >> we can go through a file to figure out how many extents are shared >> through fiemap(2), and calculate the real storage(fs/subvolume) footprint >> in the end. > > This will cost at least one more seek per extent to find out that the > extent is shared, could be quite expensive.I propose this because OCFS2 report shared space in this way combine with du(1). An old patch set to teach du(1) aware of reflinked file: https://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007293.html Do you means that the costs is very expensive for userland extent status checkup per file? If yes, I have once tested an 50Gb OCFS2 partition filled with reflinked files on an old laptop, it spent around 4 minutes to show the totally results if I recalled correct, but this definitely depending on the real world scenarios.> And without any possibility to turn this off,I''m afraid this will render FIEMAP unusable in practice.For OCFS2, the FIEMAP_EXTENT_SHARED flag will be set upon fiemap ioctl(2) if an extent is OCFS2_EXT_REFCOUNTED(i.e. reflinked or cloned), which means that FIEMAP_EXTENT_SHARED is not a persistent flag, but I have no idea how Btrfs would be in this point. :( Thanks, -Jeff> > david > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 31, 2012 at 09:02:15PM +0800, Jeff Liu wrote:> I propose this because OCFS2 report shared space in this way combine with du(1). > > An old patch set to teach du(1) aware of reflinked file: > https://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007293.htmlPatch looks ok, the shared size is requested by an option.> Do you means that the costs is very expensive for userland extent status checkup per file?The most expensive part is IMO not in userspace, it does in-memory lookups.> > And without any possibility to turn this off,I''m afraid this will render FIEMAP unusable in practice. > For OCFS2, the FIEMAP_EXTENT_SHARED flag will be set upon fiemap ioctl(2) if an extent > is OCFS2_EXT_REFCOUNTED(i.e. reflinked or cloned), which means that FIEMAP_EXTENT_SHARED > is not a persistent flag, but I have no idea how Btrfs would be in this point. :(After some research, I think this could work for btrfs without unwanted performance penalties. There''s the fiemap::fm_flags field that can be extended to request the shared extent info from fiemap, so the information is not computed unconditionally (that was my concern before). The rest is only implementation details how to speed up the file extent -> refcount info lookups. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/06/2012 06:45 AM, David Sterba wrote:> On Wed, Oct 31, 2012 at 09:02:15PM +0800, Jeff Liu wrote: >> I propose this because OCFS2 report shared space in this way combine with du(1). >> >> An old patch set to teach du(1) aware of reflinked file: >> https://oss.oracle.com/pipermail/ocfs2-devel/2010-September/007293.html > > Patch looks ok, the shared size is requested by an option. > >> Do you means that the costs is very expensive for userland extent status checkup per file? > > The most expensive part is IMO not in userspace, it does in-memory lookups. > >>> And without any possibility to turn this off,I''m afraid this will render FIEMAP unusable in practice. >> For OCFS2, the FIEMAP_EXTENT_SHARED flag will be set upon fiemap ioctl(2) if an extent >> is OCFS2_EXT_REFCOUNTED(i.e. reflinked or cloned), which means that FIEMAP_EXTENT_SHARED >> is not a persistent flag, but I have no idea how Btrfs would be in this point. :( > > After some research, I think this could work for btrfs without > unwanted performance penalties. > > There''s the fiemap::fm_flags field that can be extended to request the > shared extent info from fiemap, so the information is not computed > unconditionally (that was my concern before). The rest is only > implementation details how to speed up the file extent -> refcount info > lookups.Thanks for your confirmation. -Jeff> > david > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html