Hi everyone, I have used btrfs as a work partition with compression=zlib. The compression ratio is not satisfied to me. I tracked my workloads in btrfs. The zlib module (zlib.c) seems work well: write size of each write operation in writepage function can be compressed into about 20%. I suspent the workloads may impact the btrfs behavior. My workloads include really a large number of overwrite operations. I briefly reviewed the code about the space reclaim in btrfs, and found the btrfs kicks the defrag off when the overwritten range is smaller than 16KB, And this is the only method of reclaiming freed extents with compression. Am I right? So my question is if btrfs can successfully reclaim the overwritten space when the cleaner thread can not be started, such as in the case that each overwrite operation is larger than 16KB? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 25, 2013 at 10:03:20PM -0600, lonat_front@163.com wrote:> Hi everyone, > > I have used btrfs as a work partition with compression=zlib. The compression ratio is not satisfied to me. >So you probably want compress-force=zlib. With just compress we will bail out of the compression if the compressed pages are larger than the original size, which means if you wrote a particular file and then copmressed it with gzip you''d possibly see different results, but if you do compress-force=zlib then you''ll see behavior more like gzip.> I tracked my workloads in btrfs. The zlib module (zlib.c) seems work well: write size of each write operation in writepage function can be compressed into about 20%. > > I suspent the workloads may impact the btrfs behavior. My workloads include really a large number of overwrite operations. > > I briefly reviewed the code about the space reclaim in btrfs, and found the btrfs kicks the defrag off when the overwritten range is smaller than 16KB, And this is the only method of reclaiming freed extents with compression. Am I right?It''s 64k, and what do you mean reclaiming freed extents? The freed extents will be reclaimed once they are completely overwritten.> > So my question is if btrfs can successfully reclaim the overwritten space when the cleaner thread can not be started, such as in the case that each overwrite operation is larger than 16KB?Not sure what you mean by reclaim. They won''t be defragged if the overwrite is above 64k, but if any write is less than 64k then it will defrag the whole file. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 26, 2013 at 10:27:34AM -0600, yiletian wrote:> Yes, I use compress-force=zlib for my partition. > > Consider this scenario. > > We first write a file with size of 256KB. Assume all data is compressed to 128KB size, > btrfs create a extent item in extent-tree to record the 128KB disk range (named E). > and btrfs also creates a single file extent to records the disk range of E. > > Then we overwrite from 16KB to the end of file, with size of 240KB. > Btrfs will create a new file extent for the overwritten range. > That is, the file has two file extents: the first one is to record the first 16KB and the second one record the remaining 240KB. > > Then we are in a dilemma: > 1. the first one only occupies a disk range of 16KB, but entire E is reserved for it. This is because the __btrfs_drop_exte nts function do not decrease the number of back refs of E. > 2. because the overwritten range is large enough, the compress_file_range does not call btrfs_add_inode_defrag to kick off a defrag for the file automatically. > > With this dilemma, how can btrfs reclaim the 112KB disk range (at least) recorded in E. >Oh yeah welcome to btrfs, you must be new here ;). So yeah this is the way it works, until we overwrite the entire extent we don''t reclaim any of the space. This includes the "prealloc an 8 gig vm image and then random write inside of it" workload, you could end up using up to 16gb in the worst case scenario. The thing we could do to fix this would be to instead of splitting the file extents and then inc''ing the ref of the original extent we instead split the extent ref as well, so we can reclaim this space. It''s on my list of things to do down the road, but it keeps getting supplanted by other priorities. THanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I think the biggest problem is how we can reclaim the space when the extent is a compressed one. In this case, we may need to read and decompress data in the extent, and then compress the valid range to generate a new extent. Is this process a performance killer? At 2013-03-27 02:03:57,"Josef Bacik" <jbacik@fusionio.com> wrote:>On Tue, Mar 26, 2013 at 10:27:34AM -0600, yiletian wrote: >> Yes, I use compress-force=zlib for my partition. >> >> Consider this scenario. >> >> We first write a file with size of 256KB. Assume all data is compressed to 128KB size, >> btrfs create a extent item in extent-tree to record the 128KB disk range (named E). >> and btrfs also creates a single file extent to records the disk range of E. >> >> Then we overwrite from 16KB to the end of file, with size of 240KB. >> Btrfs will create a new file extent for the overwritten range. >> That is, the file has two file extents: the first one is to record the first 16KB and the second one record the remaining 240KB. >> >> Then we are in a dilemma: >> 1. the first one only occupies a disk range of 16KB, but entire E is reserved for it. This is because the __btrfs_drop_exte nts function do not decrease the number of back refs of E. >> 2. because the overwritten range is large enough, the compress_file_range does not call btrfs_add_inode_defrag to kick off a defrag for the file automatically. >> >> With this dilemma, how can btrfs reclaim the 112KB disk range (at least) recorded in E. >> > >Oh yeah welcome to btrfs, you must be new here ;). So yeah this is the way it >works, until we overwrite the entire extent we don''t reclaim any of the space. >This includes the "prealloc an 8 gig vm image and then random write inside of >it" workload, you could end up using up to 16gb in the worst case scenario. The >thing we could do to fix this would be to instead of splitting the file extents >and then inc''ing the ref of the original extent we instead split the extent ref >as well, so we can reclaim this space. It''s on my list of things to do down the >road, but it keeps getting supplanted by other priorities. THanks, > >Josef-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html