thr3ads.net - Btrfs devel - Questions regarding logging upon fsync in btrfs [Sep 2013]

If this information is useful, please help other people find it:
Share via:

Aastha Mehta

2013-Sep-28 23:35 UTC

Questions regarding logging upon fsync in btrfs

Hi,

I have few questions regarding logging triggered by calling fsync in BTRFS:

1. If I understand correctly, fsync will call to log entire inode in
the log tree. Does this mean that the data extents are also logged
into the log tree? Are they copied into the log tree, or just
referenced? Are they copied into the subvolume''s extent tree again
upon replay?

2. During replay, when the extents are added into the extent
allocation tree, do they acquire the physical extent number during
replay? Does they physical extent allocated to the data in the log
tree differ from that in the subvolume?

3. I see there is a mount option of notreelog available. After
disabling tree logging, does fsync still lead to flushing of buffers
to the disk directly?

4. Is it possible to selectively identify certain files in the log
tree and flush them to disk directly, without waiting for the replay
to do it?

Thanks

-- 
Aastha Mehta
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Aastha Mehta

2013-Sep-28 23:46 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

I am using linux kernel 3.1.10-1.16, just to let you know.

Thanks

On 29 September 2013 01:35, Aastha Mehta <aasthakm@gmail.com>
wrote:> Hi,
>
> I have few questions regarding logging triggered by calling fsync in BTRFS:
>
> 1. If I understand correctly, fsync will call to log entire inode in
> the log tree. Does this mean that the data extents are also logged
> into the log tree? Are they copied into the log tree, or just
> referenced? Are they copied into the subvolume''s extent tree again
> upon replay?
>
> 2. During replay, when the extents are added into the extent
> allocation tree, do they acquire the physical extent number during
> replay? Does they physical extent allocated to the data in the log
> tree differ from that in the subvolume?
>
> 3. I see there is a mount option of notreelog available. After
> disabling tree logging, does fsync still lead to flushing of buffers
> to the disk directly?
>
> 4. Is it possible to selectively identify certain files in the log
> tree and flush them to disk directly, without waiting for the replay
> to do it?
>
> Thanks
>
> --
> Aastha Mehta


-- 
Aastha Mehta
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Hugo Mills

2013-Sep-29 00:21 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On Sun, Sep 29, 2013 at 01:46:23AM +0200, Aastha Mehta
wrote:> I am using linux kernel 3.1.10-1.16, just to let you know.
   Not that it invalidates the questions below, but that''s a really
old kernel. You should update to something recent (3.11, or 3.12-rc2)
as soon as possible. There are major problems in 3.1 (and most of the
subsequent kernels) that have been fixed in 3.11. Of course, there are
still major problems in 3.11 that haven''t been fixed yet, but we
don''t
know about very many of those. :) (And when we do, we''ll be
recommending that you upgrade to whatever has them fixed...)

   Hugo.
> Thanks
> 
> On 29 September 2013 01:35, Aastha Mehta <aasthakm@gmail.com> wrote:
> > Hi,
> >
> > I have few questions regarding logging triggered by calling fsync in
BTRFS:
> >
> > 1. If I understand correctly, fsync will call to log entire inode in
> > the log tree. Does this mean that the data extents are also logged
> > into the log tree? Are they copied into the log tree, or just
> > referenced? Are they copied into the subvolume''s extent tree
again
> > upon replay?
> >
> > 2. During replay, when the extents are added into the extent
> > allocation tree, do they acquire the physical extent number during
> > replay? Does they physical extent allocated to the data in the log
> > tree differ from that in the subvolume?
> >
> > 3. I see there is a mount option of notreelog available. After
> > disabling tree logging, does fsync still lead to flushing of buffers
> > to the disk directly?
> >
> > 4. Is it possible to selectively identify certain files in the log
> > tree and flush them to disk directly, without waiting for the replay
> > to do it?
> >
> > Thanks
> >
-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ==  PGP
key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- <Diablo-D3> My code is never released,  it escapes from the ---   
          git repo and kills a few beta testers on the way out.

Josef Bacik

2013-Sep-29 00:42 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On Sun, Sep 29, 2013 at 01:35:15AM +0200, Aastha Mehta
wrote:> Hi,
> 
> I have few questions regarding logging triggered by calling fsync in BTRFS:
> 
> 1. If I understand correctly, fsync will call to log entire inode in
> the log tree. Does this mean that the data extents are also logged
> into the log tree? Are they copied into the log tree, or just
> referenced? Are they copied into the subvolume''s extent tree again
> upon replay?
> 
The data extents are copied as well, as in the metadata that points to the data,
not the actual data itself.  For 3.1 it''s all of the extents in the
inode, in
3.8 on it''s only the extents that have changed this transaction.
> 2. During replay, when the extents are added into the extent
> allocation tree, do they acquire the physical extent number during
> replay? Does they physical extent allocated to the data in the log
> tree differ from that in the subvolume?
> 
No the physical location was picked when we wrote the data out during fsync.  If
we crash and re-mount the replay will just insert the ref into the extent tree
for the disk offset as it replays the extents.
> 3. I see there is a mount option of notreelog available. After
> disabling tree logging, does fsync still lead to flushing of buffers
> to the disk directly?
> 
notreelog just means that we write the data and wait on the ordered data extents
and then commit the transaction.  So you get the data for the inode you are
fsycning and all of the metadata for the entire file system that has changed in
that transaction.
> 4. Is it possible to selectively identify certain files in the log
> tree and flush them to disk directly, without waiting for the replay
> to do it?
> 
I don''t understand this question, replay only happens on mount after a
crash/power loss, and everything is replayed that is in the log, there is no way
to select which inode is replayed.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Aastha Mehta

2013-Sep-29 09:22 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

Thank you very much for the reply. That clarifies a lot of things.

I was trying a small test case that opens a file, writes a block of
data, calls fsync and then closes the file. If I understand correctly,
fsync would return only after all in-memory buffers have been
committed to disk. I have added few print statements in the
__extent_writepage function, and I notice that the function gets
called a bit later after fsync returns. It seems that I am not
guaranteed to see the data going to disk by the time fsync returns.

Am I doing something wrong, or am I looking at the wrong place for
disk write? This happens both with tree logging enabled as well as
with notreelog.

Thanks

On 29 September 2013 02:42, Josef Bacik <jbacik@fusionio.com>
wrote:> On Sun, Sep 29, 2013 at 01:35:15AM +0200, Aastha Mehta wrote:
>> Hi,
>>
>> I have few questions regarding logging triggered by calling fsync in
BTRFS:
>>
>> 1. If I understand correctly, fsync will call to log entire inode in
>> the log tree. Does this mean that the data extents are also logged
>> into the log tree? Are they copied into the log tree, or just
>> referenced? Are they copied into the subvolume''s extent tree
again
>> upon replay?
>>
>
> The data extents are copied as well, as in the metadata that points to the
data,
> not the actual data itself.  For 3.1 it''s all of the extents in
the inode, in
> 3.8 on it''s only the extents that have changed this transaction.
>
>> 2. During replay, when the extents are added into the extent
>> allocation tree, do they acquire the physical extent number during
>> replay? Does they physical extent allocated to the data in the log
>> tree differ from that in the subvolume?
>>
>
> No the physical location was picked when we wrote the data out during
fsync.  If
> we crash and re-mount the replay will just insert the ref into the extent
tree
> for the disk offset as it replays the extents.
>
>> 3. I see there is a mount option of notreelog available. After
>> disabling tree logging, does fsync still lead to flushing of buffers
>> to the disk directly?
>>
>
> notreelog just means that we write the data and wait on the ordered data
extents
> and then commit the transaction.  So you get the data for the inode you are
> fsycning and all of the metadata for the entire file system that has
changed in
> that transaction.
>
>> 4. Is it possible to selectively identify certain files in the log
>> tree and flush them to disk directly, without waiting for the replay
>> to do it?
>>
>
> I don''t understand this question, replay only happens on mount
after a
> crash/power loss, and everything is replayed that is in the log, there is
no way
> to select which inode is replayed.  Thanks,
>
> Josef


-- 
Aastha Mehta
MPI-SWS, Germany
E-mail: aasthakm@mpi-sws.org
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik

2013-Sep-29 13:12 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha Mehta
wrote:> Thank you very much for the reply. That clarifies a lot of things.
> 
> I was trying a small test case that opens a file, writes a block of
> data, calls fsync and then closes the file. If I understand correctly,
> fsync would return only after all in-memory buffers have been
> committed to disk. I have added few print statements in the
> __extent_writepage function, and I notice that the function gets
> called a bit later after fsync returns. It seems that I am not
> guaranteed to see the data going to disk by the time fsync returns.
> 
> Am I doing something wrong, or am I looking at the wrong place for
> disk write? This happens both with tree logging enabled as well as
> with notreelog.
> 
So 3.1 was a long time ago and to be sure it had issues I don''t think
it was
_that_ broken.  You are probably better off instrumenting a recent kernel, 3.11
or just build btrfs-next from git.  But if I were to make a guess I''d
say that
__extent_writepage was how both data and metadata was written out at the time (I
don''t think I changed it until 3.2 or something later) so what you are
likely
seeing is the normal transaction commit after the fsync.  In the case of
notreelog we are likely starting another transaction and you are seeing that
commit (at the time the transaction kthread would start a transaction even if
none had been started yet.)  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Aastha Mehta

2013-Sep-30 19:32 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On 29 September 2013 15:12, Josef Bacik <jbacik@fusionio.com>
wrote:> On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha Mehta wrote:
>> Thank you very much for the reply. That clarifies a lot of things.
>>
>> I was trying a small test case that opens a file, writes a block of
>> data, calls fsync and then closes the file. If I understand correctly,
>> fsync would return only after all in-memory buffers have been
>> committed to disk. I have added few print statements in the
>> __extent_writepage function, and I notice that the function gets
>> called a bit later after fsync returns. It seems that I am not
>> guaranteed to see the data going to disk by the time fsync returns.
>>
>> Am I doing something wrong, or am I looking at the wrong place for
>> disk write? This happens both with tree logging enabled as well as
>> with notreelog.
>>
>
> So 3.1 was a long time ago and to be sure it had issues I don''t
think it was
> _that_ broken.  You are probably better off instrumenting a recent kernel,
3.11
> or just build btrfs-next from git.  But if I were to make a guess
I''d say that
> __extent_writepage was how both data and metadata was written out at the
time (I
> don''t think I changed it until 3.2 or something later) so what you
are likely
> seeing is the normal transaction commit after the fsync.  In the case of
> notreelog we are likely starting another transaction and you are seeing
that
> commit (at the time the transaction kthread would start a transaction even
if
> none had been started yet.)  Thanks,
>
> Josef
Is there any special handling for very small file write, less than 4K? As
I understand there is an optimization to inline the first extent in a file if
it is smaller than 4K, does it affect the writeback on fsync as well? I did
set the max_inline mount option to 0, but even then it seems there is
some difference in fsync behaviour for writing first extent of less than 4K
size and writing 4K or more.

Thanks,
Aastha.


-- 
Aastha Mehta
MPI-SWS, Germany
E-mail: aasthakm@mpi-sws.org
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik

2013-Sep-30 20:11 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta
wrote:> On 29 September 2013 15:12, Josef Bacik <jbacik@fusionio.com> wrote:
> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha Mehta wrote:
> >> Thank you very much for the reply. That clarifies a lot of things.
> >>
> >> I was trying a small test case that opens a file, writes a block
of
> >> data, calls fsync and then closes the file. If I understand
correctly,
> >> fsync would return only after all in-memory buffers have been
> >> committed to disk. I have added few print statements in the
> >> __extent_writepage function, and I notice that the function gets
> >> called a bit later after fsync returns. It seems that I am not
> >> guaranteed to see the data going to disk by the time fsync
returns.
> >>
> >> Am I doing something wrong, or am I looking at the wrong place for
> >> disk write? This happens both with tree logging enabled as well as
> >> with notreelog.
> >>
> >
> > So 3.1 was a long time ago and to be sure it had issues I
don''t think it was
> > _that_ broken.  You are probably better off instrumenting a recent
kernel, 3.11
> > or just build btrfs-next from git.  But if I were to make a guess
I''d say that
> > __extent_writepage was how both data and metadata was written out at
the time (I
> > don''t think I changed it until 3.2 or something later) so
what you are likely
> > seeing is the normal transaction commit after the fsync.  In the case
of
> > notreelog we are likely starting another transaction and you are
seeing that
> > commit (at the time the transaction kthread would start a transaction
even if
> > none had been started yet.)  Thanks,
> >
> > Josef
> 
> Is there any special handling for very small file write, less than 4K? As
> I understand there is an optimization to inline the first extent in a file
if
> it is smaller than 4K, does it affect the writeback on fsync as well? I did
> set the max_inline mount option to 0, but even then it seems there is
> some difference in fsync behaviour for writing first extent of less than 4K
> size and writing 4K or more.
> 
Yeah if the file is an inline extent then it will be copied into the log
directly and the log will be written out, no going through the data write path
at all.  Max inline == 0 should make it so we don''t inline, so if it
isn''t
honoring that then that may be a bug.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Aastha Mehta

2013-Sep-30 20:30 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On 30 September 2013 22:11, Josef Bacik <jbacik@fusionio.com>
wrote:> On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta wrote:
>> On 29 September 2013 15:12, Josef Bacik <jbacik@fusionio.com>
wrote:
>> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha Mehta wrote:
>> >> Thank you very much for the reply. That clarifies a lot of
things.
>> >>
>> >> I was trying a small test case that opens a file, writes a
block of
>> >> data, calls fsync and then closes the file. If I understand
correctly,
>> >> fsync would return only after all in-memory buffers have been
>> >> committed to disk. I have added few print statements in the
>> >> __extent_writepage function, and I notice that the function
gets
>> >> called a bit later after fsync returns. It seems that I am not
>> >> guaranteed to see the data going to disk by the time fsync
returns.
>> >>
>> >> Am I doing something wrong, or am I looking at the wrong place
for
>> >> disk write? This happens both with tree logging enabled as
well as
>> >> with notreelog.
>> >>
>> >
>> > So 3.1 was a long time ago and to be sure it had issues I
don''t think it was
>> > _that_ broken.  You are probably better off instrumenting a recent
kernel, 3.11
>> > or just build btrfs-next from git.  But if I were to make a guess
I''d say that
>> > __extent_writepage was how both data and metadata was written out
at the time (I
>> > don''t think I changed it until 3.2 or something later) so
what you are likely
>> > seeing is the normal transaction commit after the fsync.  In the
case of
>> > notreelog we are likely starting another transaction and you are
seeing that
>> > commit (at the time the transaction kthread would start a
transaction even if
>> > none had been started yet.)  Thanks,
>> >
>> > Josef
>>
>> Is there any special handling for very small file write, less than 4K?
As
>> I understand there is an optimization to inline the first extent in a
file if
>> it is smaller than 4K, does it affect the writeback on fsync as well? I
did
>> set the max_inline mount option to 0, but even then it seems there is
>> some difference in fsync behaviour for writing first extent of less
than 4K
>> size and writing 4K or more.
>>
>
> Yeah if the file is an inline extent then it will be copied into the log
> directly and the log will be written out, no going through the data write
path
> at all.  Max inline == 0 should make it so we don''t inline, so if
it isn''t
> honoring that then that may be a bug.  Thanks,
>
> Josef
I tried it on 3.12-rc2 release, and it seems there is a bug then.
Please find attached logs to confirm.
Also, probably on the older release.

Thanks,
Aastha.

-- 
Aastha Mehta
MPI-SWS, Germany
E-mail: aasthakm@mpi-sws.org

Josef Bacik

2013-Sep-30 20:47 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha Mehta
wrote:> On 30 September 2013 22:11, Josef Bacik <jbacik@fusionio.com> wrote:
> > On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta wrote:
> >> On 29 September 2013 15:12, Josef Bacik
<jbacik@fusionio.com> wrote:
> >> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha Mehta wrote:
> >> >> Thank you very much for the reply. That clarifies a lot
of things.
> >> >>
> >> >> I was trying a small test case that opens a file, writes
a block of
> >> >> data, calls fsync and then closes the file. If I
understand correctly,
> >> >> fsync would return only after all in-memory buffers have
been
> >> >> committed to disk. I have added few print statements in
the
> >> >> __extent_writepage function, and I notice that the
function gets
> >> >> called a bit later after fsync returns. It seems that I
am not
> >> >> guaranteed to see the data going to disk by the time
fsync returns.
> >> >>
> >> >> Am I doing something wrong, or am I looking at the wrong
place for
> >> >> disk write? This happens both with tree logging enabled
as well as
> >> >> with notreelog.
> >> >>
> >> >
> >> > So 3.1 was a long time ago and to be sure it had issues I
don''t think it was
> >> > _that_ broken.  You are probably better off instrumenting a
recent kernel, 3.11
> >> > or just build btrfs-next from git.  But if I were to make a
guess I''d say that
> >> > __extent_writepage was how both data and metadata was written
out at the time (I
> >> > don''t think I changed it until 3.2 or something
later) so what you are likely
> >> > seeing is the normal transaction commit after the fsync.  In
the case of
> >> > notreelog we are likely starting another transaction and you
are seeing that
> >> > commit (at the time the transaction kthread would start a
transaction even if
> >> > none had been started yet.)  Thanks,
> >> >
> >> > Josef
> >>
> >> Is there any special handling for very small file write, less than
4K? As
> >> I understand there is an optimization to inline the first extent
in a file if
> >> it is smaller than 4K, does it affect the writeback on fsync as
well? I did
> >> set the max_inline mount option to 0, but even then it seems there
is
> >> some difference in fsync behaviour for writing first extent of
less than 4K
> >> size and writing 4K or more.
> >>
> >
> > Yeah if the file is an inline extent then it will be copied into the
log
> > directly and the log will be written out, no going through the data
write path
> > at all.  Max inline == 0 should make it so we don''t inline,
so if it isn''t
> > honoring that then that may be a bug.  Thanks,
> >
> > Josef
> 
> I tried it on 3.12-rc2 release, and it seems there is a bug then.
> Please find attached logs to confirm.
> Also, probably on the older release.
> 
Oooh ok I understand, you have your printk''s in the wrong place ;).
do_writepages doesn''t necessarily mean you are writing something.  If
you want
to see if stuff got written to the disk I''d put a printk at
run_delalloc_range
and have it spit out the range it is writing out since thats what we think is
actually dirty.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Aastha Mehta

2013-Sep-30 21:07 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On 30 September 2013 22:47, Josef Bacik <jbacik@fusionio.com>
wrote:> On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha Mehta wrote:
>> On 30 September 2013 22:11, Josef Bacik <jbacik@fusionio.com>
wrote:
>> > On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta wrote:
>> >> On 29 September 2013 15:12, Josef Bacik
<jbacik@fusionio.com> wrote:
>> >> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha Mehta
wrote:
>> >> >> Thank you very much for the reply. That clarifies a
lot of things.
>> >> >>
>> >> >> I was trying a small test case that opens a file,
writes a block of
>> >> >> data, calls fsync and then closes the file. If I
understand correctly,
>> >> >> fsync would return only after all in-memory buffers
have been
>> >> >> committed to disk. I have added few print statements
in the
>> >> >> __extent_writepage function, and I notice that the
function gets
>> >> >> called a bit later after fsync returns. It seems that
I am not
>> >> >> guaranteed to see the data going to disk by the time
fsync returns.
>> >> >>
>> >> >> Am I doing something wrong, or am I looking at the
wrong place for
>> >> >> disk write? This happens both with tree logging
enabled as well as
>> >> >> with notreelog.
>> >> >>
>> >> >
>> >> > So 3.1 was a long time ago and to be sure it had issues I
don''t think it was
>> >> > _that_ broken.  You are probably better off instrumenting
a recent kernel, 3.11
>> >> > or just build btrfs-next from git.  But if I were to make
a guess I''d say that
>> >> > __extent_writepage was how both data and metadata was
written out at the time (I
>> >> > don''t think I changed it until 3.2 or something
later) so what you are likely
>> >> > seeing is the normal transaction commit after the fsync. 
In the case of
>> >> > notreelog we are likely starting another transaction and
you are seeing that
>> >> > commit (at the time the transaction kthread would start a
transaction even if
>> >> > none had been started yet.)  Thanks,
>> >> >
>> >> > Josef
>> >>
>> >> Is there any special handling for very small file write, less
than 4K? As
>> >> I understand there is an optimization to inline the first
extent in a file if
>> >> it is smaller than 4K, does it affect the writeback on fsync
as well? I did
>> >> set the max_inline mount option to 0, but even then it seems
there is
>> >> some difference in fsync behaviour for writing first extent of
less than 4K
>> >> size and writing 4K or more.
>> >>
>> >
>> > Yeah if the file is an inline extent then it will be copied into
the log
>> > directly and the log will be written out, no going through the
data write path
>> > at all.  Max inline == 0 should make it so we don''t
inline, so if it isn''t
>> > honoring that then that may be a bug.  Thanks,
>> >
>> > Josef
>>
>> I tried it on 3.12-rc2 release, and it seems there is a bug then.
>> Please find attached logs to confirm.
>> Also, probably on the older release.
>>
>
> Oooh ok I understand, you have your printk''s in the wrong place
;).
> do_writepages doesn''t necessarily mean you are writing something. 
If you want
> to see if stuff got written to the disk I''d put a printk at
run_delalloc_range
> and have it spit out the range it is writing out since thats what we think
is
> actually dirty.  Thanks,
>
> Josef
No, but I also placed dump_stack() in the beginning of
__extent_writepage. run_delalloc_range is being called only from
__extent_writepage, if it were to be called, the dump_stack() at the
top of __extent_writepage would have printed as well, no?

Thanks

-- 
Aastha Mehta
MPI-SWS, Germany
E-mail: aasthakm@mpi-sws.org
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik

2013-Sep-30 21:17 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On Mon, Sep 30, 2013 at 11:07:20PM +0200, Aastha Mehta
wrote:> On 30 September 2013 22:47, Josef Bacik <jbacik@fusionio.com> wrote:
> > On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha Mehta wrote:
> >> On 30 September 2013 22:11, Josef Bacik
<jbacik@fusionio.com> wrote:
> >> > On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta wrote:
> >> >> On 29 September 2013 15:12, Josef Bacik
<jbacik@fusionio.com> wrote:
> >> >> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha
Mehta wrote:
> >> >> >> Thank you very much for the reply. That
clarifies a lot of things.
> >> >> >>
> >> >> >> I was trying a small test case that opens a
file, writes a block of
> >> >> >> data, calls fsync and then closes the file. If I
understand correctly,
> >> >> >> fsync would return only after all in-memory
buffers have been
> >> >> >> committed to disk. I have added few print
statements in the
> >> >> >> __extent_writepage function, and I notice that
the function gets
> >> >> >> called a bit later after fsync returns. It seems
that I am not
> >> >> >> guaranteed to see the data going to disk by the
time fsync returns.
> >> >> >>
> >> >> >> Am I doing something wrong, or am I looking at
the wrong place for
> >> >> >> disk write? This happens both with tree logging
enabled as well as
> >> >> >> with notreelog.
> >> >> >>
> >> >> >
> >> >> > So 3.1 was a long time ago and to be sure it had
issues I don''t think it was
> >> >> > _that_ broken.  You are probably better off
instrumenting a recent kernel, 3.11
> >> >> > or just build btrfs-next from git.  But if I were to
make a guess I''d say that
> >> >> > __extent_writepage was how both data and metadata
was written out at the time (I
> >> >> > don''t think I changed it until 3.2 or
something later) so what you are likely
> >> >> > seeing is the normal transaction commit after the
fsync.  In the case of
> >> >> > notreelog we are likely starting another transaction
and you are seeing that
> >> >> > commit (at the time the transaction kthread would
start a transaction even if
> >> >> > none had been started yet.)  Thanks,
> >> >> >
> >> >> > Josef
> >> >>
> >> >> Is there any special handling for very small file write,
less than 4K? As
> >> >> I understand there is an optimization to inline the first
extent in a file if
> >> >> it is smaller than 4K, does it affect the writeback on
fsync as well? I did
> >> >> set the max_inline mount option to 0, but even then it
seems there is
> >> >> some difference in fsync behaviour for writing first
extent of less than 4K
> >> >> size and writing 4K or more.
> >> >>
> >> >
> >> > Yeah if the file is an inline extent then it will be copied
into the log
> >> > directly and the log will be written out, no going through
the data write path
> >> > at all.  Max inline == 0 should make it so we don''t
inline, so if it isn''t
> >> > honoring that then that may be a bug.  Thanks,
> >> >
> >> > Josef
> >>
> >> I tried it on 3.12-rc2 release, and it seems there is a bug then.
> >> Please find attached logs to confirm.
> >> Also, probably on the older release.
> >>
> >
> > Oooh ok I understand, you have your printk''s in the wrong
place ;).
> > do_writepages doesn''t necessarily mean you are writing
something.  If you want
> > to see if stuff got written to the disk I''d put a printk at
run_delalloc_range
> > and have it spit out the range it is writing out since thats what we
think is
> > actually dirty.  Thanks,
> >
> > Josef
> 
> No, but I also placed dump_stack() in the beginning of
> __extent_writepage. run_delalloc_range is being called only from
> __extent_writepage, if it were to be called, the dump_stack() at the
> top of __extent_writepage would have printed as well, no?
> 
Yeah, so I don''t know whats going on and I''m in the middle of
something, I''ll
look at it tomorrow and see if I can''t figure out what is going on. 
I''m sure
it''s working, we have a xfstest to test this sort of thing and
it''s passing so
we''re definitely getting the data to disk properly, I''m
probably just missing
some peice around here somewhere.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik

2013-Oct-01 17:34 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On Mon, Sep 30, 2013 at 11:07:20PM +0200, Aastha Mehta
wrote:> On 30 September 2013 22:47, Josef Bacik <jbacik@fusionio.com> wrote:
> > On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha Mehta wrote:
> >> On 30 September 2013 22:11, Josef Bacik
<jbacik@fusionio.com> wrote:
> >> > On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta wrote:
> >> >> On 29 September 2013 15:12, Josef Bacik
<jbacik@fusionio.com> wrote:
> >> >> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha
Mehta wrote:
> >> >> >> Thank you very much for the reply. That
clarifies a lot of things.
> >> >> >>
> >> >> >> I was trying a small test case that opens a
file, writes a block of
> >> >> >> data, calls fsync and then closes the file. If I
understand correctly,
> >> >> >> fsync would return only after all in-memory
buffers have been
> >> >> >> committed to disk. I have added few print
statements in the
> >> >> >> __extent_writepage function, and I notice that
the function gets
> >> >> >> called a bit later after fsync returns. It seems
that I am not
> >> >> >> guaranteed to see the data going to disk by the
time fsync returns.
> >> >> >>
> >> >> >> Am I doing something wrong, or am I looking at
the wrong place for
> >> >> >> disk write? This happens both with tree logging
enabled as well as
> >> >> >> with notreelog.
> >> >> >>
> >> >> >
> >> >> > So 3.1 was a long time ago and to be sure it had
issues I don''t think it was
> >> >> > _that_ broken.  You are probably better off
instrumenting a recent kernel, 3.11
> >> >> > or just build btrfs-next from git.  But if I were to
make a guess I''d say that
> >> >> > __extent_writepage was how both data and metadata
was written out at the time (I
> >> >> > don''t think I changed it until 3.2 or
something later) so what you are likely
> >> >> > seeing is the normal transaction commit after the
fsync.  In the case of
> >> >> > notreelog we are likely starting another transaction
and you are seeing that
> >> >> > commit (at the time the transaction kthread would
start a transaction even if
> >> >> > none had been started yet.)  Thanks,
> >> >> >
> >> >> > Josef
> >> >>
> >> >> Is there any special handling for very small file write,
less than 4K? As
> >> >> I understand there is an optimization to inline the first
extent in a file if
> >> >> it is smaller than 4K, does it affect the writeback on
fsync as well? I did
> >> >> set the max_inline mount option to 0, but even then it
seems there is
> >> >> some difference in fsync behaviour for writing first
extent of less than 4K
> >> >> size and writing 4K or more.
> >> >>
> >> >
> >> > Yeah if the file is an inline extent then it will be copied
into the log
> >> > directly and the log will be written out, no going through
the data write path
> >> > at all.  Max inline == 0 should make it so we don''t
inline, so if it isn''t
> >> > honoring that then that may be a bug.  Thanks,
> >> >
> >> > Josef
> >>
> >> I tried it on 3.12-rc2 release, and it seems there is a bug then.
> >> Please find attached logs to confirm.
> >> Also, probably on the older release.
> >>
> >
> > Oooh ok I understand, you have your printk''s in the wrong
place ;).
> > do_writepages doesn''t necessarily mean you are writing
something.  If you want
> > to see if stuff got written to the disk I''d put a printk at
run_delalloc_range
> > and have it spit out the range it is writing out since thats what we
think is
> > actually dirty.  Thanks,
> >
> > Josef
> 
> No, but I also placed dump_stack() in the beginning of
> __extent_writepage. run_delalloc_range is being called only from
> __extent_writepage, if it were to be called, the dump_stack() at the
> top of __extent_writepage would have printed as well, no?
>
Ok I''ve done the same thing and I''m not seeing what you are
seeing.  Are you
using any mount options other than notreelog and max_inline=0?  Could you adjust
your printk to print out the root objectid for the inode as well?  It could be
possible that this is the writeout for the space cache or inode cache.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Aastha Mehta

2013-Oct-01 19:40 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On 1 October 2013 19:34, Josef Bacik <jbacik@fusionio.com>
wrote:> On Mon, Sep 30, 2013 at 11:07:20PM +0200, Aastha Mehta wrote:
>> On 30 September 2013 22:47, Josef Bacik <jbacik@fusionio.com>
wrote:
>> > On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha Mehta wrote:
>> >> On 30 September 2013 22:11, Josef Bacik
<jbacik@fusionio.com> wrote:
>> >> > On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha Mehta
wrote:
>> >> >> On 29 September 2013 15:12, Josef Bacik
<jbacik@fusionio.com> wrote:
>> >> >> > On Sun, Sep 29, 2013 at 11:22:36AM +0200, Aastha
Mehta wrote:
>> >> >> >> Thank you very much for the reply. That
clarifies a lot of things.
>> >> >> >>
>> >> >> >> I was trying a small test case that opens a
file, writes a block of
>> >> >> >> data, calls fsync and then closes the file.
If I understand correctly,
>> >> >> >> fsync would return only after all in-memory
buffers have been
>> >> >> >> committed to disk. I have added few print
statements in the
>> >> >> >> __extent_writepage function, and I notice
that the function gets
>> >> >> >> called a bit later after fsync returns. It
seems that I am not
>> >> >> >> guaranteed to see the data going to disk by
the time fsync returns.
>> >> >> >>
>> >> >> >> Am I doing something wrong, or am I looking
at the wrong place for
>> >> >> >> disk write? This happens both with tree
logging enabled as well as
>> >> >> >> with notreelog.
>> >> >> >>
>> >> >> >
>> >> >> > So 3.1 was a long time ago and to be sure it had
issues I don''t think it was
>> >> >> > _that_ broken.  You are probably better off
instrumenting a recent kernel, 3.11
>> >> >> > or just build btrfs-next from git.  But if I
were to make a guess I''d say that
>> >> >> > __extent_writepage was how both data and
metadata was written out at the time (I
>> >> >> > don''t think I changed it until 3.2 or
something later) so what you are likely
>> >> >> > seeing is the normal transaction commit after
the fsync.  In the case of
>> >> >> > notreelog we are likely starting another
transaction and you are seeing that
>> >> >> > commit (at the time the transaction kthread
would start a transaction even if
>> >> >> > none had been started yet.)  Thanks,
>> >> >> >
>> >> >> > Josef
>> >> >>
>> >> >> Is there any special handling for very small file
write, less than 4K? As
>> >> >> I understand there is an optimization to inline the
first extent in a file if
>> >> >> it is smaller than 4K, does it affect the writeback
on fsync as well? I did
>> >> >> set the max_inline mount option to 0, but even then
it seems there is
>> >> >> some difference in fsync behaviour for writing first
extent of less than 4K
>> >> >> size and writing 4K or more.
>> >> >>
>> >> >
>> >> > Yeah if the file is an inline extent then it will be
copied into the log
>> >> > directly and the log will be written out, no going
through the data write path
>> >> > at all.  Max inline == 0 should make it so we
don''t inline, so if it isn''t
>> >> > honoring that then that may be a bug.  Thanks,
>> >> >
>> >> > Josef
>> >>
>> >> I tried it on 3.12-rc2 release, and it seems there is a bug
then.
>> >> Please find attached logs to confirm.
>> >> Also, probably on the older release.
>> >>
>> >
>> > Oooh ok I understand, you have your printk''s in the wrong
place ;).
>> > do_writepages doesn''t necessarily mean you are writing
something.  If you want
>> > to see if stuff got written to the disk I''d put a printk
at run_delalloc_range
>> > and have it spit out the range it is writing out since thats what
we think is
>> > actually dirty.  Thanks,
>> >
>> > Josef
>>
>> No, but I also placed dump_stack() in the beginning of
>> __extent_writepage. run_delalloc_range is being called only from
>> __extent_writepage, if it were to be called, the dump_stack() at the
>> top of __extent_writepage would have printed as well, no?
>>
>
> Ok I''ve done the same thing and I''m not seeing what you
are seeing.  Are you
> using any mount options other than notreelog and max_inline=0?  Could you
adjust
> your printk to print out the root objectid for the inode as well?  It could
be
> possible that this is the writeout for the space cache or inode cache. 
Thanks,
>
> Josef
I actually printed the stack only when the root objectid is 5. I have
attached another log for writing the first 500 bytes in a file. I also
print the root objectid for the inode in run_delalloc and
__extent_writepage.

Thanks

-- 
Aastha Mehta
MPI-SWS, Germany
E-mail: aasthakm@mpi-sws.org

Aastha Mehta

2013-Oct-01 19:42 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On 1 October 2013 21:40, Aastha Mehta <aasthakm@gmail.com>
wrote:> On 1 October 2013 19:34, Josef Bacik <jbacik@fusionio.com> wrote:
>> On Mon, Sep 30, 2013 at 11:07:20PM +0200, Aastha Mehta wrote:
>>> On 30 September 2013 22:47, Josef Bacik <jbacik@fusionio.com>
wrote:
>>> > On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha Mehta wrote:
>>> >> On 30 September 2013 22:11, Josef Bacik
<jbacik@fusionio.com> wrote:
>>> >> > On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha
Mehta wrote:
>>> >> >> On 29 September 2013 15:12, Josef Bacik
<jbacik@fusionio.com> wrote:
>>> >> >> > On Sun, Sep 29, 2013 at 11:22:36AM +0200,
Aastha Mehta wrote:
>>> >> >> >> Thank you very much for the reply. That
clarifies a lot of things.
>>> >> >> >>
>>> >> >> >> I was trying a small test case that
opens a file, writes a block of
>>> >> >> >> data, calls fsync and then closes the
file. If I understand correctly,
>>> >> >> >> fsync would return only after all
in-memory buffers have been
>>> >> >> >> committed to disk. I have added few
print statements in the
>>> >> >> >> __extent_writepage function, and I
notice that the function gets
>>> >> >> >> called a bit later after fsync returns.
It seems that I am not
>>> >> >> >> guaranteed to see the data going to disk
by the time fsync returns.
>>> >> >> >>
>>> >> >> >> Am I doing something wrong, or am I
looking at the wrong place for
>>> >> >> >> disk write? This happens both with tree
logging enabled as well as
>>> >> >> >> with notreelog.
>>> >> >> >>
>>> >> >> >
>>> >> >> > So 3.1 was a long time ago and to be sure it
had issues I don''t think it was
>>> >> >> > _that_ broken.  You are probably better off
instrumenting a recent kernel, 3.11
>>> >> >> > or just build btrfs-next from git.  But if I
were to make a guess I''d say that
>>> >> >> > __extent_writepage was how both data and
metadata was written out at the time (I
>>> >> >> > don''t think I changed it until 3.2
or something later) so what you are likely
>>> >> >> > seeing is the normal transaction commit
after the fsync.  In the case of
>>> >> >> > notreelog we are likely starting another
transaction and you are seeing that
>>> >> >> > commit (at the time the transaction kthread
would start a transaction even if
>>> >> >> > none had been started yet.)  Thanks,
>>> >> >> >
>>> >> >> > Josef
>>> >> >>
>>> >> >> Is there any special handling for very small file
write, less than 4K? As
>>> >> >> I understand there is an optimization to inline
the first extent in a file if
>>> >> >> it is smaller than 4K, does it affect the
writeback on fsync as well? I did
>>> >> >> set the max_inline mount option to 0, but even
then it seems there is
>>> >> >> some difference in fsync behaviour for writing
first extent of less than 4K
>>> >> >> size and writing 4K or more.
>>> >> >>
>>> >> >
>>> >> > Yeah if the file is an inline extent then it will be
copied into the log
>>> >> > directly and the log will be written out, no going
through the data write path
>>> >> > at all.  Max inline == 0 should make it so we
don''t inline, so if it isn''t
>>> >> > honoring that then that may be a bug.  Thanks,
>>> >> >
>>> >> > Josef
>>> >>
>>> >> I tried it on 3.12-rc2 release, and it seems there is a
bug then.
>>> >> Please find attached logs to confirm.
>>> >> Also, probably on the older release.
>>> >>
>>> >
>>> > Oooh ok I understand, you have your printk''s in the
wrong place ;).
>>> > do_writepages doesn''t necessarily mean you are
writing something.  If you want
>>> > to see if stuff got written to the disk I''d put a
printk at run_delalloc_range
>>> > and have it spit out the range it is writing out since thats
what we think is
>>> > actually dirty.  Thanks,
>>> >
>>> > Josef
>>>
>>> No, but I also placed dump_stack() in the beginning of
>>> __extent_writepage. run_delalloc_range is being called only from
>>> __extent_writepage, if it were to be called, the dump_stack() at
the
>>> top of __extent_writepage would have printed as well, no?
>>>
>>
>> Ok I''ve done the same thing and I''m not seeing what
you are seeing.  Are you
>> using any mount options other than notreelog and max_inline=0?  Could
you adjust
>> your printk to print out the root objectid for the inode as well?  It
could be
>> possible that this is the writeout for the space cache or inode cache. 
Thanks,
>>
>> Josef
>
> I actually printed the stack only when the root objectid is 5. I have
> attached another log for writing the first 500 bytes in a file. I also
> print the root objectid for the inode in run_delalloc and
> __extent_writepage.
>
> Thanks
>
Just to clarify, in the latest logs, I allowed printing of debug
printk''s and stack dump for all root objectid''s.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Aastha Mehta

2013-Oct-01 20:13 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On 1 October 2013 21:42, Aastha Mehta <aasthakm@gmail.com>
wrote:> On 1 October 2013 21:40, Aastha Mehta <aasthakm@gmail.com> wrote:
>> On 1 October 2013 19:34, Josef Bacik <jbacik@fusionio.com> wrote:
>>> On Mon, Sep 30, 2013 at 11:07:20PM +0200, Aastha Mehta wrote:
>>>> On 30 September 2013 22:47, Josef Bacik
<jbacik@fusionio.com> wrote:
>>>> > On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha Mehta
wrote:
>>>> >> On 30 September 2013 22:11, Josef Bacik
<jbacik@fusionio.com> wrote:
>>>> >> > On Mon, Sep 30, 2013 at 09:32:54PM +0200, Aastha
Mehta wrote:
>>>> >> >> On 29 September 2013 15:12, Josef Bacik
<jbacik@fusionio.com> wrote:
>>>> >> >> > On Sun, Sep 29, 2013 at 11:22:36AM
+0200, Aastha Mehta wrote:
>>>> >> >> >> Thank you very much for the reply.
That clarifies a lot of things.
>>>> >> >> >>
>>>> >> >> >> I was trying a small test case that
opens a file, writes a block of
>>>> >> >> >> data, calls fsync and then closes
the file. If I understand correctly,
>>>> >> >> >> fsync would return only after all
in-memory buffers have been
>>>> >> >> >> committed to disk. I have added few
print statements in the
>>>> >> >> >> __extent_writepage function, and I
notice that the function gets
>>>> >> >> >> called a bit later after fsync
returns. It seems that I am not
>>>> >> >> >> guaranteed to see the data going to
disk by the time fsync returns.
>>>> >> >> >>
>>>> >> >> >> Am I doing something wrong, or am I
looking at the wrong place for
>>>> >> >> >> disk write? This happens both with
tree logging enabled as well as
>>>> >> >> >> with notreelog.
>>>> >> >> >>
>>>> >> >> >
>>>> >> >> > So 3.1 was a long time ago and to be
sure it had issues I don''t think it was
>>>> >> >> > _that_ broken.  You are probably better
off instrumenting a recent kernel, 3.11
>>>> >> >> > or just build btrfs-next from git.  But
if I were to make a guess I''d say that
>>>> >> >> > __extent_writepage was how both data and
metadata was written out at the time (I
>>>> >> >> > don''t think I changed it until
3.2 or something later) so what you are likely
>>>> >> >> > seeing is the normal transaction commit
after the fsync.  In the case of
>>>> >> >> > notreelog we are likely starting another
transaction and you are seeing that
>>>> >> >> > commit (at the time the transaction
kthread would start a transaction even if
>>>> >> >> > none had been started yet.)  Thanks,
>>>> >> >> >
>>>> >> >> > Josef
>>>> >> >>
>>>> >> >> Is there any special handling for very small
file write, less than 4K? As
>>>> >> >> I understand there is an optimization to
inline the first extent in a file if
>>>> >> >> it is smaller than 4K, does it affect the
writeback on fsync as well? I did
>>>> >> >> set the max_inline mount option to 0, but
even then it seems there is
>>>> >> >> some difference in fsync behaviour for
writing first extent of less than 4K
>>>> >> >> size and writing 4K or more.
>>>> >> >>
>>>> >> >
>>>> >> > Yeah if the file is an inline extent then it will
be copied into the log
>>>> >> > directly and the log will be written out, no
going through the data write path
>>>> >> > at all.  Max inline == 0 should make it so we
don''t inline, so if it isn''t
>>>> >> > honoring that then that may be a bug.  Thanks,
>>>> >> >
>>>> >> > Josef
>>>> >>
>>>> >> I tried it on 3.12-rc2 release, and it seems there is
a bug then.
>>>> >> Please find attached logs to confirm.
>>>> >> Also, probably on the older release.
>>>> >>
>>>> >
>>>> > Oooh ok I understand, you have your printk''s in
the wrong place ;).
>>>> > do_writepages doesn''t necessarily mean you are
writing something.  If you want
>>>> > to see if stuff got written to the disk I''d put a
printk at run_delalloc_range
>>>> > and have it spit out the range it is writing out since
thats what we think is
>>>> > actually dirty.  Thanks,
>>>> >
>>>> > Josef
>>>>
>>>> No, but I also placed dump_stack() in the beginning of
>>>> __extent_writepage. run_delalloc_range is being called only
from
>>>> __extent_writepage, if it were to be called, the dump_stack()
at the
>>>> top of __extent_writepage would have printed as well, no?
>>>>
>>>
>>> Ok I''ve done the same thing and I''m not seeing
what you are seeing.  Are you
>>> using any mount options other than notreelog and max_inline=0? 
Could you adjust
>>> your printk to print out the root objectid for the inode as well? 
It could be
>>> possible that this is the writeout for the space cache or inode
cache.  Thanks,
>>>
>>> Josef
>>
>> I actually printed the stack only when the root objectid is 5. I have
>> attached another log for writing the first 500 bytes in a file. I also
>> print the root objectid for the inode in run_delalloc and
>> __extent_writepage.
>>
>> Thanks
>>
>
> Just to clarify, in the latest logs, I allowed printing of debug
> printk''s and stack dump for all root objectid''s.
Actually, it is the same behaviour when I write anything less than 4K
long, no matter what offset, except if I straddle the page boundary.
To summarise:
1. write 4K -> write in the fsync path
2. write less than 4K, within a single page -> bdi_writeback by flush worker
3. small write that straddles a page boundary or write 4K+delta -> the
first page gets written in the fsync path, the remaining length that
straddles the page boundary is written in the bdi_writeback path

Please let me know, if I am trying out incorrect cases.

Sorry for too many mails.

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik

2013-Oct-02 11:52 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On Tue, Oct 01, 2013 at 10:13:25PM +0200, Aastha Mehta
wrote:> On 1 October 2013 21:42, Aastha Mehta <aasthakm@gmail.com> wrote:
> > On 1 October 2013 21:40, Aastha Mehta <aasthakm@gmail.com>
wrote:
> >> On 1 October 2013 19:34, Josef Bacik <jbacik@fusionio.com>
wrote:
> >>> On Mon, Sep 30, 2013 at 11:07:20PM +0200, Aastha Mehta wrote:
> >>>> On 30 September 2013 22:47, Josef Bacik
<jbacik@fusionio.com> wrote:
> >>>> > On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha
Mehta wrote:
> >>>> >> On 30 September 2013 22:11, Josef Bacik
<jbacik@fusionio.com> wrote:
> >>>> >> > On Mon, Sep 30, 2013 at 09:32:54PM +0200,
Aastha Mehta wrote:
> >>>> >> >> On 29 September 2013 15:12, Josef Bacik
<jbacik@fusionio.com> wrote:
> >>>> >> >> > On Sun, Sep 29, 2013 at 11:22:36AM
+0200, Aastha Mehta wrote:
> >>>> >> >> >> Thank you very much for the
reply. That clarifies a lot of things.
> >>>> >> >> >>
> >>>> >> >> >> I was trying a small test case
that opens a file, writes a block of
> >>>> >> >> >> data, calls fsync and then
closes the file. If I understand correctly,
> >>>> >> >> >> fsync would return only after
all in-memory buffers have been
> >>>> >> >> >> committed to disk. I have added
few print statements in the
> >>>> >> >> >> __extent_writepage function,
and I notice that the function gets
> >>>> >> >> >> called a bit later after fsync
returns. It seems that I am not
> >>>> >> >> >> guaranteed to see the data
going to disk by the time fsync returns.
> >>>> >> >> >>
> >>>> >> >> >> Am I doing something wrong, or
am I looking at the wrong place for
> >>>> >> >> >> disk write? This happens both
with tree logging enabled as well as
> >>>> >> >> >> with notreelog.
> >>>> >> >> >>
> >>>> >> >> >
> >>>> >> >> > So 3.1 was a long time ago and to
be sure it had issues I don''t think it was
> >>>> >> >> > _that_ broken.  You are probably
better off instrumenting a recent kernel, 3.11
> >>>> >> >> > or just build btrfs-next from git. 
But if I were to make a guess I''d say that
> >>>> >> >> > __extent_writepage was how both
data and metadata was written out at the time (I
> >>>> >> >> > don''t think I changed it
until 3.2 or something later) so what you are likely
> >>>> >> >> > seeing is the normal transaction
commit after the fsync.  In the case of
> >>>> >> >> > notreelog we are likely starting
another transaction and you are seeing that
> >>>> >> >> > commit (at the time the transaction
kthread would start a transaction even if
> >>>> >> >> > none had been started yet.) 
Thanks,
> >>>> >> >> >
> >>>> >> >> > Josef
> >>>> >> >>
> >>>> >> >> Is there any special handling for very
small file write, less than 4K? As
> >>>> >> >> I understand there is an optimization to
inline the first extent in a file if
> >>>> >> >> it is smaller than 4K, does it affect
the writeback on fsync as well? I did
> >>>> >> >> set the max_inline mount option to 0,
but even then it seems there is
> >>>> >> >> some difference in fsync behaviour for
writing first extent of less than 4K
> >>>> >> >> size and writing 4K or more.
> >>>> >> >>
> >>>> >> >
> >>>> >> > Yeah if the file is an inline extent then it
will be copied into the log
> >>>> >> > directly and the log will be written out, no
going through the data write path
> >>>> >> > at all.  Max inline == 0 should make it so
we don''t inline, so if it isn''t
> >>>> >> > honoring that then that may be a bug. 
Thanks,
> >>>> >> >
> >>>> >> > Josef
> >>>> >>
> >>>> >> I tried it on 3.12-rc2 release, and it seems
there is a bug then.
> >>>> >> Please find attached logs to confirm.
> >>>> >> Also, probably on the older release.
> >>>> >>
> >>>> >
> >>>> > Oooh ok I understand, you have your printk''s
in the wrong place ;).
> >>>> > do_writepages doesn''t necessarily mean you
are writing something.  If you want
> >>>> > to see if stuff got written to the disk I''d
put a printk at run_delalloc_range
> >>>> > and have it spit out the range it is writing out
since thats what we think is
> >>>> > actually dirty.  Thanks,
> >>>> >
> >>>> > Josef
> >>>>
> >>>> No, but I also placed dump_stack() in the beginning of
> >>>> __extent_writepage. run_delalloc_range is being called
only from
> >>>> __extent_writepage, if it were to be called, the
dump_stack() at the
> >>>> top of __extent_writepage would have printed as well, no?
> >>>>
> >>>
> >>> Ok I''ve done the same thing and I''m not
seeing what you are seeing.  Are you
> >>> using any mount options other than notreelog and max_inline=0?
Could you adjust
> >>> your printk to print out the root objectid for the inode as
well?  It could be
> >>> possible that this is the writeout for the space cache or
inode cache.  Thanks,
> >>>
> >>> Josef
> >>
> >> I actually printed the stack only when the root objectid is 5. I
have
> >> attached another log for writing the first 500 bytes in a file. I
also
> >> print the root objectid for the inode in run_delalloc and
> >> __extent_writepage.
> >>
> >> Thanks
> >>
> >
> > Just to clarify, in the latest logs, I allowed printing of debug
> > printk''s and stack dump for all root objectid''s.
> 
> Actually, it is the same behaviour when I write anything less than 4K
> long, no matter what offset, except if I straddle the page boundary.
> To summarise:
> 1. write 4K -> write in the fsync path
> 2. write less than 4K, within a single page -> bdi_writeback by flush
worker
> 3. small write that straddles a page boundary or write 4K+delta -> the
> first page gets written in the fsync path, the remaining length that
> straddles the page boundary is written in the bdi_writeback path
> 
> Please let me know, if I am trying out incorrect cases.
> 
> Sorry for too many mails.
>
This has been bugging me so much I was dreaming about it and now here I am
writing an email at 4:45 in the morning ;).  So I couldn''t reproduce
earlier
with any of these scenarios and then I realized something, I''m doing
something
like this

xfs_io -f -c "pwrite 0 54" -c "fsync" /mnt/btrfs-test/foo

and it is working perfectly.  But I bet what you are doing is something like
this

file = fopen("/mnt/btrfs-test/foo");
fwrite(buf, 54, 1, file);
fsync(fileno(file));
fclose(file);

right?  Please say yes :).  If this is the case then it is likely that these
small writes are getting buffered in the userspace buffering that comes with
fwrite, and so when you fsync it is only flushing the data that is actually in
the kernel, not what is buffered in userspace.  Then when you fclose it flushes
what is in the userspace buffers out to the kernel and then later on the
background writer comes in and writes out the dirty data.  To fix this you want
to do fflush() and then fsync().  Hopefully that is what you are doing and I can
go back to sleep, thanks,

Josef 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Aastha Mehta

2013-Oct-02 20:12 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On 2 October 2013 13:52, Josef Bacik <jbacik@fusionio.com>
wrote:> On Tue, Oct 01, 2013 at 10:13:25PM +0200, Aastha Mehta wrote:
>> On 1 October 2013 21:42, Aastha Mehta <aasthakm@gmail.com> wrote:
>> > On 1 October 2013 21:40, Aastha Mehta <aasthakm@gmail.com>
wrote:
>> >> On 1 October 2013 19:34, Josef Bacik
<jbacik@fusionio.com> wrote:
>> >>> On Mon, Sep 30, 2013 at 11:07:20PM +0200, Aastha Mehta
wrote:
>> >>>> On 30 September 2013 22:47, Josef Bacik
<jbacik@fusionio.com> wrote:
>> >>>> > On Mon, Sep 30, 2013 at 10:30:59PM +0200, Aastha
Mehta wrote:
>> >>>> >> On 30 September 2013 22:11, Josef Bacik
<jbacik@fusionio.com> wrote:
>> >>>> >> > On Mon, Sep 30, 2013 at 09:32:54PM
+0200, Aastha Mehta wrote:
>> >>>> >> >> On 29 September 2013 15:12, Josef
Bacik <jbacik@fusionio.com> wrote:
>> >>>> >> >> > On Sun, Sep 29, 2013 at
11:22:36AM +0200, Aastha Mehta wrote:
>> >>>> >> >> >> Thank you very much for the
reply. That clarifies a lot of things.
>> >>>> >> >> >>
>> >>>> >> >> >> I was trying a small test
case that opens a file, writes a block of
>> >>>> >> >> >> data, calls fsync and then
closes the file. If I understand correctly,
>> >>>> >> >> >> fsync would return only
after all in-memory buffers have been
>> >>>> >> >> >> committed to disk. I have
added few print statements in the
>> >>>> >> >> >> __extent_writepage
function, and I notice that the function gets
>> >>>> >> >> >> called a bit later after
fsync returns. It seems that I am not
>> >>>> >> >> >> guaranteed to see the data
going to disk by the time fsync returns.
>> >>>> >> >> >>
>> >>>> >> >> >> Am I doing something wrong,
or am I looking at the wrong place for
>> >>>> >> >> >> disk write? This happens
both with tree logging enabled as well as
>> >>>> >> >> >> with notreelog.
>> >>>> >> >> >>
>> >>>> >> >> >
>> >>>> >> >> > So 3.1 was a long time ago and
to be sure it had issues I don''t think it was
>> >>>> >> >> > _that_ broken.  You are
probably better off instrumenting a recent kernel, 3.11
>> >>>> >> >> > or just build btrfs-next from
git.  But if I were to make a guess I''d say that
>> >>>> >> >> > __extent_writepage was how both
data and metadata was written out at the time (I
>> >>>> >> >> > don''t think I changed
it until 3.2 or something later) so what you are likely
>> >>>> >> >> > seeing is the normal
transaction commit after the fsync.  In the case of
>> >>>> >> >> > notreelog we are likely
starting another transaction and you are seeing that
>> >>>> >> >> > commit (at the time the
transaction kthread would start a transaction even if
>> >>>> >> >> > none had been started yet.) 
Thanks,
>> >>>> >> >> >
>> >>>> >> >> > Josef
>> >>>> >> >>
>> >>>> >> >> Is there any special handling for
very small file write, less than 4K? As
>> >>>> >> >> I understand there is an
optimization to inline the first extent in a file if
>> >>>> >> >> it is smaller than 4K, does it
affect the writeback on fsync as well? I did
>> >>>> >> >> set the max_inline mount option to
0, but even then it seems there is
>> >>>> >> >> some difference in fsync behaviour
for writing first extent of less than 4K
>> >>>> >> >> size and writing 4K or more.
>> >>>> >> >>
>> >>>> >> >
>> >>>> >> > Yeah if the file is an inline extent
then it will be copied into the log
>> >>>> >> > directly and the log will be written
out, no going through the data write path
>> >>>> >> > at all.  Max inline == 0 should make it
so we don''t inline, so if it isn''t
>> >>>> >> > honoring that then that may be a bug. 
Thanks,
>> >>>> >> >
>> >>>> >> > Josef
>> >>>> >>
>> >>>> >> I tried it on 3.12-rc2 release, and it seems
there is a bug then.
>> >>>> >> Please find attached logs to confirm.
>> >>>> >> Also, probably on the older release.
>> >>>> >>
>> >>>> >
>> >>>> > Oooh ok I understand, you have your
printk''s in the wrong place ;).
>> >>>> > do_writepages doesn''t necessarily mean
you are writing something.  If you want
>> >>>> > to see if stuff got written to the disk
I''d put a printk at run_delalloc_range
>> >>>> > and have it spit out the range it is writing out
since thats what we think is
>> >>>> > actually dirty.  Thanks,
>> >>>> >
>> >>>> > Josef
>> >>>>
>> >>>> No, but I also placed dump_stack() in the beginning of
>> >>>> __extent_writepage. run_delalloc_range is being called
only from
>> >>>> __extent_writepage, if it were to be called, the
dump_stack() at the
>> >>>> top of __extent_writepage would have printed as well,
no?
>> >>>>
>> >>>
>> >>> Ok I''ve done the same thing and I''m not
seeing what you are seeing.  Are you
>> >>> using any mount options other than notreelog and
max_inline=0?  Could you adjust
>> >>> your printk to print out the root objectid for the inode
as well?  It could be
>> >>> possible that this is the writeout for the space cache or
inode cache.  Thanks,
>> >>>
>> >>> Josef
>> >>
>> >> I actually printed the stack only when the root objectid is 5.
I have
>> >> attached another log for writing the first 500 bytes in a
file. I also
>> >> print the root objectid for the inode in run_delalloc and
>> >> __extent_writepage.
>> >>
>> >> Thanks
>> >>
>> >
>> > Just to clarify, in the latest logs, I allowed printing of debug
>> > printk''s and stack dump for all root objectid''s.
>>
>> Actually, it is the same behaviour when I write anything less than 4K
>> long, no matter what offset, except if I straddle the page boundary.
>> To summarise:
>> 1. write 4K -> write in the fsync path
>> 2. write less than 4K, within a single page -> bdi_writeback by
flush worker
>> 3. small write that straddles a page boundary or write 4K+delta ->
the
>> first page gets written in the fsync path, the remaining length that
>> straddles the page boundary is written in the bdi_writeback path
>>
>> Please let me know, if I am trying out incorrect cases.
>>
>> Sorry for too many mails.
>>
>
> This has been bugging me so much I was dreaming about it and now here I am
> writing an email at 4:45 in the morning ;).  So I couldn''t
reproduce earlier
> with any of these scenarios and then I realized something, I''m
doing something
> like this
>
> xfs_io -f -c "pwrite 0 54" -c "fsync"
/mnt/btrfs-test/foo
>
> and it is working perfectly.  But I bet what you are doing is something
like
> this
>
> file = fopen("/mnt/btrfs-test/foo");
> fwrite(buf, 54, 1, file);
> fsync(fileno(file));
> fclose(file);
>
> right?  Please say yes :).  If this is the case then it is likely that
these
> small writes are getting buffered in the userspace buffering that comes
with
> fwrite, and so when you fsync it is only flushing the data that is actually
in
> the kernel, not what is buffered in userspace.  Then when you fclose it
flushes
> what is in the userspace buffers out to the kernel and then later on the
> background writer comes in and writes out the dirty data.  To fix this you
want
> to do fflush() and then fsync().  Hopefully that is what you are doing and
I can
> go back to sleep, thanks,
>
> Josef
Indeed!! :)

I did mention I am using f* version of the POSIX API. Sorry for the
confusion. Calling fflush before fsync seems to write everything
perfectly. It works even without notreelog option, as it should have.
I was under the misconception that fflush and fsync do the same thing.

Thanks a lot for your quick help.

Regards,

-- 
Aastha Mehta
MPI-SWS, Germany
E-mail: aasthakm@mpi-sws.org
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Josef Bacik

2013-Oct-02 23:28 UTC

head link

Re: Questions regarding logging upon fsync in btrfs

On Wed, Oct 02, 2013 at 10:12:20PM +0200, Aastha Mehta
wrote:> On 2 October 2013 13:52, Josef Bacik <jbacik@fusionio.com> wrote:
> > On Tue, Oct 01, 2013 at 10:13:25PM +0200, Aastha Mehta wrote:
> >> On 1 October 2013 21:42, Aastha Mehta <aasthakm@gmail.com>
wrote:
> >> > On 1 October 2013 21:40, Aastha Mehta
<aasthakm@gmail.com> wrote:
> >> >> On 1 October 2013 19:34, Josef Bacik
<jbacik@fusionio.com> wrote:
> >> >>> On Mon, Sep 30, 2013 at 11:07:20PM +0200, Aastha
Mehta wrote:
> >> >>>> On 30 September 2013 22:47, Josef Bacik
<jbacik@fusionio.com> wrote:
> >> >>>> > On Mon, Sep 30, 2013 at 10:30:59PM +0200,
Aastha Mehta wrote:
> >> >>>> >> On 30 September 2013 22:11, Josef Bacik
<jbacik@fusionio.com> wrote:
> >> >>>> >> > On Mon, Sep 30, 2013 at 09:32:54PM
+0200, Aastha Mehta wrote:
> >> >>>> >> >> On 29 September 2013 15:12,
Josef Bacik <jbacik@fusionio.com> wrote:
> >> >>>> >> >> > On Sun, Sep 29, 2013 at
11:22:36AM +0200, Aastha Mehta wrote:
> >> >>>> >> >> >> Thank you very much
for the reply. That clarifies a lot of things.
> >> >>>> >> >> >>
> >> >>>> >> >> >> I was trying a small
test case that opens a file, writes a block of
> >> >>>> >> >> >> data, calls fsync and
then closes the file. If I understand correctly,
> >> >>>> >> >> >> fsync would return
only after all in-memory buffers have been
> >> >>>> >> >> >> committed to disk. I
have added few print statements in the
> >> >>>> >> >> >> __extent_writepage
function, and I notice that the function gets
> >> >>>> >> >> >> called a bit later
after fsync returns. It seems that I am not
> >> >>>> >> >> >> guaranteed to see the
data going to disk by the time fsync returns.
> >> >>>> >> >> >>
> >> >>>> >> >> >> Am I doing something
wrong, or am I looking at the wrong place for
> >> >>>> >> >> >> disk write? This
happens both with tree logging enabled as well as
> >> >>>> >> >> >> with notreelog.
> >> >>>> >> >> >>
> >> >>>> >> >> >
> >> >>>> >> >> > So 3.1 was a long time ago
and to be sure it had issues I don''t think it was
> >> >>>> >> >> > _that_ broken.  You are
probably better off instrumenting a recent kernel, 3.11
> >> >>>> >> >> > or just build btrfs-next
from git.  But if I were to make a guess I''d say that
> >> >>>> >> >> > __extent_writepage was how
both data and metadata was written out at the time (I
> >> >>>> >> >> > don''t think I
changed it until 3.2 or something later) so what you are likely
> >> >>>> >> >> > seeing is the normal
transaction commit after the fsync.  In the case of
> >> >>>> >> >> > notreelog we are likely
starting another transaction and you are seeing that
> >> >>>> >> >> > commit (at the time the
transaction kthread would start a transaction even if
> >> >>>> >> >> > none had been started
yet.)  Thanks,
> >> >>>> >> >> >
> >> >>>> >> >> > Josef
> >> >>>> >> >>
> >> >>>> >> >> Is there any special handling
for very small file write, less than 4K? As
> >> >>>> >> >> I understand there is an
optimization to inline the first extent in a file if
> >> >>>> >> >> it is smaller than 4K, does it
affect the writeback on fsync as well? I did
> >> >>>> >> >> set the max_inline mount option
to 0, but even then it seems there is
> >> >>>> >> >> some difference in fsync
behaviour for writing first extent of less than 4K
> >> >>>> >> >> size and writing 4K or more.
> >> >>>> >> >>
> >> >>>> >> >
> >> >>>> >> > Yeah if the file is an inline
extent then it will be copied into the log
> >> >>>> >> > directly and the log will be
written out, no going through the data write path
> >> >>>> >> > at all.  Max inline == 0 should
make it so we don''t inline, so if it isn''t
> >> >>>> >> > honoring that then that may be a
bug.  Thanks,
> >> >>>> >> >
> >> >>>> >> > Josef
> >> >>>> >>
> >> >>>> >> I tried it on 3.12-rc2 release, and it
seems there is a bug then.
> >> >>>> >> Please find attached logs to confirm.
> >> >>>> >> Also, probably on the older release.
> >> >>>> >>
> >> >>>> >
> >> >>>> > Oooh ok I understand, you have your
printk''s in the wrong place ;).
> >> >>>> > do_writepages doesn''t necessarily
mean you are writing something.  If you want
> >> >>>> > to see if stuff got written to the disk
I''d put a printk at run_delalloc_range
> >> >>>> > and have it spit out the range it is writing
out since thats what we think is
> >> >>>> > actually dirty.  Thanks,
> >> >>>> >
> >> >>>> > Josef
> >> >>>>
> >> >>>> No, but I also placed dump_stack() in the
beginning of
> >> >>>> __extent_writepage. run_delalloc_range is being
called only from
> >> >>>> __extent_writepage, if it were to be called, the
dump_stack() at the
> >> >>>> top of __extent_writepage would have printed as
well, no?
> >> >>>>
> >> >>>
> >> >>> Ok I''ve done the same thing and I''m
not seeing what you are seeing.  Are you
> >> >>> using any mount options other than notreelog and
max_inline=0?  Could you adjust
> >> >>> your printk to print out the root objectid for the
inode as well?  It could be
> >> >>> possible that this is the writeout for the space
cache or inode cache.  Thanks,
> >> >>>
> >> >>> Josef
> >> >>
> >> >> I actually printed the stack only when the root objectid
is 5. I have
> >> >> attached another log for writing the first 500 bytes in a
file. I also
> >> >> print the root objectid for the inode in run_delalloc and
> >> >> __extent_writepage.
> >> >>
> >> >> Thanks
> >> >>
> >> >
> >> > Just to clarify, in the latest logs, I allowed printing of
debug
> >> > printk''s and stack dump for all root
objectid''s.
> >>
> >> Actually, it is the same behaviour when I write anything less than
4K
> >> long, no matter what offset, except if I straddle the page
boundary.
> >> To summarise:
> >> 1. write 4K -> write in the fsync path
> >> 2. write less than 4K, within a single page -> bdi_writeback by
flush worker
> >> 3. small write that straddles a page boundary or write 4K+delta
-> the
> >> first page gets written in the fsync path, the remaining length
that
> >> straddles the page boundary is written in the bdi_writeback path
> >>
> >> Please let me know, if I am trying out incorrect cases.
> >>
> >> Sorry for too many mails.
> >>
> >
> > This has been bugging me so much I was dreaming about it and now here
I am
> > writing an email at 4:45 in the morning ;).  So I couldn''t
reproduce earlier
> > with any of these scenarios and then I realized something,
I''m doing something
> > like this
> >
> > xfs_io -f -c "pwrite 0 54" -c "fsync"
/mnt/btrfs-test/foo
> >
> > and it is working perfectly.  But I bet what you are doing is
something like
> > this
> >
> > file = fopen("/mnt/btrfs-test/foo");
> > fwrite(buf, 54, 1, file);
> > fsync(fileno(file));
> > fclose(file);
> >
> > right?  Please say yes :).  If this is the case then it is likely that
these
> > small writes are getting buffered in the userspace buffering that
comes with
> > fwrite, and so when you fsync it is only flushing the data that is
actually in
> > the kernel, not what is buffered in userspace.  Then when you fclose
it flushes
> > what is in the userspace buffers out to the kernel and then later on
the
> > background writer comes in and writes out the dirty data.  To fix this
you want
> > to do fflush() and then fsync().  Hopefully that is what you are doing
and I can
> > go back to sleep, thanks,
> >
> > Josef
> 
> Indeed!! :)
> 
> I did mention I am using f* version of the POSIX API. Sorry for the
> confusion. Calling fflush before fsync seems to write everything
> perfectly. It works even without notreelog option, as it should have.
> I was under the misconception that fflush and fsync do the same thing.
> 
> Thanks a lot for your quick help.
> 
Yeah sorry all I read was "there is a bug in fsync" and just assumed
we broke
something ;).  Glad I could help,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Sep 2013 - Questions regarding logging upon fsync in btrfs

Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs

Re: Questions regarding logging upon fsync in btrfs