thr3ads.net - Lustre discuss - [Lustre-discuss] O

If this information is useful, please help other people find it:
Share via:

Wei-keng Liao

2006-Sep-20 15:28 UTC

[Lustre-discuss] O_DIRECT question

According to the man page of open(), when a file is opened in O_DIRECT 
mode, all read/write calls are synchronous. My question is whether this
synchronization on Lustre only reaches to the servers or all the way to 
the disks, before a write call is returned?

Wei-keng Liao

Marty Barnaby

2006-Sep-20 15:42 UTC

head link

[Lustre-discuss] O_DIRECT question

I respect your question, but, in general do you have any positive 
experience with an FS delivering substantial performance increase with 
O_DIRECT? I have worked with performance of about ten FS implementations 
or flavors, mostly vendor proprietary, and have found only the SGI XFS, 
with large block access, to provide as much as a 20% speed-up,.

Marty Barnaby

Wei-keng Liao wrote:>
> According to the man page of open(), when a file is opened in O_DIRECT 
> mode, all read/write calls are synchronous. My question is whether this
> synchronization on Lustre only reaches to the servers or all the way 
> to the disks, before a write call is returned?
>
> Wei-keng Liao
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>
>

Peter J. Braam

2006-Sep-20 18:42 UTC

head link

[Lustre-discuss] O_DIRECT question

Interstingly Lustre internally uses IO extremely similar to direct IO
because we did find two performance improvements using that (in Linux 2.4): 
- using page caches slowed things down (even removing clean pages of all
things!)
- concurrently running threads did much better allocation with direct IO
than normal IO

We have not verified if this has been improved in Linux 2.6 and it may very
well have been because 2.6 has a much advanced version of ext3.

The Lustre server loads are probably not applicable to loads seen on
clients, but I thought I''d relate our experience here.

- Peter -
> -----Original Message-----
> From: lustre-discuss-bounces@clusterfs.com 
> [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of 
> Marty Barnaby
> Sent: Wednesday, September 20, 2006 3:44 PM
> To: lustre-discuss@clusterfs.com
> Subject: Re: [Lustre-discuss] O_DIRECT question
> 
> I respect your question, but, in general do you have any 
> positive experience with an FS delivering substantial 
> performance increase with O_DIRECT? I have worked with 
> performance of about ten FS implementations or flavors, 
> mostly vendor proprietary, and have found only the SGI XFS, 
> with large block access, to provide as much as a 20% speed-up,.
> 
> Marty Barnaby
> 
> 
> Wei-keng Liao wrote:
> >
> > According to the man page of open(), when a file is opened 
> in O_DIRECT 
> > mode, all read/write calls are synchronous. My question is whether 
> > this synchronization on Lustre only reaches to the servers 
> or all the 
> > way to the disks, before a write call is returned?
> >
> > Wei-keng Liao
> >
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss@clusterfs.com
> > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> >
> >
> 
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>

Wei-keng Liao

2006-Sep-21 11:02 UTC

head link

[Lustre-discuss] O_DIRECT question

I have an MPI code that performs contiguous, large chunk, 1 MB aligned, 
non-overlapping, non-interleaved writes to a shared file and data will 
never read back. I could not get a better data rate when using O_DIRECT, 
comparing to not using it, although the access pattern should be better if 
using O_DIRECT. So, I am reasoning this result. (I disabled locking when I 
used O_DIRECT. So, locking should not be an issue.)

I wonder if O_DIRECT makes a read/write call synchronous all the way to 
the disks on the server side before the the call returns. Or is it that 
the call returns immediately after the servers receive the data. If it is 
the former, it is reasonable that O_DIRECT performs worse. Are there other 
factors that lead to a bad performance of O_DIRECT, given such an access 
pattern. Comments are appreciated.

I can provide the I/O kernel and write trace file. Please let me know.

Wei-keng



On Wed, 20 Sep 2006, Peter J. Braam wrote:
>
>
> Interstingly Lustre internally uses IO extremely similar to direct IO
> because we did find two performance improvements using that (in Linux 2.4):
> - using page caches slowed things down (even removing clean pages of all
> things!)
> - concurrently running threads did much better allocation with direct IO
> than normal IO
>
> We have not verified if this has been improved in Linux 2.6 and it may very
> well have been because 2.6 has a much advanced version of ext3.
>
> The Lustre server loads are probably not applicable to loads seen on
> clients, but I thought I''d relate our experience here.
>
> - Peter -
>
>> -----Original Message-----
>> From: lustre-discuss-bounces@clusterfs.com
>> [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of
>> Marty Barnaby
>> Sent: Wednesday, September 20, 2006 3:44 PM
>> To: lustre-discuss@clusterfs.com
>> Subject: Re: [Lustre-discuss] O_DIRECT question
>>
>> I respect your question, but, in general do you have any
>> positive experience with an FS delivering substantial
>> performance increase with O_DIRECT? I have worked with
>> performance of about ten FS implementations or flavors,
>> mostly vendor proprietary, and have found only the SGI XFS,
>> with large block access, to provide as much as a 20% speed-up,.
>>
>> Marty Barnaby
>>
>>
>> Wei-keng Liao wrote:
>>>
>>> According to the man page of open(), when a file is opened
>> in O_DIRECT
>>> mode, all read/write calls are synchronous. My question is whether
>>> this synchronization on Lustre only reaches to the servers
>> or all the
>>> way to the disks, before a write call is returned?
>>>
>>> Wei-keng Liao
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss@clusterfs.com
>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>
>>>
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss@clusterfs.com
>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>

Lee Ward

2006-Sep-21 11:08 UTC

head link

[Lustre-discuss] O_DIRECT question

On Thu, 2006-09-21 at 12:02 -0500, Wei-keng Liao wrote:> I have an MPI code that performs contiguous, large chunk, 1 MB aligned, 
> non-overlapping, non-interleaved writes to a shared file and data will 
> never read back. I could not get a better data rate when using O_DIRECT, 
> comparing to not using it, although the access pattern should be better if 
> using O_DIRECT. So, I am reasoning this result. (I disabled locking when I 
> used O_DIRECT. So, locking should not be an issue.)
Hi Wei-keng. 1MB seems smallish. Is something larger possible and,
importantly, still valid in your test?

		--Lee
> 
> I wonder if O_DIRECT makes a read/write call synchronous all the way to 
> the disks on the server side before the the call returns. Or is it that 
> the call returns immediately after the servers receive the data. If it is 
> the former, it is reasonable that O_DIRECT performs worse. Are there other 
> factors that lead to a bad performance of O_DIRECT, given such an access 
> pattern. Comments are appreciated.
> 
> I can provide the I/O kernel and write trace file. Please let me know.
> 
> Wei-keng
> 
> 
> 
> On Wed, 20 Sep 2006, Peter J. Braam wrote:
> 
> >
> >
> > Interstingly Lustre internally uses IO extremely similar to direct IO
> > because we did find two performance improvements using that (in Linux
2.4):
> > - using page caches slowed things down (even removing clean pages of
all
> > things!)
> > - concurrently running threads did much better allocation with direct
IO
> > than normal IO
> >
> > We have not verified if this has been improved in Linux 2.6 and it may
very
> > well have been because 2.6 has a much advanced version of ext3.
> >
> > The Lustre server loads are probably not applicable to loads seen on
> > clients, but I thought I''d relate our experience here.
> >
> > - Peter -
> >
> >> -----Original Message-----
> >> From: lustre-discuss-bounces@clusterfs.com
> >> [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of
> >> Marty Barnaby
> >> Sent: Wednesday, September 20, 2006 3:44 PM
> >> To: lustre-discuss@clusterfs.com
> >> Subject: Re: [Lustre-discuss] O_DIRECT question
> >>
> >> I respect your question, but, in general do you have any
> >> positive experience with an FS delivering substantial
> >> performance increase with O_DIRECT? I have worked with
> >> performance of about ten FS implementations or flavors,
> >> mostly vendor proprietary, and have found only the SGI XFS,
> >> with large block access, to provide as much as a 20% speed-up,.
> >>
> >> Marty Barnaby
> >>
> >>
> >> Wei-keng Liao wrote:
> >>>
> >>> According to the man page of open(), when a file is opened
> >> in O_DIRECT
> >>> mode, all read/write calls are synchronous. My question is
whether
> >>> this synchronization on Lustre only reaches to the servers
> >> or all the
> >>> way to the disks, before a write call is returned?
> >>>
> >>> Wei-keng Liao
> >>>
> >>> _______________________________________________
> >>> Lustre-discuss mailing list
> >>> Lustre-discuss@clusterfs.com
> >>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> >>>
> >>>
> >>
> >>
> >> _______________________________________________
> >> Lustre-discuss mailing list
> >> Lustre-discuss@clusterfs.com
> >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> >>
> >
> > _______________________________________________
> > Lustre-discuss mailing list
> > Lustre-discuss@clusterfs.com
> > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> >
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> 
>

Wei-keng Liao

2006-Sep-21 11:17 UTC

head link

[Lustre-discuss] O_DIRECT question

Hi Lee,

The write request offsets and lengths are 1 MB "aligned" (offsets and 
lengths are multiple of 1 MB). Actually, the lengths of write calls are 
either 8, 9, 10, 11, or 12 MB. I have 41 write calls for each of the 16 
MPI processes. That makes a file size (or the total write size) of 6.4 GB.

Wei-keng


On Thu, 21 Sep 2006, Lee Ward wrote:
> On Thu, 2006-09-21 at 12:02 -0500, Wei-keng Liao wrote:
>> I have an MPI code that performs contiguous, large chunk, 1 MB aligned,
>> non-overlapping, non-interleaved writes to a shared file and data will
>> never read back. I could not get a better data rate when using
O_DIRECT,
>> comparing to not using it, although the access pattern should be better
if
>> using O_DIRECT. So, I am reasoning this result. (I disabled locking
when I
>> used O_DIRECT. So, locking should not be an issue.)
>
> Hi Wei-keng. 1MB seems smallish. Is something larger possible and,
> importantly, still valid in your test?
>
> 		--Lee
>
>>
>> I wonder if O_DIRECT makes a read/write call synchronous all the way to
>> the disks on the server side before the the call returns. Or is it that
>> the call returns immediately after the servers receive the data. If it
is
>> the former, it is reasonable that O_DIRECT performs worse. Are there
other
>> factors that lead to a bad performance of O_DIRECT, given such an
access
>> pattern. Comments are appreciated.
>>
>> I can provide the I/O kernel and write trace file. Please let me know.
>>
>> Wei-keng
>>
>>
>>
>> On Wed, 20 Sep 2006, Peter J. Braam wrote:
>>
>>>
>>>
>>> Interstingly Lustre internally uses IO extremely similar to direct
IO
>>> because we did find two performance improvements using that (in
Linux 2.4):
>>> - using page caches slowed things down (even removing clean pages
of all
>>> things!)
>>> - concurrently running threads did much better allocation with
direct IO
>>> than normal IO
>>>
>>> We have not verified if this has been improved in Linux 2.6 and it
may very
>>> well have been because 2.6 has a much advanced version of ext3.
>>>
>>> The Lustre server loads are probably not applicable to loads seen
on
>>> clients, but I thought I''d relate our experience here.
>>>
>>> - Peter -
>>>
>>>> -----Original Message-----
>>>> From: lustre-discuss-bounces@clusterfs.com
>>>> [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of
>>>> Marty Barnaby
>>>> Sent: Wednesday, September 20, 2006 3:44 PM
>>>> To: lustre-discuss@clusterfs.com
>>>> Subject: Re: [Lustre-discuss] O_DIRECT question
>>>>
>>>> I respect your question, but, in general do you have any
>>>> positive experience with an FS delivering substantial
>>>> performance increase with O_DIRECT? I have worked with
>>>> performance of about ten FS implementations or flavors,
>>>> mostly vendor proprietary, and have found only the SGI XFS,
>>>> with large block access, to provide as much as a 20% speed-up,.
>>>>
>>>> Marty Barnaby
>>>>
>>>>
>>>> Wei-keng Liao wrote:
>>>>>
>>>>> According to the man page of open(), when a file is opened
>>>> in O_DIRECT
>>>>> mode, all read/write calls are synchronous. My question is
whether
>>>>> this synchronization on Lustre only reaches to the servers
>>>> or all the
>>>>> way to the disks, before a write call is returned?
>>>>>
>>>>> Wei-keng Liao
>>>>>
>>>>> _______________________________________________
>>>>> Lustre-discuss mailing list
>>>>> Lustre-discuss@clusterfs.com
>>>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>>>
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Lustre-discuss mailing list
>>>> Lustre-discuss@clusterfs.com
>>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss@clusterfs.com
>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss@clusterfs.com
>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>
>>
>

Peter J. Braam

2006-Sep-21 11:36 UTC

head link

[Lustre-discuss] O_DIRECT question

O_DIRECT IO is synchronous to the disk, fyi.
> -----Original Message-----
> From: lustre-discuss-bounces@clusterfs.com 
> [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of 
> Wei-keng Liao
> Sent: Thursday, September 21, 2006 11:17 AM
> To: lustre-discuss@clusterfs.com
> Subject: RE: [Lustre-discuss] O_DIRECT question
> 
> Hi Lee,
> 
> The write request offsets and lengths are 1 MB "aligned" 
> (offsets and lengths are multiple of 1 MB). Actually, the 
> lengths of write calls are either 8, 9, 10, 11, or 12 MB. I 
> have 41 write calls for each of the 16 MPI processes. That 
> makes a file size (or the total write size) of 6.4 GB.
> 
> Wei-keng
> 
> 
> On Thu, 21 Sep 2006, Lee Ward wrote:
> 
> > On Thu, 2006-09-21 at 12:02 -0500, Wei-keng Liao wrote:
> >> I have an MPI code that performs contiguous, large chunk, 1 MB 
> >> aligned, non-overlapping, non-interleaved writes to a 
> shared file and 
> >> data will never read back. I could not get a better data rate when
> >> using O_DIRECT, comparing to not using it, although the access 
> >> pattern should be better if using O_DIRECT. So, I am 
> reasoning this 
> >> result. (I disabled locking when I used O_DIRECT. So, 
> locking should 
> >> not be an issue.)
> >
> > Hi Wei-keng. 1MB seems smallish. Is something larger possible and, 
> > importantly, still valid in your test?
> >
> > 		--Lee
> >
> >>
> >> I wonder if O_DIRECT makes a read/write call synchronous 
> all the way 
> >> to the disks on the server side before the the call 
> returns. Or is it 
> >> that the call returns immediately after the servers 
> receive the data. 
> >> If it is the former, it is reasonable that O_DIRECT 
> performs worse. 
> >> Are there other factors that lead to a bad performance of 
> O_DIRECT, 
> >> given such an access pattern. Comments are appreciated.
> >>
> >> I can provide the I/O kernel and write trace file. Please 
> let me know.
> >>
> >> Wei-keng
> >>
> >>
> >>
> >> On Wed, 20 Sep 2006, Peter J. Braam wrote:
> >>
> >>>
> >>>
> >>> Interstingly Lustre internally uses IO extremely similar 
> to direct 
> >>> IO because we did find two performance improvements using 
> that (in Linux 2.4):
> >>> - using page caches slowed things down (even removing 
> clean pages of 
> >>> all
> >>> things!)
> >>> - concurrently running threads did much better allocation with
> >>> direct IO than normal IO
> >>>
> >>> We have not verified if this has been improved in Linux 
> 2.6 and it 
> >>> may very well have been because 2.6 has a much advanced 
> version of ext3.
> >>>
> >>> The Lustre server loads are probably not applicable to 
> loads seen on 
> >>> clients, but I thought I''d relate our experience
here.
> >>>
> >>> - Peter -
> >>>
> >>>> -----Original Message-----
> >>>> From: lustre-discuss-bounces@clusterfs.com
> >>>> [mailto:lustre-discuss-bounces@clusterfs.com] On Behalf Of
Marty
> >>>> Barnaby
> >>>> Sent: Wednesday, September 20, 2006 3:44 PM
> >>>> To: lustre-discuss@clusterfs.com
> >>>> Subject: Re: [Lustre-discuss] O_DIRECT question
> >>>>
> >>>> I respect your question, but, in general do you have any 
> positive 
> >>>> experience with an FS delivering substantial performance 
> increase 
> >>>> with O_DIRECT? I have worked with performance of about ten
FS
> >>>> implementations or flavors, mostly vendor proprietary, and
have
> >>>> found only the SGI XFS, with large block access, to 
> provide as much 
> >>>> as a 20% speed-up,.
> >>>>
> >>>> Marty Barnaby
> >>>>
> >>>>
> >>>> Wei-keng Liao wrote:
> >>>>>
> >>>>> According to the man page of open(), when a file is
opened
> >>>> in O_DIRECT
> >>>>> mode, all read/write calls are synchronous. My
question
> is whether 
> >>>>> this synchronization on Lustre only reaches to the
servers
> >>>> or all the
> >>>>> way to the disks, before a write call is returned?
> >>>>>
> >>>>> Wei-keng Liao
> >>>>>
> >>>>> _______________________________________________
> >>>>> Lustre-discuss mailing list
> >>>>> Lustre-discuss@clusterfs.com
> >>>>>
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Lustre-discuss mailing list
> >>>> Lustre-discuss@clusterfs.com
> >>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> >>>>
> >>>
> >>> _______________________________________________
> >>> Lustre-discuss mailing list
> >>> Lustre-discuss@clusterfs.com
> >>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> >>>
> >>
> >> _______________________________________________
> >> Lustre-discuss mailing list
> >> Lustre-discuss@clusterfs.com
> >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
> >>
> >>
> >
> 
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Johann Lombardi

2006-Sep-21 11:44 UTC

head link

[Lustre-discuss] O_DIRECT question

On 9/21/06, Wei-keng Liao <wkliao@ece.northwestern.edu>
wrote:> I wonder if O_DIRECT makes a read/write call synchronous all the way to
> the disks on the server side before the the call returns.
Yes, that''s how it works.

Johann

Lustre discuss - Sep 2006 - O_DIRECT question

[Lustre-discuss] O_DIRECT question

[Lustre-discuss] O_DIRECT question

[Lustre-discuss] O_DIRECT question

[Lustre-discuss] O_DIRECT question

[Lustre-discuss] O_DIRECT question

[Lustre-discuss] O_DIRECT question

[Lustre-discuss] O_DIRECT question

[Lustre-discuss] O_DIRECT question