thr3ads.net - Lustre devel - [Lustre-devel] Important changes to libcfs primitives usage. [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Oleg Drokin

2012-Dec-05 13:54 UTC

[Lustre-devel] Important changes to libcfs primitives usage.

Hello!

   I just landed first patch of the series to reduce usage of our libcfs_
wrappers for kernel primitives like libcfs_spin_lock/unlock...
   You can see actual change here: http://review.whamcloud.com/#change,2829

   It''s highly likely that plenty of patches will be affected. To make
our job easier, there is a
   build/libcfs_cleanup.sed script included, you can run it on all your .c and
.h files to make necessary replacements:
   sed -i -f build/libcfs_cleanup.sed3  `find . -name "*.h" -or -name
"*.c"`

   Please be also advised that there are more changes like this are coming
(timeline is not very clear ATM, we might be able to wait with the rest until
   after feature freeze) and the sed script will be updated accordingly.

Bye,
    Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.

Alexey Lyahkov

2012-Dec-14 08:19 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

Please don''t kill a portability code.
We have spend many times in past to introduce it''s but now you want to
kill that work and make portable to systems other then Linux too hard.
That is wrong way as supporting a compatibility code have no cost.

On Dec 5, 2012, at 17:54, Oleg Drokin wrote:
> Hello!
> 
>   I just landed first patch of the series to reduce usage of our libcfs_
wrappers for kernel primitives like libcfs_spin_lock/unlock...
>   You can see actual change here: http://review.whamcloud.com/#change,2829
> 
>   It''s highly likely that plenty of patches will be affected. To
make our job easier, there is a
>   build/libcfs_cleanup.sed script included, you can run it on all your .c
and .h files to make necessary replacements:
>   sed -i -f build/libcfs_cleanup.sed3  `find . -name "*.h" -or
-name "*.c"`
> 
>   Please be also advised that there are more changes like this are coming
(timeline is not very clear ATM, we might be able to wait with the rest until
>   after feature freeze) and the sed script will be updated accordingly.
> 
> Bye,
>    Oleg
> --
> Oleg Drokin
> Senior Software Engineer
> Whamcloud, Inc.
> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Peng, Tao

2012-Dec-14 09:52 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

Hi Alex,

We are not killing the portability code. For now we just wanted to replace it
with Linux defined symbols, to a extent that libcfs is unnecessary for Linux
client. The libcfs layer remains for platforms that need porting. They just need
to define things with Linux names rather than current cfs_ prefix. I admit that
some ported code may get broken in during the transition but in theory it is
easy to fix them when interests raise on other platforms.

Cheers,
Tao

> -----Original Message-----
> From: lustre-devel-bounces at lists.lustre.org [mailto:lustre-devel-bounces
at lists.lustre.org] On Behalf
> Of Alexey Lyahkov
> Sent: Friday, December 14, 2012 4:19 PM
> To: Oleg Drokin
> Cc: wc-discuss Discussion; lustre-devel at lists.lustre.org Development
> Subject: Re: [Lustre-devel] [wc-discuss] Important changes to libcfs
primitives usage.
> 
> Please don''t kill a portability code.
> We have spend many times in past to introduce it''s but now you
want to kill that work and make
> portable to systems other then Linux too hard.
> That is wrong way as supporting a compatibility code have no cost.
> 
> On Dec 5, 2012, at 17:54, Oleg Drokin wrote:
> 
> > Hello!
> >
> >   I just landed first patch of the series to reduce usage of our
libcfs_ wrappers for kernel
> primitives like libcfs_spin_lock/unlock...
> >   You can see actual change here:
http://review.whamcloud.com/#change,2829
> >
> >   It''s highly likely that plenty of patches will be affected.
To make our job easier, there is a
> >   build/libcfs_cleanup.sed script included, you can run it on all your
.c and .h files to make
> necessary replacements:
> >   sed -i -f build/libcfs_cleanup.sed3  `find . -name "*.h"
-or -name "*.c"`
> >
> >   Please be also advised that there are more changes like this are
coming (timeline is not very
> clear ATM, we might be able to wait with the rest until
> >   after feature freeze) and the sed script will be updated
accordingly.
> >
> > Bye,
> >    Oleg
> > --
> > Oleg Drokin
> > Senior Software Engineer
> > Whamcloud, Inc.
> >
> 
> ----------------------------------------------
> Alexey Lyahkov
> alexey_lyashkov at xyratex.com
> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

Alexey Lyahkov

2012-Dec-15 06:27 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

Peng,

you may confirm any platfrom isn''t intersect with linux names for that
?
I not sure. cfs_ prefix introduce a private namespace which may implemented at
any way
I remember that discussion in LUG, but we have a decision to continue discussion
about it, but you are stop discussion and make changes without it. I''m
strongly about it.
I remember you idea about landing lustre client in kernel tree but that may
introduce hard problem with supporting stable code, so we need to discuss and
find way - how we will have and support a stable code in opposite to development
version in kernel tree.
I strongly against to do it without discussion with all affected persons and
companies.


On Dec 14, 2012, at 13:52, Peng, Tao wrote:
> Hi Alex,
> 
> We are not killing the portability code. For now we just wanted to replace
it with Linux defined symbols, to a extent that libcfs is unnecessary for Linux
client. The libcfs layer remains for platforms that need porting. They just need
to define things with Linux names rather than current cfs_ prefix. I admit that
some ported code may get broken in during the transition but in theory it is
easy to fix them when interests raise on other platforms.
> 
> Cheers,
> Tao
> 
> 
>> -----Original Message-----
>> From: lustre-devel-bounces at lists.lustre.org
[mailto:lustre-devel-bounces at lists.lustre.org] On Behalf
>> Of Alexey Lyahkov
>> Sent: Friday, December 14, 2012 4:19 PM
>> To: Oleg Drokin
>> Cc: wc-discuss Discussion; lustre-devel at lists.lustre.org Development
>> Subject: Re: [Lustre-devel] [wc-discuss] Important changes to libcfs
primitives usage.
>> 
>> Please don''t kill a portability code.
>> We have spend many times in past to introduce it''s but now you
want to kill that work and make
>> portable to systems other then Linux too hard.
>> That is wrong way as supporting a compatibility code have no cost.
>> 
>> On Dec 5, 2012, at 17:54, Oleg Drokin wrote:
>> 
>>> Hello!
>>> 
>>>  I just landed first patch of the series to reduce usage of our
libcfs_ wrappers for kernel
>> primitives like libcfs_spin_lock/unlock...
>>>  You can see actual change here:
http://review.whamcloud.com/#change,2829
>>> 
>>>  It''s highly likely that plenty of patches will be
affected. To make our job easier, there is a
>>>  build/libcfs_cleanup.sed script included, you can run it on all
your .c and .h files to make
>> necessary replacements:
>>>  sed -i -f build/libcfs_cleanup.sed3  `find . -name "*.h"
-or -name "*.c"`
>>> 
>>>  Please be also advised that there are more changes like this are
coming (timeline is not very
>> clear ATM, we might be able to wait with the rest until
>>>  after feature freeze) and the sed script will be updated
accordingly.
>>> 
>>> Bye,
>>>   Oleg
>>> --
>>> Oleg Drokin
>>> Senior Software Engineer
>>> Whamcloud, Inc.
>>> 
>> 
>> ----------------------------------------------
>> Alexey Lyahkov
>> alexey_lyashkov at xyratex.com
>> 
>> 
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Dilger, Andreas

2012-Dec-15 09:12 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On 2012-12-14, at 23:28, "Alexey Lyahkov" <alexey_lyashkov at
xyratex.com> wrote:> you may confirm any platfrom isn''t intersect with linux names for
that ?
> I not sure. cfs_ prefix introduce a private namespace which may implemented
at any way
> I remember that discussion in LUG, but we have a decision to continue
discussion about it, but you are stop discussion and make changes without it.
I don''t think it is fair to say there was no discussion. It was
discussed at LUG and SC Lustre BOF and on the list.

While I would be thrilled if there was a native MacOS and Windows client for
Lustre, realistically there is nothing today.  The MacOS client is not
maintained, and the Windows client is stick inside Oracle.

In contrast, only the Linux client is on use today, so it makes sense to focus
effort on this instead of spending time on maintaining portability code that is
not useful to us.

If the other clients were usable, perhaps a different decision would be made. 
> I''m strongly about it.
> I remember you idea about landing lustre client in kernel tree but that may
introduce hard problem with supporting stable code, so we need to discuss and
find way - how we will have and support a stable code in opposite to development
version in kernel tree.
Yes, there are potential issues and effort with putting the client into the
kernel, but also benefits as well. Faster support for new kernels, less effort
spent to catch up to new changes in the kernel, better distro support.
> I strongly against to do it without discussion with all affected persons
and companies.
I''m open for discussion. You''ve voiced some objection to this
plan, so please explain the benefits of not cleaning up old code from Lustre,
not removing useless abstractions, and npt putting the client into the kernel.

Cheers, Andreas
> On Dec 14, 2012, at 13:52, Peng, Tao wrote:
> 
>> Hi Alex,
>> 
>> We are not killing the portability code. For now we just wanted to
replace it with Linux defined symbols, to a extent that libcfs is unnecessary
for Linux client. The libcfs layer remains for platforms that need porting. They
just need to define things with Linux names rather than current cfs_ prefix. I
admit that some ported code may get broken in during the transition but in
theory it is easy to fix them when interests raise on other platforms.
>> 
>> Cheers,
>> Tao
>> 
>> 
>>> -----Original Message-----
>>> From: lustre-devel-bounces at lists.lustre.org
[mailto:lustre-devel-bounces at lists.lustre.org] On Behalf
>>> Of Alexey Lyahkov
>>> Sent: Friday, December 14, 2012 4:19 PM
>>> To: Oleg Drokin
>>> Cc: wc-discuss Discussion; lustre-devel at lists.lustre.org
Development
>>> Subject: Re: [Lustre-devel] [wc-discuss] Important changes to
libcfs primitives usage.
>>> 
>>> Please don''t kill a portability code.
>>> We have spend many times in past to introduce it''s but now
you want to kill that work and make
>>> portable to systems other then Linux too hard.
>>> That is wrong way as supporting a compatibility code have no cost.
>>> 
>>> On Dec 5, 2012, at 17:54, Oleg Drokin wrote:
>>> 
>>>> Hello!
>>>> 
>>>> I just landed first patch of the series to reduce usage of our
libcfs_ wrappers for kernel
>>> primitives like libcfs_spin_lock/unlock...
>>>> You can see actual change here:
http://review.whamcloud.com/#change,2829
>>>> 
>>>> It''s highly likely that plenty of patches will be
affected. To make our job easier, there is a
>>>> build/libcfs_cleanup.sed script included, you can run it on all
your .c and .h files to make
>>> necessary replacements:
>>>> sed -i -f build/libcfs_cleanup.sed3  `find . -name
"*.h" -or -name "*.c"`
>>>> 
>>>> Please be also advised that there are more changes like this
are coming (timeline is not very
>>> clear ATM, we might be able to wait with the rest until
>>>> after feature freeze) and the sed script will be updated
accordingly.
>>>> 
>>>> Bye,
>>>>  Oleg
>>>> --
>>>> Oleg Drokin
>>>> Senior Software Engineer
>>>> Whamcloud, Inc.
>>> 
>>> ----------------------------------------------
>>> Alexey Lyahkov
>>> alexey_lyashkov at xyratex.com
>>> 
>>> 
>>> _______________________________________________
>>> Lustre-devel mailing list
>>> Lustre-devel at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
> ----------------------------------------------
> Alexey Lyahkov
> alexey_lyashkov at xyratex.com
> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

Ken Hornstein

2012-Dec-17 05:44 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

>While I would be thrilled if there was a native MacOS and Windows client
>for Lustre, realistically there is nothing today.  The MacOS client is
>not maintained, and the Windows client is stick inside Oracle.
>
>In contrast, only the Linux client is on use today, so it makes sense to
>focus effort on this instead of spending time on maintaining portability
>code that is not useful to us.
>
>If the other clients were usable, perhaps a different decision would be
>made.
So, let me speak to that a bit since I was the one who did the MacOS X port
which has been langunishing.

Andreas is completely right that 90% of libcfs library for Linux at least
is adding to kernel internal interfaces a "cfs_" prefix.  So that part
at
least isn''t going to change the issues with porting Lustre to other
operating
systems; you have to pretty much emulate a whole pile of Linux interfaces.
So that doesn''t really worry me in terms of future portability.

What _does_ concern me is that to be portable Lustre needs some code in the
generic side of things.  Example: locks in MacOS X are done via allocated
memory, so you need to release that memory when you''re done.  Lustre
never
does this (because it''s not needed under Linux since locks use
statically
allocated memory).  So is it going to be possible in the future to put
generic code to aid portabiliy in Lustre?  I don''t know.

Personally, I think putting Lustre in the Linux kernel is a terrible
idea, but since I was obviously in the minority on that one I figured
there was no reason to really say anything about it.  Also, it''s hard
for me to voice objections about possible MacOS X portability issues
when I can''t dedicate any resources to cleaning up the MacOS X port.

--Ken

Liu, Xuezhao

2012-Dec-17 06:52 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

Hi,

Current the libcfs cleanup is mainly done by replacing kinds of libcfs calls
with direct kernel primitives, we can implement the portable layer on other OS
with the same name as Linux.
I think the real concern is that on other OS there is possible some kernel
primitives have the same name a Linux kernel but with different
behavior/parameters, that will cause either compiling broken or running crashing
etc.
I don''t have the full list of the above intersecting case, but it
should be very rare right? We can build a separate patch( which should be small)
for those intersecting cases for other OS, and I guess that patch need to be
maintained separately.

And lustre CLIO is structured that all OS-dependent glue is supposed to be
handled in the topmost layer (VVP for linux client). For Mac or
Windows''s native client, it is possibly just implement a new thin top
layer if interests really raised on those platform? This could reduce the
possibility of the above intersecting cases.

Thanks,
Xuezhao
-----Original Message-----
From: Ken Hornstein [mailto:kenh at cmf.nrl.navy.mil] 
Sent: 2012?12?17? 13:44
To: WC-Discuss; lustre-devel at lists.lustre.org
Subject: Re: [Lustre-devel] [wc-discuss] Important changes to libcfs primitives
usage.
>While I would be thrilled if there was a native MacOS and Windows 
>client for Lustre, realistically there is nothing today.  The MacOS 
>client is not maintained, and the Windows client is stick inside Oracle.
>
>In contrast, only the Linux client is on use today, so it makes sense 
>to focus effort on this instead of spending time on maintaining 
>portability code that is not useful to us.
>
>If the other clients were usable, perhaps a different decision would be 
>made.
So, let me speak to that a bit since I was the one who did the MacOS X port
which has been langunishing.

Andreas is completely right that 90% of libcfs library for Linux at least is
adding to kernel internal interfaces a "cfs_" prefix.  So that part at
least isn''t going to change the issues with porting Lustre to other
operating systems; you have to pretty much emulate a whole pile of Linux
interfaces.
So that doesn''t really worry me in terms of future portability.

What _does_ concern me is that to be portable Lustre needs some code in the
generic side of things.  Example: locks in MacOS X are done via allocated
memory, so you need to release that memory when you''re done.  Lustre
never does this (because it''s not needed under Linux since locks use
statically allocated memory).  So is it going to be possible in the future to
put generic code to aid portabiliy in Lustre?  I don''t know.

Personally, I think putting Lustre in the Linux kernel is a terrible idea, but
since I was obviously in the minority on that one I figured there was no reason
to really say anything about it.  Also, it''s hard for me to voice
objections about possible MacOS X portability issues when I can''t
dedicate any resources to cleaning up the MacOS X port.

--Ken

Zhuravlev, Alexey

2012-Dec-17 07:04 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On Dec 17, 2012, at 10:52 AM, "Liu, Xuezhao" <Xuezhao.Liu at
emc.com> wrote:
> Hi,
> 
> Current the libcfs cleanup is mainly done by replacing kinds of libcfs
calls with direct kernel primitives, we can implement the portable layer on
other OS with the same name as Linux.
> I think the real concern is that on other OS there is possible some kernel
primitives have the same name a Linux kernel but with different
behavior/parameters, that will cause either compiling broken or running crashing
etc.
> I don''t have the full list of the above intersecting case, but it
should be very rare right? We can build a separate patch( which should be small)
for those intersecting cases for other OS, and I guess that patch need to be
maintained separately.
originally libcfs_ prefix was introduce to cope with name collisions in Solaris.
I''m not sure how this can be solved with a separate small patch.

to me the both goals look hard to achieve, so I''m fine with any
solution ;)

thanks, Alex


--------------------------------------------------------------------
Closed Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park, 
17 Krylatskaya Str., Bldg 4, Moscow 121614, 
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Dilger, Andreas

2012-Dec-17 08:42 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On 2012-12-16, at 22:44, Ken Hornstein <kenh at cmf.nrl.navy.mil> wrote:
>> While I would be thrilled if there was a native MacOS and Windows
client
>> for Lustre, realistically there is nothing today.  The MacOS client is
>> not maintained, and the Windows client is stick inside Oracle.
>> 
>> In contrast, only the Linux client is on use today, so it makes sense
to
>> focus effort on this instead of spending time on maintaining
portability
>> code that is not useful to us.
>> 
>> If the other clients were usable, perhaps a different decision would be
>> made.
> 
> So, let me speak to that a bit since I was the one who did the MacOS X port
> which has been langunishing.
> 
> Andreas is completely right that 90% of libcfs library for Linux at least
> is adding to kernel internal interfaces a "cfs_" prefix.  So that
part at
> least isn''t going to change the issues with porting Lustre to
other operating
> systems; you have to pretty much emulate a whole pile of Linux interfaces.
> So that doesn''t really worry me in terms of future portability.
> 
> What _does_ concern me is that to be portable Lustre needs some code in the
> generic side of things.  Example: locks in MacOS X are done via allocated
> memory, so you need to release that memory when you''re done. 
Lustre never
> does this (because it''s not needed under Linux since locks use
statically
> allocated memory).  So is it going to be possible in the future to put
> generic code to aid portabiliy in Lustre?  I don''t know.
Yes, we''d discussed this in the past, but the patches for lock cleanup
never got added into the core Lustre code. We will still be the maintainers of
the Lustre code in the kernel, so if there are innocuous no-op calls that can be
put into the code (e.g. spin_lock_free()) or something similar I think that is
fine.

The root of the problem, for which there was no easy solution (wrappers or not)
is that there is no easy way to test for this under Linux, so without either
static code analysis tools (preferably run with the prepare-commit-msg git
hook), or rigorous testing on MacOS, it would always be a game of catch up for
these cleanups.
> Personally, I think putting Lustre in the Linux kernel is a terrible
> idea, but since I was obviously in the minority on that one I figured
> there was no reason to really say anything about it.
I''d like to understand your reasons for thinking it is a bad idea.
There are definite plusses and minuses to being in the kernel, but if there is
some overwhelming badness for being in the kernel is like to know about it.

Cheers, Andreas
> Also, it''s hard
> for me to voice objections about possible MacOS X portability issues
> when I can''t dedicate any resources to cleaning up the MacOS X
port.

John Hammond

2012-Dec-17 17:09 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On 12/05/2012 07:54 AM, Oleg Drokin wrote:>     I just landed first patch of the series to reduce usage of our libcfs_
wrappers for kernel primitives like libcfs_spin_lock/unlock...
>     You can see actual change here:
http://review.whamcloud.com/#change,2829
>
>     It''s highly likely that plenty of patches will be affected. To
make our job easier, there is a
>     build/libcfs_cleanup.sed script included, you can run it on all your .c
and .h files to make necessary replacements:
>     sed -i -f build/libcfs_cleanup.sed3  `find . -name "*.h" -or
-name "*.c"`
>
>     Please be also advised that there are more changes like this are coming
(timeline is not very clear ATM, we might be able to wait with the rest until
>     after feature freeze) and the sed script will be updated accordingly.
I have been wondering about wrappers and typedefs not affected by this 
change, for example cfs_get_cpu(), cfs_atomic_read() and 
cfs_proc_dir_entry_t.  In new code and patches should we use the cfs 
names or their Linux equivalents, get_cpu(), atomic_read(), and struct 
proc_dir_entry?

Thanks,

John

-- 
John L. Hammond, Ph.D.
TACC, The University of Texas at Austin
jhammond at tacc.utexas.edu
(512) 471-9304

Ken Hornstein

2012-Dec-17 17:21 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

>Current the libcfs cleanup is mainly done by replacing kinds of         
>libcfs calls with direct kernel primitives, we can implement the        
>portable layer on other OS with the same name as Linux.  I think        
>the real concern is that on other OS there is possible some kernel      
>primitives have the same name a Linux kernel but with different         
>behavior/parameters, that will cause either compiling broken or running 
>crashing etc.  I don''t have the full list of the above intersecting
>case, but it should be very rare right? We can build a separate patch(  
>which should be small) for those intersecting cases for other OS, and I 
>guess that patch need to be maintained separately.                      
Like I said, I''m fine with that; at least in the MacOS X case I
don''t
recall any symbol names which overlapped.
>And lustre CLIO is structured that all OS-dependent glue is supposed
>to be handled in the topmost layer (VVP for linux client). For Mac or
>Windows''s native client, it is possibly just implement a new thin
top
>layer if interests really raised on those platform? This could reduce
>the possibility of the above intersecting cases.
Well, it''s only a "thin" layer if your VM system looks
exactly like Linux;
otherwise it ain''t so thin.  Yeah, I did some of that, but it
wasn''t
easy.

--Ken

Oleg Drokin

2012-Dec-17 17:30 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

Hello!

On Dec 17, 2012, at 12:09 PM, John Hammond wrote:> I have been wondering about wrappers and typedefs not affected by this
change, for example cfs_get_cpu(), cfs_atomic_read() and cfs_proc_dir_entry_t. 
In new code and patches should we use the cfs names or their Linux equivalents,
get_cpu(), atomic_read(), and struct proc_dir_entry?
Those will be converted in later patches.
As such it''s better not to submit cfs_ style names already, though no
super big harm if not.

Bye,
    Oleg
--
Oleg Drokin
Senior Software Engineer
Whamcloud, Inc.

Andreas Dilger

2012-Dec-17 18:34 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On 2012-12-17, at 10:09 AM, John Hammond wrote:> On 12/05/2012 07:54 AM, Oleg Drokin wrote:
>>    I just landed first patch of the series to reduce usage of our
libcfs_ wrappers for kernel primitives like libcfs_spin_lock/unlock...
>>    You can see actual change here:
http://review.whamcloud.com/#change,2829
>> 
>>    It''s highly likely that plenty of patches will be affected.
To make our job easier, there is a
>>    build/libcfs_cleanup.sed script included, you can run it on all your
.c and .h files to make necessary replacements:
>>    sed -i -f build/libcfs_cleanup.sed3  `find . -name "*.h"
-or -name "*.c"`
>> 
>>    Please be also advised that there are more changes like this are
coming (timeline is not very clear ATM, we might be able to wait with the rest
until
>>    after feature freeze) and the sed script will be updated
accordingly.
> 
> I have been wondering about wrappers and typedefs not affected by this
change, for example cfs_get_cpu(), cfs_atomic_read() and cfs_proc_dir_entry_t. 
In new code and patches should we use the cfs names or their Linux equivalents,
get_cpu(), atomic_read(), and struct proc_dir_entry?
Ideally, new patches would use the Linux primitives.  However, if they are in
client-side code that is compiled for liblustre, then the liblustre builds would
fail until the wrappers are renamed to their Linux equivalents (i.e. removing
"cfs_" prefix).

For server-side code and/or llite it should be fine to use the native Linux
functions.

Cheers, Andreas
--
Andreas Dilger                       Whamcloud, Inc.
Principal Lustre Engineer            http://www.whamcloud.com/

Ken Hornstein

2012-Dec-18 04:12 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

>Yes, we''d discussed this in the past, but the patches for lock
cleanup
>never got added into the core Lustre code. We will still be the
>maintainers of the Lustre code in the kernel, so if there are innocuous
>no-op calls that can be put into the code (e.g. spin_lock_free()) or
>something similar I think that is fine.
Well as long as that''s not a problem, I don''t think that will
be an issue
then.
>The root of the problem, for which there was no easy solution (wrappers
>or not) is that there is no easy way to test for this under Linux, so
>without either static code analysis tools (preferably run with the
>prepare-commit-msg git hook), or rigorous testing on MacOS, it would
>always be a game of catch up for these cleanups.
I understand that, and I''m personally fine with that.  From my memory,
the issue is not that a lot of lock structures were being created all
of the time and then deallocated (or I was able to find all of those
cases); it was more that a bunch of locks were leaked when Lustre shut
down.  I understand it''s going to be a constant catch-up.
>I''d like to understand your reasons for thinking it is a bad idea.
There
>are definite plusses and minuses to being in the kernel, but if there is
>some overwhelming badness for being in the kernel is like to know about
>it.
Well, fundamentally it just seems to me that you''re tying your product
to the whims of a group of people who, quite frankly, don''t care about
your product (except in a very abstract sense).  Giving away a huge amount
of control of your product looks like a bad idea to me.

But that''s sort of vague; let me talk about specifics. 
There''s a HUGE
difference in practice between "feature X appears in Linux kernel
version Y" and "My RedHat release Z has a particular feature". 
That''s
where life gets complicated.  Many times we''re stuck on a particular
kernel for various complicated reasons, yet we need to upgrade Lustre
... or vice versa.  It''s kind of like Infiniband in the Linux kernel
...
at best, it doesn''t hurt us, but it''s always the
"wrong" version (or so
my Infiniband guys tell me).  We never end up using the Infiniband in
the kernel, and sometimes (depending on the vagarities of the distro)
that screws us up hard; part of that is because for a long time one
particular distro never would distribute the development symbols for
the Infiniband in the kernel that matched what the kernel was running.
That won''t be an issue with Lustre, but you get the idea.

--Ken

Prakash Surya

2012-Dec-18 20:29 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On Mon, Dec 17, 2012 at 11:12:29PM -0500, Ken Hornstein
wrote:> >Yes, we''d discussed this in the past, but the patches for lock
cleanup
> >never got added into the core Lustre code. We will still be the
> >maintainers of the Lustre code in the kernel, so if there are innocuous
> >no-op calls that can be put into the code (e.g. spin_lock_free()) or
> >something similar I think that is fine.
> 
> Well as long as that''s not a problem, I don''t think that
will be an issue
> then.
> 
> >The root of the problem, for which there was no easy solution (wrappers
> >or not) is that there is no easy way to test for this under Linux, so
> >without either static code analysis tools (preferably run with the
> >prepare-commit-msg git hook), or rigorous testing on MacOS, it would
> >always be a game of catch up for these cleanups.
> 
> I understand that, and I''m personally fine with that.  From my
memory,
> the issue is not that a lot of lock structures were being created all
> of the time and then deallocated (or I was able to find all of those
> cases); it was more that a bunch of locks were leaked when Lustre shut
> down.  I understand it''s going to be a constant catch-up.
> 
> >I''d like to understand your reasons for thinking it is a bad
idea. There
> >are definite plusses and minuses to being in the kernel, but if there
is
> >some overwhelming badness for being in the kernel is like to know about
> >it.
> 
> Well, fundamentally it just seems to me that you''re tying your
product
> to the whims of a group of people who, quite frankly, don''t care
about
> your product (except in a very abstract sense).  Giving away a huge amount
> of control of your product looks like a bad idea to me.
> 
> But that''s sort of vague; let me talk about specifics. 
There''s a HUGE
> difference in practice between "feature X appears in Linux kernel
> version Y" and "My RedHat release Z has a particular
feature".  That''s
> where life gets complicated.  Many times we''re stuck on a
particular
> kernel for various complicated reasons, yet we need to upgrade Lustre
> ... or vice versa.  It''s kind of like Infiniband in the Linux
kernel ...
Perhaps I''m being naive, but at some point it would be nice to have a
rock solid "stable" version of a lustre client and protocol. Giving up
some control over the ability to upgrade the client could be perceived
as a good thing.

-- 
Cheers, Prakash
> at best, it doesn''t hurt us, but it''s always the
"wrong" version (or so
> my Infiniband guys tell me).  We never end up using the Infiniband in
> the kernel, and sometimes (depending on the vagarities of the distro)
> that screws us up hard; part of that is because for a long time one
> particular distro never would distribute the development symbols for
> the Infiniband in the kernel that matched what the kernel was running.
> That won''t be an issue with Lustre, but you get the idea.
> 
> --Ken

Brian J. Murrell

2012-Dec-18 22:12 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On Mon, 2012-12-17 at 23:12 -0500, Ken Hornstein wrote:> There''s a HUGE
> difference in practice between "feature X appears in Linux kernel
> version Y" and "My RedHat release Z has a particular
feature".  That''s
> where life gets complicated.  Many times we''re stuck on a
particular
> kernel for various complicated reasons, yet we need to upgrade Lustre
> ... or vice versa.  It''s kind of like Infiniband in the Linux
kernel ...
> at best, it doesn''t hurt us, but it''s always the
"wrong" version (or so
> my Infiniband guys tell me).  We never end up using the Infiniband in
> the kernel, and sometimes (depending on the vagarities of the distro)
> that screws us up hard; part of that is because for a long time one
> particular distro never would distribute the development symbols for
> the Infiniband in the kernel that matched what the kernel was running.
> That won''t be an issue with Lustre, but you get the idea.
So let me just confirm I understand your issue here.  For example, the
Linux kernel is at 3.12.1 (version numbers are purely hypothetical and
far into the future) which has Lustre 2.6.3 in it.  But RHEL 9''s kernel
is only 3.8.1"ish" and has Lustre 2.4.2 (because that''s what
was in the
upstream Linux 3.8.1 kernel) in it.

The complaint is that the Lustre (2.4.2) that''s in the RHEL 9 kernel
that you are using is too old for some reason?

So what''s the remedy?  You can either (a) download the stand-alone
Lustre module source for the version of Lustre you need and build it
against the RHEL 9 kernel you want to use (assuming they are near enough
to be compatible -- which is the same situation that exists today) or
(b) you download the entire upstream kernel that has the version of
Lustre that you want to use and build that.

It seems to me that (a) is what everyone already has to do all of the
time today anyway (assuming the binary RPM that we ship for the given
vendor kernel is not new enough) and (b) just isn''t even possible so
that''s gravy.

There might be something obvious that I''m missing, but I''m
failing to
see how having Lustre (client) in the kernel is any worse than the
status quo (basically scenario (a) above) and indeed should provide
Lustre compatibility to the current stable Linux sooner than we can
typically turn it around.  

Cheers,
b.

Prakash Surya

2012-Dec-19 00:12 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On Tue, Dec 18, 2012 at 05:12:27PM -0500, Brian J. Murrell
wrote:> On Mon, 2012-12-17 at 23:12 -0500, Ken Hornstein wrote:
> > There''s a HUGE
> > difference in practice between "feature X appears in Linux kernel
> > version Y" and "My RedHat release Z has a particular
feature".  That''s
> > where life gets complicated.  Many times we''re stuck on a
particular
> > kernel for various complicated reasons, yet we need to upgrade Lustre
> > ... or vice versa.  It''s kind of like Infiniband in the Linux
kernel ...
> > at best, it doesn''t hurt us, but it''s always the
"wrong" version (or so
> > my Infiniband guys tell me).  We never end up using the Infiniband in
> > the kernel, and sometimes (depending on the vagarities of the distro)
> > that screws us up hard; part of that is because for a long time one
> > particular distro never would distribute the development symbols for
> > the Infiniband in the kernel that matched what the kernel was running.
> > That won''t be an issue with Lustre, but you get the idea.
> 
> So let me just confirm I understand your issue here.  For example, the
> Linux kernel is at 3.12.1 (version numbers are purely hypothetical and
> far into the future) which has Lustre 2.6.3 in it.  But RHEL 9''s
kernel
> is only 3.8.1"ish" and has Lustre 2.4.2 (because that''s
what was in the
> upstream Linux 3.8.1 kernel) in it.
> 
> The complaint is that the Lustre (2.4.2) that''s in the RHEL 9
kernel
> that you are using is too old for some reason?
> 
> So what''s the remedy?  You can either (a) download the stand-alone
> Lustre module source for the version of Lustre you need and build it
> against the RHEL 9 kernel you want to use (assuming they are near enough
> to be compatible -- which is the same situation that exists today) or
That is a big assumption, and is most likely false. Currently autoconf
maintains the compatibility of Lustre with the different kernel versions.
If the client is in tree, are we really going to maintain the same level
of autoconf "foo" to be compatible with older kernels? And if not, we
would have to maintain a second package, outside of the kernel, which did
have this autoconf compatibility layer.

I don''t think option (a) is the same as what we have now.

-- 
Cheers, Prakash
> (b) you download the entire upstream kernel that has the version of
> Lustre that you want to use and build that.
> 
> It seems to me that (a) is what everyone already has to do all of the
> time today anyway (assuming the binary RPM that we ship for the given
> vendor kernel is not new enough) and (b) just isn''t even possible
so
> that''s gravy.
> 
> There might be something obvious that I''m missing, but
I''m failing to
> see how having Lustre (client) in the kernel is any worse than the
> status quo (basically scenario (a) above) and indeed should provide
> Lustre compatibility to the current stable Linux sooner than we can
> typically turn it around.  
> 
> Cheers,
> b.
>

Ken Hornstein

2012-Dec-19 02:30 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

See, now, this is EXACTLY why I didn''t want to say anything.  Sigh.
>The complaint is that the Lustre (2.4.2) that''s in the RHEL 9
kernel
>that you are using is too old for some reason?
Right.
>So what''s the remedy?  You can either (a) download the stand-alone
>Lustre module source for the version of Lustre you need and build it
>against the RHEL 9 kernel you want to use (assuming they are near enough
>to be compatible -- which is the same situation that exists today) or
>(b) you download the entire upstream kernel that has the version of
>Lustre that you want to use and build that.
So (b) isn''t an option for us, ever.  And I think that Prakash has
already
explained why what we have today isn''t exactly (a).  Generally we
have to run one particular version of Lustre with maybe one version
skew.
>There might be something obvious that I''m missing, but I''m
failing to
>see how having Lustre (client) in the kernel is any worse than the
>status quo (basically scenario (a) above) and indeed should provide
>Lustre compatibility to the current stable Linux sooner than we can
>typically turn it around.  
Well, turn it around: Lustre in the Linux kernel doesn''t GAIN me
anything.  So like I said, at best it''s a no-op; I can''t
imagine a
situation where we''ll use it.  I''m personally extremely
skeptical that
having it in the kernel will bring Lustre compatibility to stable Linux
sooner than usual.  Of course, I''m just speaking from my view;
obviously
from the developer side of things it can look very different.

And that''s my last email on this topic.

--Ken

Dilger, Andreas

2012-Dec-19 08:39 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On 12/18/12 5:12 PM, "Prakash Surya" <surya1 at llnl.gov>
wrote:>On Tue, Dec 18, 2012 at 05:12:27PM -0500, Brian J. Murrell wrote:
>> On Mon, 2012-12-17 at 23:12 -0500, Ken Hornstein wrote:
>> > There''s a HUGE
>> > difference in practice between "feature X appears in Linux
kernel
>> > version Y" and "My RedHat release Z has a particular
feature".  That''s
>> > where life gets complicated.  Many times we''re stuck on a
particular
>> > kernel for various complicated reasons, yet we need to upgrade
Lustre
>> > ... or vice versa.  It''s kind of like Infiniband in the
Linux kernel
>>...
>> > at best, it doesn''t hurt us, but it''s always the
"wrong" version (or
>>so
>> > my Infiniband guys tell me).  We never end up using the Infiniband
in
>> > the kernel, and sometimes (depending on the vagarities of the
distro)
>> > that screws us up hard; part of that is because for a long time
one
>> > particular distro never would distribute the development symbols
for
>> > the Infiniband in the kernel that matched what the kernel was
running.
>> > That won''t be an issue with Lustre, but you get the idea.
> > 
>> 
>> So let me just confirm I understand your issue here.  For example, the
>> Linux kernel is at 3.12.1 (version numbers are purely hypothetical and
>> far into the future) which has Lustre 2.6.3 in it.  But RHEL
9''s kernel
>> is only 3.8.1"ish" and has Lustre 2.4.2 (because
that''s what was in the
>> upstream Linux 3.8.1 kernel) in it.
>> 
>> The complaint is that the Lustre (2.4.2) that''s in the RHEL 9
kernel
>> that you are using is too old for some reason?
>> 
>> So what''s the remedy?  You can either (a) download the
stand-alone
>> Lustre module source for the version of Lustre you need and build it
>> against the RHEL 9 kernel you want to use (assuming they are near
enough
>> to be compatible -- which is the same situation that exists today) or
>
>That is a big assumption, and is most likely false. Currently autoconf
>maintains the compatibility of Lustre with the different kernel versions.
>If the client is in tree, are we really going to maintain the same level
>of autoconf "foo" to be compatible with older kernels? And if not,
we
>would have to maintain a second package, outside of the kernel, which did
>have this autoconf compatibility layer.
>
>I don''t think option (a) is the same as what we have now.
We _already_ have to maintain this autoconf layer for the out-of-kernel
tree, and we''ve had to do it basically forever.  At least if Lustre is
in
the kernel, it is _possible_ that the version that is in the vendor
release is "good enough" for a large number of users (probably more
than
exist today).  

If we align the Lustre "maintenance" branch with the kernel version
that
goes into the distro long-term-maintenance kernels (e.g. Lustre 3.0 and
Linux 3.24 for RHEL8) that wouldn''t be very different from what we are
doing today, but it would be a lot less hassle for most users.  If someone
wants to run a new kernel and new Lustre development release, then they
will get both at the same time.

The corner case is a development version of Lustre with an older vendor
kernel, which would require making the new version of Lustre to build
against the vendor kernel.  The in-kernel version of Lustre would no
longer have the autoconf and kernel-portability wrappers, but at least for
a few years there would be an out-of-kernel version of Lustre until it is
available in the vendor kernels (only updated every few years).

Having Lustre in the kernel would also tend to provide a more "stable"
point for the Lustre client and network protocol (as commented elsewhere),
assuming we can line up the Lustre maintenance releases with the kernel
chosen by the vendors.  We already need to provide interop and upgrade
support for Lustre, regardless of whether it is in the kernel or not.

It may mean that the "maintenance" version of Lustre is chosen by Red
Hat
when they decide on which kernel to use for, say, RHEL8.  We would push
Lustre fixes into the "stable" kernel branch used by RHEL8 and/or
directly
to RH.

The most difficult part will be the transition, when there is a version of
Lustre in the kernel, but there is still a need to maintain Lustre out of
the tree for the older kernels that do not have it yet.  That would
probably take 3-6 years before we could reasonably deprecate the oldest
vendor kernel that does not have any Lustre support in it.
>> (b) you download the entire upstream kernel that has the version of
>> Lustre that you want to use and build that.
>> 
>> It seems to me that (a) is what everyone already has to do all of the
>> time today anyway (assuming the binary RPM that we ship for the given
>> vendor kernel is not new enough) and (b) just isn''t even
possible so
>> that''s gravy.
>> 
>> There might be something obvious that I''m missing, but
I''m failing to
>> see how having Lustre (client) in the kernel is any worse than the
>> status quo (basically scenario (a) above) and indeed should provide
>> Lustre compatibility to the current stable Linux sooner than we can
>> typically turn it around.
>> 
>> Cheers,
>> b.

Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division

Alexey Lyahkov

2012-Dec-19 12:26 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

Nice idea, what about conflicts with names between linux and freebsd?
both have a atomic but with different arguments
atomic_set, atomic_add as example.

did you research before make that patch to be sure no conflicts exist?



On Dec 17, 2012, at 22:34, Andreas Dilger wrote:
> On 2012-12-17, at 10:09 AM, John Hammond wrote:
>> On 12/05/2012 07:54 AM, Oleg Drokin wrote:
>>>   I just landed first patch of the series to reduce usage of our
libcfs_ wrappers for kernel primitives like libcfs_spin_lock/unlock...
>>>   You can see actual change here:
http://review.whamcloud.com/#change,2829
>>> 
>>>   It''s highly likely that plenty of patches will be
affected. To make our job easier, there is a
>>>   build/libcfs_cleanup.sed script included, you can run it on all
your .c and .h files to make necessary replacements:
>>>   sed -i -f build/libcfs_cleanup.sed3  `find . -name
"*.h" -or -name "*.c"`
>>> 
>>>   Please be also advised that there are more changes like this are
coming (timeline is not very clear ATM, we might be able to wait with the rest
until
>>>   after feature freeze) and the sed script will be updated
accordingly.
>> 
>> I have been wondering about wrappers and typedefs not affected by this
change, for example cfs_get_cpu(), cfs_atomic_read() and cfs_proc_dir_entry_t. 
In new code and patches should we use the cfs names or their Linux equivalents,
get_cpu(), atomic_read(), and struct proc_dir_entry?
> 
> Ideally, new patches would use the Linux primitives.  However, if they are
in client-side code that is compiled for liblustre, then the liblustre builds
would fail until the wrappers are renamed to their Linux equivalents (i.e.
removing "cfs_" prefix).
> 
> For server-side code and/or llite it should be fine to use the native Linux
functions.
> 
> Cheers, Andreas
> --
> Andreas Dilger                       Whamcloud, Inc.
> Principal Lustre Engineer            http://www.whamcloud.com/
> 
> 
> 
> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Alexey Lyahkov

2012-Dec-19 12:32 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

That is second question, 
how we plan maintain a stable version to build with older kernels from RHEL,
SuSe and may be Debian.
how WC have plan to sync changes between stable tree and development in kernel?
what they will planed in case huge API changes? as example that change - splice
API was added in 2.6.20 kernel and sendfile API was killed - so we should lost
all kernels with sendfile support (RHEL5, SLES9/10).
Same for page fault API - they have 3 versions between 2.6.21 and 2.6.26 - so
old kernels will be also killed from support.
or we will have never tested code in ''stable'' tree.

reason to use autoconf macroses - we can''t trust a RHEL/SuSe kernel
version/


On Dec 19, 2012, at 04:12, Prakash Surya wrote:>>> ]unning.
>>> That won''t be an issue with Lustre, but you get the idea.
>> 
>> So let me just confirm I understand your issue here.  For example, the
>> Linux kernel is at 3.12.1 (version numbers are purely hypothetical and
>> far into the future) which has Lustre 2.6.3 in it.  But RHEL
9''s kernel
>> is only 3.8.1"ish" and has Lustre 2.4.2 (because
that''s what was in the
>> upstream Linux 3.8.1 kernel) in it.
>> 
>> The complaint is that the Lustre (2.4.2) that''s in the RHEL 9
kernel
>> that you are using is too old for some reason?
>> 
>> So what''s the remedy?  You can either (a) download the
stand-alone
>> Lustre module source for the version of Lustre you need and build it
>> against the RHEL 9 kernel you want to use (assuming they are near
enough
>> to be compatible -- which is the same situation that exists today) or
> 
> That is a big assumption, and is most likely false. Currently autoconf
> maintains the compatibility of Lustre with the different kernel versions.
> If the client is in tree, are we really going to maintain the same level
> of autoconf "foo" to be compatible with older kernels? And if
not, we
> would have to maintain a second package, outside of the kernel, which did
> have this autoconf compatibility layer.
> 
> I don''t think option (a) is the same as what we have now.
> 
> -- 
> Cheers, Prakash
> 
>> (b) you download the entire upstream kernel that has the version of
>> Lustre that you want to use and build that.
>> 
>> It seems to me that (a) is what everyone already has to do all of the
>> time today anyway (assuming the binary RPM that we ship for the given
>> vendor kernel is not new enough) and (b) just isn''t even
possible so
>> that''s gravy.
>> 
>> There might be something obvious that I''m missing, but
I''m failing to
>> see how having Lustre (client) in the kernel is any worse than the
>> status quo (basically scenario (a) above) and indeed should provide
>> Lustre compatibility to the current stable Linux sooner than we can
>> typically turn it around.  
>> 
>> Cheers,
>> b.
>> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Liu, Xuezhao

2012-Dec-19 15:06 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

"Nice idea, what about conflicts with names between linux and freebsd?
both have a atomic but with different arguments atomic_set, atomic_add as
example."

For atomic_xxx in freebsd, we may refer the ofed solution. It defines atomic
operations in "sys/ofed/include/asm/atomic.h".
For ofed, there are also some other header files to provide similar semantics of
linux primitives.

"did you research before make that patch to be sure no conflicts
exist?"

I made those initial cleanup patches, I did worry the possible conflicts as you
said.
But I think the hard-to-resolve confliction cases should be very rare, we can
deal with it case by case(the possible spin_lock_free for example as previous
email).

Thanks,
Xuezhao

On Dec 17, 2012, at 22:34, Andreas Dilger wrote:
> On 2012-12-17, at 10:09 AM, John Hammond wrote:
>> On 12/05/2012 07:54 AM, Oleg Drokin wrote:
>>>   I just landed first patch of the series to reduce usage of our
libcfs_ wrappers for kernel primitives like libcfs_spin_lock/unlock...
>>>   You can see actual change here: 
>>> http://review.whamcloud.com/#change,2829
>>> 
>>>   It''s highly likely that plenty of patches will be
affected. To make our job easier, there is a
>>>   build/libcfs_cleanup.sed script included, you can run it on all
your .c and .h files to make necessary replacements:
>>>   sed -i -f build/libcfs_cleanup.sed3  `find . -name
"*.h" -or -name
>>> "*.c"`
>>> 
>>>   Please be also advised that there are more changes like this are
coming (timeline is not very clear ATM, we might be able to wait with the rest
until
>>>   after feature freeze) and the sed script will be updated
accordingly.
>> 
>> I have been wondering about wrappers and typedefs not affected by this
change, for example cfs_get_cpu(), cfs_atomic_read() and cfs_proc_dir_entry_t. 
In new code and patches should we use the cfs names or their Linux equivalents,
get_cpu(), atomic_read(), and struct proc_dir_entry?
> 
> Ideally, new patches would use the Linux primitives.  However, if they are
in client-side code that is compiled for liblustre, then the liblustre builds
would fail until the wrappers are renamed to their Linux equivalents (i.e.
removing "cfs_" prefix).
> 
> For server-side code and/or llite it should be fine to use the native Linux
functions.
> 
> Cheers, Andreas
> --
> Andreas Dilger                       Whamcloud, Inc.
> Principal Lustre Engineer            http://www.whamcloud.com/
> 
> 
> 
> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com


_______________________________________________
Lustre-devel mailing list
Lustre-devel at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-devel

Alexey Lyahkov

2012-Dec-19 18:55 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

If you don''t make a research before create a patches - why you create
it?
why we need to kill current code? just for put in kernel mainline?
That is very bad idea before you will create a plan - how stable branches will
supported.
because stable branch need to support older kernel and i don''t like a
OFED way - because that is way to introduce a strange errors.
because isn''t all functions may be simulated via old API, as example a
deadlock with SLES10 kernel and external OFED due bug in OFED backports.


On Dec 19, 2012, at 19:06, Liu, Xuezhao wrote:
> "Nice idea, what about conflicts with names between linux and freebsd?
> both have a atomic but with different arguments atomic_set, atomic_add as
example."
> 
> For atomic_xxx in freebsd, we may refer the ofed solution. It defines
atomic operations in "sys/ofed/include/asm/atomic.h".
> For ofed, there are also some other header files to provide similar
semantics of linux primitives.
> 
> "did you research before make that patch to be sure no conflicts
exist?"
> 
> I made those initial cleanup patches, I did worry the possible conflicts as
you said.
> But I think the hard-to-resolve confliction cases should be very rare, we
can deal with it case by case(the possible spin_lock_free for example as
previous email).


> 
> Thanks,
> Xuezhao
> 
> On Dec 17, 2012, at 22:34, Andreas Dilger wrote:
> 
>> On 2012-12-17, at 10:09 AM, John Hammond wrote:
>>> On 12/05/2012 07:54 AM, Oleg Drokin wrote:
>>>>  I just landed first patch of the series to reduce usage of our
libcfs_ wrappers for kernel primitives like libcfs_spin_lock/unlock...
>>>>  You can see actual change here: 
>>>> http://review.whamcloud.com/#change,2829
>>>> 
>>>>  It''s highly likely that plenty of patches will be
affected. To make our job easier, there is a
>>>>  build/libcfs_cleanup.sed script included, you can run it on
all your .c and .h files to make necessary replacements:
>>>>  sed -i -f build/libcfs_cleanup.sed3  `find . -name
"*.h" -or -name
>>>> "*.c"`
>>>> 
>>>>  Please be also advised that there are more changes like this
are coming (timeline is not very clear ATM, we might be able to wait with the
rest until
>>>>  after feature freeze) and the sed script will be updated
accordingly.
>>> 
>>> I have been wondering about wrappers and typedefs not affected by
this change, for example cfs_get_cpu(), cfs_atomic_read() and
cfs_proc_dir_entry_t.  In new code and patches should we use the cfs names or
their Linux equivalents, get_cpu(), atomic_read(), and struct proc_dir_entry?
>> 
>> Ideally, new patches would use the Linux primitives.  However, if they
are in client-side code that is compiled for liblustre, then the liblustre
builds would fail until the wrappers are renamed to their Linux equivalents
(i.e. removing "cfs_" prefix).
>> 
>> For server-side code and/or llite it should be fine to use the native
Linux functions.
>> 
>> Cheers, Andreas
>> --
>> Andreas Dilger                       Whamcloud, Inc.
>> Principal Lustre Engineer            http://www.whamcloud.com/
>> 
>> 
>> 
>> 
> 
> ----------------------------------------------
> Alexey Lyahkov
> alexey_lyashkov at xyratex.com
> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Dilger, Andreas

2012-Dec-19 21:36 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On 2012-12-19, at 5:27, "Alexey Lyahkov" <alexey_lyashkov at
xyratex.com> wrote:> Nice idea, what about conflicts with names between linux and freebsd?
> both have a atomic but with different arguments
> atomic_set, atomic_add as example.
> 
> did you research before make that patch to be sure no conflicts exist?
Yes, it is possible that such conflicts exist. However, it is impossible to know
of such conflicts if there is not any code in the Lustre tree that breaks, and
no way to test it.

I''m aware that you have been working on a FreeBSD FUSE port at times.
Have you ever released the patches somewhere? There never was any code in the
Lustre tree for it and I''ve never heard of any users. Virtually (?) all
of the Lustre users are on Linux, and making it easier for Linux users to use
Lustre is most important.

I don''t want to make it impossible to port to FreeBSD, but if there is
a workaround as Xuezhao has proposed, and there are no real users of this code,
then the benefits of broader Lustre acceptance outweigh the inconvenience to one
or two people developing the FreeBSD FUSE port.

Cheers, Andreas
> On Dec 17, 2012, at 22:34, Andreas Dilger wrote:
> 
>> On 2012-12-17, at 10:09 AM, John Hammond wrote:
>>> On 12/05/2012 07:54 AM, Oleg Drokin wrote:
>>>>  I just landed first patch of the series to reduce usage of our
libcfs_ wrappers for kernel primitives like libcfs_spin_lock/unlock...
>>>>  You can see actual change here:
http://review.whamcloud.com/#change,2829
>>>> 
>>>>  It''s highly likely that plenty of patches will be
affected. To make our job easier, there is a
>>>>  build/libcfs_cleanup.sed script included, you can run it on
all your .c and .h files to make necessary replacements:
>>>>  sed -i -f build/libcfs_cleanup.sed3  `find . -name
"*.h" -or -name "*.c"`
>>>> 
>>>>  Please be also advised that there are more changes like this
are coming (timeline is not very clear ATM, we might be able to wait with the
rest until
>>>>  after feature freeze) and the sed script will be updated
accordingly.
>>> 
>>> I have been wondering about wrappers and typedefs not affected by
this change, for example cfs_get_cpu(), cfs_atomic_read() and
cfs_proc_dir_entry_t.  In new code and patches should we use the cfs names or
their Linux equivalents, get_cpu(), atomic_read(), and struct proc_dir_entry?
>> 
>> Ideally, new patches would use the Linux primitives.  However, if they
are in client-side code that is compiled for liblustre, then the liblustre
builds would fail until the wrappers are renamed to their Linux equivalents
(i.e. removing "cfs_" prefix).
>> 
>> For server-side code and/or llite it should be fine to use the native
Linux functions.
>> 
>> Cheers, Andreas
>> --
>> Andreas Dilger                       Whamcloud, Inc.
>> Principal Lustre Engineer            http://www.whamcloud.com/
>> 
>> 
>> 
>> 
> 
> ----------------------------------------------
> Alexey Lyahkov
> alexey_lyashkov at xyratex.com
> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

Dilger, Andreas

2012-Dec-19 22:59 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On 12/19/12 5:32 AM, "Alexey Lyahkov" <alexey_lyashkov at
xyratex.com> wrote:
>That is second question,
>how we plan maintain a stable version to build with older kernels from
>RHEL, SuSe and may be Debian.
>how WC have plan to sync changes between stable tree and development in
>kernel?
This will need some extra effort, but needs to happen.  That is why we are
now
working to clean up the current Lustre code to be acceptable to the
upstream
kernel.  If the out-of-kernel Lustre code is wildly different from the
in-kernel
Lustre code, then it will be far too much effort to keep the two in sync.

By changing the compatibility macros to match the upstream kernel, cfs_*
wrappers,
split the client/server code, etc, we can make the out-of-tree code match
the
in-kernel code very closely, then porting patches back and forth will be
much
less work.
>what they will planed in case huge API changes? as example that change -
>splice API was added in 2.6.20 kernel and sendfile API was killed - so we
>should lost all kernels with sendfile support (RHEL5, SLES9/10).
>Same for page fault API - they have 3 versions between 2.6.21 and 2.6.26
>- so old kernels will be also killed from support.
>or we will have never tested code in ''stable'' tree.
This is no different than the situation today.  There are large API
changes in
the upstream kernel (e.g. VFS changes in 2.6.37), and we are always playing
catch-up with them.  Through the work of EMC we are close to being in sync
with
the upstream kernel (3.6 at least), and being part of the upstream kernel
will
definitely help avoid this problem.  Then, patches to change the API in the
kernel will also be applied to the Lustre code immediately.

Once Lustre is in the upstream kernel, it will eventually get into the
vendor
kernels.  It may also be possible to get a backport (i.e. the current
version)
included into a vendor kernel since they are less reluctant to do so for
code
that is already in the upstream kernel.
>reason to use autoconf macroses - we can''t trust a RHEL/SuSe kernel
>version/
Sure, we will need this for the out-of-tree Lustre code, as we do today,
but not
inside the kernel.  The benefit is that the in-tree code will be the
"right"
code for that particular kernel version, and no macros will be needed.

Cheers, Andreas
>On Dec 19, 2012, at 04:12, Prakash Surya wrote:
>>>> ]unning.
>>>> That won''t be an issue with Lustre, but you get the
idea.
>>> 
>>> So let me just confirm I understand your issue here.  For example,
the
>>> Linux kernel is at 3.12.1 (version numbers are purely hypothetical
and
>>> far into the future) which has Lustre 2.6.3 in it.  But RHEL
9''s kernel
>>> is only 3.8.1"ish" and has Lustre 2.4.2 (because
that''s what was in the
>>> upstream Linux 3.8.1 kernel) in it.
>>> 
>>> The complaint is that the Lustre (2.4.2) that''s in the
RHEL 9 kernel
>>> that you are using is too old for some reason?
>>> 
>>> So what''s the remedy?  You can either (a) download the
stand-alone
>>> Lustre module source for the version of Lustre you need and build
it
>>> against the RHEL 9 kernel you want to use (assuming they are near
>>>enough
>>> to be compatible -- which is the same situation that exists today)
or
>> 
>> That is a big assumption, and is most likely false. Currently autoconf
>> maintains the compatibility of Lustre with the different kernel
>>versions.
>> If the client is in tree, are we really going to maintain the same
level
>> of autoconf "foo" to be compatible with older kernels? And if
not, we
>> would have to maintain a second package, outside of the kernel, which
>>did
>> have this autoconf compatibility layer.
>> 
>> I don''t think option (a) is the same as what we have now.
>> 
>> -- 
>> Cheers, Prakash
>> 
>>> (b) you download the entire upstream kernel that has the version of
>>> Lustre that you want to use and build that.
>>> 
>>> It seems to me that (a) is what everyone already has to do all of
the
>>> time today anyway (assuming the binary RPM that we ship for the
given
>>> vendor kernel is not new enough) and (b) just isn''t even
possible so
>>> that''s gravy.
>>> 
>>> There might be something obvious that I''m missing, but
I''m failing to
>>> see how having Lustre (client) in the kernel is any worse than the
>>> status quo (basically scenario (a) above) and indeed should provide
>>> Lustre compatibility to the current stable Linux sooner than we can
>>> typically turn it around.
>>> 
>>> Cheers,
>>> b.
>>> 
>
>----------------------------------------------
>Alexey Lyahkov
>alexey_lyashkov at xyratex.com
>
>
>_______________________________________________
>Lustre-devel mailing list
>Lustre-devel at lists.lustre.org
>http://lists.lustre.org/mailman/listinfo/lustre-devel
>

Cheers, Andreas
-- 
Andreas Dilger

Lustre Software Architect
Intel High Performance Data Division

Peng, Tao

2012-Dec-20 02:32 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

> -----Original Message-----
> From: lustre-devel-bounces at lists.lustre.org [mailto:lustre-devel-bounces
at lists.lustre.org] On Behalf
> Of Dilger, Andreas
> Sent: Thursday, December 20, 2012 6:59 AM
> To: Alexey Lyahkov; Prakash Surya
> Cc: hpdd-discuss at lists.01.org; WC-Discuss; <lustre-devel at
lists.lustre.org>
> Subject: Re: [Lustre-devel] [wc-discuss] Important changes to libcfs
primitives usage.
> 
> On 12/19/12 5:32 AM, "Alexey Lyahkov" <alexey_lyashkov at
xyratex.com> wrote:
> 
> >That is second question,
> >how we plan maintain a stable version to build with older kernels from
> >RHEL, SuSe and may be Debian.
> >how WC have plan to sync changes between stable tree and development in
> >kernel?
> 
> This will need some extra effort, but needs to happen.  That is why we are
> now
> working to clean up the current Lustre code to be acceptable to the
> upstream
> kernel.  If the out-of-kernel Lustre code is wildly different from the
> in-kernel
> Lustre code, then it will be far too much effort to keep the two in sync.
> 
> By changing the compatibility macros to match the upstream kernel, cfs_*
> wrappers,
> split the client/server code, etc, we can make the out-of-tree code match
> the
> in-kernel code very closely, then porting patches back and forth will be
> much
> less work.
> 
> >what they will planed in case huge API changes? as example that change
-
> >splice API was added in 2.6.20 kernel and sendfile API was killed - so
we
> >should lost all kernels with sendfile support (RHEL5, SLES9/10).
> >Same for page fault API - they have 3 versions between 2.6.21 and
2.6.26
> >- so old kernels will be also killed from support.
> >or we will have never tested code in ''stable'' tree.
> 
> This is no different than the situation today.  There are large API
> changes in
> the upstream kernel (e.g. VFS changes in 2.6.37), and we are always playing
> catch-up with them.  Through the work of EMC we are close to being in sync
> with
> the upstream kernel (3.6 at least), and being part of the upstream kernel
> will
> definitely help avoid this problem.  Then, patches to change the API in the
> kernel will also be applied to the Lustre code immediately.
> 
> Once Lustre is in the upstream kernel, it will eventually get into the
> vendor
> kernels.  It may also be possible to get a backport (i.e. the current
> version)
> included into a vendor kernel since they are less reluctant to do so for
> code
> that is already in the upstream kernel.
> 
> >reason to use autoconf macroses - we can''t trust a RHEL/SuSe
kernel
> >version/Well, there are situations that autoconf cannot handle properly, such as the
open_flag change by upstream commit (8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a).
The change is inline so we cannot tell it without grepping the source file
(fs/namei.c) which is likely unavailable when users build Lustre client from
source. We currently solved it by putting a LINUX_VERSION_CODE check there but
as you see it is not trustworthy when vendors backport patches.

And IMO, the right way to solve such problems is to put Lustre in
upstream/stable kernels and let vendors ship it so that they match in the first
place.

Cheers,
Tao> 
> Sure, we will need this for the out-of-tree Lustre code, as we do today,
> but not
> inside the kernel.  The benefit is that the in-tree code will be the
> "right"
> code for that particular kernel version, and no macros will be needed.
> 
> Cheers, Andreas
> 
> >On Dec 19, 2012, at 04:12, Prakash Surya wrote:
> >>>> ]unning.
> >>>> That won''t be an issue with Lustre, but you get
the idea.
> >>>
> >>> So let me just confirm I understand your issue here.  For
example, the
> >>> Linux kernel is at 3.12.1 (version numbers are purely
hypothetical and
> >>> far into the future) which has Lustre 2.6.3 in it.  But RHEL
9''s kernel
> >>> is only 3.8.1"ish" and has Lustre 2.4.2 (because
that''s what was in the
> >>> upstream Linux 3.8.1 kernel) in it.
> >>>
> >>> The complaint is that the Lustre (2.4.2) that''s in
the RHEL 9 kernel
> >>> that you are using is too old for some reason?
> >>>
> >>> So what''s the remedy?  You can either (a) download
the stand-alone
> >>> Lustre module source for the version of Lustre you need and
build it
> >>> against the RHEL 9 kernel you want to use (assuming they are
near
> >>>enough
> >>> to be compatible -- which is the same situation that exists
today) or
> >>
> >> That is a big assumption, and is most likely false. Currently
autoconf
> >> maintains the compatibility of Lustre with the different kernel
> >>versions.
> >> If the client is in tree, are we really going to maintain the same
level
> >> of autoconf "foo" to be compatible with older kernels?
And if not, we
> >> would have to maintain a second package, outside of the kernel,
which
> >>did
> >> have this autoconf compatibility layer.
> >>
> >> I don''t think option (a) is the same as what we have now.
> >>
> >> --
> >> Cheers, Prakash
> >>
> >>> (b) you download the entire upstream kernel that has the
version of
> >>> Lustre that you want to use and build that.
> >>>
> >>> It seems to me that (a) is what everyone already has to do all
of the
> >>> time today anyway (assuming the binary RPM that we ship for
the given
> >>> vendor kernel is not new enough) and (b) just isn''t
even possible so
> >>> that''s gravy.
> >>>
> >>> There might be something obvious that I''m missing,
but I''m failing to
> >>> see how having Lustre (client) in the kernel is any worse than
the
> >>> status quo (basically scenario (a) above) and indeed should
provide
> >>> Lustre compatibility to the current stable Linux sooner than
we can
> >>> typically turn it around.
> >>>
> >>> Cheers,
> >>> b.
> >>>
> >
> >----------------------------------------------
> >Alexey Lyahkov
> >alexey_lyashkov at xyratex.com
> >
> >
> >_______________________________________________
> >Lustre-devel mailing list
> >Lustre-devel at lists.lustre.org
> >http://lists.lustre.org/mailman/listinfo/lustre-devel
> >
> 
> 
> Cheers, Andreas
> --
> Andreas Dilger
> 
> Lustre Software Architect
> Intel High Performance Data Division
> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

Alexey Lyahkov

2012-Dec-20 10:08 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

I don''t understand a point to broke something which work now.
as about workaround - did you remember bug with cancel for dirty pages in OFED
when Lustre hit a deadlock due bad dirty pages accounting?

On Dec 20, 2012, at 01:36, Dilger, Andreas wrote:
> On 2012-12-19, at 5:27, "Alexey Lyahkov" <alexey_lyashkov at
xyratex.com> wrote:
>> Nice idea, what about conflicts with names between linux and freebsd?
>> both have a atomic but with different arguments
>> atomic_set, atomic_add as example.
>> 
>> did you research before make that patch to be sure no conflicts exist?
> 
> Yes, it is possible that such conflicts exist. However, it is impossible to
know of such conflicts if there is not any code in the Lustre tree that breaks,
and no way to test it.
> 
> I''m aware that you have been working on a FreeBSD FUSE port at
times. Have you ever released the patches somewhere? There never was any code in
the Lustre tree for it and I''ve never heard of any users. Virtually (?)
all of the Lustre users are on Linux, and making it easier for Linux users to
use Lustre is most important.
> 
> I don''t want to make it impossible to port to FreeBSD, but if
there is a workaround as Xuezhao has proposed, and there are no real users of
this code, then the benefits of broader Lustre acceptance outweigh the
inconvenience to one or two people developing the FreeBSD FUSE port.
> 
> Cheers, Andreas
> 
>> On Dec 17, 2012, at 22:34, Andreas Dilger wrote:
>> 
>>> On 2012-12-17, at 10:09 AM, John Hammond wrote:
>>>> On 12/05/2012 07:54 AM, Oleg Drokin wrote:
>>>>> I just landed first patch of the series to reduce usage of
our libcfs_ wrappers for kernel primitives like libcfs_spin_lock/unlock...
>>>>> You can see actual change here:
http://review.whamcloud.com/#change,2829
>>>>> 
>>>>> It''s highly likely that plenty of patches will be
affected. To make our job easier, there is a
>>>>> build/libcfs_cleanup.sed script included, you can run it on
all your .c and .h files to make necessary replacements:
>>>>> sed -i -f build/libcfs_cleanup.sed3  `find . -name
"*.h" -or -name "*.c"`
>>>>> 
>>>>> Please be also advised that there are more changes like
this are coming (timeline is not very clear ATM, we might be able to wait with
the rest until
>>>>> after feature freeze) and the sed script will be updated
accordingly.
>>>> 
>>>> I have been wondering about wrappers and typedefs not affected
by this change, for example cfs_get_cpu(), cfs_atomic_read() and
cfs_proc_dir_entry_t.  In new code and patches should we use the cfs names or
their Linux equivalents, get_cpu(), atomic_read(), and struct proc_dir_entry?
>>> 
>>> Ideally, new patches would use the Linux primitives.  However, if
they are in client-side code that is compiled for liblustre, then the liblustre
builds would fail until the wrappers are renamed to their Linux equivalents
(i.e. removing "cfs_" prefix).
>>> 
>>> For server-side code and/or llite it should be fine to use the
native Linux functions.
>>> 
>>> Cheers, Andreas
>>> --
>>> Andreas Dilger                       Whamcloud, Inc.
>>> Principal Lustre Engineer            http://www.whamcloud.com/
>>> 
>>> 
>>> 
>>> 
>> 
>> ----------------------------------------------
>> Alexey Lyahkov
>> alexey_lyashkov at xyratex.com
>> 
>> 
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Alexey Lyahkov

2012-Dec-20 10:24 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

> 
> 
>> what they will planed in case huge API changes? as example that change
-
>> splice API was added in 2.6.20 kernel and sendfile API was killed - so
we
>> should lost all kernels with sendfile support (RHEL5, SLES9/10).
>> Same for page fault API - they have 3 versions between 2.6.21 and
2.6.26
>> - so old kernels will be also killed from support.
>> or we will have never tested code in ''stable'' tree.
> 
> This is no different than the situation today.Really? currently we have a stable code for some stable kernel and
don''t have a big kernel API changes until that is will be tested.
but if client will in kernel - you will lost source control, because any
developer will change a kernel API and will change a lustre code without make a
special testing for lustre, mostly test a building sources.

>  There are large API
> changes in
> the upstream kernel (e.g. VFS changes in 2.6.37), and we are always playing
> catch-up with them.  sure. we will catch - but when we will be stable kernel sources without huge
kernel API changes between versions.
> Through the work of EMC we are close to being in sync
> with
> the upstream kernel (3.6 at least), and being part of the upstream kernel
> will
> definitely help avoid this problem.Sorry, i see only troubles with stable sources, but don''t find a real
benefits in that way. May be you forget about problems with porting lustre
between kernels and how much bugs we hit after it? but you select way when we
don''t have a stable tree.

>  Then, patches to change the API in the
> kernel will also be applied to the Lustre code immediately.
> yes, but you really think that is will be stable code? did you think API changes
will be tested with lustre fine?

> Once Lustre is in the upstream kernel, it will eventually get into the
> vendor
> kernels.  It may also be possible to get a backport (i.e. the current
> version)
> included into a vendor kernel since they are less reluctant to do so for
> code
> that is already in the upstream kernel.
So vendor will have older lustre client version in kernel, and we will be need
to support that version, and additional ''stable'' version and
version in last kernel. I think that is too much efforts - in comparing single
tree with supporting different kernels.

> 
>> reason to use autoconf macroses - we can''t trust a RHEL/SuSe
kernel
>> version/
> 
> Sure, we will need this for the out-of-tree Lustre code, as we do today,
> but not
> inside the kernel.  The benefit is that the in-tree code will be the
> "right"
> code for that particular kernel version, and no macros will be needed.
> Sorry. I don''t see a benefits now. if EMC want - they may maintain a
own tree - but i don''t like to see removing existent functionality
named as ''cleanup''.



> Cheers, Andreas
> 
>> On Dec 19, 2012, at 04:12, Prakash Surya wrote:
>>>>> ]unning.
>>>>> That won''t be an issue with Lustre, but you get
the idea.
>>>> 
>>>> So let me just confirm I understand your issue here.  For
example, the
>>>> Linux kernel is at 3.12.1 (version numbers are purely
hypothetical and
>>>> far into the future) which has Lustre 2.6.3 in it.  But RHEL
9''s kernel
>>>> is only 3.8.1"ish" and has Lustre 2.4.2 (because
that''s what was in the
>>>> upstream Linux 3.8.1 kernel) in it.
>>>> 
>>>> The complaint is that the Lustre (2.4.2) that''s in the
RHEL 9 kernel
>>>> that you are using is too old for some reason?
>>>> 
>>>> So what''s the remedy?  You can either (a) download the
stand-alone
>>>> Lustre module source for the version of Lustre you need and
build it
>>>> against the RHEL 9 kernel you want to use (assuming they are
near
>>>> enough
>>>> to be compatible -- which is the same situation that exists
today) or
>>> 
>>> That is a big assumption, and is most likely false. Currently
autoconf
>>> maintains the compatibility of Lustre with the different kernel
>>> versions.
>>> If the client is in tree, are we really going to maintain the same
level
>>> of autoconf "foo" to be compatible with older kernels?
And if not, we
>>> would have to maintain a second package, outside of the kernel,
which
>>> did
>>> have this autoconf compatibility layer.
>>> 
>>> I don''t think option (a) is the same as what we have now.
>>> 
>>> -- 
>>> Cheers, Prakash
>>> 
>>>> (b) you download the entire upstream kernel that has the
version of
>>>> Lustre that you want to use and build that.
>>>> 
>>>> It seems to me that (a) is what everyone already has to do all
of the
>>>> time today anyway (assuming the binary RPM that we ship for the
given
>>>> vendor kernel is not new enough) and (b) just isn''t
even possible so
>>>> that''s gravy.
>>>> 
>>>> There might be something obvious that I''m missing, but
I''m failing to
>>>> see how having Lustre (client) in the kernel is any worse than
the
>>>> status quo (basically scenario (a) above) and indeed should
provide
>>>> Lustre compatibility to the current stable Linux sooner than we
can
>>>> typically turn it around.
>>>> 
>>>> Cheers,
>>>> b.
>>>> 
>> 
>> ----------------------------------------------
>> Alexey Lyahkov
>> alexey_lyashkov at xyratex.com
>> 
>> 
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>> 
> 
> 
> Cheers, Andreas
> -- 
> Andreas Dilger
> 
> Lustre Software Architect
> Intel High Performance Data Division
> 
> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Alexey Lyahkov

2012-Dec-20 10:45 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

On Dec 20, 2012, at 06:32, Peng, Tao wrote:>> 
>> 
>>> reason to use autoconf macroses - we can''t trust a
RHEL/SuSe kernel
>>> version/
> Well, there are situations that autoconf cannot handle properly, such as
the open_flag change by upstream commit
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a).i don''t understand how it is affect to lustre, as that is internal
change for VFS, but if you provide more details - i try to help you. But in
different thread please.

hm.. some minutes latter - you talk about ''static inline int
ll_namei_to_lookup_intent_flag(int flag)'' change? can you point lustre
commit in next time.
> The change is inline so we cannot tell it without grepping the source file
(fs/namei.c) which is likely unavailable when users build Lustre client from
source. We currently solved it by putting a LINUX_VERSION_CODE check there but
as you see it is not trustworthy when vendors backport patches.
> But what you will do if similar change will be backported by RH? they will have
outdated client in own tree and ''stable'' client in external
repository. that is need to be resolved in portable way anyway.


> And IMO, the right way to solve such problems is to put Lustre in
upstream/stable kernels and let vendors ship it so that they match in the first
place.
> Vendor will freeze lustre source version and never update it, but RHEL have ~5-7
years to support - so you will be need support that clients in own servers and
maintain stable lustre client for that kernel. So none difference with current
situation except a two additional versions to support
1) from stock vendor kernel
2) from stock vanila kernel.

and that is solve very few problems with taking a much more resources for
testing.

Put lustre code into kernel mainline good to have support a last kernel and for
development without stable tree. But i don''t like a support for very
older clients in server code - RHEL4 isn''t canceled to support (as i
know) that is lustre 1.4 time. RHEL5 that is 1.6 time.
but today we have talk about dropping lustre 1.8 clients support...
> Cheers,
> Tao
>> 
>> Sure, we will need this for the out-of-tree Lustre code, as we do
today,
>> but not
>> inside the kernel.  The benefit is that the in-tree code will be the
>> "right"
>> code for that particular kernel version, and no macros will be needed.
>> 
>> Cheers, Andreas
>> 
>>> On Dec 19, 2012, at 04:12, Prakash Surya wrote:
>>>>>> ]unning.
>>>>>> That won''t be an issue with Lustre, but you
get the idea.
>>>>> 
>>>>> So let me just confirm I understand your issue here.  For
example, the
>>>>> Linux kernel is at 3.12.1 (version numbers are purely
hypothetical and
>>>>> far into the future) which has Lustre 2.6.3 in it.  But
RHEL 9''s kernel
>>>>> is only 3.8.1"ish" and has Lustre 2.4.2 (because
that''s what was in the
>>>>> upstream Linux 3.8.1 kernel) in it.
>>>>> 
>>>>> The complaint is that the Lustre (2.4.2) that''s in
the RHEL 9 kernel
>>>>> that you are using is too old for some reason?
>>>>> 
>>>>> So what''s the remedy?  You can either (a) download
the stand-alone
>>>>> Lustre module source for the version of Lustre you need and
build it
>>>>> against the RHEL 9 kernel you want to use (assuming they
are near
>>>>> enough
>>>>> to be compatible -- which is the same situation that exists
today) or
>>>> 
>>>> That is a big assumption, and is most likely false. Currently
autoconf
>>>> maintains the compatibility of Lustre with the different kernel
>>>> versions.
>>>> If the client is in tree, are we really going to maintain the
same level
>>>> of autoconf "foo" to be compatible with older
kernels? And if not, we
>>>> would have to maintain a second package, outside of the kernel,
which
>>>> did
>>>> have this autoconf compatibility layer.
>>>> 
>>>> I don''t think option (a) is the same as what we have
now.
>>>> 
>>>> --
>>>> Cheers, Prakash
>>>> 
>>>>> (b) you download the entire upstream kernel that has the
version of
>>>>> Lustre that you want to use and build that.
>>>>> 
>>>>> It seems to me that (a) is what everyone already has to do
all of the
>>>>> time today anyway (assuming the binary RPM that we ship for
the given
>>>>> vendor kernel is not new enough) and (b) just
isn''t even possible so
>>>>> that''s gravy.
>>>>> 
>>>>> There might be something obvious that I''m missing,
but I''m failing to
>>>>> see how having Lustre (client) in the kernel is any worse
than the
>>>>> status quo (basically scenario (a) above) and indeed should
provide
>>>>> Lustre compatibility to the current stable Linux sooner
than we can
>>>>> typically turn it around.
>>>>> 
>>>>> Cheers,
>>>>> b.
>>>>> 
>>> 
>>> ----------------------------------------------
>>> Alexey Lyahkov
>>> alexey_lyashkov at xyratex.com
>>> 
>>> 
>>> _______________________________________________
>>> Lustre-devel mailing list
>>> Lustre-devel at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>> 
>> 
>> 
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> 
>> Lustre Software Architect
>> Intel High Performance Data Division
>> 
>> 
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Alexey Lyahkov

2012-Dec-20 11:06 UTC

head link

[Lustre-devel] OFLAGS change.

On Dec 20, 2012, at 06:32, Peng, Tao wrote:>> 
>> 
>>> reason to use autoconf macroses - we can''t trust a
RHEL/SuSe kernel
>>> version/
> Well, there are situations that autoconf cannot handle properly, such as
the open_flag change by upstream commit
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a). The change is inline so we cannot
tell it without grepping the source file (fs/namei.c) which is likely
unavailable when users build Lustre client from source. We currently solved it
by putting a LINUX_VERSION_CODE check there but as you see it is not trustworthy
when vendors backport patches.
> 
> And IMO, the right way to solve such problems is to put Lustre in
upstream/stable kernels and let vendors ship it so that they match in the first
place.

static inline int ll_namei_to_lookup_intent_flag(int flag)
{
#if LINUX_VERSION_CODE >= KERNEL_VERSION(3, 1, 0)
<------>flag = (flag & ~O_ACCMODE) | OPEN_FMODE(flag);
#endif
<------>return flag;
}

if i correctly understand - that change have a dependence to the OPEN_FMODE
macro - not a kernel version.
so for any kernels have a OPEN_FMODE - we need to use that function, otherwise
use flags directly.

----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Liu, Xuezhao

2012-Dec-20 12:08 UTC

head link

[Lustre-devel] OFLAGS change.

"if i correctly understand - that change have a dependence to the
OPEN_FMODE macro - not a kernel version.
so for any kernels have a OPEN_FMODE - we need to use that function, otherwise
use flags directly."

OPEN_FMODE was introduced at kernel 2.6.34
(5300990c0370e804e49d9a59d928c5d53fb73487).
open_to_namei_flags was changed at kernel 3.1
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a).
So we cannot only depend on OPEN_FMODE for that.

Thanks,
Xuezhao
-----Original Message-----
From: lustre-devel-bounces at lists.lustre.org [mailto:lustre-devel-bounces at
lists.lustre.org] On Behalf Of Alexey Lyahkov
Sent: 2012?12?20? 19:07
To: Peng, Tao
Cc: hpdd-discuss at lists.01.org; WC-Discuss; <lustre-devel at
lists.lustre.org>
Subject: [Lustre-devel] OFLAGS change.


On Dec 20, 2012, at 06:32, Peng, Tao wrote:>> 
>> 
>>> reason to use autoconf macroses - we can''t trust a
RHEL/SuSe kernel
>>> version/
> Well, there are situations that autoconf cannot handle properly, such as
the open_flag change by upstream commit
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a). The change is inline so we cannot
tell it without grepping the source file (fs/namei.c) which is likely
unavailable when users build Lustre client from source. We currently solved it
by putting a LINUX_VERSION_CODE check there but as you see it is not trustworthy
when vendors backport patches.
> 
> And IMO, the right way to solve such problems is to put Lustre in
upstream/stable kernels and let vendors ship it so that they match in the first
place.

static inline int ll_namei_to_lookup_intent_flag(int flag) { #if
LINUX_VERSION_CODE >= KERNEL_VERSION(3, 1, 0) <------>flag = (flag
& ~O_ACCMODE) | OPEN_FMODE(flag); #endif <------>return flag; }

if i correctly understand - that change have a dependence to the OPEN_FMODE
macro - not a kernel version.
so for any kernels have a OPEN_FMODE - we need to use that function, otherwise
use flags directly.

----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com


_______________________________________________
Lustre-devel mailing list
Lustre-devel at lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-devel

Alexey Lyahkov

2012-Dec-20 16:18 UTC

head link

[Lustre-devel] OFLAGS change.

I have checked on last kernel from kernel.org, so may be for other kernels that
is may wrong..
comments is appreciated. 

some assumption 
- intent.open.file->f_flags filled before call a vfs code for kernels RHEL6
and up (checked for -131 kernel is first), so ANY kernels with OPEN_FMODE macro
will have a f_flags filled correctly (checked by __dentry_open for all available
kernels).
- for kernels without OPEN_FMODE flag we may trust a flags from intent - because
they already have a conversion to FMODE for lover bits.

so we need - 
1) check if OPEN_FMODE exist, use a to conversion via
OPEN_FMODE(intent.open.file->f_flags) | ~O_ACCMODE;
2) if that macro don''t exist we trust oit->flags as that
don''t have a FMODE -> O_ flags conversion.

PS.
that is need to pass additional parameter to ll_convert_intent
nd->intent.open.file.
may be need better check - when f_flag filled before call a vfs function..





On Dec 20, 2012, at 16:08, Liu, Xuezhao wrote:
> "if i correctly understand - that change have a dependence to the
OPEN_FMODE macro - not a kernel version.
> so for any kernels have a OPEN_FMODE - we need to use that function,
otherwise use flags directly."
> 
> OPEN_FMODE was introduced at kernel 2.6.34
(5300990c0370e804e49d9a59d928c5d53fb73487).but exists in RHEL6 kernel - 2.6.32-343 (last 6.4 beta) - patchset for that
kernel don''t available on RHN, so not able to see which patch introduce
it.
> open_to_namei_flags was changed at kernel 3.1
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a).
> So we cannot only depend on OPEN_FMODE for that.
> if you look to tree - changes was applied to code what to restore a O_ flags
from FMODE_ part of open flags.
FS (like xfs) want to use a FMODE continue to use a OPEN_FMODE and FMODE_
macroses

> Thanks,
> Xuezhao
> -----Original Message-----
> From: lustre-devel-bounces at lists.lustre.org [mailto:lustre-devel-bounces
at lists.lustre.org] On Behalf Of Alexey Lyahkov
> Sent: 2012?12?20? 19:07
> To: Peng, Tao
> Cc: hpdd-discuss at lists.01.org; WC-Discuss; <lustre-devel at
lists.lustre.org>
> Subject: [Lustre-devel] OFLAGS change.
> 
> 
> On Dec 20, 2012, at 06:32, Peng, Tao wrote:
>>> 
>>> 
>>>> reason to use autoconf macroses - we can''t trust a
RHEL/SuSe kernel
>>>> version/
>> Well, there are situations that autoconf cannot handle properly, such
as the open_flag change by upstream commit
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a). The change is inline so we cannot
tell it without grepping the source file (fs/namei.c) which is likely
unavailable when users build Lustre client from source. We currently solved it
by putting a LINUX_VERSION_CODE check there but as you see it is not trustworthy
when vendors backport patches.
>> 
>> And IMO, the right way to solve such problems is to put Lustre in
upstream/stable kernels and let vendors ship it so that they match in the first
place.
> 
> 
> static inline int ll_namei_to_lookup_intent_flag(int flag) { #if
LINUX_VERSION_CODE >= KERNEL_VERSION(3, 1, 0) <------>flag = (flag
& ~O_ACCMODE) | OPEN_FMODE(flag); #endif <------>return flag; }
> 
> if i correctly understand - that change have a dependence to the OPEN_FMODE
macro - not a kernel version.
> so for any kernels have a OPEN_FMODE - we need to use that function,
otherwise use flags directly.
> 
> ----------------------------------------------
> Alexey Lyahkov
> alexey_lyashkov at xyratex.com
> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Peng, Tao

2012-Dec-21 04:44 UTC

head link

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

> -----Original Message-----
> From: Alexey Lyahkov [mailto:alexey_lyashkov at xyratex.com]
> Sent: Thursday, December 20, 2012 6:45 PM
> To: Peng, Tao
> Cc: Dilger, Andreas; Prakash Surya; hpdd-discuss at lists.01.org;
WC-Discuss; <lustre-
> devel at lists.lustre.org>
> Subject: Re: [Lustre-devel] [wc-discuss] Important changes to libcfs
primitives usage.
> 
> 
> On Dec 20, 2012, at 06:32, Peng, Tao wrote:
> >>
> >>
> >>> reason to use autoconf macroses - we can''t trust a
RHEL/SuSe kernel
> >>> version/
> > Well, there are situations that autoconf cannot handle properly, such
as the open_flag change by
> upstream commit (8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a).
> i don''t understand how it is affect to lustre, as that is internal
change for VFS, but if you provide
> more details - i try to help you. But in different thread please.
> 
> hm.. some minutes latter - you talk about ''static inline int
ll_namei_to_lookup_intent_flag(int flag)''
> change? can you point lustre commit in next time.
> 
> > The change is inline so we cannot tell it without grepping the source
file (fs/namei.c) which is
> likely unavailable when users build Lustre client from source. We currently
solved it by putting a
> LINUX_VERSION_CODE check there but as you see it is not trustworthy when
vendors backport patches.
> >
> But what you will do if similar change will be backported by RH? they will
have outdated client in own
> tree and ''stable'' client in external repository. that is
need to be resolved in portable way anyway.
> If Lustre is in kernel, RH will need to backport relative changes to Lustre as
part of the process of backporting kernel commit. So we won''t face such
problems. It only remains a problem if Lustre is maintained separately out of RH
kernel tree as is today.
> 
> 
> > And IMO, the right way to solve such problems is to put Lustre in
upstream/stable kernels and let
> vendors ship it so that they match in the first place.
> >
> Vendor will freeze lustre source version and never update it, 
> but RHEL have ~5-7 years to support - so
> you will be need support that clients in own servers and maintain stable
lustre client for that kernel.
> So none difference with current situation except a two additional versions
to support
> 1) from stock vendor kernel
> 2) from stock vanila kernel.
> It is the same case for any in-tree kernel modules. Besides, vendor kernels are
supported by vendors as well. So I don''t think it is such a burden for
Lustre. Other file systems all handle it pretty well.
> and that is solve very few problems with taking a much more resources for
testing.
> 
> Put lustre code into kernel mainline good to have support a last kernel and
for development without
> stable tree. But i don''t like a support for very older clients in
server code - RHEL4 isn''t canceled
> to support (as i know) that is lustre 1.4 time. RHEL5 that is 1.6 time.
> but today we have talk about dropping lustre 1.8 clients support...
> It is already answered by Andreas. We can reach a more stable point for Lustre
client and network protocol.
http://lists.lustre.org/pipermail/lustre-devel/2012-December/004187.html

While I don''t want to option about dropping 1.8 clients support, I
think it is better to have a more stable support matrix than today. And if you
are really serious about long term support as enterprise storage providers, most
NAS vendors still support nfsv2 today. Forcing everyone to upgrade
doesn''t seem practical to me...

Cheers,
Tao
> > Cheers,
> > Tao
> >>
> >> Sure, we will need this for the out-of-tree Lustre code, as we do
today,
> >> but not
> >> inside the kernel.  The benefit is that the in-tree code will be
the
> >> "right"
> >> code for that particular kernel version, and no macros will be
needed.
> >>
> >> Cheers, Andreas
> >>
> >>> On Dec 19, 2012, at 04:12, Prakash Surya wrote:
> >>>>>> ]unning.
> >>>>>> That won''t be an issue with Lustre, but
you get the idea.
> >>>>>
> >>>>> So let me just confirm I understand your issue here. 
For example, the
> >>>>> Linux kernel is at 3.12.1 (version numbers are purely
hypothetical and
> >>>>> far into the future) which has Lustre 2.6.3 in it. 
But RHEL 9''s kernel
> >>>>> is only 3.8.1"ish" and has Lustre 2.4.2
(because that''s what was in the
> >>>>> upstream Linux 3.8.1 kernel) in it.
> >>>>>
> >>>>> The complaint is that the Lustre (2.4.2)
that''s in the RHEL 9 kernel
> >>>>> that you are using is too old for some reason?
> >>>>>
> >>>>> So what''s the remedy?  You can either (a)
download the stand-alone
> >>>>> Lustre module source for the version of Lustre you
need and build it
> >>>>> against the RHEL 9 kernel you want to use (assuming
they are near
> >>>>> enough
> >>>>> to be compatible -- which is the same situation that
exists today) or
> >>>>
> >>>> That is a big assumption, and is most likely false.
Currently autoconf
> >>>> maintains the compatibility of Lustre with the different
kernel
> >>>> versions.
> >>>> If the client is in tree, are we really going to maintain
the same level
> >>>> of autoconf "foo" to be compatible with older
kernels? And if not, we
> >>>> would have to maintain a second package, outside of the
kernel, which
> >>>> did
> >>>> have this autoconf compatibility layer.
> >>>>
> >>>> I don''t think option (a) is the same as what we
have now.
> >>>>
> >>>> --
> >>>> Cheers, Prakash
> >>>>
> >>>>> (b) you download the entire upstream kernel that has
the version of
> >>>>> Lustre that you want to use and build that.
> >>>>>
> >>>>> It seems to me that (a) is what everyone already has
to do all of the
> >>>>> time today anyway (assuming the binary RPM that we
ship for the given
> >>>>> vendor kernel is not new enough) and (b) just
isn''t even possible so
> >>>>> that''s gravy.
> >>>>>
> >>>>> There might be something obvious that I''m
missing, but I''m failing to
> >>>>> see how having Lustre (client) in the kernel is any
worse than the
> >>>>> status quo (basically scenario (a) above) and indeed
should provide
> >>>>> Lustre compatibility to the current stable Linux
sooner than we can
> >>>>> typically turn it around.
> >>>>>
> >>>>> Cheers,
> >>>>> b.
> >>>>>
> >>>
> >>> ----------------------------------------------
> >>> Alexey Lyahkov
> >>> alexey_lyashkov at xyratex.com
> >>>
> >>>
> >>> _______________________________________________
> >>> Lustre-devel mailing list
> >>> Lustre-devel at lists.lustre.org
> >>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >>>
> >>
> >>
> >> Cheers, Andreas
> >> --
> >> Andreas Dilger
> >>
> >> Lustre Software Architect
> >> Intel High Performance Data Division
> >>
> >>
> >> _______________________________________________
> >> Lustre-devel mailing list
> >> Lustre-devel at lists.lustre.org
> >> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >
> 
> ----------------------------------------------
> Alexey Lyahkov
> alexey_lyashkov at xyratex.com
> 
>

Liu, Xuezhao

2012-Dec-21 07:47 UTC

head link

[Lustre-devel] OFLAGS change.

Hi Alex,

Thanks for looking into this issue.

Three related commits:
1) intent.open.file->f_flags is filled at kernel
2.6.34(482928d59db668b8d82a48717f78986d8cea72e9).
2) OPEN_FMODE is just the next commit(5300990c0370e804e49d9a59d928c5d53fb73487)
followed the above.
3) open_to_namei_flags was changed at kernel 3.1
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a).

The purpose is we want to check 3) to make lustre take the lookup intent flag
same as before.
We resolved it by putting a LINUX_VERSION_CODE check, it has potential defect
when vendor kernel(RHEL/SUSE) backport 3) to an older version.

You proposed method that check 2), bypass open_to_namei_flags'' result
"intent.open.flags" and take "intent.open.file->f_flags"
instead.
It can work and has an assumption that if vendor kernel backport 2) then must
also backport 1), it is possibly not a problem.
I think the method of taking "intent.open.file->f_flags" and
calculating the lookup intent flag is not so legitimate, as VFS wants FS use the
passed in "intent.open.flags". Some inline handling possibly will add
by VFS to open_to_namei_flags although current it only removes the special open
flag 3.
And OPEN_FMODE also possibly be changed/renamed/removed in future.
But I don''t oppose you to make another patch for that. 

This issue is only an example to say that autoconf is not always so powerful to
keep compatibility, get into mainline can help us release such pain. Be mainline
possibly is not so terrible as current the plan is only for client-side. But
this is discussed in another thread.

Thanks,
Xuezhao
-----Original Message-----
From: Alexey Lyahkov [mailto:alexey_lyashkov at xyratex.com] 
Sent: 2012?12?21? 0:18
To: Liu, Xuezhao
Cc: Peng, Tao; hpdd-discuss at lists.01.org; WC-Discuss; <lustre-devel at
lists.lustre.org>
Subject: Re: [Lustre-devel] OFLAGS change.

I have checked on last kernel from kernel.org, so may be for other kernels that
is may wrong..
comments is appreciated. 

some assumption
- intent.open.file->f_flags filled before call a vfs code for kernels RHEL6
and up (checked for -131 kernel is first), so ANY kernels with OPEN_FMODE macro
will have a f_flags filled correctly (checked by __dentry_open for all available
kernels).
- for kernels without OPEN_FMODE flag we may trust a flags from intent - because
they already have a conversion to FMODE for lover bits.

so we need -
1) check if OPEN_FMODE exist, use a to conversion via
OPEN_FMODE(intent.open.file->f_flags) | ~O_ACCMODE;
2) if that macro don''t exist we trust oit->flags as that
don''t have a FMODE -> O_ flags conversion.

PS.
that is need to pass additional parameter to ll_convert_intent
nd->intent.open.file.
may be need better check - when f_flag filled before call a vfs function..

On Dec 20, 2012, at 16:08, Liu, Xuezhao wrote:
> "if i correctly understand - that change have a dependence to the
OPEN_FMODE macro - not a kernel version.
> so for any kernels have a OPEN_FMODE - we need to use that function,
otherwise use flags directly."
> 
> OPEN_FMODE was introduced at kernel 2.6.34
(5300990c0370e804e49d9a59d928c5d53fb73487).but exists in RHEL6 kernel - 2.6.32-343 (last 6.4 beta) - patchset for that
kernel don''t available on RHN, so not able to see which patch introduce
it.
> open_to_namei_flags was changed at kernel 3.1
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a).
> So we cannot only depend on OPEN_FMODE for that.
> if you look to tree - changes was applied to code what to restore a O_ flags
from FMODE_ part of open flags.
FS (like xfs) want to use a FMODE continue to use a OPEN_FMODE and FMODE_
macroses

> Thanks,
> Xuezhao
> -----Original Message-----
> From: lustre-devel-bounces at lists.lustre.org 
> [mailto:lustre-devel-bounces at lists.lustre.org] On Behalf Of Alexey 
> Lyahkov
> Sent: 2012?12?20? 19:07
> To: Peng, Tao
> Cc: hpdd-discuss at lists.01.org; WC-Discuss; 
> <lustre-devel at lists.lustre.org>
> Subject: [Lustre-devel] OFLAGS change.
> 
> 
> On Dec 20, 2012, at 06:32, Peng, Tao wrote:
>>> 
>>> 
>>>> reason to use autoconf macroses - we can''t trust a
RHEL/SuSe kernel
>>>> version/
>> Well, there are situations that autoconf cannot handle properly, such
as the open_flag change by upstream commit
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a). The change is inline so we cannot
tell it without grepping the source file (fs/namei.c) which is likely
unavailable when users build Lustre client from source. We currently solved it
by putting a LINUX_VERSION_CODE check there but as you see it is not trustworthy
when vendors backport patches.
>> 
>> And IMO, the right way to solve such problems is to put Lustre in
upstream/stable kernels and let vendors ship it so that they match in the first
place.
> 
> 
> static inline int ll_namei_to_lookup_intent_flag(int flag) { #if 
> LINUX_VERSION_CODE >= KERNEL_VERSION(3, 1, 0) <------>flag = (flag
&
> ~O_ACCMODE) | OPEN_FMODE(flag); #endif <------>return flag; }
> 
> if i correctly understand - that change have a dependence to the OPEN_FMODE
macro - not a kernel version.
> so for any kernels have a OPEN_FMODE - we need to use that function,
otherwise use flags directly.
> 
> ----------------------------------------------
> Alexey Lyahkov
> alexey_lyashkov at xyratex.com
> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Alexey Lyahkov

2012-Dec-21 09:22 UTC

head link

[Lustre-devel] OFLAGS change.

Hi Liu,

some comments.
Most bigger Lustre customers don''t use a vanila kernel, so kernel
version check - is unusefull in that case.
Cray, HP, Dell, Bull (may be use a vanila - don''t remember), .... most
of them uses a RHEL, or SuSe kernels - so that check a wrong assumption and
don''t help for these distro`s, but will add a troubles with supporting
these clients.

as about other - cifs, and NFS uses a intent.open.file->f_flags to check
actual flags passed from VFS and don''t trust a intent flags
flags.>>int
cifs_create(struct inode *inode, struct dentry *direntry, int mode,
                struct nameidata *nd)
..
        if (nd && (nd->flags & LOOKUP_OPEN))
                oflags = nd->intent.open.file->f_flags;
        else
                oflags = O_RDONLY | O_CREAT;
>>>so that is acceptable way in kernel.

I agree about that is change is good example - but focusing on vanila kernel
development should be don''t break something else, a specially
add''s problems with supporting a RHEL, SuSe.

from my side.>>        # the implementation of cancel_dirty_page in OFED 1.4.1''s
SLES10 SP2
        # backport is broken, so ignore it
        if test -f $OFED_BACKPORT_PATH/linux/mm.h &&
           test "$(sed -ne ''/^static inline void
cancel_dirty_page(struct page \*page, unsigned int
account_size)$/,/^}$/p'' $OFED_BACKPORT_PATH/linux/mm.h | md5sum)"
= "c518cb32d6394760c5bc>>
that is from way you suggest.
that backport produce a live lock in lustre due impossible to have correct
emulation for a cancel_dirty_pages in OFED.
and sometimes be easy create a implementation for old API via new when rewrite a
lustre code to use new API and emulation for older kernels.

 

On Dec 21, 2012, at 11:47, Liu, Xuezhao wrote:
> Hi Alex,
> 
> Thanks for looking into this issue.
> 
> Three related commits:
> 1) intent.open.file->f_flags is filled at kernel
2.6.34(482928d59db668b8d82a48717f78986d8cea72e9).
> 2) OPEN_FMODE is just the next
commit(5300990c0370e804e49d9a59d928c5d53fb73487) followed the above.
> 3) open_to_namei_flags was changed at kernel 3.1
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a).
> 
> The purpose is we want to check 3) to make lustre take the lookup intent
flag same as before.
> We resolved it by putting a LINUX_VERSION_CODE check, it has potential
defect when vendor kernel(RHEL/SUSE) backport 3) to an older version.
> 
> You proposed method that check 2), bypass open_to_namei_flags''
result "intent.open.flags" and take
"intent.open.file->f_flags" instead.
> It can work and has an assumption that if vendor kernel backport 2) then
must also backport 1), it is possibly not a problem.
> I think the method of taking "intent.open.file->f_flags" and
calculating the lookup intent flag is not so legitimate, as VFS wants FS use the
passed in "intent.open.flags". Some inline handling possibly will add
by VFS to open_to_namei_flags although current it only removes the special open
flag 3.
> And OPEN_FMODE also possibly be changed/renamed/removed in future.
> But I don''t oppose you to make another patch for that. 
> 
> This issue is only an example to say that autoconf is not always so
powerful to keep compatibility, get into mainline can help us release such pain.
Be mainline possibly is not so terrible as current the plan is only for
client-side. But this is discussed in another thread.
> 
> Thanks,
> Xuezhao
> -----Original Message-----
> From: Alexey Lyahkov [mailto:alexey_lyashkov at xyratex.com] 
> Sent: 2012?12?21? 0:18
> To: Liu, Xuezhao
> Cc: Peng, Tao; hpdd-discuss at lists.01.org; WC-Discuss; <lustre-devel
at lists.lustre.org>
> Subject: Re: [Lustre-devel] OFLAGS change.
> 
> I have checked on last kernel from kernel.org, so may be for other kernels
that is may wrong..
> comments is appreciated. 
> 
> some assumption
> - intent.open.file->f_flags filled before call a vfs code for kernels
RHEL6 and up (checked for -131 kernel is first), so ANY kernels with OPEN_FMODE
macro will have a f_flags filled correctly (checked by __dentry_open for all
available kernels).
> - for kernels without OPEN_FMODE flag we may trust a flags from intent -
because they already have a conversion to FMODE for lover bits.
> 
> so we need -
> 1) check if OPEN_FMODE exist, use a to conversion via
OPEN_FMODE(intent.open.file->f_flags) | ~O_ACCMODE;
> 2) if that macro don''t exist we trust oit->flags as that
don''t have a FMODE -> O_ flags conversion.
> 
> PS.
> that is need to pass additional parameter to ll_convert_intent
nd->intent.open.file.
> may be need better check - when f_flag filled before call a vfs function..
> 
> 
> 
> 
> 
> On Dec 20, 2012, at 16:08, Liu, Xuezhao wrote:
> 
>> "if i correctly understand - that change have a dependence to the
OPEN_FMODE macro - not a kernel version.
>> so for any kernels have a OPEN_FMODE - we need to use that function,
otherwise use flags directly."
>> 
>> OPEN_FMODE was introduced at kernel 2.6.34
(5300990c0370e804e49d9a59d928c5d53fb73487).
> but exists in RHEL6 kernel - 2.6.32-343 (last 6.4 beta) - patchset for that
kernel don''t available on RHN, so not able to see which patch introduce
it.
> 
>> open_to_namei_flags was changed at kernel 3.1
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a).
>> So we cannot only depend on OPEN_FMODE for that.
>> 
> if you look to tree - changes was applied to code what to restore a O_
flags from FMODE_ part of open flags.
> FS (like xfs) want to use a FMODE continue to use a OPEN_FMODE and FMODE_
macroses
> 
> 
>> Thanks,
>> Xuezhao
>> -----Original Message-----
>> From: lustre-devel-bounces at lists.lustre.org 
>> [mailto:lustre-devel-bounces at lists.lustre.org] On Behalf Of Alexey 
>> Lyahkov
>> Sent: 2012?12?20? 19:07
>> To: Peng, Tao
>> Cc: hpdd-discuss at lists.01.org; WC-Discuss; 
>> <lustre-devel at lists.lustre.org>
>> Subject: [Lustre-devel] OFLAGS change.
>> 
>> 
>> On Dec 20, 2012, at 06:32, Peng, Tao wrote:
>>>> 
>>>> 
>>>>> reason to use autoconf macroses - we can''t trust a
RHEL/SuSe kernel
>>>>> version/
>>> Well, there are situations that autoconf cannot handle properly,
such as the open_flag change by upstream commit
(8a5e929dd2e05ab4d3d89f58c5e8fca596af8f3a). The change is inline so we cannot
tell it without grepping the source file (fs/namei.c) which is likely
unavailable when users build Lustre client from source. We currently solved it
by putting a LINUX_VERSION_CODE check there but as you see it is not trustworthy
when vendors backport patches.
>>> 
>>> And IMO, the right way to solve such problems is to put Lustre in
upstream/stable kernels and let vendors ship it so that they match in the first
place.
>> 
>> 
>> static inline int ll_namei_to_lookup_intent_flag(int flag) { #if 
>> LINUX_VERSION_CODE >= KERNEL_VERSION(3, 1, 0) <------>flag =
(flag &
>> ~O_ACCMODE) | OPEN_FMODE(flag); #endif <------>return flag; }
>> 
>> if i correctly understand - that change have a dependence to the
OPEN_FMODE macro - not a kernel version.
>> so for any kernels have a OPEN_FMODE - we need to use that function,
otherwise use flags directly.
>> 
>> ----------------------------------------------
>> Alexey Lyahkov
>> alexey_lyashkov at xyratex.com
>> 
>> 
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>> 
> 
> ----------------------------------------------
> Alexey Lyahkov
> alexey_lyashkov at xyratex.com
> 
> 
> 
----------------------------------------------
Alexey Lyahkov
alexey_lyashkov at xyratex.com

Liu, Xuezhao

2012-Dec-21 13:44 UTC

head link

[Lustre-devel] OFLAGS change.

I agree about that is change is good example - but focusing on vanila kernel
development should be don''t break something else, a specially
add''s problems with supporting a RHEL, SuSe.

[Xuezhao] Sure, we must ensure the added code will not break any supporting or
to-be-supported vendor kernels, I will make a fixing patch if the break really
happens.
>>        # the implementation of cancel_dirty_page in OFED 1.4.1''s
SLES10 SP2
        # backport is broken, so ignore it
        if test -f $OFED_BACKPORT_PATH/linux/mm.h &&
           test "$(sed -ne ''/^static inline void
cancel_dirty_page(struct page \*page, unsigned int
account_size)$/,/^}$/p'' $OFED_BACKPORT_PATH/linux/mm.h | md5sum)"
= "c518cb32d6394760c5bc>>
that is from way you suggest.
that backport produce a live lock in lustre due impossible to have correct
emulation for a cancel_dirty_pages in OFED.
and sometimes be easy create a implementation for old API via new when rewrite a
lustre code to use new API and emulation for older kernels.

[Xuezhao] I may don''t know the full story of the above case, but I
understand your point.
          OFED emulates new API based on old API on old kernel, and emulates
linux API based on other OS'' primitives on other OS.
          But I think that in the above case it does not matter whether or not
have the "cfs_" prefix, even if OFED has
"ofd_cancel_dirty_page" it also cannot avoid the above.

Thanks,          
Xuezhao

Lustre devel - Dec 2012 - Important changes to libcfs primitives usage.

[Lustre-devel] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] OFLAGS change.

[Lustre-devel] OFLAGS change.

[Lustre-devel] OFLAGS change.

[Lustre-devel] [wc-discuss] Important changes to libcfs primitives usage.

[Lustre-devel] OFLAGS change.

[Lustre-devel] OFLAGS change.

[Lustre-devel] OFLAGS change.