Erik Froese
2010-Aug-20 13:22 UTC
[Lustre-discuss] Enabling async journals while the filesystem is active
Is it safe to enable async journals on the OSS''s while the filesystem is active? I''d like to see how it works for us. Thanks Erik
Kevin Van Maren
2010-Aug-20 14:02 UTC
[Lustre-discuss] Enabling async journals while the filesystem is active
Yes, but depending on the Lustre version there are several bugs in the async journal code. Kevin Erik Froese wrote:> Is it safe to enable async journals on the OSS''s while the filesystem is active? > I''d like to see how it works for us. > > Thanks > Erik > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Erik Froese
2010-Aug-20 20:09 UTC
[Lustre-discuss] Enabling async journals while the filesystem is active
Thanks Kevin. This is 1.8.3. On Fri, Aug 20, 2010 at 10:02 AM, Kevin Van Maren <kevin.van.maren at oracle.com> wrote:> Yes, but depending on the Lustre version there are several bugs in the async > journal code. > > Kevin > > > Erik Froese wrote: >> >> Is it safe to enable async journals on the OSS''s while the filesystem is >> active? >> I''d like to see how it works for us. >> >> Thanks >> Erik >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > >
Peter Jones
2010-Aug-21 00:15 UTC
[Lustre-discuss] Enabling async journals while the filesystem is active
Really? I had heard of a serious bug before async journal commits were included in a release, but I had understood that things were working well at the few sites running it. Kevin Van Maren wrote:> Yes, but depending on the Lustre version there are several bugs in the > async journal code. > > Kevin > > > Erik Froese wrote: > >> Is it safe to enable async journals on the OSS''s while the filesystem is active? >> I''d like to see how it works for us. >> >> Thanks >> Erik >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >
Erik Froese
2010-Aug-21 00:34 UTC
[Lustre-discuss] Enabling async journals while the filesystem is active
I remember something about a bug during recovery after a crash in 1.8.1.1. Erik On Fri, Aug 20, 2010 at 8:15 PM, Peter Jones <peter.x.jones at oracle.com> wrote:> Really? I had heard of a serious bug before async journal commits were > included in a release, but I had understood that things were working well at > the few sites running it. > > Kevin Van Maren wrote: >> >> Yes, but depending on the Lustre version there are several bugs in the >> async journal code. >> >> Kevin >> >> >> Erik Froese wrote: >> >>> >>> Is it safe to enable async journals on the OSS''s while the filesystem is >>> active? >>> I''d like to see how it works for us. >>> >>> Thanks >>> Erik >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >
Peter Jones
2010-Aug-21 00:36 UTC
[Lustre-discuss] Enabling async journals while the filesystem is active
Quite. And async journal commits were officially added in 1.8.2. Erik Froese wrote:> I remember something about a bug during recovery after a crash in 1.8.1.1. > Erik > > On Fri, Aug 20, 2010 at 8:15 PM, Peter Jones <peter.x.jones at oracle.com> wrote: > >> Really? I had heard of a serious bug before async journal commits were >> included in a release, but I had understood that things were working well at >> the few sites running it. >> >> Kevin Van Maren wrote: >> >>> Yes, but depending on the Lustre version there are several bugs in the >>> async journal code. >>> >>> Kevin >>> >>> >>> Erik Froese wrote: >>> >>> >>>> Is it safe to enable async journals on the OSS''s while the filesystem is >>>> active? >>>> I''d like to see how it works for us. >>>> >>>> Thanks >>>> Erik >>>> _______________________________________________ >>>> Lustre-discuss mailing list >>>> Lustre-discuss at lists.lustre.org >>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>>> >>>> >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >>> >>> > >
David Dillow
2010-Aug-21 01:35 UTC
[Lustre-discuss] Enabling async journals while the filesystem is active
On Fri, 2010-08-20 at 17:36 -0700, Peter Jones wrote:> Quite. And async journal commits were officially added in 1.8.2. > > Erik Froese wrote: > > I remember something about a bug during recovery after a crash in 1.8.1.1. > > Erik > > > > On Fri, Aug 20, 2010 at 8:15 PM, Peter Jones <peter.x.jones at oracle.com> wrote: > > > >> Really? I had heard of a serious bug before async journal commits were > >> included in a release, but I had understood that things were working well at > >> the few sites running it.We''ve been quite happy with it. I recall a bug being reported against them, but IIRC that was actually a longstanding bug that predated async journals. My recollection is foggy, though so perhaps Oleg can set the record straight. -- Dave Dillow National Center for Computational Science Oak Ridge National Laboratory (865) 241-6602 office
Erik Froese
2010-Aug-21 14:06 UTC
[Lustre-discuss] Enabling async journals while the filesystem is active
I think we''re going to benefit from it. Last question: My clients and routers are still running Lustre 1.8.1.1. The MDS/OSSes are on 1.8.3. Would you recommend a client upgrade before enabling async journals? Erik On Fri, Aug 20, 2010 at 9:35 PM, David Dillow <dillowda at ornl.gov> wrote:> On Fri, 2010-08-20 at 17:36 -0700, Peter Jones wrote: >> Quite. And async journal commits were officially added in 1.8.2. >> >> Erik Froese wrote: >> > I remember something about a bug during recovery after a crash in 1.8.1.1. >> > Erik >> > >> > On Fri, Aug 20, 2010 at 8:15 PM, Peter Jones <peter.x.jones at oracle.com> wrote: >> > >> >> Really? I had heard of a serious bug before async journal commits were >> >> included in a release, but I had understood that things were working well at >> >> the few sites running it. > > We''ve been quite happy with it. I recall a bug being reported against > them, but IIRC that was actually a longstanding bug that predated async > journals. My recollection is foggy, though so perhaps Oleg can set the > record straight. > -- > Dave Dillow > National Center for Computational Science > Oak Ridge National Laboratory > (865) 241-6602 office > >
Oleg Drokin
2010-Aug-21 20:46 UTC
[Lustre-discuss] Enabling async journals while the filesystem is active
Hello! You would need to upgrade your clients to at least 1.8.2, otherwise you might hit bug 19128 during replay that would lead to losing some of the data being replayed. Version on the routers is not important for async journals feature. Bye, Oleg On Aug 21, 2010, at 10:06 AM, Erik Froese wrote:> I think we''re going to benefit from it. > > Last question: My clients and routers are still running Lustre > 1.8.1.1. The MDS/OSSes are on 1.8.3. > Would you recommend a client upgrade before enabling async journals? > > Erik > > On Fri, Aug 20, 2010 at 9:35 PM, David Dillow <dillowda at ornl.gov> wrote: >> On Fri, 2010-08-20 at 17:36 -0700, Peter Jones wrote: >>> Quite. And async journal commits were officially added in 1.8.2. >>> >>> Erik Froese wrote: >>>> I remember something about a bug during recovery after a crash in 1.8.1.1. >>>> Erik >>>> >>>> On Fri, Aug 20, 2010 at 8:15 PM, Peter Jones <peter.x.jones at oracle.com> wrote: >>>> >>>>> Really? I had heard of a serious bug before async journal commits were >>>>> included in a release, but I had understood that things were working well at >>>>> the few sites running it. >> >> We''ve been quite happy with it. I recall a bug being reported against >> them, but IIRC that was actually a longstanding bug that predated async >> journals. My recollection is foggy, though so perhaps Oleg can set the >> record straight. >> -- >> Dave Dillow >> National Center for Computational Science >> Oak Ridge National Laboratory >> (865) 241-6602 office >> >> > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss