nathan@clusterfs.com
2006-Dec-15 12:58 UTC
[Lustre-devel] [Bug 11089] lustre client statistics
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11089 Created an attachment (id=9151) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9151&action=view) add per-export nid and uuid proc entries This patch adds "nid" and "uuid" entries to the per-export proc directories, so that users can at least determine what the nid is. Nobody from ORNL seems to be subscribed to this bug - mjmac, do you know if they are still interested? If we can get someone to say the additional "nid" file within the per-export dir is acceptable, I''ll go ahead and backport this to b1_4.
Yes. We are interested. I have not been able to track the bug since it was created. I have also contacted mjmac earlier regarding this. Let me know if you could create a new account or correct the permission for me to view the bug entry. Weikuan nathan@clusterfs.com wrote:> Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: > https://bugzilla.lustre.org/show_bug.cgi?id=11089 > > > > Created an attachment (id=9151) > Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: > --> (https://bugzilla.lustre.org/attachment.cgi?id=9151&action=view) > add per-export nid and uuid proc entries > > This patch adds "nid" and "uuid" entries to the per-export proc directories, so > > that users can at least determine what the nid is. > > Nobody from ORNL seems to be subscribed to this bug - mjmac, do you know if > they > are still interested? > If we can get someone to say the additional "nid" file within the per-export > dir > is acceptable, I''ll go ahead and backport this to b1_4. > > _______________________________________________ > Lustre-devel mailing list > Lustre-devel@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-devel > >
nathan@clusterfs.com
2006-Dec-18 17:33 UTC
[Lustre-devel] [Bug 11089] lustre client statistics
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11089 Created an attachment (id=9168) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9168&action=view) prevent race during eviction while testing the per-export nid patches I found a race in the export proc create/delete. Attached patch fixes this, signed in to b1_5.
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11089 (In reply to comment #4)> Created an attachment (id=9151)Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9151&action=view)> add per-export nid and uuid proc entries > > This patch adds "nid" and "uuid" entries to the per-export proc directories, so > > that users can at least determine what the nid is. > > Nobody from ORNL seems to be subscribed to this bug - mjmac, do you know if > they > are still interested? > If we can get someone to say the additional "nid" file within the per-export > dir > is acceptable, I''ll go ahead and backport this to b1_4. >I just got permission to view this thread. This patch appears to defeat the purpose of create a NID in the proc tree. The reason we need NID is to save all stats even after an export is gone, i.e., after a linux client does an umount or a liblustre client exits.
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11089 Created an attachment (id=9169) Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9169&action=view) bug fix to kernel panic in export cleanup; zero out initial brw struct for correct reading at the beginning; should have separated them though.
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11089 (In reply to comment #4)> Created an attachment (id=9151)Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9151&action=view)> add per-export nid and uuid proc entries > > This patch adds "nid" and "uuid" entries to the per-export proc directories, so > > that users can at least determine what the nid is. > > Nobody from ORNL seems to be subscribed to this bug - mjmac, do you know if > they > are still interested? > If we can get someone to say the additional "nid" file within the per-export > dir > is acceptable, I''ll go ahead and backport this to b1_4. >Some comments I have after going through the history of this bug entry. -- Agree to organize per client stat struct''s in an OBD-neutral way -- There is a need to zero brw in function, init_brw_stats -- There is a bug in create/delete. Our fix is different. Not sure if CFS fix as posted in Comment #4 is based on the patch posted in Comment? -- Fixes for the above two are posted as patch #id=9169. Overall: We need to sync on our efforts. Our fixes have been tested and running on an 80-node linux system for about 2 months. It would be great if we could converge these into a final patch for cray machines at ORNL.
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11089 (In reply to comment #9)> > I just got permission to view this thread. This patch appears to defeat the > > purpose of create a NID in the proc tree. The reason we need NID is to save all > > stats even after an export is gone, i.e., after a linux client does an umount or > > a liblustre client exits. > > > > If you save the stats after the client disappears, you are effectively losing > about 8k every time a client restarts. This rapidly will cause problems on > large systems. Or are you talking about re-using the same stats based on the > nid? If so, then this fails if you have two clients mounted on the same node (a > fully-supported situation), where they will overwrite each other. > I think I need to understand your goals before trying to work out any > implementation.The proc is organized as this +- nid --> stats... | + exp#1 --> stats... | + exp#2 --> stats... So it would avoid both situation you mentioned earlier with a little more memory consumed for each NID. We considered it is worthwhile unless concrete numbers on its weakness of scaling.
nathan@clusterfs.com
2006-Dec-19 09:46 UTC
[Lustre-devel] [Bug 11089] lustre client statistics
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11089 (In reply to comment #17)> The proc is organized as this > > +- nid --> stats... > | > + exp#1 --> stats... > | > + exp#2 --> stats... > > So it would avoid both situation you mentioned earlier with a little more memory > consumed for each NID. We considered it is worthwhile unless concrete numbers on > its weakness of scaling. >Aha, so you want to destroy the export proc stats when the export disappears, but _also_ keep a running total of per-nid stats, which lives for the life of the server. Is that right? (I''m happier about that because it doesn''t burn memory every time a client restarts.) Also, FYI, we''re putting proc through a fairly major overhaul to deal with bug 10866, so we should wait for the dust to settle there before trying to land this.
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11089 (In reply to comment #12)> > Aha, so you want to destroy the export proc stats when the export disappears, > but _also_ keep a running total of per-nid stats, which lives for the life of > the server. Is that right?Right.> (I''m happier about that because it doesn''t burn memory every time a client > restarts.) > > Also, FYI, we''re putting proc through a fairly major overhaul to deal with bug > 10866, so we should wait for the dust to settle there before trying to land this. >Sounds good. Looking forward to that.