Simply, I would like to be able to access lustre client data statistics for each filesystem that excludes statistics for cached reads. It is my understanding that for the lustre client (at least on version 1.6.4.2) the /proc lustre stats for the client for each fs (/proc/fs/lustre/llite/FSNAME/stats) report the total number of bytes read (reported in the line read_bytes), and that this number also includes the total number of bytes read using client-cached data. First of all, is my understanding correct for this version of lustre? And does this apply also for newer versions? It has been suggested to subtract the lustre IO from the network IO to get this data, but this is only applicable if the network is dedicated to Lustre IO, which is not the case. For the moment it seems only the number of cached reads are being reported in /proc, and not the actual sizes, so this -seems- difficult or impossible. Is there another way (perhaps even in a newer version of lustre) to find the true read rate for the lustre client that excludes cached reads? -- John Parhizgari
After doing a few experimental read tests while examining /proc/fs/lustre/osc/OSTID/stats on the lustre client: It seems to be that the 7th column of the stats file, on the ost_read line, represents the total bytes read for all the RPC calls involving reading from that OST up until now (2773483520 in the example below). ost_read 91655 samples [usec] 4096 1048576 2773483520 1759238200229888 Is this an accurate description of ost_read? If so, does it make sense to use information to track and analyze the true read stats against OSTs and filesystems (since the llite stats includes client-cached data reads)? And then also similarly for ost_write, since we want essentially Lustre''s network activity to the filesystem; if the above holds this will be representative of how much data is really being transferred to/from the lustre filesystems for each client. John Parhizgari wrote:> Simply, I would like to be able to access lustre client data statistics > for each filesystem that excludes statistics for cached reads. > > It is my understanding that for the lustre client (at least on version > 1.6.4.2) the /proc lustre stats for the client for each fs > (/proc/fs/lustre/llite/FSNAME/stats) report the total number of bytes > read (reported in the line read_bytes), and that this number also > includes the total number of bytes read using client-cached data. > > First of all, is my understanding correct for this version of lustre? > And does this apply also for newer versions? > > It has been suggested to subtract the lustre IO from the network IO to > get this data, but this is only applicable if the network is dedicated > to Lustre IO, which is not the case. > > For the moment it seems only the number of cached reads are being > reported in /proc, and not the actual sizes, so this -seems- difficult > or impossible. > > Is there another way (perhaps even in a newer version of lustre) to find > the true read rate for the lustre client that excludes cached reads? > > -- > John Parhizgari > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > >
On Aug 01, 2008 14:29 -0400, John Parhizgari wrote:> After doing a few experimental read tests while examining > /proc/fs/lustre/osc/OSTID/stats on the lustre client: > > It seems to be that the 7th column of the stats file, on the ost_read > line, represents the total bytes read for all the RPC calls involving > reading from that OST up until now (2773483520 in the example below). > ost_read 91655 samples [usec] 4096 1048576 2773483520 > 1759238200229888 > > Is this an accurate description of ost_read?Yes, though unfortunately this statistic was "hijacked" from being the per-RPC latency (hence [usec] units). I''ve attached a patch to bug 16573 that fixes this, though it is still just a prototype.> If so, does it make sense to use information to track and analyze the > true read stats against OSTs and filesystems (since the llite stats > includes client-cached data reads)? And then also similarly for > ost_write, since we want essentially Lustre''s network activity to the > filesystem; if the above holds this will be representative of how much > data is really being transferred to/from the lustre filesystems for each > client.The above patch also tracks these stats on the OST. See also "brw_stats" and "rpc_stats" on the obdfilter devices. Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc.