thr3ads.net - Lustre discuss - [Lustre-discuss] How to detect process owner on client [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Satoshi Isono

2011-Feb-11 03:16 UTC

[Lustre-discuss] How to detect process owner on client

Dear members,

I am looking into the way which can detect userid or jobid on the Lustre client.
Assumed the following condition;

 1) Any users run any jobs through scheduler like PBS Pro, LSF or SGE.
 2) A users processes occupy Lustre I/O.
 3) Some Lustre servers (MDS?/OSS?) can detect high I/O stress on each server.
 4) But Lustre server cannot make the mapping between jobid/userid and Lustre
I/O processes having heavy stress, because there aren''t userid on
Lustre servers.
 5) I expect that Lustre can monitor and can make the mapping.
 6) If possible for (5), we can make a script which launches scheduler command
like as qdel.
 7) Heavy users job will be killed by job scheduler.

I want (5) for Lustre capability, but I guess current Lustre 1.8 cannot perform
(5). On the other hand, in order to map Lustre process to userid/jobid, are
there any ways using like rpctrace or nid stats? Can you please your advice or
comments?

Regards,
Satoshi Isono

Michael Kluge

2011-Feb-11 06:18 UTC

head link

[Lustre-discuss] How to detect process owner on client

Hi Satoshi,

I am not aware of any possibility to map the current statistics in /proc
to UIDs. But I might be wrong. We had a script like this a while ago
which did not kill the I/O intensive processes but told us the PIDs. 

What we did is collecting for ~30 seconds the number of I/O operations
per node via /proc on all nodes. Then we attached an strace process to
each process on nodes with heavy I/O load. This strace intercepted only
the I/O calls and wrote one log file per process. If this strace is
running for the same amount of time for each process on a host, you just
need to sort the log files for size.


Regards, Michael


Am Donnerstag, den 10.02.2011, 21:16 -0600 schrieb Satoshi Isono:
> Dear members,
> 
> I am looking into the way which can detect userid or jobid on the Lustre
client. Assumed the following condition;
> 
>  1) Any users run any jobs through scheduler like PBS Pro, LSF or SGE.
>  2) A users processes occupy Lustre I/O.
>  3) Some Lustre servers (MDS?/OSS?) can detect high I/O stress on each
server.
>  4) But Lustre server cannot make the mapping between jobid/userid and
Lustre I/O processes having heavy stress, because there aren''t userid
on Lustre servers.
>  5) I expect that Lustre can monitor and can make the mapping.
>  6) If possible for (5), we can make a script which launches scheduler
command like as qdel.
>  7) Heavy users job will be killed by job scheduler.
> 
> I want (5) for Lustre capability, but I guess current Lustre 1.8 cannot
perform (5). On the other hand, in order to map Lustre process to userid/jobid,
are there any ways using like rpctrace or nid stats? Can you please your advice
or comments?
> 
> Regards,
> Satoshi Isono
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 
-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5973 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110211/66177c5a/attachment.bin

Andreas Dilger

2011-Feb-11 16:34 UTC

head link

[Lustre-discuss] How to detect process owner on client

On 2011-02-10, at 23:18, Michael Kluge wrote:> I am not aware of any possibility to map the current statistics in /proc
> to UIDs. But I might be wrong. We had a script like this a while ago
> which did not kill the I/O intensive processes but told us the PIDs. 
> 
> What we did is collecting for ~30 seconds the number of I/O operations
> per node via /proc on all nodes. Then we attached an strace process to
> each process on nodes with heavy I/O load. This strace intercepted only
> the I/O calls and wrote one log file per process. If this strace is
> running for the same amount of time for each process on a host, you just
> need to sort the log files for size.
On the OSS and MDS nodes there are per-client statistics that allow this kind of
tracking.  They can be seen in /proc/fs/lustre/obdfilter/*/exports/*/stats for
detailed information (e.g. broken down by RPC type, bytes read/written), or
/proc/fs/lustre/ost/OSS/*/req_history to just get a dump of the recent RPCs sent
by each client.

A little script was discussed in the thread "How to determine which lustre
clients are loading filesystem" (2010-07-08):
> Another way that I heard some sites were doing this is to use the "rpc
history".  They may already have a script to do this, but the basics are
below:
> 
> oss# lctl set_param ost.OSS.*.req_buffer_history_max=10240
> {wait a few seconds to collect some history}
> oss# lctl get_param ost.OSS.*.req_history
> 
> This will give you a list of the past (up to) 10240 RPCs for the
"ost_io" RPC service, which is what you are observing the high load
on:
> 
> 3436037:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957534353:448:Complete:1278612656:0s(-6s) opc 3
> 3436038:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957536190:448:Complete:1278615489:1s(-41s) opc 3
> 3436039:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957536193:448:Complete:1278615490:0s(-6s) opc 3
> 
> This output is in the format:
> 
>
identifier:target_nid:source_nid:rpc_xid:rpc_size:rpc_status:arrival_time:service_time(deadline)
opcode
> 
> Using some shell scripting, one can find the clients sending the most RPC
requests:
> 
> oss# lctl get_param ost.OSS.*.req_history | tr ":" " "
| cut -d" " -f3,9,10 | sort | uniq -c | sort -nr | head -20
> 
> 
>   3443 12345-192.168.20.159 at tcp opc 3
>   1215 12345-192.168.20.157 at tcp opc 3
>    121 12345-192.168.20.157 at tcp opc 4
> 
> This will give you a sorted list of the top 20 clients that are sending the
most RPCs to the ost and ost_io services, along with the operation being done (3
= OST_READ, 4 = OST_WRITE, etc. see lustre/include/lustre/lustre_idl.h).
> Am Donnerstag, den 10.02.2011, 21:16 -0600 schrieb Satoshi Isono: 
>> Dear members,
>> 
>> I am looking into the way which can detect userid or jobid on the
Lustre client. Assumed the following condition;
>> 
>> 1) Any users run any jobs through scheduler like PBS Pro, LSF or SGE.
>> 2) A users processes occupy Lustre I/O.
>> 3) Some Lustre servers (MDS?/OSS?) can detect high I/O stress on each
server.
>> 4) But Lustre server cannot make the mapping between jobid/userid and
Lustre I/O processes having heavy stress, because there aren''t userid
on Lustre servers.
>> 5) I expect that Lustre can monitor and can make the mapping.
>> 6) If possible for (5), we can make a script which launches scheduler
command like as qdel.
>> 7) Heavy users job will be killed by job scheduler.
>> 
>> I want (5) for Lustre capability, but I guess current Lustre 1.8 cannot
perform (5). On the other hand, in order to map Lustre process to userid/jobid,
are there any ways using like rpctrace or nid stats? Can you please your advice
or comments?
>> 
>> Regards,
>> Satoshi Isono
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>> 
> 
> -- 
> 
> Michael Kluge, M.Sc.
> 
> Technische Universit?t Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.

Michael Kluge

2011-Feb-11 18:09 UTC

head link

[Lustre-discuss] How to detect process owner on client

But it does not give you PIDs or user names? Or is there a way to find 
these with standard lustre tools?

Michael

Am 11.02.2011 17:34, schrieb Andreas Dilger:> On 2011-02-10, at 23:18, Michael Kluge wrote:
>> I am not aware of any possibility to map the current statistics in
/proc
>> to UIDs. But I might be wrong. We had a script like this a while ago
>> which did not kill the I/O intensive processes but told us the PIDs.
>>
>> What we did is collecting for ~30 seconds the number of I/O operations
>> per node via /proc on all nodes. Then we attached an strace process to
>> each process on nodes with heavy I/O load. This strace intercepted only
>> the I/O calls and wrote one log file per process. If this strace is
>> running for the same amount of time for each process on a host, you
just
>> need to sort the log files for size.
>
> On the OSS and MDS nodes there are per-client statistics that allow this
kind of tracking.  They can be seen in
/proc/fs/lustre/obdfilter/*/exports/*/stats for detailed information (e.g.
broken down by RPC type, bytes read/written), or
/proc/fs/lustre/ost/OSS/*/req_history to just get a dump of the recent RPCs sent
by each client.
>
> A little script was discussed in the thread "How to determine which
lustre clients are loading filesystem" (2010-07-08):
>
>> Another way that I heard some sites were doing this is to use the
"rpc history".  They may already have a script to do this, but the
basics are below:
>>
>> oss# lctl set_param ost.OSS.*.req_buffer_history_max=10240
>> {wait a few seconds to collect some history}
>> oss# lctl get_param ost.OSS.*.req_history
>>
>> This will give you a list of the past (up to) 10240 RPCs for the
"ost_io" RPC service, which is what you are observing the high load
on:
>>
>> 3436037:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957534353:448:Complete:1278612656:0s(-6s) opc 3
>> 3436038:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957536190:448:Complete:1278615489:1s(-41s) opc 3
>> 3436039:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957536193:448:Complete:1278615490:0s(-6s) opc 3
>>
>> This output is in the format:
>>
>>
identifier:target_nid:source_nid:rpc_xid:rpc_size:rpc_status:arrival_time:service_time(deadline)
opcode
>>
>> Using some shell scripting, one can find the clients sending the most
RPC requests:
>>
>> oss# lctl get_param ost.OSS.*.req_history | tr ":" "
" | cut -d" " -f3,9,10 | sort | uniq -c | sort -nr | head -20
>>
>>
>>    3443 12345-192.168.20.159 at tcp opc 3
>>    1215 12345-192.168.20.157 at tcp opc 3
>>     121 12345-192.168.20.157 at tcp opc 4
>>
>> This will give you a sorted list of the top 20 clients that are sending
the most RPCs to the ost and ost_io services, along with the operation being
done (3 = OST_READ, 4 = OST_WRITE, etc. see lustre/include/lustre/lustre_idl.h).
>
>
>> Am Donnerstag, den 10.02.2011, 21:16 -0600 schrieb Satoshi Isono:
>>> Dear members,
>>>
>>> I am looking into the way which can detect userid or jobid on the
Lustre client. Assumed the following condition;
>>>
>>> 1) Any users run any jobs through scheduler like PBS Pro, LSF or
SGE.
>>> 2) A users processes occupy Lustre I/O.
>>> 3) Some Lustre servers (MDS?/OSS?) can detect high I/O stress on
each server.
>>> 4) But Lustre server cannot make the mapping between jobid/userid
and Lustre I/O processes having heavy stress, because there aren''t
userid on Lustre servers.
>>> 5) I expect that Lustre can monitor and can make the mapping.
>>> 6) If possible for (5), we can make a script which launches
scheduler command like as qdel.
>>> 7) Heavy users job will be killed by job scheduler.
>>>
>>> I want (5) for Lustre capability, but I guess current Lustre 1.8
cannot perform (5). On the other hand, in order to map Lustre process to
userid/jobid, are there any ways using like rpctrace or nid stats? Can you
please your advice or comments?
>>>
>>> Regards,
>>> Satoshi Isono
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>
>> --
>>
>> Michael Kluge, M.Sc.
>>
>> Technische Universit?t Dresden
>> Center for Information Services and
>> High Performance Computing (ZIH)
>> D-01062 Dresden
>> Germany
>>
>> Contact:
>> Willersbau, Room A 208
>> Phone:  (+49) 351 463-34217
>> Fax:    (+49) 351 463-37773
>> e-mail: michael.kluge at tu-dresden.de
>> WWW:    http://www.tu-dresden.de/zih
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Engineer
> Whamcloud, Inc.
>
>
>
>

-- 
Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

Andreas Dilger

2011-Feb-11 20:59 UTC

head link

[Lustre-discuss] How to detect process owner on client

On 2011-02-11, at 11:09, Michael Kluge wrote:> But it does not give you PIDs or user names? Or is there a way to find
these with standard lustre tools?
I think for most purposes, the req_history should be enough to identify the
problem node and then a simple "ps" or looking at the job scheduler
for that node would identify the problem.

However, if process-level tracking is needed, this is also possible to track on
either the client or server, using the RPCTRACE functionality in the Lustre
kernel debug logs.

client# lctl set_param debug=+rpctrace
{wait to collect some logs}
client# lctl dk /tmp/debug.cli
client# less /tmp/debug.cli
:
:
00000100:00100000:1:1297449409.192077:0:32392:0:(client.c:2095:ptlrpc_queue_wait
()) Sending RPC pname:cluuid:pid:xid:nid:opc ls:028fd87f-1865-3915-a864-fc829a4d
7a4c:32392:x1359928498575499:0 at lo:37
:

This lists the names of the fields being printed in the RPCTRACE message.  We
are particularly interested in the "pname" and "pid" fields,
and maybe "opc" (opcode).  This shows that "ls", pid 32392
is sending an opcode 37 request (MDT_READPAGE, per
lustre/include/lustre/lustre_idl.h).

This RPC is identified with xid "x1359928498575499" on client UUID
"028fd87f-1865-3915-a864-fc829a4d7a4c".  The xid is not guaranteed to
be unique between clients, but is relatively relatively unique in most debug
logs.

On the server we can see the same RPC in the debug logs:

server# lctl dk /tmp/debug.mds
server# grep ":x1359928498575499:" /tmp/debug.mds
00000100:00100000:1:1297449409.192178:0:5174:0:(service.c:1276:ptlrpc_server_log
_handling_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_mdt_rdpg_0
0:028fd87f-1865-3915-a864-fc829a4d7a4c+6:32392:x1359928498575499:12345-0 at
lo:37

Here we see that the RPC for this xid and client pid was processed by a service
thread, but the server-side debug log does not have the client process name, but
rather the process name of the thread handling the RPC.

In any case, it is definitely possible to track down this information just from
the server in a variety of ways.

Adding a job identifier and possibly a rank # to the Lustre RPC messages is
definitely something that we''ve thought about, but it would need help
from userspace (MPI, job scheduler, etc) in order to be useful, so it
hasn''t been done yet.
> Am 11.02.2011 17:34, schrieb Andreas Dilger:
>> 
>> On the OSS and MDS nodes there are per-client statistics that allow
this kind of tracking.  They can be seen in
/proc/fs/lustre/obdfilter/*/exports/*/stats for detailed information (e.g.
broken down by RPC type, bytes read/written), or
/proc/fs/lustre/ost/OSS/*/req_history to just get a dump of the recent RPCs sent
by each client.
>> 
>> A little script was discussed in the thread "How to determine
which lustre clients are loading filesystem" (2010-07-08):
>> 
>>> Another way that I heard some sites were doing this is to use the
"rpc history".  They may already have a script to do this, but the
basics are below:
>>> 
>>> oss# lctl set_param ost.OSS.*.req_buffer_history_max=10240
>>> {wait a few seconds to collect some history}
>>> oss# lctl get_param ost.OSS.*.req_history
>>> 
>>> This will give you a list of the past (up to) 10240 RPCs for the
"ost_io" RPC service, which is what you are observing the high load
on:
>>> 
>>> 3436037:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957534353:448:Complete:1278612656:0s(-6s) opc 3
>>> 3436038:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957536190:448:Complete:1278615489:1s(-41s) opc 3
>>> 3436039:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957536193:448:Complete:1278615490:0s(-6s) opc 3
>>> 
>>> This output is in the format:
>>> 
>>>
identifier:target_nid:source_nid:rpc_xid:rpc_size:rpc_status:arrival_time:service_time(deadline)
opcode
>>> 
>>> Using some shell scripting, one can find the clients sending the
most RPC requests:
>>> 
>>> oss# lctl get_param ost.OSS.*.req_history | tr ":" "
" | cut -d" " -f3,9,10 | sort | uniq -c | sort -nr | head -20
>>> 
>>> 
>>>   3443 12345-192.168.20.159 at tcp opc 3
>>>   1215 12345-192.168.20.157 at tcp opc 3
>>>    121 12345-192.168.20.157 at tcp opc 4
>>> 
>>> This will give you a sorted list of the top 20 clients that are
sending the most RPCs to the ost and ost_io services, along with the operation
being done (3 = OST_READ, 4 = OST_WRITE, etc. see
lustre/include/lustre/lustre_idl.h).

Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.

Satoshi Isono

2011-Feb-14 10:04 UTC

head link

[Lustre-discuss] How to detect process owner on client

Dear Andreas, Michael,

Thanks for your messages. They are very useful for me. I try to do test.

Regards,
Satoshi Isono

-----Original Message-----
From: Andreas Dilger [mailto:adilger at whamcloud.com] 
Sent: Saturday, February 12, 2011 5:59 AM
To: Michael Kluge
Cc: Satoshi Isono; lustre-discuss at lists.lustre.org
Subject: Re: [Lustre-discuss] How to detect process owner on client

On 2011-02-11, at 11:09, Michael Kluge wrote:> But it does not give you PIDs or user names? Or is there a way to find
these with standard lustre tools?
I think for most purposes, the req_history should be enough to identify the
problem node and then a simple "ps" or looking at the job scheduler
for that node would identify the problem.

However, if process-level tracking is needed, this is also possible to track on
either the client or server, using the RPCTRACE functionality in the Lustre
kernel debug logs.

client# lctl set_param debug=+rpctrace
{wait to collect some logs}
client# lctl dk /tmp/debug.cli
client# less /tmp/debug.cli
:
:
00000100:00100000:1:1297449409.192077:0:32392:0:(client.c:2095:ptlrpc_queue_wait
()) Sending RPC pname:cluuid:pid:xid:nid:opc ls:028fd87f-1865-3915-a864-fc829a4d
7a4c:32392:x1359928498575499:0 at lo:37
:

This lists the names of the fields being printed in the RPCTRACE message.  We
are particularly interested in the "pname" and "pid" fields,
and maybe "opc" (opcode).  This shows that "ls", pid 32392
is sending an opcode 37 request (MDT_READPAGE, per
lustre/include/lustre/lustre_idl.h).

This RPC is identified with xid "x1359928498575499" on client UUID
"028fd87f-1865-3915-a864-fc829a4d7a4c".  The xid is not guaranteed to
be unique between clients, but is relatively relatively unique in most debug
logs.

On the server we can see the same RPC in the debug logs:

server# lctl dk /tmp/debug.mds
server# grep ":x1359928498575499:" /tmp/debug.mds
00000100:00100000:1:1297449409.192178:0:5174:0:(service.c:1276:ptlrpc_server_log
_handling_request()) Handling RPC pname:cluuid+ref:pid:xid:nid:opc ll_mdt_rdpg_0
0:028fd87f-1865-3915-a864-fc829a4d7a4c+6:32392:x1359928498575499:12345-0 at
lo:37

Here we see that the RPC for this xid and client pid was processed by a service
thread, but the server-side debug log does not have the client process name, but
rather the process name of the thread handling the RPC.

In any case, it is definitely possible to track down this information just from
the server in a variety of ways.

Adding a job identifier and possibly a rank # to the Lustre RPC messages is
definitely something that we''ve thought about, but it would need help
from userspace (MPI, job scheduler, etc) in order to be useful, so it
hasn''t been done yet.
> Am 11.02.2011 17:34, schrieb Andreas Dilger:
>> 
>> On the OSS and MDS nodes there are per-client statistics that allow
this kind of tracking.  They can be seen in
/proc/fs/lustre/obdfilter/*/exports/*/stats for detailed information (e.g.
broken down by RPC type, bytes read/written), or
/proc/fs/lustre/ost/OSS/*/req_history to just get a dump of the recent RPCs sent
by each client.
>> 
>> A little script was discussed in the thread "How to determine
which lustre clients are loading filesystem" (2010-07-08):
>> 
>>> Another way that I heard some sites were doing this is to use the
"rpc history".  They may already have a script to do this, but the
basics are below:
>>> 
>>> oss# lctl set_param ost.OSS.*.req_buffer_history_max=10240
>>> {wait a few seconds to collect some history}
>>> oss# lctl get_param ost.OSS.*.req_history
>>> 
>>> This will give you a list of the past (up to) 10240 RPCs for the
"ost_io" RPC service, which is what you are observing the high load
on:
>>> 
>>> 3436037:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957534353:448:Complete:1278612656:0s(-6s) opc 3
>>> 3436038:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957536190:448:Complete:1278615489:1s(-41s) opc 3
>>> 3436039:192.168.20.1 at tcp:12345-192.168.20.159 at
tcp:x1340648957536193:448:Complete:1278615490:0s(-6s) opc 3
>>> 
>>> This output is in the format:
>>> 
>>>
identifier:target_nid:source_nid:rpc_xid:rpc_size:rpc_status:arrival_time:service_time(deadline)
opcode
>>> 
>>> Using some shell scripting, one can find the clients sending the
most RPC requests:
>>> 
>>> oss# lctl get_param ost.OSS.*.req_history | tr ":" "
" | cut -d" " -f3,9,10 | sort | uniq -c | sort -nr | head -20
>>> 
>>> 
>>>   3443 12345-192.168.20.159 at tcp opc 3
>>>   1215 12345-192.168.20.157 at tcp opc 3
>>>    121 12345-192.168.20.157 at tcp opc 4
>>> 
>>> This will give you a sorted list of the top 20 clients that are
sending the most RPCs to the ost and ost_io services, along with the operation
being done (3 = OST_READ, 4 = OST_WRITE, etc. see
lustre/include/lustre/lustre_idl.h).

Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.

John Hammond

2011-Feb-15 22:17 UTC

head link

[Lustre-discuss] How to detect process owner on client

On 02/10/2011 09:16 PM, Satoshi Isono wrote:> Dear members,
> 
> I am looking into the way which can detect userid or jobid on the Lustre
client. Assumed the following condition;
> 
>  1) Any users run any jobs through scheduler like PBS Pro, LSF or SGE.
>  2) A users processes occupy Lustre I/O.
>  3) Some Lustre servers (MDS?/OSS?) can detect high I/O stress on each
server.
>  4) But Lustre server cannot make the mapping between jobid/userid and
Lustre I/O processes having heavy stress, because there aren''t userid
on Lustre servers.
>  5) I expect that Lustre can monitor and can make the mapping.
>  6) If possible for (5), we can make a script which launches scheduler
command like as qdel.
>  7) Heavy users job will be killed by job scheduler.
> 
> I want (5) for Lustre capability, but I guess current Lustre 1.8 cannot
perform (5). On the other hand, in order to map Lustre process to userid/jobid,
are there any ways using like rpctrace or nid stats? Can you please your advice
or comments?
I''ve written a utility called lltop which gathers I/O statistics from
Lustre servers, along with job assignment data from cluster batch
schedulers, to give a job-by-job accounting of filesystem load.  Here''s
its output with names changed to protect the innocent:

  $ sudo tacc_lltop work
  JOBID     WR_MB  RD_MB  REQS  OWNER WORKDIR
  1823815    2101      0  4176     al /work/000/al/job1
  1823060     774      0  1570    bob /work/000/bob/fftw
  1823634     323      3  3244   chas /work/000/chas/boltzeq
  1823768     289      0  5108    deb /work/000/deb/mesh-08
  1823085      55      0   110     ed /work/000/ed/jumble
  login3       18      3  2961

We use it on several systems, only with SGE so far, but it''s hookable
to
other schedulers.

See https://github.com/jhammond/lltop for source and documentation.

Best,

John

-- 
John L. Hammond, Ph.D.
TACC, The University of Texas at Austin
jhammond at tacc.utexas.edu

Ashley Pittman

2011-Feb-16 08:56 UTC

head link

[Lustre-discuss] How to detect process owner on client

On 15 Feb 2011, at 22:17, John Hammond wrote:> I''ve written a utility called lltop which gathers I/O statistics
from
> Lustre servers, along with job assignment data from cluster batch
> schedulers, to give a job-by-job accounting of filesystem load. 
Here''s
> its output with names changed to protect the innocent:
> 
>  $ sudo tacc_lltop work
>  JOBID     WR_MB  RD_MB  REQS  OWNER WORKDIR
>  1823815    2101      0  4176     al /work/000/al/job1
>  1823060     774      0  1570    bob /work/000/bob/fftw
>  1823634     323      3  3244   chas /work/000/chas/boltzeq
>  1823768     289      0  5108    deb /work/000/deb/mesh-08
>  1823085      55      0   110     ed /work/000/ed/jumble
>  login3       18      3  2961
> 
> We use it on several systems, only with SGE so far, but it''s
hookable to
> other schedulers.
> 
> See https://github.com/jhammond/lltop for source and documentation.
That looks very useful!  We won''t be able to use this directly at DDN
because we don''t integrate with the right bits of the stack but
I''ll be sure to make sure our HPC customers hear about it if they are
looking for this kind of data.

I also have some code which would work with other schedulers if people are
interested.

Ashley,

Sebastien Piechurski

2011-Mar-01 11:40 UTC

head link

[Lustre-discuss] How to detect process owner on client

Hi Satoshi,

I don''t have a complete solution to your problem, but I have written a
script which lets me find at least the lustre client responsible for the bad
I/Os. We are using PBSPro with nodes set as job-exclusive, so determining the
job and user is then a lot more easier.

The script does that:
Dump the attributes in /proc/fs/lustre/obdfilter/*/exports/*/stats
Sleep a few seconds (tunable) 
Dump again all the attributes, and then use diff to see which clients changed
their IO count.
These changes are then sorted numerically.
The final result is a list of IP adresses with the number of IO done during the
sleep period for each.
The last one in the list (because it is sorted) points to the responsible
client(s).

Hope the method helps.

> -----Original Message-----
> From: lustre-discuss-bounces at lists.lustre.org 
> [mailto:lustre-discuss-bounces at lists.lustre.org] On Behalf Of 
> Satoshi Isono
> Sent: vendredi 11 f?vrier 2011 04:16
> To: lustre-discuss at lists.lustre.org
> Subject: [Lustre-discuss] How to detect process owner on client
> 
> Dear members,
> 
> I am looking into the way which can detect userid or jobid on 
> the Lustre client. Assumed the following condition;
> 
>  1) Any users run any jobs through scheduler like PBS Pro, LSF or SGE.
>  2) A users processes occupy Lustre I/O.
>  3) Some Lustre servers (MDS?/OSS?) can detect high I/O 
> stress on each server.
>  4) But Lustre server cannot make the mapping between 
> jobid/userid and Lustre I/O processes having heavy stress, 
> because there aren''t userid on Lustre servers.
>  5) I expect that Lustre can monitor and can make the mapping.
>  6) If possible for (5), we can make a script which launches 
> scheduler command like as qdel.
>  7) Heavy users job will be killed by job scheduler.
> 
> I want (5) for Lustre capability, but I guess current Lustre 
> 1.8 cannot perform (5). On the other hand, in order to map 
> Lustre process to userid/jobid, are there any ways using like 
> rpctrace or nid stats? Can you please your advice or comments?
> 
> Regards,
> Satoshi Isono
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>

Lustre discuss - Feb 2011 - How to detect process owner on client

[Lustre-discuss] How to detect process owner on client

[Lustre-discuss] How to detect process owner on client

[Lustre-discuss] How to detect process owner on client

[Lustre-discuss] How to detect process owner on client

[Lustre-discuss] How to detect process owner on client

[Lustre-discuss] How to detect process owner on client

[Lustre-discuss] How to detect process owner on client

[Lustre-discuss] How to detect process owner on client

[Lustre-discuss] How to detect process owner on client