thr3ads.net - Lustre devel - [Lustre-devel] (no subject) [Apr 2010]

If this information is useful, please help other people find it:
Share via:

di.wang

2010-Apr-29 01:25 UTC

[Lustre-devel] (no subject)

Hello, Michael

There is a logfile parser script in the attachment, which was written by 
Eric.

This script is very simple, but it should help you understand how we 
retrieve time step information from lustre debug log. On the server 
side, if you enable rpc_trace log, whenever the request arrive/start 
being processed/end of processing, there will be corresponding records 
being written into the debug log. Basically, you can get all the time 
step information from these records (actually only two of these records 
would be enough).


a. 
00000100:00100000:0.0:1272313858.472660:0:31581:0:(service.c:1625:ptlrpc_server_handle_request())
Handling RPC pname:cluuid+ref:pid:xid:nid:opc 
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+10:4055:x1334115493386242:12345-0
at lo:400

This record means the req will being handled, so you can get the start 
time stamp(1272313858.472660) operation type (opc: 400, ping), xid 
(1334115493386242), client nid(12345-0 at lo) and so on.

b. 
00000100:00100000:0.0:1272313858.472687:0:31581:0:(service.c:1672:ptlrpc_server_handle_request())
Handled RPC pname:cluuid+ref:pid:xid:nid:opc 
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+9:4055:x1334115493386242:12345-0
at lo:400
Request procesed in 45us (77us total) trans 0 rc 0/0

This record means the req is already being handled, so you can get the 
end time stamp(1272313858.472687), operation type (opc: 400, ping), xid 
(1334115493386242), client nid(12345-0 at lo) and so no.

Note: (77us total) means how long it takes from the request arriving to 
the end of processing. so you can also get the request arriving time 
stamp here by (1272313858 - 77 = 1272312781).


So with these information you can draw the graph Eric mentioned in his 
email. If you have any questions, please let me know.

Thanks
WangDi


Eric Barton wrote:>
> ------------------------------------------------------------------------
>
> Subject:
> RE: Visualising Lustre RPCs
> To:
> "Michael Kluge" <Michael.Kluge at tu-dresden.de>,
"wangdi"
> <Tom.Wang at Sun.COM>
>
> To:
> "Michael Kluge" <Michael.Kluge at tu-dresden.de>,
"wangdi"
> <Tom.Wang at Sun.COM>
> CC:
> <lustre-devel at lists.lustre.org>, "Galen M. Shipman"
<gshipman at ornl.gov>
>
>
> Michael,
>
> The previous Lustre RPC visualisation effort I mentioned at the LUG 
> used the
>
> Lustre debug log entries of type D_RPCTRACE. We disabled all but these
>
> log messages and then used the Lustre debug daemon to collect them while
>
> we ran I/O tests. We then ran a simple logfile parser which used just 
> the log entries
>
> for request arrival?
>
> , start of processing and end of processing to graph request
>
> queue depth (arrival->end) and the number of requests being serviced 
> by type
>
> over time ? e.g?
>
> read3d
>
> ?which shows request queue depth (vertical) over time (axis labelled 
> 20-25) by
>
> server (axis labelled 0-80).
>
> *From:* Michael Kluge [mailto:Michael.Kluge at tu-dresden.de]
> *Sent:* 17 April 2010 6:26 AM
> *To:* Galen M. Shipman; Eric Barton
> *Subject:* Visualising Lustre RPCs
>
> Hi Galen, Eric,
>
> in order to get this little project started, I think what I need at 
> first to
>
> write a prototype for a converter are the following things:
>
> A set of test traces collected on maybe a handful of clients and some 
> servers
>
> is probably a good point to start with. It would be even better if we know
>
> what is in this traces so that we have an expectation what kind of things
>
> we want to see on the Vampir displays. This little program that Eric 
> mentioned
>
> that can read the trace file would be very helpful as well. And as the 
> last
>
> idea I have right now, a technical contact. I might come up with a couple
>
> of question after I have taken the first look onto the original trace 
> data
>
> and before I start writing code.
>
> Regards, Michael
>
>
> -- 
>
> Michael Kluge, M.Sc.
>
> Technische Universit?t Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
>
> Contact:
> Willersbau, Room WIL A 208
> Phone: (+49) 351 463-34217
> Fax: (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de <mailto:michael.kluge at
tu-dresden.de>
> WWW: http://www.tu-dresden.de/zih
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rpc_parse.tgz
Type: application/x-compressed-tar
Size: 30287 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100428/e7d90469/attachment-0001.bin

Andreas Dilger

2010-Apr-29 01:49 UTC

head link

[Lustre-devel] (no subject)

AFAIK the Handling time is when request processing starts, not when it  
first arrived at the server. We would need to check I''d there is an  
RPCTRACE message from ptlrpc_handle_req_in() that we could use to  
determine the initial request arrival time.

Cheers, Andreas

On 2010-04-28, at 19:25, "di.wang" <di.wang at oracle.com>
wrote:
> Hello, Michael
>
> There is a logfile parser script in the attachment, which was  
> written by Eric.
>
> This script is very simple, but it should help you understand how we  
> retrieve time step information from lustre debug log. On the server  
> side, if you enable rpc_trace log, whenever the request arrive/start  
> being processed/end of processing, there will be corresponding  
> records being written into the debug log. Basically, you can get all  
> the time step information from these records (actually only two of  
> these records would be enough).
>
>
> a. 00000100:00100000:0.0:1272313858.472660:0:31581:0:(service.c: 
> 1625:ptlrpc_server_handle_request()) Handling RPC pname:cluuid 
> +ref:pid:xid:nid:opc ll_mgs_00:7d4fb15c-1b1c-295f-e466- 
> ea7d77089b52+10:4055:x1334115493386242:12345-0 at lo:400
>
> This record means the req will being handled, so you can get the  
> start time stamp(1272313858.472660) operation type (opc: 400, ping),  
> xid (1334115493386242), client nid(12345-0 at lo) and so on.
>
> b. 00000100:00100000:0.0:1272313858.472687:0:31581:0:(service.c: 
> 1672:ptlrpc_server_handle_request()) Handled RPC pname:cluuid 
> +ref:pid:xid:nid:opc ll_mgs_00:7d4fb15c-1b1c-295f-e466- 
> ea7d77089b52+9:4055:x1334115493386242:12345-0 at lo:400 Request  
> procesed in 45us (77us total) trans 0 rc 0/0
>
> This record means the req is already being handled, so you can get  
> the end time stamp(1272313858.472687), operation type (opc: 400,  
> ping), xid (1334115493386242), client nid(12345-0 at lo) and so no.
>
> Note: (77us total) means how long it takes from the request arriving  
> to the end of processing. so you can also get the request arriving  
> time stamp here by (1272313858 - 77 = 1272312781).
>
>
> So with these information you can draw the graph Eric mentioned in  
> his email. If you have any questions, please let me know.
>
> Thanks
> WangDi
>
>
> Eric Barton wrote:
>>
>> --- 
>> ---------------------------------------------------------------------
>>
>> Subject:
>> RE: Visualising Lustre RPCs
>> To:
>> "Michael Kluge" <Michael.Kluge at tu-dresden.de>,
"wangdi" <Tom.Wang at Sun.COM
>> >
>>
>> To:
>> "Michael Kluge" <Michael.Kluge at tu-dresden.de>,
"wangdi" <Tom.Wang at Sun.COM
>> >
>> CC:
>> <lustre-devel at lists.lustre.org>, "Galen M. Shipman"
<gshipman at ornl.gov
>> >
>>
>>
>> Michael,
>>
>> The previous Lustre RPC visualisation effort I mentioned at the LUG  
>> used the
>>
>> Lustre debug log entries of type D_RPCTRACE. We disabled all but  
>> these
>>
>> log messages and then used the Lustre debug daemon to collect them  
>> while
>>
>> we ran I/O tests. We then ran a simple logfile parser which used  
>> just the log entries
>>
>> for request arrival?
>>
>> , start of processing and end of processing to graph request
>>
>> queue depth (arrival->end) and the number of requests being  
>> serviced by type
>>
>> over time ? e.g?
>>
>> read3d
>>
>> ?which shows request queue depth (vertical) over time (axis labell 
>> ed 20-25) by
>>
>> server (axis labelled 0-80).
>>
>> *From:* Michael Kluge [mailto:Michael.Kluge at tu-dresden.de]
>> *Sent:* 17 April 2010 6:26 AM
>> *To:* Galen M. Shipman; Eric Barton
>> *Subject:* Visualising Lustre RPCs
>>
>> Hi Galen, Eric,
>>
>> in order to get this little project started, I think what I need at  
>> first to
>>
>> write a prototype for a converter are the following things:
>>
>> A set of test traces collected on maybe a handful of clients and  
>> some servers
>>
>> is probably a good point to start with. It would be even better if  
>> we know
>>
>> what is in this traces so that we have an expectation what kind of  
>> things
>>
>> we want to see on the Vampir displays. This little program that  
>> Eric mentioned
>>
>> that can read the trace file would be very helpful as well. And as  
>> the last
>>
>> idea I have right now, a technical contact. I might come up with a  
>> couple
>>
>> of question after I have taken the first look onto the original  
>> trace data
>>
>> and before I start writing code.
>>
>> Regards, Michael
>>
>>
>> -- 
>>
>> Michael Kluge, M.Sc.
>>
>> Technische Universit?t Dresden
>> Center for Information Services and
>> High Performance Computing (ZIH)
>> D-01062 Dresden
>> Germany
>>
>> Contact:
>> Willersbau, Room WIL A 208
>> Phone: (+49) 351 463-34217
>> Fax: (+49) 351 463-37773
>> e-mail: michael.kluge at tu-dresden.de <mailto:michael.kluge at
tu-dresden.de
>> >
>> WWW: http://www.tu-dresden.de/zih
>>
>
> <rpc_parse.tgz>
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

di.wang

2010-Apr-29 02:04 UTC

head link

[Lustre-devel] (no subject)

Andreas Dilger wrote:> AFAIK the Handling time is when request processing starts, not when it  
> first arrived at the server. We would need to check I''d there is
an
> RPCTRACE message from ptlrpc_handle_req_in() that we could use to  
> determine the initial request arrival time.
>   Oh, you can also get the arriving time in "Handled RPC" msg
by Handled RPC_time - total time.

Thanks
WangDi> Cheers, Andreas
>
> On 2010-04-28, at 19:25, "di.wang" <di.wang at oracle.com>
wrote:
>
>   
>> Hello, Michael
>>
>> There is a logfile parser script in the attachment, which was  
>> written by Eric.
>>
>> This script is very simple, but it should help you understand how we  
>> retrieve time step information from lustre debug log. On the server  
>> side, if you enable rpc_trace log, whenever the request arrive/start  
>> being processed/end of processing, there will be corresponding  
>> records being written into the debug log. Basically, you can get all  
>> the time step information from these records (actually only two of  
>> these records would be enough).
>>
>>
>> a. 00000100:00100000:0.0:1272313858.472660:0:31581:0:(service.c: 
>> 1625:ptlrpc_server_handle_request()) Handling RPC pname:cluuid 
>> +ref:pid:xid:nid:opc ll_mgs_00:7d4fb15c-1b1c-295f-e466- 
>> ea7d77089b52+10:4055:x1334115493386242:12345-0 at lo:400
>>
>> This record means the req will being handled, so you can get the  
>> start time stamp(1272313858.472660) operation type (opc: 400, ping),  
>> xid (1334115493386242), client nid(12345-0 at lo) and so on.
>>
>> b. 00000100:00100000:0.0:1272313858.472687:0:31581:0:(service.c: 
>> 1672:ptlrpc_server_handle_request()) Handled RPC pname:cluuid 
>> +ref:pid:xid:nid:opc ll_mgs_00:7d4fb15c-1b1c-295f-e466- 
>> ea7d77089b52+9:4055:x1334115493386242:12345-0 at lo:400 Request  
>> procesed in 45us (77us total) trans 0 rc 0/0
>>
>> This record means the req is already being handled, so you can get  
>> the end time stamp(1272313858.472687), operation type (opc: 400,  
>> ping), xid (1334115493386242), client nid(12345-0 at lo) and so no.
>>
>> Note: (77us total) means how long it takes from the request arriving  
>> to the end of processing. so you can also get the request arriving  
>> time stamp here by (1272313858 - 77 = 1272312781).
>>
>>
>> So with these information you can draw the graph Eric mentioned in  
>> his email. If you have any questions, please let me know.
>>
>> Thanks
>> WangDi
>>
>>
>> Eric Barton wrote:
>>     
>>> --- 
>>>
---------------------------------------------------------------------
>>>
>>> Subject:
>>> RE: Visualising Lustre RPCs
>>> To:
>>> "Michael Kluge" <Michael.Kluge at tu-dresden.de>,
"wangdi" <Tom.Wang at Sun.COM
>>>       
>>> To:
>>> "Michael Kluge" <Michael.Kluge at tu-dresden.de>,
"wangdi" <Tom.Wang at Sun.COM
>>>       
>>> CC:
>>> <lustre-devel at lists.lustre.org>, "Galen M.
Shipman" <gshipman at ornl.gov
>>>       
>>> Michael,
>>>
>>> The previous Lustre RPC visualisation effort I mentioned at the LUG
>>> used the
>>>
>>> Lustre debug log entries of type D_RPCTRACE. We disabled all but  
>>> these
>>>
>>> log messages and then used the Lustre debug daemon to collect them
>>> while
>>>
>>> we ran I/O tests. We then ran a simple logfile parser which used  
>>> just the log entries
>>>
>>> for request arrival?
>>>
>>> , start of processing and end of processing to graph request
>>>
>>> queue depth (arrival->end) and the number of requests being  
>>> serviced by type
>>>
>>> over time ? e.g?
>>>
>>> read3d
>>>
>>> ?which shows request queue depth (vertical) over time (axis labell 
>>> ed 20-25) by
>>>
>>> server (axis labelled 0-80).
>>>
>>> *From:* Michael Kluge [mailto:Michael.Kluge at tu-dresden.de]
>>> *Sent:* 17 April 2010 6:26 AM
>>> *To:* Galen M. Shipman; Eric Barton
>>> *Subject:* Visualising Lustre RPCs
>>>
>>> Hi Galen, Eric,
>>>
>>> in order to get this little project started, I think what I need at
>>> first to
>>>
>>> write a prototype for a converter are the following things:
>>>
>>> A set of test traces collected on maybe a handful of clients and  
>>> some servers
>>>
>>> is probably a good point to start with. It would be even better if
>>> we know
>>>
>>> what is in this traces so that we have an expectation what kind of
>>> things
>>>
>>> we want to see on the Vampir displays. This little program that  
>>> Eric mentioned
>>>
>>> that can read the trace file would be very helpful as well. And as
>>> the last
>>>
>>> idea I have right now, a technical contact. I might come up with a
>>> couple
>>>
>>> of question after I have taken the first look onto the original  
>>> trace data
>>>
>>> and before I start writing code.
>>>
>>> Regards, Michael
>>>
>>>
>>> -- 
>>>
>>> Michael Kluge, M.Sc.
>>>
>>> Technische Universit?t Dresden
>>> Center for Information Services and
>>> High Performance Computing (ZIH)
>>> D-01062 Dresden
>>> Germany
>>>
>>> Contact:
>>> Willersbau, Room WIL A 208
>>> Phone: (+49) 351 463-34217
>>> Fax: (+49) 351 463-37773
>>> e-mail: michael.kluge at tu-dresden.de <mailto:michael.kluge at
tu-dresden.de
>>>       
>>> WWW: http://www.tu-dresden.de/zih
>>>
>>>       
>> <rpc_parse.tgz>
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>     
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
>

Michael Kluge

2010-Apr-29 04:48 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi WangDi,

OK, thanks for input. I''ll go ahead and try to write a converter. Could
you please collect a set test traces that belong together from a couple 
servers and clients and put them somewhere so that I can download them?


Thanks, Michael

Am 29.04.2010 03:25, schrieb di.wang:> Hello, Michael
>
> There is a logfile parser script in the attachment, which was written by
> Eric.
>
> This script is very simple, but it should help you understand how we
> retrieve time step information from lustre debug log. On the server
> side, if you enable rpc_trace log, whenever the request arrive/start
> being processed/end of processing, there will be corresponding records
> being written into the debug log. Basically, you can get all the time
> step information from these records (actually only two of these records
> would be enough).
>
>
> a.
>
00000100:00100000:0.0:1272313858.472660:0:31581:0:(service.c:1625:ptlrpc_server_handle_request())
> Handling RPC pname:cluuid+ref:pid:xid:nid:opc
>
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+10:4055:x1334115493386242:12345-0
at lo:400
>
>
> This record means the req will being handled, so you can get the start
> time stamp(1272313858.472660) operation type (opc: 400, ping), xid
> (1334115493386242), client nid(12345-0 at lo) and so on.
>
> b.
>
00000100:00100000:0.0:1272313858.472687:0:31581:0:(service.c:1672:ptlrpc_server_handle_request())
> Handled RPC pname:cluuid+ref:pid:xid:nid:opc
>
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+9:4055:x1334115493386242:12345-0
at lo:400
> Request procesed in 45us (77us total) trans 0 rc 0/0
>
> This record means the req is already being handled, so you can get the
> end time stamp(1272313858.472687), operation type (opc: 400, ping), xid
> (1334115493386242), client nid(12345-0 at lo) and so no.
>
> Note: (77us total) means how long it takes from the request arriving to
> the end of processing. so you can also get the request arriving time
> stamp here by (1272313858 - 77 = 1272312781).
>
>
> So with these information you can draw the graph Eric mentioned in his
> email. If you have any questions, please let me know.
>
> Thanks
> WangDi
>
>
> Eric Barton wrote:
>>
>>
------------------------------------------------------------------------
>>
>> Subject:
>> RE: Visualising Lustre RPCs
>> To:
>> "Michael Kluge" <Michael.Kluge at tu-dresden.de>,
"wangdi"
>> <Tom.Wang at Sun.COM>
>>
>> To:
>> "Michael Kluge" <Michael.Kluge at tu-dresden.de>,
"wangdi"
>> <Tom.Wang at Sun.COM>
>> CC:
>> <lustre-devel at lists.lustre.org>, "Galen M. Shipman"
<gshipman at ornl.gov>
>>
>>
>> Michael,
>>
>> The previous Lustre RPC visualisation effort I mentioned at the LUG
>> used the
>>
>> Lustre debug log entries of type D_RPCTRACE. We disabled all but these
>>
>> log messages and then used the Lustre debug daemon to collect them
while
>>
>> we ran I/O tests. We then ran a simple logfile parser which used just
>> the log entries
>>
>> for request arrival?
>>
>> , start of processing and end of processing to graph request
>>
>> queue depth (arrival->end) and the number of requests being serviced
>> by type
>>
>> over time ? e.g?
>>
>> read3d
>>
>> ?which shows request queue depth (vertical) over time (axis labelled
>> 20-25) by
>>
>> server (axis labelled 0-80).
>>
>> *From:* Michael Kluge [mailto:Michael.Kluge at tu-dresden.de]
>> *Sent:* 17 April 2010 6:26 AM
>> *To:* Galen M. Shipman; Eric Barton
>> *Subject:* Visualising Lustre RPCs
>>
>> Hi Galen, Eric,
>>
>> in order to get this little project started, I think what I need at
>> first to
>>
>> write a prototype for a converter are the following things:
>>
>> A set of test traces collected on maybe a handful of clients and some
>> servers
>>
>> is probably a good point to start with. It would be even better if we
>> know
>>
>> what is in this traces so that we have an expectation what kind of
things
>>
>> we want to see on the Vampir displays. This little program that Eric
>> mentioned
>>
>> that can read the trace file would be very helpful as well. And as the
>> last
>>
>> idea I have right now, a technical contact. I might come up with a
couple
>>
>> of question after I have taken the first look onto the original trace
>> data
>>
>> and before I start writing code.
>>
>> Regards, Michael
>>
>>
>> --
>>
>> Michael Kluge, M.Sc.
>>
>> Technische Universit?t Dresden
>> Center for Information Services and
>> High Performance Computing (ZIH)
>> D-01062 Dresden
>> Germany
>>
>> Contact:
>> Willersbau, Room WIL A 208
>> Phone: (+49) 351 463-34217
>> Fax: (+49) 351 463-37773
>> e-mail: michael.kluge at tu-dresden.de <mailto:michael.kluge at
tu-dresden.de>
>> WWW: http://www.tu-dresden.de/zih

Michael Kluge

2010-May-03 08:41 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi WangDi,

I have a small problem understanding the logical order of the events. 
Here is one example from the logs where the client reports an RPC as 
completed before the server says that it has finished the RPC handling. 
The difference is about 1.5 ms.

Is that due to the fact that the server waits until the client has 
ack''ed the result of the RPC? Or are the clocks between the servers not
well synchronized? Or are the timestamps in the logfile sometimes not 
correct (log message could not be flushed or whatever)?

In the first case I don''t have a chance to figure out when the server 
has finished working on the RPC and is going to send out the result, 
correct?

../debug_log/rpc_client.debug:00000100:00100000:1.0:1272565278.263430:0:3479:0:(client.c:1532:ptlrpc_check_set())
Completed RPC pname:cluuid:pid:xid:nid:opc 
ptlrpcd-brw:3f7723dc-06a3-0b5e-9e96-f5b86e8d1b1c:3479:1334380768266050:10.8.0.126
at tcp:4

../debug_log/rpc_oss.debug:00000100:00100000:1.0:1272565278.265039:0:7385:0:(service.c:1672:ptlrpc_server_handle_request())
Handled RPC pname:cluuid+ref:pid:xid:nid:opc 
ll_ost_io_24:3f7723dc-06a3-0b5e-9e96-f5b86e8d1b1c+27:3479:x1334380768266050:12345-10.8.0.104
at tcp:4
Request procesed in 238391us (239698us total) trans 266 rc 0/0


For this RPC I see in the oss log:

1272565278.025337: incoming x1334380768266050
1272565278.026649: Handling RPC x1334380768266050
1272565278.265039: Handled RPC x1334380768266050:Request procesed in 
238391us (239698us total)

So, by doing the math I get: 265039-026649=238390 (just 1 off target, 
might be wathever) and 265039-025337=239702 (4 off target). So I guess 
the times reported by the last message are calculated by using a 
different data source than the log time stamps?


Regards, Michael


Am 29.04.2010 20:27, schrieb di.wang:> Hello, Michael
>
> Here is a small debug log example you can use for your development.
>
> Thanks
> WangDi
> Michael Kluge wrote:
>> Hi WangDi,
>>
>> OK, thanks for input. I''ll go ahead and try to write a
converter.
>> Could you please collect a set test traces that belong together from a
>> couple servers and clients and put them somewhere so that I can
>> download them?
>>
>>
>> Thanks, Michael
>>
>> Am 29.04.2010 03:25, schrieb di.wang:
>>> Hello, Michael
>>>
>>> There is a logfile parser script in the attachment, which was
written by
>>> Eric.
>>>
>>> This script is very simple, but it should help you understand how
we
>>> retrieve time step information from lustre debug log. On the server
>>> side, if you enable rpc_trace log, whenever the request
arrive/start
>>> being processed/end of processing, there will be corresponding
records
>>> being written into the debug log. Basically, you can get all the
time
>>> step information from these records (actually only two of these
records
>>> would be enough).
>>>
>>>
>>> a.
>>>
00000100:00100000:0.0:1272313858.472660:0:31581:0:(service.c:1625:ptlrpc_server_handle_request())
>>>
>>> Handling RPC pname:cluuid+ref:pid:xid:nid:opc
>>>
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+10:4055:x1334115493386242:12345-0
at lo:400
>>>
>>>
>>>
>>> This record means the req will being handled, so you can get the
start
>>> time stamp(1272313858.472660) operation type (opc: 400, ping), xid
>>> (1334115493386242), client nid(12345-0 at lo) and so on.
>>>
>>> b.
>>>
00000100:00100000:0.0:1272313858.472687:0:31581:0:(service.c:1672:ptlrpc_server_handle_request())
>>>
>>> Handled RPC pname:cluuid+ref:pid:xid:nid:opc
>>>
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+9:4055:x1334115493386242:12345-0
at lo:400
>>>
>>> Request procesed in 45us (77us total) trans 0 rc 0/0
>>>
>>> This record means the req is already being handled, so you can get
the
>>> end time stamp(1272313858.472687), operation type (opc: 400, ping),
xid
>>> (1334115493386242), client nid(12345-0 at lo) and so no.
>>>
>>> Note: (77us total) means how long it takes from the request
arriving to
>>> the end of processing. so you can also get the request arriving
time
>>> stamp here by (1272313858 - 77 = 1272312781).
>>>
>>>
>>> So with these information you can draw the graph Eric mentioned in
his
>>> email. If you have any questions, please let me know.
>>>
>>> Thanks
>>> WangDi
>>>
>>>
>>> Eric Barton wrote:
>>>>
>>>>
------------------------------------------------------------------------
>>>>
>>>>
>>>> Subject:
>>>> RE: Visualising Lustre RPCs
>>>> To:
>>>> "Michael Kluge" <Michael.Kluge at
tu-dresden.de>, "wangdi"
>>>> <Tom.Wang at Sun.COM>
>>>>
>>>> To:
>>>> "Michael Kluge" <Michael.Kluge at
tu-dresden.de>, "wangdi"
>>>> <Tom.Wang at Sun.COM>
>>>> CC:
>>>> <lustre-devel at lists.lustre.org>, "Galen M.
Shipman" <gshipman at ornl.gov>
>>>>
>>>>
>>>> Michael,
>>>>
>>>> The previous Lustre RPC visualisation effort I mentioned at the
LUG
>>>> used the
>>>>
>>>> Lustre debug log entries of type D_RPCTRACE. We disabled all
but these
>>>>
>>>> log messages and then used the Lustre debug daemon to collect
them
>>>> while
>>>>
>>>> we ran I/O tests. We then ran a simple logfile parser which
used just
>>>> the log entries
>>>>
>>>> for request arrival?
>>>>
>>>> , start of processing and end of processing to graph request
>>>>
>>>> queue depth (arrival->end) and the number of requests being
serviced
>>>> by type
>>>>
>>>> over time ? e.g?
>>>>
>>>> read3d
>>>>
>>>> ?which shows request queue depth (vertical) over time (axis
labelled
>>>> 20-25) by
>>>>
>>>> server (axis labelled 0-80).
>>>>
>>>> *From:* Michael Kluge [mailto:Michael.Kluge at tu-dresden.de]
>>>> *Sent:* 17 April 2010 6:26 AM
>>>> *To:* Galen M. Shipman; Eric Barton
>>>> *Subject:* Visualising Lustre RPCs
>>>>
>>>> Hi Galen, Eric,
>>>>
>>>> in order to get this little project started, I think what I
need at
>>>> first to
>>>>
>>>> write a prototype for a converter are the following things:
>>>>
>>>> A set of test traces collected on maybe a handful of clients
and some
>>>> servers
>>>>
>>>> is probably a good point to start with. It would be even better
if we
>>>> know
>>>>
>>>> what is in this traces so that we have an expectation what kind
of
>>>> things
>>>>
>>>> we want to see on the Vampir displays. This little program that
Eric
>>>> mentioned
>>>>
>>>> that can read the trace file would be very helpful as well. And
as the
>>>> last
>>>>
>>>> idea I have right now, a technical contact. I might come up
with a
>>>> couple
>>>>
>>>> of question after I have taken the first look onto the original
trace
>>>> data
>>>>
>>>> and before I start writing code.
>>>>
>>>> Regards, Michael
>>>>
>>>>
>>>> --
>>>>
>>>> Michael Kluge, M.Sc.
>>>>
>>>> Technische Universit?t Dresden
>>>> Center for Information Services and
>>>> High Performance Computing (ZIH)
>>>> D-01062 Dresden
>>>> Germany
>>>>
>>>> Contact:
>>>> Willersbau, Room WIL A 208
>>>> Phone: (+49) 351 463-34217
>>>> Fax: (+49) 351 463-37773
>>>> e-mail: michael.kluge at tu-dresden.de
>>>> <mailto:michael.kluge at tu-dresden.de>
>>>> WWW: http://www.tu-dresden.de/zih
>>
>

Andreas Dilger

2010-May-03 13:20 UTC

head link

[Lustre-devel] Lustre RPC visualization

On 2010-05-03, at 04:41, Michael Kluge wrote:> I have a small problem understanding the logical order of the events. 
> Here is one example from the logs where the client reports an RPC as 
> completed before the server says that it has finished the RPC handling. 
> The difference is about 1.5 ms.
> 
> Is that due to the fact that the server waits until the client has 
> ack''ed the result of the RPC? Or are the clocks between the
servers not
> well synchronized? Or are the timestamps in the logfile sometimes not 
> correct (log message could not be flushed or whatever)?
The timestamps on the log messages are only as accurate as the clocks on the
nodes.  Lustre can run without synchronized clocks, using the client clock as
the basis for the timestamps, but Lustre makes no effort to synchronize the
client clocks.  For that you need to use NTP (easiest) or adjust the timestamps
in the logs based on messages like this.
> In the first case I don''t have a chance to figure out when the
server
> has finished working on the RPC and is going to send out the result, 
> correct?
> 
>
../debug_log/rpc_client.debug:00000100:00100000:1.0:1272565278.263430:0:3479:0:(client.c:1532:ptlrpc_check_set())
> Completed RPC pname:cluuid:pid:xid:nid:opc 
>
ptlrpcd-brw:3f7723dc-06a3-0b5e-9e96-f5b86e8d1b1c:3479:1334380768266050:10.8.0.126
at tcp:4
> 
>
../debug_log/rpc_oss.debug:00000100:00100000:1.0:1272565278.265039:0:7385:0:(service.c:1672:ptlrpc_server_handle_request())
> Handled RPC pname:cluuid+ref:pid:xid:nid:opc 
>
ll_ost_io_24:3f7723dc-06a3-0b5e-9e96-f5b86e8d1b1c+27:3479:x1334380768266050:12345-10.8.0.104
at tcp:4
> Request procesed in 238391us (239698us total) trans 266 rc 0/0
> 
> 
> For this RPC I see in the oss log:
> 
> 1272565278.025337: incoming x1334380768266050
> 1272565278.026649: Handling RPC x1334380768266050
> 1272565278.265039: Handled RPC x1334380768266050:Request procesed in 
> 238391us (239698us total)
> 
> So, by doing the math I get: 265039-026649=238390 (just 1 off target, 
> might be wathever) and 265039-025337=239702 (4 off target). So I guess 
> the times reported by the last message are calculated by using a 
> different data source than the log time stamps?
The log timestamps are generated at the time the message is processed by the log
system.  The processing times are themselves generated by accessing the clock
and computing the processing before the message is printed.

        do_gettimeofday(&work_start);
        :
        {handle RPC}
        :
        do_gettimeofday(&work_end);
        timediff = cfs_timeval_sub(&work_end, &work_start, NULL);
        CDEBUG(D_RPCTRACE, "request x"LPU64" opc %u from %s
processed in "
               "%ldus (%ldus total) trans "LPU64" rc
%d/%d\n",


> Am 29.04.2010 20:27, schrieb di.wang:
>> Hello, Michael
>> 
>> Here is a small debug log example you can use for your development.
>> 
>> Thanks
>> WangDi
>> Michael Kluge wrote:
>>> Hi WangDi,
>>> 
>>> OK, thanks for input. I''ll go ahead and try to write a
converter.
>>> Could you please collect a set test traces that belong together
from a
>>> couple servers and clients and put them somewhere so that I can
>>> download them?
>>> 
>>> 
>>> Thanks, Michael
>>> 
>>> Am 29.04.2010 03:25, schrieb di.wang:
>>>> Hello, Michael
>>>> 
>>>> There is a logfile parser script in the attachment, which was
written by
>>>> Eric.
>>>> 
>>>> This script is very simple, but it should help you understand
how we
>>>> retrieve time step information from lustre debug log. On the
server
>>>> side, if you enable rpc_trace log, whenever the request
arrive/start
>>>> being processed/end of processing, there will be corresponding
records
>>>> being written into the debug log. Basically, you can get all
the time
>>>> step information from these records (actually only two of these
records
>>>> would be enough).
>>>> 
>>>> 
>>>> a.
>>>>
00000100:00100000:0.0:1272313858.472660:0:31581:0:(service.c:1625:ptlrpc_server_handle_request())
>>>> 
>>>> Handling RPC pname:cluuid+ref:pid:xid:nid:opc
>>>>
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+10:4055:x1334115493386242:12345-0
at lo:400
>>>> 
>>>> 
>>>> 
>>>> This record means the req will being handled, so you can get
the start
>>>> time stamp(1272313858.472660) operation type (opc: 400, ping),
xid
>>>> (1334115493386242), client nid(12345-0 at lo) and so on.
>>>> 
>>>> b.
>>>>
00000100:00100000:0.0:1272313858.472687:0:31581:0:(service.c:1672:ptlrpc_server_handle_request())
>>>> 
>>>> Handled RPC pname:cluuid+ref:pid:xid:nid:opc
>>>>
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+9:4055:x1334115493386242:12345-0
at lo:400
>>>> 
>>>> Request procesed in 45us (77us total) trans 0 rc 0/0
>>>> 
>>>> This record means the req is already being handled, so you can
get the
>>>> end time stamp(1272313858.472687), operation type (opc: 400,
ping), xid
>>>> (1334115493386242), client nid(12345-0 at lo) and so no.
>>>> 
>>>> Note: (77us total) means how long it takes from the request
arriving to
>>>> the end of processing. so you can also get the request arriving
time
>>>> stamp here by (1272313858 - 77 = 1272312781).
>>>> 
>>>> 
>>>> So with these information you can draw the graph Eric mentioned
in his
>>>> email. If you have any questions, please let me know.
>>>> 
>>>> Thanks
>>>> WangDi
>>>> 
>>>> 
>>>> Eric Barton wrote:
>>>>> 
>>>>>
------------------------------------------------------------------------
>>>>> 
>>>>> 
>>>>> Subject:
>>>>> RE: Visualising Lustre RPCs
>>>>> To:
>>>>> "Michael Kluge" <Michael.Kluge at
tu-dresden.de>, "wangdi"
>>>>> <Tom.Wang at Sun.COM>
>>>>> 
>>>>> To:
>>>>> "Michael Kluge" <Michael.Kluge at
tu-dresden.de>, "wangdi"
>>>>> <Tom.Wang at Sun.COM>
>>>>> CC:
>>>>> <lustre-devel at lists.lustre.org>, "Galen M.
Shipman" <gshipman at ornl.gov>
>>>>> 
>>>>> 
>>>>> Michael,
>>>>> 
>>>>> The previous Lustre RPC visualisation effort I mentioned at
the LUG
>>>>> used the
>>>>> 
>>>>> Lustre debug log entries of type D_RPCTRACE. We disabled
all but these
>>>>> 
>>>>> log messages and then used the Lustre debug daemon to
collect them
>>>>> while
>>>>> 
>>>>> we ran I/O tests. We then ran a simple logfile parser which
used just
>>>>> the log entries
>>>>> 
>>>>> for request arrival?
>>>>> 
>>>>> , start of processing and end of processing to graph
request
>>>>> 
>>>>> queue depth (arrival->end) and the number of requests
being serviced
>>>>> by type
>>>>> 
>>>>> over time ? e.g?
>>>>> 
>>>>> read3d
>>>>> 
>>>>> ?which shows request queue depth (vertical) over time (axis
labelled
>>>>> 20-25) by
>>>>> 
>>>>> server (axis labelled 0-80).
>>>>> 
>>>>> *From:* Michael Kluge [mailto:Michael.Kluge at
tu-dresden.de]
>>>>> *Sent:* 17 April 2010 6:26 AM
>>>>> *To:* Galen M. Shipman; Eric Barton
>>>>> *Subject:* Visualising Lustre RPCs
>>>>> 
>>>>> Hi Galen, Eric,
>>>>> 
>>>>> in order to get this little project started, I think what I
need at
>>>>> first to
>>>>> 
>>>>> write a prototype for a converter are the following things:
>>>>> 
>>>>> A set of test traces collected on maybe a handful of
clients and some
>>>>> servers
>>>>> 
>>>>> is probably a good point to start with. It would be even
better if we
>>>>> know
>>>>> 
>>>>> what is in this traces so that we have an expectation what
kind of
>>>>> things
>>>>> 
>>>>> we want to see on the Vampir displays. This little program
that Eric
>>>>> mentioned
>>>>> 
>>>>> that can read the trace file would be very helpful as well.
And as the
>>>>> last
>>>>> 
>>>>> idea I have right now, a technical contact. I might come up
with a
>>>>> couple
>>>>> 
>>>>> of question after I have taken the first look onto the
original trace
>>>>> data
>>>>> 
>>>>> and before I start writing code.
>>>>> 
>>>>> Regards, Michael
>>>>> 
>>>>> 
>>>>> --
>>>>> 
>>>>> Michael Kluge, M.Sc.
>>>>> 
>>>>> Technische Universit?t Dresden
>>>>> Center for Information Services and
>>>>> High Performance Computing (ZIH)
>>>>> D-01062 Dresden
>>>>> Germany
>>>>> 
>>>>> Contact:
>>>>> Willersbau, Room WIL A 208
>>>>> Phone: (+49) 351 463-34217
>>>>> Fax: (+49) 351 463-37773
>>>>> e-mail: michael.kluge at tu-dresden.de
>>>>> <mailto:michael.kluge at tu-dresden.de>
>>>>> WWW: http://www.tu-dresden.de/zih
>>> 
>> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

Michael Kluge

2010-May-03 18:10 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi Andreas,

Am 03.05.2010 15:20, schrieb Andreas Dilger:> On 2010-05-03, at 04:41, Michael Kluge wrote:
>> I have a small problem understanding the logical order of the events.
>> Here is one example from the logs where the client reports an RPC as
>> completed before the server says that it has finished the RPC handling.
>> The difference is about 1.5 ms.
>>
>> Is that due to the fact that the server waits until the client has
>> ack''ed the result of the RPC? Or are the clocks between the
servers not
>> well synchronized? Or are the timestamps in the logfile sometimes not
>> correct (log message could not be flushed or whatever)?
>
> The timestamps on the log messages are only as accurate as the clocks on > the nodes.  Lustre can run without synchronized clocks, using the client
 > clock as the basis for the timestamps, but Lustre makes no effort to
 > synchronize the client clocks.  For that you need to use NTP
(easiest)>  or adjust the timestamps in the logs based on messages like this.
OK, so logically the "Completed RPC" on the client side is supposed to
show up after the server has written his "Handled RPC" to its log (if 
clocks are snyc''ed perfectly).

One more question: RPC 1334380768266400 (in the log WangDi sent me) has 
on the client side only a "Sending RPC" message, thus missing the 
"Completed RPC". The server has all three (received,start work, done 
work). Has this RPC vanished on the way back to the client? There is no 
further indication what happend. The last timestamp in the client log is:
1272565368.228628
and the server says it finished the processing of the request at:
1272565281.379471
So the client log has been recorded long enough to contain the 
"Completed RPC" message for this RPC if it arrived ever ...


Regards, Michael

>> In the first case I don''t have a chance to figure out when the
server
>> has finished working on the RPC and is going to send out the result,
>> correct?
>>
>>
../debug_log/rpc_client.debug:00000100:00100000:1.0:1272565278.263430:0:3479:0:(client.c:1532:ptlrpc_check_set())
>> Completed RPC pname:cluuid:pid:xid:nid:opc
>>
ptlrpcd-brw:3f7723dc-06a3-0b5e-9e96-f5b86e8d1b1c:3479:1334380768266050:10.8.0.126
at tcp:4
>>
>>
../debug_log/rpc_oss.debug:00000100:00100000:1.0:1272565278.265039:0:7385:0:(service.c:1672:ptlrpc_server_handle_request())
>> Handled RPC pname:cluuid+ref:pid:xid:nid:opc
>>
ll_ost_io_24:3f7723dc-06a3-0b5e-9e96-f5b86e8d1b1c+27:3479:x1334380768266050:12345-10.8.0.104
at tcp:4
>> Request procesed in 238391us (239698us total) trans 266 rc 0/0
>>
>>
>> For this RPC I see in the oss log:
>>
>> 1272565278.025337: incoming x1334380768266050
>> 1272565278.026649: Handling RPC x1334380768266050
>> 1272565278.265039: Handled RPC x1334380768266050:Request procesed in
>> 238391us (239698us total)
>>
>> So, by doing the math I get: 265039-026649=238390 (just 1 off target,
>> might be wathever) and 265039-025337=239702 (4 off target). So I guess
>> the times reported by the last message are calculated by using a
>> different data source than the log time stamps?
>
> The log timestamps are generated at the time the message is processed by
the log system.  The processing times are themselves generated by accessing the
clock and computing the processing before the message is printed.
>
>          do_gettimeofday(&work_start);
>          :
>          {handle RPC}
>          :
>          do_gettimeofday(&work_end);
>          timediff = cfs_timeval_sub(&work_end,&work_start, NULL);
>          CDEBUG(D_RPCTRACE, "request x"LPU64" opc %u from %s
processed in "
>                 "%ldus (%ldus total) trans "LPU64" rc
%d/%d\n",
>
>
>
>> Am 29.04.2010 20:27, schrieb di.wang:
>>> Hello, Michael
>>>
>>> Here is a small debug log example you can use for your development.
>>>
>>> Thanks
>>> WangDi
>>> Michael Kluge wrote:
>>>> Hi WangDi,
>>>>
>>>> OK, thanks for input. I''ll go ahead and try to write a
converter.
>>>> Could you please collect a set test traces that belong together
from a
>>>> couple servers and clients and put them somewhere so that I can
>>>> download them?
>>>>
>>>>
>>>> Thanks, Michael
>>>>
>>>> Am 29.04.2010 03:25, schrieb di.wang:
>>>>> Hello, Michael
>>>>>
>>>>> There is a logfile parser script in the attachment, which
was written by
>>>>> Eric.
>>>>>
>>>>> This script is very simple, but it should help you
understand how we
>>>>> retrieve time step information from lustre debug log. On
the server
>>>>> side, if you enable rpc_trace log, whenever the request
arrive/start
>>>>> being processed/end of processing, there will be
corresponding records
>>>>> being written into the debug log. Basically, you can get
all the time
>>>>> step information from these records (actually only two of
these records
>>>>> would be enough).
>>>>>
>>>>>
>>>>> a.
>>>>>
00000100:00100000:0.0:1272313858.472660:0:31581:0:(service.c:1625:ptlrpc_server_handle_request())
>>>>>
>>>>> Handling RPC pname:cluuid+ref:pid:xid:nid:opc
>>>>>
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+10:4055:x1334115493386242:12345-0
at lo:400
>>>>>
>>>>>
>>>>>
>>>>> This record means the req will being handled, so you can
get the start
>>>>> time stamp(1272313858.472660) operation type (opc: 400,
ping), xid
>>>>> (1334115493386242), client nid(12345-0 at lo) and so on.
>>>>>
>>>>> b.
>>>>>
00000100:00100000:0.0:1272313858.472687:0:31581:0:(service.c:1672:ptlrpc_server_handle_request())
>>>>>
>>>>> Handled RPC pname:cluuid+ref:pid:xid:nid:opc
>>>>>
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+9:4055:x1334115493386242:12345-0
at lo:400
>>>>>
>>>>> Request procesed in 45us (77us total) trans 0 rc 0/0
>>>>>
>>>>> This record means the req is already being handled, so you
can get the
>>>>> end time stamp(1272313858.472687), operation type (opc:
400, ping), xid
>>>>> (1334115493386242), client nid(12345-0 at lo) and so no.
>>>>>
>>>>> Note: (77us total) means how long it takes from the request
arriving to
>>>>> the end of processing. so you can also get the request
arriving time
>>>>> stamp here by (1272313858 - 77 = 1272312781).
>>>>>
>>>>>
>>>>> So with these information you can draw the graph Eric
mentioned in his
>>>>> email. If you have any questions, please let me know.
>>>>>
>>>>> Thanks
>>>>> WangDi
>>>>>
>>>>>
>>>>> Eric Barton wrote:
>>>>>>
>>>>>>
------------------------------------------------------------------------
>>>>>>
>>>>>>
>>>>>> Subject:
>>>>>> RE: Visualising Lustre RPCs
>>>>>> To:
>>>>>> "Michael Kluge"<Michael.Kluge at
tu-dresden.de>, "wangdi"
>>>>>> <Tom.Wang at Sun.COM>
>>>>>>
>>>>>> To:
>>>>>> "Michael Kluge"<Michael.Kluge at
tu-dresden.de>, "wangdi"
>>>>>> <Tom.Wang at Sun.COM>
>>>>>> CC:
>>>>>> <lustre-devel at lists.lustre.org>, "Galen
M. Shipman"<gshipman at ornl.gov>
>>>>>>
>>>>>>
>>>>>> Michael,
>>>>>>
>>>>>> The previous Lustre RPC visualisation effort I
mentioned at the LUG
>>>>>> used the
>>>>>>
>>>>>> Lustre debug log entries of type D_RPCTRACE. We
disabled all but these
>>>>>>
>>>>>> log messages and then used the Lustre debug daemon to
collect them
>>>>>> while
>>>>>>
>>>>>> we ran I/O tests. We then ran a simple logfile parser
which used just
>>>>>> the log entries
>>>>>>
>>>>>> for request arrival?
>>>>>>
>>>>>> , start of processing and end of processing to graph
request
>>>>>>
>>>>>> queue depth (arrival->end) and the number of
requests being serviced
>>>>>> by type
>>>>>>
>>>>>> over time ? e.g?
>>>>>>
>>>>>> read3d
>>>>>>
>>>>>> ?which shows request queue depth (vertical) over time
(axis labelled
>>>>>> 20-25) by
>>>>>>
>>>>>> server (axis labelled 0-80).
>>>>>>
>>>>>> *From:* Michael Kluge [mailto:Michael.Kluge at
tu-dresden.de]
>>>>>> *Sent:* 17 April 2010 6:26 AM
>>>>>> *To:* Galen M. Shipman; Eric Barton
>>>>>> *Subject:* Visualising Lustre RPCs
>>>>>>
>>>>>> Hi Galen, Eric,
>>>>>>
>>>>>> in order to get this little project started, I think
what I need at
>>>>>> first to
>>>>>>
>>>>>> write a prototype for a converter are the following
things:
>>>>>>
>>>>>> A set of test traces collected on maybe a handful of
clients and some
>>>>>> servers
>>>>>>
>>>>>> is probably a good point to start with. It would be
even better if we
>>>>>> know
>>>>>>
>>>>>> what is in this traces so that we have an expectation
what kind of
>>>>>> things
>>>>>>
>>>>>> we want to see on the Vampir displays. This little
program that Eric
>>>>>> mentioned
>>>>>>
>>>>>> that can read the trace file would be very helpful as
well. And as the
>>>>>> last
>>>>>>
>>>>>> idea I have right now, a technical contact. I might
come up with a
>>>>>> couple
>>>>>>
>>>>>> of question after I have taken the first look onto the
original trace
>>>>>> data
>>>>>>
>>>>>> and before I start writing code.
>>>>>>
>>>>>> Regards, Michael
>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Michael Kluge, M.Sc.
>>>>>>
>>>>>> Technische Universit?t Dresden
>>>>>> Center for Information Services and
>>>>>> High Performance Computing (ZIH)
>>>>>> D-01062 Dresden
>>>>>> Germany
>>>>>>
>>>>>> Contact:
>>>>>> Willersbau, Room WIL A 208
>>>>>> Phone: (+49) 351 463-34217
>>>>>> Fax: (+49) 351 463-37773
>>>>>> e-mail: michael.kluge at tu-dresden.de
>>>>>> <mailto:michael.kluge at tu-dresden.de>
>>>>>> WWW: http://www.tu-dresden.de/zih

Robert Read

2010-May-03 18:57 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi Michael, 

On May 3, 2010, at 11:10 , Michael Kluge wrote:
> Hi Andreas,
> 
> Am 03.05.2010 15:20, schrieb Andreas Dilger:
>> On 2010-05-03, at 04:41, Michael Kluge wrote:
>>> I have a small problem understanding the logical order of the
events.
>>> Here is one example from the logs where the client reports an RPC
as
>>> completed before the server says that it has finished the RPC
handling.
>>> The difference is about 1.5 ms.
>>> 
>>> Is that due to the fact that the server waits until the client has
>>> ack''ed the result of the RPC? Or are the clocks between
the servers not
>>> well synchronized? Or are the timestamps in the logfile sometimes
not
>>> correct (log message could not be flushed or whatever)?
>> 
>> The timestamps on the log messages are only as accurate as the clocks
on
>> the nodes.  Lustre can run without synchronized clocks, using the
client
>> clock as the basis for the timestamps, but Lustre makes no effort to
>> synchronize the client clocks.  For that you need to use NTP (easiest)
>> or adjust the timestamps in the logs based on messages like this.
> 
> OK, so logically the "Completed RPC" on the client side is
supposed to
> show up after the server has written his "Handled RPC" to its log
(if
> clocks are snyc''ed perfectly).
> 
> One more question: RPC 1334380768266400 (in the log WangDi sent me) has 
> on the client side only a "Sending RPC" message, thus missing the
> "Completed RPC". The server has all three (received,start work,
done
> work). Has this RPC vanished on the way back to the client? There is no 
> further indication what happend. The last timestamp in the client log is:
> 1272565368.228628
> and the server says it finished the processing of the request at:
> 1272565281.379471
> So the client log has been recorded long enough to contain the 
> "Completed RPC" message for this RPC if it arrived ever ...
If the reply was never received then you should expect to see a timeout for that
rpc in the log.

It''s also possible, though I''m not sure how likely in this
case, that the trace buffer on the cpu that processed the completion has
overflowed.  You can check if this is possible by finding the earliest message
for each cpu and see if any of them of them is later than the expected time for
this message.

cheers,
robert
> 
> 
> Regards, Michael
> 
> 
>>> In the first case I don''t have a chance to figure out when
the server
>>> has finished working on the RPC and is going to send out the
result,
>>> correct?
>>> 
>>>
../debug_log/rpc_client.debug:00000100:00100000:1.0:1272565278.263430:0:3479:0:(client.c:1532:ptlrpc_check_set())
>>> Completed RPC pname:cluuid:pid:xid:nid:opc
>>>
ptlrpcd-brw:3f7723dc-06a3-0b5e-9e96-f5b86e8d1b1c:3479:1334380768266050:10.8.0.126
at tcp:4
>>> 
>>>
../debug_log/rpc_oss.debug:00000100:00100000:1.0:1272565278.265039:0:7385:0:(service.c:1672:ptlrpc_server_handle_request())
>>> Handled RPC pname:cluuid+ref:pid:xid:nid:opc
>>>
ll_ost_io_24:3f7723dc-06a3-0b5e-9e96-f5b86e8d1b1c+27:3479:x1334380768266050:12345-10.8.0.104
at tcp:4
>>> Request procesed in 238391us (239698us total) trans 266 rc 0/0
>>> 
>>> 
>>> For this RPC I see in the oss log:
>>> 
>>> 1272565278.025337: incoming x1334380768266050
>>> 1272565278.026649: Handling RPC x1334380768266050
>>> 1272565278.265039: Handled RPC x1334380768266050:Request procesed
in
>>> 238391us (239698us total)
>>> 
>>> So, by doing the math I get: 265039-026649=238390 (just 1 off
target,
>>> might be wathever) and 265039-025337=239702 (4 off target). So I
guess
>>> the times reported by the last message are calculated by using a
>>> different data source than the log time stamps?
>> 
>> The log timestamps are generated at the time the message is processed
by the log system.  The processing times are themselves generated by accessing
the clock and computing the processing before the message is printed.
>> 
>>         do_gettimeofday(&work_start);
>>         :
>>         {handle RPC}
>>         :
>>         do_gettimeofday(&work_end);
>>         timediff = cfs_timeval_sub(&work_end,&work_start,
NULL);
>>         CDEBUG(D_RPCTRACE, "request x"LPU64" opc %u from
%s processed in "
>>                "%ldus (%ldus total) trans "LPU64" rc
%d/%d\n",
>> 
>> 
>> 
>>> Am 29.04.2010 20:27, schrieb di.wang:
>>>> Hello, Michael
>>>> 
>>>> Here is a small debug log example you can use for your
development.
>>>> 
>>>> Thanks
>>>> WangDi
>>>> Michael Kluge wrote:
>>>>> Hi WangDi,
>>>>> 
>>>>> OK, thanks for input. I''ll go ahead and try to
write a converter.
>>>>> Could you please collect a set test traces that belong
together from a
>>>>> couple servers and clients and put them somewhere so that I
can
>>>>> download them?
>>>>> 
>>>>> 
>>>>> Thanks, Michael
>>>>> 
>>>>> Am 29.04.2010 03:25, schrieb di.wang:
>>>>>> Hello, Michael
>>>>>> 
>>>>>> There is a logfile parser script in the attachment,
which was written by
>>>>>> Eric.
>>>>>> 
>>>>>> This script is very simple, but it should help you
understand how we
>>>>>> retrieve time step information from lustre debug log.
On the server
>>>>>> side, if you enable rpc_trace log, whenever the request
arrive/start
>>>>>> being processed/end of processing, there will be
corresponding records
>>>>>> being written into the debug log. Basically, you can
get all the time
>>>>>> step information from these records (actually only two
of these records
>>>>>> would be enough).
>>>>>> 
>>>>>> 
>>>>>> a.
>>>>>>
00000100:00100000:0.0:1272313858.472660:0:31581:0:(service.c:1625:ptlrpc_server_handle_request())
>>>>>> 
>>>>>> Handling RPC pname:cluuid+ref:pid:xid:nid:opc
>>>>>>
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+10:4055:x1334115493386242:12345-0
at lo:400
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> This record means the req will being handled, so you
can get the start
>>>>>> time stamp(1272313858.472660) operation type (opc: 400,
ping), xid
>>>>>> (1334115493386242), client nid(12345-0 at lo) and so
on.
>>>>>> 
>>>>>> b.
>>>>>>
00000100:00100000:0.0:1272313858.472687:0:31581:0:(service.c:1672:ptlrpc_server_handle_request())
>>>>>> 
>>>>>> Handled RPC pname:cluuid+ref:pid:xid:nid:opc
>>>>>>
ll_mgs_00:7d4fb15c-1b1c-295f-e466-ea7d77089b52+9:4055:x1334115493386242:12345-0
at lo:400
>>>>>> 
>>>>>> Request procesed in 45us (77us total) trans 0 rc 0/0
>>>>>> 
>>>>>> This record means the req is already being handled, so
you can get the
>>>>>> end time stamp(1272313858.472687), operation type (opc:
400, ping), xid
>>>>>> (1334115493386242), client nid(12345-0 at lo) and so
no.
>>>>>> 
>>>>>> Note: (77us total) means how long it takes from the
request arriving to
>>>>>> the end of processing. so you can also get the request
arriving time
>>>>>> stamp here by (1272313858 - 77 = 1272312781).
>>>>>> 
>>>>>> 
>>>>>> So with these information you can draw the graph Eric
mentioned in his
>>>>>> email. If you have any questions, please let me know.
>>>>>> 
>>>>>> Thanks
>>>>>> WangDi
>>>>>> 
>>>>>> 
>>>>>> Eric Barton wrote:
>>>>>>> 
>>>>>>>
------------------------------------------------------------------------
>>>>>>> 
>>>>>>> 
>>>>>>> Subject:
>>>>>>> RE: Visualising Lustre RPCs
>>>>>>> To:
>>>>>>> "Michael Kluge"<Michael.Kluge at
tu-dresden.de>, "wangdi"
>>>>>>> <Tom.Wang at Sun.COM>
>>>>>>> 
>>>>>>> To:
>>>>>>> "Michael Kluge"<Michael.Kluge at
tu-dresden.de>, "wangdi"
>>>>>>> <Tom.Wang at Sun.COM>
>>>>>>> CC:
>>>>>>> <lustre-devel at lists.lustre.org>,
"Galen M. Shipman"<gshipman at ornl.gov>
>>>>>>> 
>>>>>>> 
>>>>>>> Michael,
>>>>>>> 
>>>>>>> The previous Lustre RPC visualisation effort I
mentioned at the LUG
>>>>>>> used the
>>>>>>> 
>>>>>>> Lustre debug log entries of type D_RPCTRACE. We
disabled all but these
>>>>>>> 
>>>>>>> log messages and then used the Lustre debug daemon
to collect them
>>>>>>> while
>>>>>>> 
>>>>>>> we ran I/O tests. We then ran a simple logfile
parser which used just
>>>>>>> the log entries
>>>>>>> 
>>>>>>> for request arrival?
>>>>>>> 
>>>>>>> , start of processing and end of processing to
graph request
>>>>>>> 
>>>>>>> queue depth (arrival->end) and the number of
requests being serviced
>>>>>>> by type
>>>>>>> 
>>>>>>> over time ? e.g?
>>>>>>> 
>>>>>>> read3d
>>>>>>> 
>>>>>>> ?which shows request queue depth (vertical) over
time (axis labelled
>>>>>>> 20-25) by
>>>>>>> 
>>>>>>> server (axis labelled 0-80).
>>>>>>> 
>>>>>>> *From:* Michael Kluge [mailto:Michael.Kluge at
tu-dresden.de]
>>>>>>> *Sent:* 17 April 2010 6:26 AM
>>>>>>> *To:* Galen M. Shipman; Eric Barton
>>>>>>> *Subject:* Visualising Lustre RPCs
>>>>>>> 
>>>>>>> Hi Galen, Eric,
>>>>>>> 
>>>>>>> in order to get this little project started, I
think what I need at
>>>>>>> first to
>>>>>>> 
>>>>>>> write a prototype for a converter are the following
things:
>>>>>>> 
>>>>>>> A set of test traces collected on maybe a handful
of clients and some
>>>>>>> servers
>>>>>>> 
>>>>>>> is probably a good point to start with. It would be
even better if we
>>>>>>> know
>>>>>>> 
>>>>>>> what is in this traces so that we have an
expectation what kind of
>>>>>>> things
>>>>>>> 
>>>>>>> we want to see on the Vampir displays. This little
program that Eric
>>>>>>> mentioned
>>>>>>> 
>>>>>>> that can read the trace file would be very helpful
as well. And as the
>>>>>>> last
>>>>>>> 
>>>>>>> idea I have right now, a technical contact. I might
come up with a
>>>>>>> couple
>>>>>>> 
>>>>>>> of question after I have taken the first look onto
the original trace
>>>>>>> data
>>>>>>> 
>>>>>>> and before I start writing code.
>>>>>>> 
>>>>>>> Regards, Michael
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> 
>>>>>>> Michael Kluge, M.Sc.
>>>>>>> 
>>>>>>> Technische Universit?t Dresden
>>>>>>> Center for Information Services and
>>>>>>> High Performance Computing (ZIH)
>>>>>>> D-01062 Dresden
>>>>>>> Germany
>>>>>>> 
>>>>>>> Contact:
>>>>>>> Willersbau, Room WIL A 208
>>>>>>> Phone: (+49) 351 463-34217
>>>>>>> Fax: (+49) 351 463-37773
>>>>>>> e-mail: michael.kluge at tu-dresden.de
>>>>>>> <mailto:michael.kluge at tu-dresden.de>
>>>>>>> WWW: http://www.tu-dresden.de/zih
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

di.wang

2010-May-03 18:58 UTC

head link

[Lustre-devel] Lustre RPC visualization

Michael Kluge wrote:> Hi Andreas,
>
> Am 03.05.2010 15:20, schrieb Andreas Dilger:
>> On 2010-05-03, at 04:41, Michael Kluge wrote:
>>> I have a small problem understanding the logical order of the
events.
>>> Here is one example from the logs where the client reports an RPC
as
>>> completed before the server says that it has finished the RPC
handling.
>>> The difference is about 1.5 ms.
>>>
>>> Is that due to the fact that the server waits until the client has
>>> ack''ed the result of the RPC? Or are the clocks between
the servers not
>>> well synchronized? Or are the timestamps in the logfile sometimes
not
>>> correct (log message could not be flushed or whatever)?
>>
>> The timestamps on the log messages are only as accurate as the clocks
on
> > the nodes.  Lustre can run without synchronized clocks, using the 
> client
> > clock as the basis for the timestamps, but Lustre makes no effort to
> > synchronize the client clocks.  For that you need to use NTP (easiest)
>>  or adjust the timestamps in the logs based on messages like this.
>
> OK, so logically the "Completed RPC" on the client side is
supposed to
> show up after the server has written his "Handled RPC" to its log
(if
> clocks are snyc''ed perfectly).
Yes.>
> One more question: RPC 1334380768266400 (in the log WangDi sent me) 
> has on the client side only a "Sending RPC" message, thus missing
the
> "Completed RPC". The server has all three (received,start work,
done
> work). Has this RPC vanished on the way back to the client? There is 
> no further indication what happend. The last timestamp in the client 
> log is:
> 1272565368.228628
> and the server says it finished the processing of the request at:
> 1272565281.379471
> So the client log has been recorded long enough to contain the 
> "Completed RPC" message for this RPC if it arrived ever ...Logically, yes. But in some cases, some debug logs might be abandoned 
for some reasons(actually, it happens not rarely), and probably you need 
maintain an average time from server "Handled RPC" to client
"Completed
RPC", then you just guess the client "Completed RPC" time in this
case.

Thanks
WangDi
>
>
> Regards, Michael
>

Michael Kluge

2010-May-03 19:32 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi WangDi,
>> OK, so logically the "Completed RPC" on the client side is
supposed to
>> show up after the server has written his "Handled RPC" to its
log (if
>> clocks are snyc''ed perfectly).
> Yes.
Thanks.
>> One more question: RPC 1334380768266400 (in the log WangDi sent me)
>> has on the client side only a "Sending RPC" message, thus
missing the
>> "Completed RPC". The server has all three (received,start
work, done
>> work). Has this RPC vanished on the way back to the client? There is
>> no further indication what happend. The last timestamp in the client
>> log is:
>> 1272565368.228628
>> and the server says it finished the processing of the request at:
>> 1272565281.379471
>> So the client log has been recorded long enough to contain the
>> "Completed RPC" message for this RPC if it arrived ever ...
> Logically, yes. But in some cases, some debug logs might be abandoned
> for some reasons(actually, it happens not rarely), and probably you need
> maintain an average time from server "Handled RPC" to client
"Completed
> RPC", then you just guess the client "Completed RPC" time in
this case.
Oh my gosh ;) I don''t want to start speculations about the helpfulness 
of incomplete debug logs. Anyway, what can get lost? Any kind of message 
on the servers and clients?  I think I''d like to know what cases have
to
be handled while I try to track individual RPC''s on their way.


Regards, Michael

di.wang

2010-May-03 19:52 UTC

head link

[Lustre-devel] Lustre RPC visualization

Michael Kluge wrote:>>> One more question: RPC 1334380768266400 (in the log WangDi sent me)
>>> has on the client side only a "Sending RPC" message, thus
missing the
>>> "Completed RPC". The server has all three (received,start
work, done
>>> work). Has this RPC vanished on the way back to the client? There
is
>>> no further indication what happend. The last timestamp in the
client
>>> log is:
>>> 1272565368.228628
>>> and the server says it finished the processing of the request at:
>>> 1272565281.379471
>>> So the client log has been recorded long enough to contain the
>>> "Completed RPC" message for this RPC if it arrived ever
...
>>>       
>> Logically, yes. But in some cases, some debug logs might be abandoned
>> for some reasons(actually, it happens not rarely), and probably you
need
>> maintain an average time from server "Handled RPC" to client
"Completed
>> RPC", then you just guess the client "Completed RPC"
time in this case.
>>     
>
> Oh my gosh ;) I don''t want to start speculations about the
helpfulness
> of incomplete debug logs. Anyway, what can get lost? Any kind of message 
> on the servers and clients?  I think I''d like to know what cases
have to
> be handled while I try to track individual RPC''s on their way.
>   Any records can get lost here. Unfortunately, there are not any messages 
indicate the missing happened. :(
(Usually, I would check the time stamp in the log, i.e. no records for a 
"long" time, for example several seconds, but this is not the accurate
way).

I guess you can just ignore these uncompleted records in your first 
step?  Let''s see how these incomplete log will
impact the profiling result, then we will decide how to deal with this?

Thanks
Wangdi>
> Regards, Michael
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
>

Michael Kluge

2010-May-03 20:04 UTC

head link

[Lustre-devel] Lustre RPC visualization

OK, sounds reasonable. I can put the number of ignored messages in an 
additional counter. That should make the generated trace file more 
descriptive.

Regards, Michael

Am 03.05.2010 21:52, schrieb di.wang:>> Anyway, what can get lost? Any kind of
>> message on the servers and clients? I think I''d like to know
what
>> cases have to be handled while I try to track individual RPC''s
on
>> their way.
> Any records can get lost here. Unfortunately, there are not any messages
> indicate the missing happened. :(
> (Usually, I would check the time stamp in the log, i.e. no records for a
> "long" time, for example several seconds, but this is not the
accurate
> way).
>
> I guess you can just ignore these uncompleted records in your first
> step? Let''s see how these incomplete log will
> impact the profiling result, then we will decide how to deal with this?
>

Michael Kluge

2010-May-16 09:29 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi WangDi,

the first version works. Screenshot is attached. I have a couple of 
counter realized: RPC''s in flight and RPC''s completed in total
on the
client, RPC''s enqueued, RPC''s in processing and RPC''c
completed in total
on the server. All these counter can be broken down by the type of RPC 
(op code). The picture has not yet the lines that show each single RPC, 
I still have to do counter like "avg. time to complete an RPC over the 
last second" and there are some more TODO''s. Like the timer 
synchronization. (In the screenshot the first and the last counter show 
total values while the one in the middle shows a rate.)

What I like to have is a complete set of traces from a small cluster 
(<100 nodes) including the servers. Would that be possible?

Is one of you in Hamburg May, 31-June, 3 for ISC''2010? I''ll be
there and
like to talk about what would be useful for the next steps.


Regards, Michael

Am 03.05.2010 21:52, schrieb di.wang:> Michael Kluge wrote:
>>>> One more question: RPC 1334380768266400 (in the log WangDi sent
me)
>>>> has on the client side only a "Sending RPC" message,
thus missing the
>>>> "Completed RPC". The server has all three
(received,start work, done
>>>> work). Has this RPC vanished on the way back to the client?
There is
>>>> no further indication what happend. The last timestamp in the
client
>>>> log is:
>>>> 1272565368.228628
>>>> and the server says it finished the processing of the request
at:
>>>> 1272565281.379471
>>>> So the client log has been recorded long enough to contain the
>>>> "Completed RPC" message for this RPC if it arrived
ever ...
>>> Logically, yes. But in some cases, some debug logs might be
abandoned
>>> for some reasons(actually, it happens not rarely), and probably you
need
>>> maintain an average time from server "Handled RPC" to
client "Completed
>>> RPC", then you just guess the client "Completed RPC"
time in this case.
>>
>> Oh my gosh ;) I don''t want to start speculations about the
helpfulness
>> of incomplete debug logs. Anyway, what can get lost? Any kind of
>> message on the servers and clients? I think I''d like to know
what
>> cases have to be handled while I try to track individual RPC''s
on
>> their way.
> Any records can get lost here. Unfortunately, there are not any messages
> indicate the missing happened. :(
> (Usually, I would check the time stamp in the log, i.e. no records for a
> "long" time, for example several seconds, but this is not the
accurate
> way).
>
> I guess you can just ignore these uncompleted records in your first
> step? Let''s see how these incomplete log will
> impact the profiling result, then we will decide how to deal with this?
>
> Thanks
> Wangdi
>>
>> Regards, Michael
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>
>

-- 
Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Vampir - [Trace View - lustre_rpcs.otf].png
Type: image/png
Size: 97181 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100516/f4cd3a83/attachment-0001.png

Eric Barton

2010-May-16 13:12 UTC

head link

[Lustre-devel] Lustre RPC visualization

Excellent :)

How do you think measurements taken from 1000 servers with 100,000
clients can be visualised?  We''ve used heat maps to visualise
10s-100s of concurrent measurements (y) over time (x) but I wonder
if that will scale.  Does vampire support heat maps?

    Cheers,
              Eric
> -----Original Message-----
> From: Michael Kluge [mailto:Michael.Kluge at tu-dresden.de]
> Sent: 16 May 2010 10:30 AM
> To: di.wang
> Cc: Eric Barton; Andreas Dilger; Robert Read; Galen M. Shipman;
lustre-devel
> Subject: Re: [Lustre-devel] Lustre RPC visualization
> 
> Hi WangDi,
> 
> the first version works. Screenshot is attached. I have a couple of
> counter realized: RPC''s in flight and RPC''s completed in
total on the
> client, RPC''s enqueued, RPC''s in processing and
RPC''c completed in total
> on the server. All these counter can be broken down by the type of RPC
> (op code). The picture has not yet the lines that show each single RPC,
> I still have to do counter like "avg. time to complete an RPC over the
> last second" and there are some more TODO''s. Like the timer
> synchronization. (In the screenshot the first and the last counter show
> total values while the one in the middle shows a rate.)
> 
> What I like to have is a complete set of traces from a small cluster
> (<100 nodes) including the servers. Would that be possible?
> 
> Is one of you in Hamburg May, 31-June, 3 for ISC''2010?
I''ll be there and
> like to talk about what would be useful for the next steps.
> 
> 
> Regards, Michael
> 
> Am 03.05.2010 21:52, schrieb di.wang:
> > Michael Kluge wrote:
> >>>> One more question: RPC 1334380768266400 (in the log WangDi
sent me)
> >>>> has on the client side only a "Sending RPC"
message, thus missing the
> >>>> "Completed RPC". The server has all three
(received,start work, done
> >>>> work). Has this RPC vanished on the way back to the
client? There is
> >>>> no further indication what happend. The last timestamp in
the client
> >>>> log is:
> >>>> 1272565368.228628
> >>>> and the server says it finished the processing of the
request at:
> >>>> 1272565281.379471
> >>>> So the client log has been recorded long enough to contain
the
> >>>> "Completed RPC" message for this RPC if it
arrived ever ...
> >>> Logically, yes. But in some cases, some debug logs might be
abandoned
> >>> for some reasons(actually, it happens not rarely), and
probably you need
> >>> maintain an average time from server "Handled RPC"
to client "Completed
> >>> RPC", then you just guess the client "Completed
RPC" time in this case.
> >>
> >> Oh my gosh ;) I don''t want to start speculations about
the helpfulness
> >> of incomplete debug logs. Anyway, what can get lost? Any kind of
> >> message on the servers and clients? I think I''d like to
know what
> >> cases have to be handled while I try to track individual
RPC''s on
> >> their way.
> > Any records can get lost here. Unfortunately, there are not any
messages
> > indicate the missing happened. :(
> > (Usually, I would check the time stamp in the log, i.e. no records for
a
> > "long" time, for example several seconds, but this is not
the accurate
> > way).
> >
> > I guess you can just ignore these uncompleted records in your first
> > step? Let''s see how these incomplete log will
> > impact the profiling result, then we will decide how to deal with
this?
> >
> > Thanks
> > Wangdi
> >>
> >> Regards, Michael
> >> _______________________________________________
> >> Lustre-devel mailing list
> >> Lustre-devel at lists.lustre.org
> >> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >
> >
> 
> 
> --
> Michael Kluge, M.Sc.
> 
> Technische Universit?t Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih

Andrew Uselton

2010-May-17 03:24 UTC

head link

[Lustre-devel] Lustre RPC visualization

I think this work is very interesting.  Will anyone be at CUG 2010 next week
to discuss?
Cheers,
Andrew


2010/5/16 Michael Kluge <Michael.Kluge at tu-dresden.de>
> Hi WangDi,
>
> the first version works. Screenshot is attached. I have a couple of counter
> realized: RPC''s in flight and RPC''s completed in total on
the client, RPC''s
> enqueued, RPC''s in processing and RPC''c completed in
total on the server.
> All these counter can be broken down by the type of RPC (op code). The
> picture has not yet the lines that show each single RPC, I still have to do
> counter like "avg. time to complete an RPC over the last second"
and there
> are some more TODO''s. Like the timer synchronization. (In the
screenshot the
> first and the last counter show total values while the one in the middle
> shows a rate.)
>
> What I like to have is a complete set of traces from a small cluster
(<100
> nodes) including the servers. Would that be possible?
>
> Is one of you in Hamburg May, 31-June, 3 for ISC''2010?
I''ll be there and
> like to talk about what would be useful for the next steps.
>
>
>
> Regards, Michael
>
> Am 03.05.2010 21:52, schrieb di.wang:
>
>> Michael Kluge wrote:
>>
>>  One more question: RPC 1334380768266400 (in the log WangDi sent me)
>>>>> has on the client side only a "Sending RPC"
message, thus missing the
>>>>> "Completed RPC". The server has all three
(received,start work, done
>>>>> work). Has this RPC vanished on the way back to the client?
There is
>>>>> no further indication what happend. The last timestamp in
the client
>>>>> log is:
>>>>> 1272565368.228628
>>>>> and the server says it finished the processing of the
request at:
>>>>> 1272565281.379471
>>>>> So the client log has been recorded long enough to contain
the
>>>>> "Completed RPC" message for this RPC if it
arrived ever ...
>>>>>
>>>> Logically, yes. But in some cases, some debug logs might be
abandoned
>>>> for some reasons(actually, it happens not rarely), and probably
you need
>>>> maintain an average time from server "Handled RPC" to
client "Completed
>>>> RPC", then you just guess the client "Completed
RPC" time in this case.
>>>>
>>>
>>> Oh my gosh ;) I don''t want to start speculations about the
helpfulness
>>> of incomplete debug logs. Anyway, what can get lost? Any kind of
>>> message on the servers and clients? I think I''d like to
know what
>>> cases have to be handled while I try to track individual
RPC''s on
>>> their way.
>>>
>> Any records can get lost here. Unfortunately, there are not any
messages
>> indicate the missing happened. :(
>> (Usually, I would check the time stamp in the log, i.e. no records for
a
>> "long" time, for example several seconds, but this is not the
accurate
>> way).
>>
>> I guess you can just ignore these uncompleted records in your first
>> step? Let''s see how these incomplete log will
>> impact the profiling result, then we will decide how to deal with this?
>>
>> Thanks
>> Wangdi
>>
>>>
>>> Regards, Michael
>>> _______________________________________________
>>> Lustre-devel mailing list
>>> Lustre-devel at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>
>>
>>
>>
>
> --
> Michael Kluge, M.Sc.
>
> Technische Universit?t Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
>
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih
>
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100516/b1d7074f/attachment.html

Shipman, Galen M.

2010-May-17 03:39 UTC

head link

[Lustre-devel] Lustre RPC visualization

I would be very interested in this, particularly at scale. We should look into
collecting some large scale traces on Jaguar for experimentation of various
visualization techniques.

Thanks,

Galen


On May 16, 2010, at 9:12 AM, Eric Barton wrote:

Excellent :)

How do you think measurements taken from 1000 servers with 100,000
clients can be visualised?  We''ve used heat maps to visualise
10s-100s of concurrent measurements (y) over time (x) but I wonder
if that will scale.  Does vampire support heat maps?

   Cheers,
             Eric

-----Original Message-----
From: Michael Kluge [mailto:Michael.Kluge at tu-dresden.de]
Sent: 16 May 2010 10:30 AM
To: di.wang
Cc: Eric Barton; Andreas Dilger; Robert Read; Galen M. Shipman; lustre-devel
Subject: Re: [Lustre-devel] Lustre RPC visualization

Hi WangDi,

the first version works. Screenshot is attached. I have a couple of
counter realized: RPC''s in flight and RPC''s completed in total
on the
client, RPC''s enqueued, RPC''s in processing and RPC''c
completed in total
on the server. All these counter can be broken down by the type of RPC
(op code). The picture has not yet the lines that show each single RPC,
I still have to do counter like "avg. time to complete an RPC over the
last second" and there are some more TODO''s. Like the timer
synchronization. (In the screenshot the first and the last counter show
total values while the one in the middle shows a rate.)

What I like to have is a complete set of traces from a small cluster
(<100 nodes) including the servers. Would that be possible?

Is one of you in Hamburg May, 31-June, 3 for ISC''2010? I''ll be
there and
like to talk about what would be useful for the next steps.


Regards, Michael

Am 03.05.2010 21:52, schrieb di.wang:
Michael Kluge wrote:
One more question: RPC 1334380768266400 (in the log WangDi sent me)
has on the client side only a "Sending RPC" message, thus missing the
"Completed RPC". The server has all three (received,start work, done
work). Has this RPC vanished on the way back to the client? There is
no further indication what happend. The last timestamp in the client
log is:
1272565368.228628
and the server says it finished the processing of the request at:
1272565281.379471
So the client log has been recorded long enough to contain the
"Completed RPC" message for this RPC if it arrived ever ...
Logically, yes. But in some cases, some debug logs might be abandoned
for some reasons(actually, it happens not rarely), and probably you need
maintain an average time from server "Handled RPC" to client
"Completed
RPC", then you just guess the client "Completed RPC" time in this
case.

Oh my gosh ;) I don''t want to start speculations about the helpfulness
of incomplete debug logs. Anyway, what can get lost? Any kind of
message on the servers and clients? I think I''d like to know what
cases have to be handled while I try to track individual RPC''s on
their way.
Any records can get lost here. Unfortunately, there are not any messages
indicate the missing happened. :(
(Usually, I would check the time stamp in the log, i.e. no records for a
"long" time, for example several seconds, but this is not the accurate
way).

I guess you can just ignore these uncompleted records in your first
step? Let''s see how these incomplete log will
impact the profiling result, then we will decide how to deal with this?

Thanks
Wangdi

Regards, Michael
_______________________________________________
Lustre-devel mailing list
Lustre-devel at lists.lustre.org<mailto:Lustre-devel at lists.lustre.org>
http://lists.lustre.org/mailman/listinfo/lustre-devel




--
Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de<mailto:michael.kluge at
tu-dresden.de>
WWW:    http://www.tu-dresden.de/zih


Galen M. Shipman
Group Leader - Technology Integration
National Center for Computational Sciences
Oak Ridge National Laboratory
Office: 865.576.2672
Cell:   865.307.1209
Email:  gshipman at ornl.gov<mailto:gshipman at ornl.gov>

Michael Kluge

2010-May-17 04:52 UTC

head link

[Lustre-devel] Lustre RPC visualization

Eric,

Vampir does scale up to this point. Almost, if I remember well, but 
thing might have changed recently). If it does not, well'' make it ;) We
have visualized real trace files with up to 20.000 processes and 
artificial trace files with many more. Heat maps can be emulated. Once 
we have had something like a color-coded counter timeline that procduces 
a heat map. Right now it can be done by on the timeline by placing 
artificial enter/leave events for function called "5%",
"10%", and so
on. I have done this already in the past, so it should not be too difficult.


Michael

Am 16.05.2010 15:12, schrieb Eric Barton:> Excellent :)
>
> How do you think measurements taken from 1000 servers with 100,000
> clients can be visualised?  We''ve used heat maps to visualise
> 10s-100s of concurrent measurements (y) over time (x) but I wonder
> if that will scale.  Does vampire support heat maps?
>
>      Cheers,
>                Eric
>
>> -----Original Message-----
>> From: Michael Kluge [mailto:Michael.Kluge at tu-dresden.de]
>> Sent: 16 May 2010 10:30 AM
>> To: di.wang
>> Cc: Eric Barton; Andreas Dilger; Robert Read; Galen M. Shipman;
lustre-devel
>> Subject: Re: [Lustre-devel] Lustre RPC visualization
>>
>> Hi WangDi,
>>
>> the first version works. Screenshot is attached. I have a couple of
>> counter realized: RPC''s in flight and RPC''s completed
in total on the
>> client, RPC''s enqueued, RPC''s in processing and
RPC''c completed in total
>> on the server. All these counter can be broken down by the type of RPC
>> (op code). The picture has not yet the lines that show each single RPC,
>> I still have to do counter like "avg. time to complete an RPC over
the
>> last second" and there are some more TODO''s. Like the
timer
>> synchronization. (In the screenshot the first and the last counter show
>> total values while the one in the middle shows a rate.)
>>
>> What I like to have is a complete set of traces from a small cluster
>> (<100 nodes) including the servers. Would that be possible?
>>
>> Is one of you in Hamburg May, 31-June, 3 for ISC''2010?
I''ll be there and
>> like to talk about what would be useful for the next steps.
>>
>>
>> Regards, Michael
>>
>> Am 03.05.2010 21:52, schrieb di.wang:
>>> Michael Kluge wrote:
>>>>>> One more question: RPC 1334380768266400 (in the log
WangDi sent me)
>>>>>> has on the client side only a "Sending RPC"
message, thus missing the
>>>>>> "Completed RPC". The server has all three
(received,start work, done
>>>>>> work). Has this RPC vanished on the way back to the
client? There is
>>>>>> no further indication what happend. The last timestamp
in the client
>>>>>> log is:
>>>>>> 1272565368.228628
>>>>>> and the server says it finished the processing of the
request at:
>>>>>> 1272565281.379471
>>>>>> So the client log has been recorded long enough to
contain the
>>>>>> "Completed RPC" message for this RPC if it
arrived ever ...
>>>>> Logically, yes. But in some cases, some debug logs might be
abandoned
>>>>> for some reasons(actually, it happens not rarely), and
probably you need
>>>>> maintain an average time from server "Handled
RPC" to client "Completed
>>>>> RPC", then you just guess the client "Completed
RPC" time in this case.
>>>>
>>>> Oh my gosh ;) I don''t want to start speculations about
the helpfulness
>>>> of incomplete debug logs. Anyway, what can get lost? Any kind
of
>>>> message on the servers and clients? I think I''d like
to know what
>>>> cases have to be handled while I try to track individual
RPC''s on
>>>> their way.
>>> Any records can get lost here. Unfortunately, there are not any
messages
>>> indicate the missing happened. :(
>>> (Usually, I would check the time stamp in the log, i.e. no records
for a
>>> "long" time, for example several seconds, but this is not
the accurate
>>> way).
>>>
>>> I guess you can just ignore these uncompleted records in your first
>>> step? Let''s see how these incomplete log will
>>> impact the profiling result, then we will decide how to deal with
this?
>>>
>>> Thanks
>>> Wangdi
>>>>
>>>> Regards, Michael
>>>> _______________________________________________
>>>> Lustre-devel mailing list
>>>> Lustre-devel at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>
>>>
>>
>>
>> --
>> Michael Kluge, M.Sc.
>>
>> Technische Universit?t Dresden
>> Center for Information Services and
>> High Performance Computing (ZIH)
>> D-01062 Dresden
>> Germany
>>
>> Contact:
>> Willersbau, Room WIL A 208
>> Phone:  (+49) 351 463-34217
>> Fax:    (+49) 351 463-37773
>> e-mail: michael.kluge at tu-dresden.de
>> WWW:    http://www.tu-dresden.de/zih
>
>

Michael Kluge

2010-May-17 05:53 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi Andrew,

unfortunately no. We don''t own a Cray :( 


Regards, Michael


Am Sonntag, den 16.05.2010, 20:24 -0700 schrieb Andrew
Uselton:> I think this work is very interesting.  Will anyone be at CUG 2010
> next week to discuss? 
> Cheers,
> Andrew
> 
> 
> 2010/5/16 Michael Kluge <Michael.Kluge at tu-dresden.de>
>         Hi WangDi,
>         
>         the first version works. Screenshot is attached. I have a
>         couple of counter realized: RPC''s in flight and
RPC''s
>         completed in total on the client, RPC''s enqueued,
RPC''s in
>         processing and RPC''c completed in total on the server. All
>         these counter can be broken down by the type of RPC (op code).
>         The picture has not yet the lines that show each single RPC, I
>         still have to do counter like "avg. time to complete an RPC
>         over the last second" and there are some more TODO''s.
Like the
>         timer synchronization. (In the screenshot the first and the
>         last counter show total values while the one in the middle
>         shows a rate.)
>         
>         What I like to have is a complete set of traces from a small
>         cluster (<100 nodes) including the servers. Would that be
>         possible?
>         
>         Is one of you in Hamburg May, 31-June, 3 for ISC''2010?
I''ll be
>         there and like to talk about what would be useful for the next
>         steps. 
>         
>         
>         
>         Regards, Michael
>         
>         Am 03.05.2010 21:52, schrieb di.wang:
>         
>                 Michael Kluge wrote: 
>                 
>                 
>                                         One more question: RPC
>                                         1334380768266400 (in the log
>                                         WangDi sent me)
>                                         has on the client side only a
>                                         "Sending RPC" message,
thus
>                                         missing the
>                                         "Completed RPC". The
server
>                                         has all three (received,start
>                                         work, done
>                                         work). Has this RPC vanished
>                                         on the way back to the client?
>                                         There is
>                                         no further indication what
>                                         happend. The last timestamp in
>                                         the client
>                                         log is:
>                                         1272565368.228628
>                                         and the server says it
>                                         finished the processing of the
>                                         request at:
>                                         1272565281.379471
>                                         So the client log has been
>                                         recorded long enough to
>                                         contain the
>                                         "Completed RPC" message
for
>                                         this RPC if it arrived
>                                         ever ...
>                                 Logically, yes. But in some cases,
>                                 some debug logs might be abandoned
>                                 for some reasons(actually, it happens
>                                 not rarely), and probably you need
>                                 maintain an average time from server
>                                 "Handled RPC" to client
"Completed
>                                 RPC", then you just guess the client
>                                 "Completed RPC" time in this
case.
>                         
>                         Oh my gosh ;) I don''t want to start
>                         speculations about the helpfulness
>                         of incomplete debug logs. Anyway, what can get
>                         lost? Any kind of
>                         message on the servers and clients? I think
>                         I''d like to know what
>                         cases have to be handled while I try to track
>                         individual RPC''s on
>                         their way.
>                 Any records can get lost here. Unfortunately, there
>                 are not any messages
>                 indicate the missing happened. :(
>                 (Usually, I would check the time stamp in the log,
>                 i.e. no records for a
>                 "long" time, for example several seconds, but
this is
>                 not the accurate
>                 way).
>                 
>                 I guess you can just ignore these uncompleted records
>                 in your first
>                 step? Let''s see how these incomplete log will
>                 impact the profiling result, then we will decide how
>                 to deal with this?
>                 
>                 Thanks
>                 Wangdi
>                         
>                         Regards, Michael
>                         _______________________________________________
>                         Lustre-devel mailing list
>                         Lustre-devel at lists.lustre.org
>                        
http://lists.lustre.org/mailman/listinfo/lustre-devel
>                 
>                 
>                 
>         
>         
>         -- 
>         Michael Kluge, M.Sc.
>         
>         Technische Universit?t Dresden
>         Center for Information Services and
>         High Performance Computing (ZIH)
>         D-01062 Dresden
>         Germany
>         
>         Contact:
>         Willersbau, Room WIL A 208
>         Phone:  (+49) 351 463-34217
>         Fax:    (+49) 351 463-37773
>         e-mail: michael.kluge at tu-dresden.de
>         
>         
>         WWW:    http://www.tu-dresden.de/zih
>         
>         
>         _______________________________________________
>         Lustre-devel mailing list
>         Lustre-devel at lists.lustre.org
>         http://lists.lustre.org/mailman/listinfo/lustre-devel
>         
> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5997 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100517/92de1f31/attachment-0001.bin

Michael Kluge

2010-May-17 05:59 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi Galen,

that''s why we started this :) I guess I need some more input data to be
able to play with the visualization. Vampir also supports markers that
can point to specific (process,timestamp) combinations. If we know that
some kind of condition happened while we collected the trace, it is
probably possible to detect this in the trace converter and to put a
mark into the trace file. That would help finding "interesting points"
faster.


Regards, Michael

Am Sonntag, den 16.05.2010, 23:39 -0400 schrieb Shipman, Galen M.:
> I would be very interested in this, particularly at scale. We should look
into collecting some large scale traces on Jaguar for experimentation of various
visualization techniques.
> 
> Thanks,
> 
> Galen
> 
> 
> On May 16, 2010, at 9:12 AM, Eric Barton wrote:
> 
> Excellent :)
> 
> How do you think measurements taken from 1000 servers with 100,000
> clients can be visualised?  We''ve used heat maps to visualise
> 10s-100s of concurrent measurements (y) over time (x) but I wonder
> if that will scale.  Does vampire support heat maps?
> 
>    Cheers,
>              Eric
> 
> -----Original Message-----
> From: Michael Kluge [mailto:Michael.Kluge at tu-dresden.de]
> Sent: 16 May 2010 10:30 AM
> To: di.wang
> Cc: Eric Barton; Andreas Dilger; Robert Read; Galen M. Shipman;
lustre-devel
> Subject: Re: [Lustre-devel] Lustre RPC visualization
> 
> Hi WangDi,
> 
> the first version works. Screenshot is attached. I have a couple of
> counter realized: RPC''s in flight and RPC''s completed in
total on the
> client, RPC''s enqueued, RPC''s in processing and
RPC''c completed in total
> on the server. All these counter can be broken down by the type of RPC
> (op code). The picture has not yet the lines that show each single RPC,
> I still have to do counter like "avg. time to complete an RPC over the
> last second" and there are some more TODO''s. Like the timer
> synchronization. (In the screenshot the first and the last counter show
> total values while the one in the middle shows a rate.)
> 
> What I like to have is a complete set of traces from a small cluster
> (<100 nodes) including the servers. Would that be possible?
> 
> Is one of you in Hamburg May, 31-June, 3 for ISC''2010?
I''ll be there and
> like to talk about what would be useful for the next steps.
> 
> 
> Regards, Michael
> 
> Am 03.05.2010 21:52, schrieb di.wang:
> Michael Kluge wrote:
> One more question: RPC 1334380768266400 (in the log WangDi sent me)
> has on the client side only a "Sending RPC" message, thus missing
the
> "Completed RPC". The server has all three (received,start work,
done
> work). Has this RPC vanished on the way back to the client? There is
> no further indication what happend. The last timestamp in the client
> log is:
> 1272565368.228628
> and the server says it finished the processing of the request at:
> 1272565281.379471
> So the client log has been recorded long enough to contain the
> "Completed RPC" message for this RPC if it arrived ever ...
> Logically, yes. But in some cases, some debug logs might be abandoned
> for some reasons(actually, it happens not rarely), and probably you need
> maintain an average time from server "Handled RPC" to client
"Completed
> RPC", then you just guess the client "Completed RPC" time in
this case.
> 
> Oh my gosh ;) I don''t want to start speculations about the
helpfulness
> of incomplete debug logs. Anyway, what can get lost? Any kind of
> message on the servers and clients? I think I''d like to know what
> cases have to be handled while I try to track individual RPC''s on
> their way.
> Any records can get lost here. Unfortunately, there are not any messages
> indicate the missing happened. :(
> (Usually, I would check the time stamp in the log, i.e. no records for a
> "long" time, for example several seconds, but this is not the
accurate
> way).
> 
> I guess you can just ignore these uncompleted records in your first
> step? Let''s see how these incomplete log will
> impact the profiling result, then we will decide how to deal with this?
> 
> Thanks
> Wangdi
> 
> Regards, Michael
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org<mailto:Lustre-devel at
lists.lustre.org>
> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
> 
> 
> 
> --
> Michael Kluge, M.Sc.
> 
> Technische Universit?t Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de<mailto:michael.kluge at
tu-dresden.de>
> WWW:    http://www.tu-dresden.de/zih
> 
> 
> Galen M. Shipman
> Group Leader - Technology Integration
> National Center for Computational Sciences
> Oak Ridge National Laboratory
> Office: 865.576.2672
> Cell:   865.307.1209
> Email:  gshipman at ornl.gov<mailto:gshipman at ornl.gov>
> 
> 
> 
> 
-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5997 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100517/c15ea9a9/attachment.bin

Michael Kluge

2010-May-25 12:03 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi WangDi,

so, for the moment I am done with what I promised. The work to be done
is mainly debugging with more input data sets. Screenshot of Vampir
showing the derived counter values for the RPC processing/queue times on
the server and the client is attached. Units for the values are either
microseconds or just a number.


Regards, Michael

Am Sonntag, den 16.05.2010, 11:29 +0200 schrieb Michael Kluge:
> Hi WangDi,
> 
> the first version works. Screenshot is attached. I have a couple of 
> counter realized: RPC''s in flight and RPC''s completed in
total on the
> client, RPC''s enqueued, RPC''s in processing and
RPC''c completed in total
> on the server. All these counter can be broken down by the type of RPC 
> (op code). The picture has not yet the lines that show each single RPC, 
> I still have to do counter like "avg. time to complete an RPC over the
> last second" and there are some more TODO''s. Like the timer 
> synchronization. (In the screenshot the first and the last counter show 
> total values while the one in the middle shows a rate.)
> 
> What I like to have is a complete set of traces from a small cluster 
> (<100 nodes) including the servers. Would that be possible?
> 
> Is one of you in Hamburg May, 31-June, 3 for ISC''2010?
I''ll be there and
> like to talk about what would be useful for the next steps.
> 
> 
> Regards, Michael
> 
> Am 03.05.2010 21:52, schrieb di.wang:
> > Michael Kluge wrote:
> >>>> One more question: RPC 1334380768266400 (in the log WangDi
sent me)
> >>>> has on the client side only a "Sending RPC"
message, thus missing the
> >>>> "Completed RPC". The server has all three
(received,start work, done
> >>>> work). Has this RPC vanished on the way back to the
client? There is
> >>>> no further indication what happend. The last timestamp in
the client
> >>>> log is:
> >>>> 1272565368.228628
> >>>> and the server says it finished the processing of the
request at:
> >>>> 1272565281.379471
> >>>> So the client log has been recorded long enough to contain
the
> >>>> "Completed RPC" message for this RPC if it
arrived ever ...
> >>> Logically, yes. But in some cases, some debug logs might be
abandoned
> >>> for some reasons(actually, it happens not rarely), and
probably you need
> >>> maintain an average time from server "Handled RPC"
to client "Completed
> >>> RPC", then you just guess the client "Completed
RPC" time in this case.
> >>
> >> Oh my gosh ;) I don''t want to start speculations about
the helpfulness
> >> of incomplete debug logs. Anyway, what can get lost? Any kind of
> >> message on the servers and clients? I think I''d like to
know what
> >> cases have to be handled while I try to track individual
RPC''s on
> >> their way.
> > Any records can get lost here. Unfortunately, there are not any
messages
> > indicate the missing happened. :(
> > (Usually, I would check the time stamp in the log, i.e. no records for
a
> > "long" time, for example several seconds, but this is not
the accurate
> > way).
> >
> > I guess you can just ignore these uncompleted records in your first
> > step? Let''s see how these incomplete log will
> > impact the profiling result, then we will decide how to deal with
this?
> >
> > Thanks
> > Wangdi
> >>
> >> Regards, Michael
> >> _______________________________________________
> >> Lustre-devel mailing list
> >> Lustre-devel at lists.lustre.org
> >> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >
> >
> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lustre_rpc_trace.png
Type: image/png
Size: 146793 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100525/66fe7ab6/attachment-0001.png
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5997 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100525/66fe7ab6/attachment-0001.bin

Michael Kluge

2010-May-28 14:54 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi WangDi,
> Looks great! Just query, as you said, "All these counter can be broken
> down by the type of RPC (op code)" , you actually implemented that,
but
> not shown in the attached picture? 
Yes.
> And could you please also add "Server queued RPCs" over time ?
Already done.

One good news: The Feature that Vampir can show something like a heat
map (Eric asked about this) comes back with the release at ISC. It is
now called "performance radar". It can produce a heat map for a
counter
and does some other things as well. I could send a picture around, but
need at first an bigger trace (more hosts generating traces in
parallel).


Regards, Michael
> Thanks
> WangDi
> 
> Michael Kluge wrote:
> > Hi WangDi,
> >
> > so, for the moment I am done with what I promised. The work to be done
> > is mainly debugging with more input data sets. Screenshot of Vampir
> > showing the derived counter values for the RPC processing/queue times
on
> > the server and the client is attached. Units for the values are either
> > microseconds or just a number.
> >
> >
> > Regards, Michael
> >
> > Am Sonntag, den 16.05.2010, 11:29 +0200 schrieb Michael Kluge: 
> >   
> >> Hi WangDi,
> >>
> >> the first version works. Screenshot is attached. I have a couple
of
> >> counter realized: RPC''s in flight and RPC''s
completed in total on the
> >> client, RPC''s enqueued, RPC''s in processing and
RPC''c completed in total
> >> on the server. All these counter can be broken down by the type of
RPC
> >> (op code). The picture has not yet the lines that show each single
RPC,
> >> I still have to do counter like "avg. time to complete an RPC
over the
> >> last second" and there are some more TODO''s. Like
the timer
> >> synchronization. (In the screenshot the first and the last counter
show
> >> total values while the one in the middle shows a rate.)
> >>
> >> What I like to have is a complete set of traces from a small
cluster
> >> (<100 nodes) including the servers. Would that be possible?
> >>
> >> Is one of you in Hamburg May, 31-June, 3 for ISC''2010?
I''ll be there and
> >> like to talk about what would be useful for the next steps.
> >>
> >>
> >> Regards, Michael
> >>
> >> Am 03.05.2010 21:52, schrieb di.wang:
> >>     
> >>> Michael Kluge wrote:
> >>>       
> >>>>>> One more question: RPC 1334380768266400 (in the
log WangDi sent me)
> >>>>>> has on the client side only a "Sending
RPC" message, thus missing the
> >>>>>> "Completed RPC". The server has all
three (received,start work, done
> >>>>>> work). Has this RPC vanished on the way back to
the client? There is
> >>>>>> no further indication what happend. The last
timestamp in the client
> >>>>>> log is:
> >>>>>> 1272565368.228628
> >>>>>> and the server says it finished the processing of
the request at:
> >>>>>> 1272565281.379471
> >>>>>> So the client log has been recorded long enough to
contain the
> >>>>>> "Completed RPC" message for this RPC if
it arrived ever ...
> >>>>>>             
> >>>>> Logically, yes. But in some cases, some debug logs
might be abandoned
> >>>>> for some reasons(actually, it happens not rarely), and
probably you need
> >>>>> maintain an average time from server "Handled
RPC" to client "Completed
> >>>>> RPC", then you just guess the client
"Completed RPC" time in this case.
> >>>>>           
> >>>> Oh my gosh ;) I don''t want to start speculations
about the helpfulness
> >>>> of incomplete debug logs. Anyway, what can get lost? Any
kind of
> >>>> message on the servers and clients? I think I''d
like to know what
> >>>> cases have to be handled while I try to track individual
RPC''s on
> >>>> their way.
> >>>>         
> >>> Any records can get lost here. Unfortunately, there are not
any messages
> >>> indicate the missing happened. :(
> >>> (Usually, I would check the time stamp in the log, i.e. no
records for a
> >>> "long" time, for example several seconds, but this
is not the accurate
> >>> way).
> >>>
> >>> I guess you can just ignore these uncompleted records in your
first
> >>> step? Let''s see how these incomplete log will
> >>> impact the profiling result, then we will decide how to deal
with this?
> >>>
> >>> Thanks
> >>> Wangdi
> >>>       
> >>>> Regards, Michael
> >>>> _______________________________________________
> >>>> Lustre-devel mailing list
> >>>> Lustre-devel at lists.lustre.org
> >>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >>>>         
> >>>       
> >> _______________________________________________
> >> Lustre-devel mailing list
> >> Lustre-devel at lists.lustre.org
> >> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >>     
> >
> >   
> >
> >
------------------------------------------------------------------------
> >
> 
> 
-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5997 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100528/3897d501/attachment.bin

di.wang

2010-Jun-01 12:12 UTC

head link

[Lustre-devel] Lustre RPC visualization

Eric Barton wrote:> I''d really like to see how vampire handles _all_ the trace data
> we can throw at it.  If 600 clients is a pain, how bad will it
> be at 60,000?   
>
> What in particular makes collecting traces from all the clients
> + servers hard?  How can/should we automate it or otherwise make
> it easier?
>   Aha, I do not mean it is difficult to get all trace logs.

I had thought the current target is to make sure vampire can handle
the trace from "big" enough clusters.

Actually, I am not sure whether vampire(with this trace analyse) should 
work in
this way as we do now?  i.e. running job exclusively on the cluster, get 
rpctrace log, then get
graph by vampire?

Or I miss sth here? Michale, could you please explain a bit? What is 
your idea how vampire could help the end users? or the target here is 
just helping the developers and sysadmin to understand the system?

Thanks
WangDi>     Cheers,
>               Eric
>
>   
>> -----Original Message-----
>> From: di.wang [mailto:di.wang at oracle.com]
>> Sent: 01 June 2010 12:50 PM
>> To: Robert Read
>> Cc: Michael Kluge; Eric Barton; Galen M. Shipman
>> Subject: Re: [Lustre-devel] Lustre RPC visualization
>>
>> Hello,
>>
>> IMHO, just run IOR with whatever parameters, and get rpctrace
>> log(probably only enable rpctrace) from 1 OST and some of clients
>> (probably 2 is enough).
>> Note: please make sure these 2 clients did communicate the OST during
>> the IOR.
>>
>> Michael, I do not think you need all the trace logs from the clients.
right?
>>
>> Robert
>>
>> If there are available time slots for this test on Hyperion, who can
>> help to get these logs?
>>
>> Thanks
>> Wangdi
>>
>> Robert Read wrote:
>>     
>>> What should I run then? Do have scripts to capture this?
>>>
>>> robert
>>>
>>> On May 31, 2010, at 2:39 , Michael Kluge wrote:
>>>
>>>       
>>>> Hi Robert,
>>>>
>>>> 600 is a nice number. Plus the traces from the server an I am
happy.
>>>>
>>>>
>>>> Michael
>>>>
>>>> Am 28.05.2010 um 17:53 schrieb Robert Read:
>>>>
>>>>         
>>>>> On May 28, 2010, at 4:09 , di.wang wrote:
>>>>>
>>>>>           
>>>>>> Hello, Michael
>>>>>>
>>>>>>             
>>>>>>> One good news: The Feature that Vampir can show
something like a heat
>>>>>>> map (Eric asked about this) comes back with the
release at ISC. It is
>>>>>>> now called "performance radar". It can
produce a heat map for a
>>>>>>> counter
>>>>>>> and does some other things as well. I could send a
picture around, but
>>>>>>> need at first an bigger trace (more hosts
generating traces in
>>>>>>> parallel).
>>>>>>>
>>>>>>>               
>>>>>> Right now I do not have big clusters available to
generate the trace.
>>>>>> I will see what I can do here.
>>>>>>             
>>>>> If ~600 clients is big enough we could generate that on
Hyperion.
>>>>>
>>>>> robert
>>>>>
>>>>>           
>>>>>> Thanks
>>>>>> WangDi
>>>>>>             
>>>>>>>
>>>>>>>               
>>>>>>>>>                   
>>>>>>>
------------------------------------------------------------------------
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Lustre-devel mailing list
>>>>>>> Lustre-devel at lists.lustre.org
<mailto:Lustre-devel at lists.lustre.org>
>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>>
>>>>>>>               
>>>>>           
>>>> --
>>>>
>>>> Michael Kluge, M.Sc.
>>>>
>>>> Technische Universit?t Dresden
>>>> Center for Information Services and
>>>> High Performance Computing (ZIH)
>>>> D-01062 Dresden
>>>> Germany
>>>>
>>>> Contact:
>>>> Willersbau, Room WIL A 208
>>>> Phone:  (+49) 351 463-34217
>>>> Fax:    (+49) 351 463-37773
>>>> e-mail: michael.kluge at tu-dresden.de <mailto:michael.kluge
at tu-dresden.de>
>>>> WWW:    http://www.tu-dresden.de/zih
>>>>
>>>>         
>
>
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
>

Eric Barton

2010-Jun-01 15:58 UTC

head link

[Lustre-devel] Lustre RPC visualization

I''d really like to see how vampire handles _all_ the trace data
we can throw at it.  If 600 clients is a pain, how bad will it
be at 60,000?   

What in particular makes collecting traces from all the clients
+ servers hard?  How can/should we automate it or otherwise make
it easier?

    Cheers,
              Eric
> -----Original Message-----
> From: di.wang [mailto:di.wang at oracle.com]
> Sent: 01 June 2010 12:50 PM
> To: Robert Read
> Cc: Michael Kluge; Eric Barton; Galen M. Shipman
> Subject: Re: [Lustre-devel] Lustre RPC visualization
> 
> Hello,
> 
> IMHO, just run IOR with whatever parameters, and get rpctrace
> log(probably only enable rpctrace) from 1 OST and some of clients
> (probably 2 is enough).
> Note: please make sure these 2 clients did communicate the OST during
> the IOR.
> 
> Michael, I do not think you need all the trace logs from the clients.
right?
> 
> Robert
> 
> If there are available time slots for this test on Hyperion, who can
> help to get these logs?
> 
> Thanks
> Wangdi
> 
> Robert Read wrote:
> > What should I run then? Do have scripts to capture this?
> >
> > robert
> >
> > On May 31, 2010, at 2:39 , Michael Kluge wrote:
> >
> >> Hi Robert,
> >>
> >> 600 is a nice number. Plus the traces from the server an I am
happy.
> >>
> >>
> >> Michael
> >>
> >> Am 28.05.2010 um 17:53 schrieb Robert Read:
> >>
> >>>
> >>> On May 28, 2010, at 4:09 , di.wang wrote:
> >>>
> >>>> Hello, Michael
> >>>>
> >>>>> One good news: The Feature that Vampir can show
something like a heat
> >>>>> map (Eric asked about this) comes back with the
release at ISC. It is
> >>>>> now called "performance radar". It can
produce a heat map for a
> >>>>> counter
> >>>>> and does some other things as well. I could send a
picture around, but
> >>>>> need at first an bigger trace (more hosts generating
traces in
> >>>>> parallel).
> >>>>>
> >>>> Right now I do not have big clusters available to generate
the trace.
> >>>> I will see what I can do here.
> >>>
> >>> If ~600 clients is big enough we could generate that on
Hyperion.
> >>>
> >>> robert
> >>>
> >>>>
> >>>> Thanks
> >>>> WangDi
> >>>>>
> >>>>>
> >>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
------------------------------------------------------------------------
> >>>>>
> >>>>> _______________________________________________
> >>>>> Lustre-devel mailing list
> >>>>> Lustre-devel at lists.lustre.org
<mailto:Lustre-devel at lists.lustre.org>
> >>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >> --
> >>
> >> Michael Kluge, M.Sc.
> >>
> >> Technische Universit?t Dresden
> >> Center for Information Services and
> >> High Performance Computing (ZIH)
> >> D-01062 Dresden
> >> Germany
> >>
> >> Contact:
> >> Willersbau, Room WIL A 208
> >> Phone:  (+49) 351 463-34217
> >> Fax:    (+49) 351 463-37773
> >> e-mail: michael.kluge at tu-dresden.de <mailto:michael.kluge at
tu-dresden.de>
> >> WWW:    http://www.tu-dresden.de/zih
> >>
> >

Andreas Dilger

2010-Jun-01 17:03 UTC

head link

[Lustre-devel] Lustre RPC visualization

On 2010-06-01, at 06:12, di.wang wrote:> Michael, I do not think you need all the trace logs from the clients.
right?
Actually, I think he does want trace logs from all of the clients.
> I had thought the current target is to make sure vampire can handle
> the trace from "big" enough clusters.
> 
> Actually, I am not sure whether vampire(with this trace analyse) should 
> work in this way as we do now?  i.e. running job exclusively on the
cluster, get rpctrace log, then get graph by vampire?
> 
> Or I miss sth here? Michale, could you please explain a bit? What is 
> your idea how vampire could help the end users? or the target here is 
> just helping the developers and sysadmin to understand the system?
My thoughts on this in the past were that it should be possible to use ONLY the
client logs to plot syscall latencies (using client VFSTRACE lines) and the RPC
latencies (using client RPCTRACE lines) and locking (using DLMTRACE lines). 
Some code rework would be needed to allow regular users to run "lctl
dk", and it would filter out the lines specific to their own processes. 
This would allow regular users to collect and analyze their application without
assistance or coordination from the sysadmin.

In order to allow regular users to trace such information, the VFSTRACE calls
should report the UID of the process doing the system call, which would
immediately map to a PID.  The PID can be used to track the majority of the
debug entries, but not always those done in another thread (e.g. ptlrpcd).  The
RPCTRACE messages also contain the PID, so that would be helpful, but it would
mean that there has to be a parser/filter in the kernel to ensure users cannot
access trace information that is not their own.  That would be a significant
undertaking, I think.

In the meantime, for testing purposes and initial usage (with sysadmin
assistance) the full debug log can be extracted from the kernel and filtered in
userspace as needed.
>>> -----Original Message-----
>>> From: di.wang [mailto:di.wang at oracle.com]
>>> Sent: 01 June 2010 12:50 PM
>>> To: Robert Read
>>> Cc: Michael Kluge; Eric Barton; Galen M. Shipman
>>> Subject: Re: [Lustre-devel] Lustre RPC visualization
>>> 
>>> Hello,
>>> 
>>> IMHO, just run IOR with whatever parameters, and get rpctrace
>>> log(probably only enable rpctrace) from 1 OST and some of clients
>>> (probably 2 is enough).
>>> Note: please make sure these 2 clients did communicate the OST
during
>>> the IOR.
>>> 
>>> Michael, I do not think you need all the trace logs from the
clients. right?
>>> 
>>> Robert
>>> 
>>> If there are available time slots for this test on Hyperion, who
can
>>> help to get these logs?
>>> 
>>> Thanks
>>> Wangdi
>>> 
>>> Robert Read wrote:
>>> 
>>>> What should I run then? Do have scripts to capture this?
>>>> 
>>>> robert
>>>> 
>>>> On May 31, 2010, at 2:39 , Michael Kluge wrote:
>>>> 
>>>> 
>>>>> Hi Robert,
>>>>> 
>>>>> 600 is a nice number. Plus the traces from the server an I
am happy.
>>>>> 
>>>>> 
>>>>> Michael
>>>>> 
>>>>> Am 28.05.2010 um 17:53 schrieb Robert Read:
>>>>> 
>>>>> 
>>>>>> On May 28, 2010, at 4:09 , di.wang wrote:
>>>>>> 
>>>>>> 
>>>>>>> Hello, Michael
>>>>>>> 
>>>>>>> 
>>>>>>>> One good news: The Feature that Vampir can show
something like a heat
>>>>>>>> map (Eric asked about this) comes back with the
release at ISC. It is
>>>>>>>> now called "performance radar". It
can produce a heat map for a
>>>>>>>> counter
>>>>>>>> and does some other things as well. I could
send a picture around, but
>>>>>>>> need at first an bigger trace (more hosts
generating traces in
>>>>>>>> parallel).
>>>>>>>> 
>>>>>>>> 
>>>>>>> Right now I do not have big clusters available to
generate the trace.
>>>>>>> I will see what I can do here.
>>>>>>> 
>>>>>> If ~600 clients is big enough we could generate that on
Hyperion.
>>>>>> 
>>>>>> robert
>>>>>> 
>>>>>> 
>>>>>>> Thanks
>>>>>>> WangDi
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>
------------------------------------------------------------------------
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> Lustre-devel mailing list
>>>>>>>> Lustre-devel at lists.lustre.org
<mailto:Lustre-devel at lists.lustre.org>
>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>> --
>>>>> 
>>>>> Michael Kluge, M.Sc.
>>>>> 
>>>>> Technische Universit?t Dresden
>>>>> Center for Information Services and
>>>>> High Performance Computing (ZIH)
>>>>> D-01062 Dresden
>>>>> Germany
>>>>> 
>>>>> Contact:
>>>>> Willersbau, Room WIL A 208
>>>>> Phone:  (+49) 351 463-34217
>>>>> Fax:    (+49) 351 463-37773
>>>>> e-mail: michael.kluge at tu-dresden.de
<mailto:michael.kluge at tu-dresden.de>
>>>>> WWW:    http://www.tu-dresden.de/zih
>>>>> 
>>>>> 
>> 
>> 
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>> 
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

Michael Kluge

2010-Jun-01 19:39 UTC

head link

[Lustre-devel] Lustre RPC visualization

> On 2010-06-01, at 06:12, di.wang wrote:
>> Michael, I do not think you need all the trace logs from the clients.
right?
> 
> Actually, I think he does want trace logs from all of the clients.
Yes, 600 is a good number. Vampir can easily handle this. If possible,
I''d like to have all server traces as well to include this information.
Right now I am only putting these RPC''s in the trace where all 5 events
(client send,server recv/start/done,client done) are present. I see the
"client send/done" events in the OSS log as well, probably it is
talking to the MDS or MGS. Events that are incomplete are being ignored but
counted in "dumped events" counter. But that could be changed.


Michael
>> I had thought the current target is to make sure vampire can handle
>> the trace from "big" enough clusters.
>> 
>> Actually, I am not sure whether vampire(with this trace analyse) should
>> work in this way as we do now?  i.e. running job exclusively on the
cluster, get rpctrace log, then get graph by vampire?
>> 
>> Or I miss sth here? Michale, could you please explain a bit? What is 
>> your idea how vampire could help the end users? or the target here is 
>> just helping the developers and sysadmin to understand the system?
> 
> My thoughts on this in the past were that it should be possible to use ONLY
the client logs to plot syscall latencies (using client VFSTRACE lines) and the
RPC latencies (using client RPCTRACE lines) and locking (using DLMTRACE lines). 
Some code rework would be needed to allow regular users to run "lctl
dk", and it would filter out the lines specific to their own processes. 
This would allow regular users to collect and analyze their application without
assistance or coordination from the sysadmin.
> 
> In order to allow regular users to trace such information, the VFSTRACE
calls should report the UID of the process doing the system call, which would
immediately map to a PID.  The PID can be used to track the majority of the
debug entries, but not always those done in another thread (e.g. ptlrpcd).  The
RPCTRACE messages also contain the PID, so that would be helpful, but it would
mean that there has to be a parser/filter in the kernel to ensure users cannot
access trace information that is not their own.  That would be a significant
undertaking, I think.
> 
> In the meantime, for testing purposes and initial usage (with sysadmin
assistance) the full debug log can be extracted from the kernel and filtered in
userspace as needed.
> 
>>>> -----Original Message-----
>>>> From: di.wang [mailto:di.wang at oracle.com]
>>>> Sent: 01 June 2010 12:50 PM
>>>> To: Robert Read
>>>> Cc: Michael Kluge; Eric Barton; Galen M. Shipman
>>>> Subject: Re: [Lustre-devel] Lustre RPC visualization
>>>> 
>>>> Hello,
>>>> 
>>>> IMHO, just run IOR with whatever parameters, and get rpctrace
>>>> log(probably only enable rpctrace) from 1 OST and some of
clients
>>>> (probably 2 is enough).
>>>> Note: please make sure these 2 clients did communicate the OST
during
>>>> the IOR.
>>>> 
>>>> Michael, I do not think you need all the trace logs from the
clients. right?
>>>> 
>>>> Robert
>>>> 
>>>> If there are available time slots for this test on Hyperion,
who can
>>>> help to get these logs?
>>>> 
>>>> Thanks
>>>> Wangdi
>>>> 
>>>> Robert Read wrote:
>>>> 
>>>>> What should I run then? Do have scripts to capture this?
>>>>> 
>>>>> robert
>>>>> 
>>>>> On May 31, 2010, at 2:39 , Michael Kluge wrote:
>>>>> 
>>>>> 
>>>>>> Hi Robert,
>>>>>> 
>>>>>> 600 is a nice number. Plus the traces from the server
an I am happy.
>>>>>> 
>>>>>> 
>>>>>> Michael
>>>>>> 
>>>>>> Am 28.05.2010 um 17:53 schrieb Robert Read:
>>>>>> 
>>>>>> 
>>>>>>> On May 28, 2010, at 4:09 , di.wang wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> Hello, Michael
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> One good news: The Feature that Vampir can
show something like a heat
>>>>>>>>> map (Eric asked about this) comes back with
the release at ISC. It is
>>>>>>>>> now called "performance radar".
It can produce a heat map for a
>>>>>>>>> counter
>>>>>>>>> and does some other things as well. I could
send a picture around, but
>>>>>>>>> need at first an bigger trace (more hosts
generating traces in
>>>>>>>>> parallel).
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> Right now I do not have big clusters available
to generate the trace.
>>>>>>>> I will see what I can do here.
>>>>>>>> 
>>>>>>> If ~600 clients is big enough we could generate
that on Hyperion.
>>>>>>> 
>>>>>>> robert
>>>>>>> 
>>>>>>> 
>>>>>>>> Thanks
>>>>>>>> WangDi
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>
------------------------------------------------------------------------
>>>>>>>>> 
>>>>>>>>>
_______________________________________________
>>>>>>>>> Lustre-devel mailing list
>>>>>>>>> Lustre-devel at lists.lustre.org
<mailto:Lustre-devel at lists.lustre.org>
>>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Michael Kluge, M.Sc.
>>>>>> 
>>>>>> Technische Universit?t Dresden
>>>>>> Center for Information Services and
>>>>>> High Performance Computing (ZIH)
>>>>>> D-01062 Dresden
>>>>>> Germany
>>>>>> 
>>>>>> Contact:
>>>>>> Willersbau, Room WIL A 208
>>>>>> Phone:  (+49) 351 463-34217
>>>>>> Fax:    (+49) 351 463-37773
>>>>>> e-mail: michael.kluge at tu-dresden.de
<mailto:michael.kluge at tu-dresden.de>
>>>>>> WWW:    http://www.tu-dresden.de/zih
>>>>>> 
>>>>>> 
>>> 
>>> 
>>> _______________________________________________
>>> Lustre-devel mailing list
>>> Lustre-devel at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>> 
>> 
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 

-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100601/557b4f54/attachment-0001.html

Michael Kluge

2010-Jun-16 08:46 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi Robert,

thanks for the trace files. Every single file contains log entries like:

line:
00000100:00100000:2.0F:1275433209.964416:0:7924:0:(events.c:285:request_in_callback())
incoming req at ffff8101b32f5400 x1337373369608379 msgsize 296

Are in the tarball 684 server logs? Or am I wrong with the assumption that the
lines with an "incoming req@" can only show up on the servers?


Michael

Am 01.06.2010 um 21:39 schrieb Michael Kluge:
>> On 2010-06-01, at 06:12, di.wang wrote:
>>> Michael, I do not think you need all the trace logs from the
clients. right?
>> 
>> Actually, I think he does want trace logs from all of the clients.
> 
> Yes, 600 is a good number. Vampir can easily handle this. If possible,
I''d like to have all server traces as well to include this information.
Right now I am only putting these RPC''s in the trace where all 5 events
(client send,server recv/start/done,client done) are present. I see the
"client send/done" events in the OSS log as well, probably it is
talking to the MDS or MGS. Events that are incomplete are being ignored but
counted in "dumped events" counter. But that could be changed.
> 
> 
> Michael
> 
>>> I had thought the current target is to make sure vampire can handle
>>> the trace from "big" enough clusters.
>>> 
>>> Actually, I am not sure whether vampire(with this trace analyse)
should
>>> work in this way as we do now?  i.e. running job exclusively on the
cluster, get rpctrace log, then get graph by vampire?
>>> 
>>> Or I miss sth here? Michale, could you please explain a bit? What
is
>>> your idea how vampire could help the end users? or the target here
is
>>> just helping the developers and sysadmin to understand the system?
>> 
>> My thoughts on this in the past were that it should be possible to use
ONLY the client logs to plot syscall latencies (using client VFSTRACE lines) and
the RPC latencies (using client RPCTRACE lines) and locking (using DLMTRACE
lines).  Some code rework would be needed to allow regular users to run
"lctl dk", and it would filter out the lines specific to their own
processes.  This would allow regular users to collect and analyze their
application without assistance or coordination from the sysadmin.
>> 
>> In order to allow regular users to trace such information, the VFSTRACE
calls should report the UID of the process doing the system call, which would
immediately map to a PID.  The PID can be used to track the majority of the
debug entries, but not always those done in another thread (e.g. ptlrpcd).  The
RPCTRACE messages also contain the PID, so that would be helpful, but it would
mean that there has to be a parser/filter in the kernel to ensure users cannot
access trace information that is not their own.  That would be a significant
undertaking, I think.
>> 
>> In the meantime, for testing purposes and initial usage (with sysadmin
assistance) the full debug log can be extracted from the kernel and filtered in
userspace as needed.
>> 
>>>>> -----Original Message-----
>>>>> From: di.wang [mailto:di.wang at oracle.com]
>>>>> Sent: 01 June 2010 12:50 PM
>>>>> To: Robert Read
>>>>> Cc: Michael Kluge; Eric Barton; Galen M. Shipman
>>>>> Subject: Re: [Lustre-devel] Lustre RPC visualization
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> IMHO, just run IOR with whatever parameters, and get
rpctrace
>>>>> log(probably only enable rpctrace) from 1 OST and some of
clients
>>>>> (probably 2 is enough).
>>>>> Note: please make sure these 2 clients did communicate the
OST during
>>>>> the IOR.
>>>>> 
>>>>> Michael, I do not think you need all the trace logs from
the clients. right?
>>>>> 
>>>>> Robert
>>>>> 
>>>>> If there are available time slots for this test on
Hyperion, who can
>>>>> help to get these logs?
>>>>> 
>>>>> Thanks
>>>>> Wangdi
>>>>> 
>>>>> Robert Read wrote:
>>>>> 
>>>>>> What should I run then? Do have scripts to capture
this?
>>>>>> 
>>>>>> robert
>>>>>> 
>>>>>> On May 31, 2010, at 2:39 , Michael Kluge wrote:
>>>>>> 
>>>>>> 
>>>>>>> Hi Robert,
>>>>>>> 
>>>>>>> 600 is a nice number. Plus the traces from the
server an I am happy.
>>>>>>> 
>>>>>>> 
>>>>>>> Michael
>>>>>>> 
>>>>>>> Am 28.05.2010 um 17:53 schrieb Robert Read:
>>>>>>> 
>>>>>>> 
>>>>>>>> On May 28, 2010, at 4:09 , di.wang wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Hello, Michael
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> One good news: The Feature that Vampir
can show something like a heat
>>>>>>>>>> map (Eric asked about this) comes back
with the release at ISC. It is
>>>>>>>>>> now called "performance
radar". It can produce a heat map for a
>>>>>>>>>> counter
>>>>>>>>>> and does some other things as well. I
could send a picture around, but
>>>>>>>>>> need at first an bigger trace (more
hosts generating traces in
>>>>>>>>>> parallel).
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> Right now I do not have big clusters
available to generate the trace.
>>>>>>>>> I will see what I can do here.
>>>>>>>>> 
>>>>>>>> If ~600 clients is big enough we could generate
that on Hyperion.
>>>>>>>> 
>>>>>>>> robert
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> WangDi
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>
------------------------------------------------------------------------
>>>>>>>>>> 
>>>>>>>>>>
_______________________________________________
>>>>>>>>>> Lustre-devel mailing list
>>>>>>>>>> Lustre-devel at lists.lustre.org
<mailto:Lustre-devel at lists.lustre.org>
>>>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> --
>>>>>>> 
>>>>>>> Michael Kluge, M.Sc.
>>>>>>> 
>>>>>>> Technische Universit?t Dresden
>>>>>>> Center for Information Services and
>>>>>>> High Performance Computing (ZIH)
>>>>>>> D-01062 Dresden
>>>>>>> Germany
>>>>>>> 
>>>>>>> Contact:
>>>>>>> Willersbau, Room WIL A 208
>>>>>>> Phone:  (+49) 351 463-34217
>>>>>>> Fax:    (+49) 351 463-37773
>>>>>>> e-mail: michael.kluge at tu-dresden.de
<mailto:michael.kluge at tu-dresden.de>
>>>>>>> WWW:    http://www.tu-dresden.de/zih
>>>>>>> 
>>>>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Lustre-devel mailing list
>>>> Lustre-devel at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>> 
>>> 
>>> _______________________________________________
>>> Lustre-devel mailing list
>>> Lustre-devel at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>> 
>> 
>> Cheers, Andreas
>> --
>> Andreas Dilger
>> Lustre Technical Lead
>> Oracle Corporation Canada Inc.
>> 
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>> 
> 
> 
> -- 
> 
> Michael Kluge, M.Sc.
> 
> Technische Universit?t Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100616/4755505c/attachment-0001.html

Andreas Dilger

2010-Jun-16 14:50 UTC

head link

[Lustre-devel] Lustre RPC visualization

On 2010-06-16, at 02:46, Michael Kluge wrote:> thanks for the trace files. Every single file contains log entries like:
> 
> line:
00000100:00100000:2.0F:1275433209.964416:0:7924:0:(events.c:285:request_in_callback())
incoming req at ffff8101b32f5400 x1337373369608379 msgsize 296
> 
> Are in the tarball 684 server logs? Or am I wrong with the assumption that
the lines with an "incoming req@" can only show up on the servers?
Clients can receive requests from the server for things like lock cancellations,
so this line will appear in both client and server logs.  We don''t have
a test system with 684 servers...
> Am 01.06.2010 um 21:39 schrieb Michael Kluge:
> 
>>> On 2010-06-01, at 06:12, di.wang wrote:
>>>> Michael, I do not think you need all the trace logs from the
clients. right?
>>> 
>>> Actually, I think he does want trace logs from all of the clients.
>> 
>> Yes, 600 is a good number. Vampir can easily handle this. If possible,
I''d like to have all server traces as well to include this information.
Right now I am only putting these RPC''s in the trace where all 5 events
(client send,server recv/start/done,client done) are present. I see the
"client send/done" events in the OSS log as well, probably it is
talking to the MDS or MGS. Events that are incomplete are being ignored but
counted in "dumped events" counter. But that could be changed.
>> 
>> 
>> Michael
>> 
>>>> I had thought the current target is to make sure vampire can
handle
>>>> the trace from "big" enough clusters.
>>>> 
>>>> Actually, I am not sure whether vampire(with this trace
analyse) should
>>>> work in this way as we do now?  i.e. running job exclusively on
the cluster, get rpctrace log, then get graph by vampire?
>>>> 
>>>> Or I miss sth here? Michale, could you please explain a bit?
What is
>>>> your idea how vampire could help the end users? or the target
here is
>>>> just helping the developers and sysadmin to understand the
system?
>>> 
>>> My thoughts on this in the past were that it should be possible to
use ONLY the client logs to plot syscall latencies (using client VFSTRACE lines)
and the RPC latencies (using client RPCTRACE lines) and locking (using DLMTRACE
lines).  Some code rework would be needed to allow regular users to run
"lctl dk", and it would filter out the lines specific to their own
processes.  This would allow regular users to collect and analyze their
application without assistance or coordination from the sysadmin.
>>> 
>>> In order to allow regular users to trace such information, the
VFSTRACE calls should report the UID of the process doing the system call, which
would immediately map to a PID.  The PID can be used to track the majority of
the debug entries, but not always those done in another thread (e.g. ptlrpcd). 
The RPCTRACE messages also contain the PID, so that would be helpful, but it
would mean that there has to be a parser/filter in the kernel to ensure users
cannot access trace information that is not their own.  That would be a
significant undertaking, I think.
>>> 
>>> In the meantime, for testing purposes and initial usage (with
sysadmin assistance) the full debug log can be extracted from the kernel and
filtered in userspace as needed.
>>> 
>>>>>> -----Original Message-----
>>>>>> From: di.wang [mailto:di.wang at oracle.com]
>>>>>> Sent: 01 June 2010 12:50 PM
>>>>>> To: Robert Read
>>>>>> Cc: Michael Kluge; Eric Barton; Galen M. Shipman
>>>>>> Subject: Re: [Lustre-devel] Lustre RPC visualization
>>>>>> 
>>>>>> Hello,
>>>>>> 
>>>>>> IMHO, just run IOR with whatever parameters, and get
rpctrace
>>>>>> log(probably only enable rpctrace) from 1 OST and some
of clients
>>>>>> (probably 2 is enough).
>>>>>> Note: please make sure these 2 clients did communicate
the OST during
>>>>>> the IOR.
>>>>>> 
>>>>>> Michael, I do not think you need all the trace logs
from the clients. right?
>>>>>> 
>>>>>> Robert
>>>>>> 
>>>>>> If there are available time slots for this test on
Hyperion, who can
>>>>>> help to get these logs?
>>>>>> 
>>>>>> Thanks
>>>>>> Wangdi
>>>>>> 
>>>>>> Robert Read wrote:
>>>>>> 
>>>>>>> What should I run then? Do have scripts to capture
this?
>>>>>>> 
>>>>>>> robert
>>>>>>> 
>>>>>>> On May 31, 2010, at 2:39 , Michael Kluge wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> Hi Robert,
>>>>>>>> 
>>>>>>>> 600 is a nice number. Plus the traces from the
server an I am happy.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Michael
>>>>>>>> 
>>>>>>>> Am 28.05.2010 um 17:53 schrieb Robert Read:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On May 28, 2010, at 4:09 , di.wang wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Hello, Michael
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> One good news: The Feature that
Vampir can show something like a heat
>>>>>>>>>>> map (Eric asked about this) comes
back with the release at ISC. It is
>>>>>>>>>>> now called "performance
radar". It can produce a heat map for a
>>>>>>>>>>> counter
>>>>>>>>>>> and does some other things as well.
I could send a picture around, but
>>>>>>>>>>> need at first an bigger trace (more
hosts generating traces in
>>>>>>>>>>> parallel).
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> Right now I do not have big clusters
available to generate the trace.
>>>>>>>>>> I will see what I can do here.
>>>>>>>>>> 
>>>>>>>>> If ~600 clients is big enough we could
generate that on Hyperion.
>>>>>>>>> 
>>>>>>>>> robert
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> Thanks
>>>>>>>>>> WangDi
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>
------------------------------------------------------------------------
>>>>>>>>>>> 
>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>> Lustre-devel mailing list
>>>>>>>>>>> Lustre-devel at lists.lustre.org
<mailto:Lustre-devel at lists.lustre.org>
>>>>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> --
>>>>>>>> 
>>>>>>>> Michael Kluge, M.Sc.
>>>>>>>> 
>>>>>>>> Technische Universit?t Dresden
>>>>>>>> Center for Information Services and
>>>>>>>> High Performance Computing (ZIH)
>>>>>>>> D-01062 Dresden
>>>>>>>> Germany
>>>>>>>> 
>>>>>>>> Contact:
>>>>>>>> Willersbau, Room WIL A 208
>>>>>>>> Phone:  (+49) 351 463-34217
>>>>>>>> Fax:    (+49) 351 463-37773
>>>>>>>> e-mail: michael.kluge at tu-dresden.de
<mailto:michael.kluge at tu-dresden.de>
>>>>>>>> WWW:    http://www.tu-dresden.de/zih
>>>>>>>> 
>>>>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> Lustre-devel mailing list
>>>>> Lustre-devel at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>> 
>>>> 
>>>> _______________________________________________
>>>> Lustre-devel mailing list
>>>> Lustre-devel at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>> 
>>> 
>>> Cheers, Andreas
>>> --
>>> Andreas Dilger
>>> Lustre Technical Lead
>>> Oracle Corporation Canada Inc.
>>> 
>>> _______________________________________________
>>> Lustre-devel mailing list
>>> Lustre-devel at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>> 
>> 
>> 
>> -- 
>> 
>> Michael Kluge, M.Sc.
>> 
>> Technische Universit?t Dresden
>> Center for Information Services and
>> High Performance Computing (ZIH)
>> D-01062 Dresden
>> Germany
>> 
>> Contact:
>> Willersbau, Room WIL A 208
>> Phone:  (+49) 351 463-34217
>> Fax:    (+49) 351 463-37773
>> e-mail: michael.kluge at tu-dresden.de
>> WWW:    http://www.tu-dresden.de/zih
>> 
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
> 
> -- 
> 
> Michael Kluge, M.Sc.
> 
> Technische Universit?t Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih
> 

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

Michael Kluge

2010-Jun-17 14:02 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hmm. Ok. Is there a way to figure out what type of file I have if I have
nothing like the (most probably incomplete) file itself? Whether it is a
client or a server log and which NID the node has had? Right now the
code is executing a not-so-easy mapping algorithm over all files to
figure this out. The way to distinguish between client and server was to
look at those lines which I expected to see only on the server (the
"incoming req at .....").

Michael

Am Mittwoch, den 16.06.2010, 08:50 -0600 schrieb Andreas Dilger:
> On 2010-06-16, at 02:46, Michael Kluge wrote:
> > thanks for the trace files. Every single file contains log entries
like:
> > 
> > line:
00000100:00100000:2.0F:1275433209.964416:0:7924:0:(events.c:285:request_in_callback())
incoming req at ffff8101b32f5400 x1337373369608379 msgsize 296
> > 
> > Are in the tarball 684 server logs? Or am I wrong with the assumption
that the lines with an "incoming req@" can only show up on the
servers?
> 
> Clients can receive requests from the server for things like lock
cancellations, so this line will appear in both client and server logs.  We
don''t have a test system with 684 servers...
> 
> > Am 01.06.2010 um 21:39 schrieb Michael Kluge:
> > 
> >>> On 2010-06-01, at 06:12, di.wang wrote:
> >>>> Michael, I do not think you need all the trace logs from
the clients. right?
> >>> 
> >>> Actually, I think he does want trace logs from all of the
clients.
> >> 
> >> Yes, 600 is a good number. Vampir can easily handle this. If
possible, I''d like to have all server traces as well to include this
information. Right now I am only putting these RPC''s in the trace where
all 5 events (client send,server recv/start/done,client done) are present. I see
the "client send/done" events in the OSS log as well, probably it is
talking to the MDS or MGS. Events that are incomplete are being ignored but
counted in "dumped events" counter. But that could be changed.
> >> 
> >> 
> >> Michael
> >> 
> >>>> I had thought the current target is to make sure vampire
can handle
> >>>> the trace from "big" enough clusters.
> >>>> 
> >>>> Actually, I am not sure whether vampire(with this trace
analyse) should
> >>>> work in this way as we do now?  i.e. running job
exclusively on the cluster, get rpctrace log, then get graph by vampire?
> >>>> 
> >>>> Or I miss sth here? Michale, could you please explain a
bit? What is
> >>>> your idea how vampire could help the end users? or the
target here is
> >>>> just helping the developers and sysadmin to understand the
system?
> >>> 
> >>> My thoughts on this in the past were that it should be
possible to use ONLY the client logs to plot syscall latencies (using client
VFSTRACE lines) and the RPC latencies (using client RPCTRACE lines) and locking
(using DLMTRACE lines).  Some code rework would be needed to allow regular users
to run "lctl dk", and it would filter out the lines specific to their
own processes.  This would allow regular users to collect and analyze their
application without assistance or coordination from the sysadmin.
> >>> 
> >>> In order to allow regular users to trace such information, the
VFSTRACE calls should report the UID of the process doing the system call, which
would immediately map to a PID.  The PID can be used to track the majority of
the debug entries, but not always those done in another thread (e.g. ptlrpcd). 
The RPCTRACE messages also contain the PID, so that would be helpful, but it
would mean that there has to be a parser/filter in the kernel to ensure users
cannot access trace information that is not their own.  That would be a
significant undertaking, I think.
> >>> 
> >>> In the meantime, for testing purposes and initial usage (with
sysadmin assistance) the full debug log can be extracted from the kernel and
filtered in userspace as needed.
> >>> 
> >>>>>> -----Original Message-----
> >>>>>> From: di.wang [mailto:di.wang at oracle.com]
> >>>>>> Sent: 01 June 2010 12:50 PM
> >>>>>> To: Robert Read
> >>>>>> Cc: Michael Kluge; Eric Barton; Galen M. Shipman
> >>>>>> Subject: Re: [Lustre-devel] Lustre RPC
visualization
> >>>>>> 
> >>>>>> Hello,
> >>>>>> 
> >>>>>> IMHO, just run IOR with whatever parameters, and
get rpctrace
> >>>>>> log(probably only enable rpctrace) from 1 OST and
some of clients
> >>>>>> (probably 2 is enough).
> >>>>>> Note: please make sure these 2 clients did
communicate the OST during
> >>>>>> the IOR.
> >>>>>> 
> >>>>>> Michael, I do not think you need all the trace
logs from the clients. right?
> >>>>>> 
> >>>>>> Robert
> >>>>>> 
> >>>>>> If there are available time slots for this test on
Hyperion, who can
> >>>>>> help to get these logs?
> >>>>>> 
> >>>>>> Thanks
> >>>>>> Wangdi
> >>>>>> 
> >>>>>> Robert Read wrote:
> >>>>>> 
> >>>>>>> What should I run then? Do have scripts to
capture this?
> >>>>>>> 
> >>>>>>> robert
> >>>>>>> 
> >>>>>>> On May 31, 2010, at 2:39 , Michael Kluge
wrote:
> >>>>>>> 
> >>>>>>> 
> >>>>>>>> Hi Robert,
> >>>>>>>> 
> >>>>>>>> 600 is a nice number. Plus the traces from
the server an I am happy.
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>> Michael
> >>>>>>>> 
> >>>>>>>> Am 28.05.2010 um 17:53 schrieb Robert
Read:
> >>>>>>>> 
> >>>>>>>> 
> >>>>>>>>> On May 28, 2010, at 4:09 , di.wang
wrote:
> >>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>>> Hello, Michael
> >>>>>>>>>> 
> >>>>>>>>>> 
> >>>>>>>>>>> One good news: The Feature
that Vampir can show something like a heat
> >>>>>>>>>>> map (Eric asked about this)
comes back with the release at ISC. It is
> >>>>>>>>>>> now called "performance
radar". It can produce a heat map for a
> >>>>>>>>>>> counter
> >>>>>>>>>>> and does some other things as
well. I could send a picture around, but
> >>>>>>>>>>> need at first an bigger trace
(more hosts generating traces in
> >>>>>>>>>>> parallel).
> >>>>>>>>>>> 
> >>>>>>>>>>> 
> >>>>>>>>>> Right now I do not have big
clusters available to generate the trace.
> >>>>>>>>>> I will see what I can do here.
> >>>>>>>>>> 
> >>>>>>>>> If ~600 clients is big enough we could
generate that on Hyperion.
> >>>>>>>>> 
> >>>>>>>>> robert
> >>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>>>> Thanks
> >>>>>>>>>> WangDi
> >>>>>>>>>> 
> >>>>>>>>>>> 
> >>>>>>>>>>> 
> >>>>>>>>>>>>> 
> >>>>>>>>>>>
------------------------------------------------------------------------
> >>>>>>>>>>> 
> >>>>>>>>>>>
_______________________________________________
> >>>>>>>>>>> Lustre-devel mailing list
> >>>>>>>>>>> Lustre-devel at
lists.lustre.org <mailto:Lustre-devel at lists.lustre.org>
> >>>>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
> >>>>>>>>>>> 
> >>>>>>>>>>> 
> >>>>>>>>> 
> >>>>>>>> --
> >>>>>>>> 
> >>>>>>>> Michael Kluge, M.Sc.
> >>>>>>>> 
> >>>>>>>> Technische Universit?t Dresden
> >>>>>>>> Center for Information Services and
> >>>>>>>> High Performance Computing (ZIH)
> >>>>>>>> D-01062 Dresden
> >>>>>>>> Germany
> >>>>>>>> 
> >>>>>>>> Contact:
> >>>>>>>> Willersbau, Room WIL A 208
> >>>>>>>> Phone:  (+49) 351 463-34217
> >>>>>>>> Fax:    (+49) 351 463-37773
> >>>>>>>> e-mail: michael.kluge at tu-dresden.de
<mailto:michael.kluge at tu-dresden.de>
> >>>>>>>> WWW:    http://www.tu-dresden.de/zih
> >>>>>>>> 
> >>>>>>>> 
> >>>>> 
> >>>>> 
> >>>>> _______________________________________________
> >>>>> Lustre-devel mailing list
> >>>>> Lustre-devel at lists.lustre.org
> >>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >>>>> 
> >>>> 
> >>>> _______________________________________________
> >>>> Lustre-devel mailing list
> >>>> Lustre-devel at lists.lustre.org
> >>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >>> 
> >>> 
> >>> Cheers, Andreas
> >>> --
> >>> Andreas Dilger
> >>> Lustre Technical Lead
> >>> Oracle Corporation Canada Inc.
> >>> 
> >>> _______________________________________________
> >>> Lustre-devel mailing list
> >>> Lustre-devel at lists.lustre.org
> >>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >>> 
> >> 
> >> 
> >> -- 
> >> 
> >> Michael Kluge, M.Sc.
> >> 
> >> Technische Universit?t Dresden
> >> Center for Information Services and
> >> High Performance Computing (ZIH)
> >> D-01062 Dresden
> >> Germany
> >> 
> >> Contact:
> >> Willersbau, Room WIL A 208
> >> Phone:  (+49) 351 463-34217
> >> Fax:    (+49) 351 463-37773
> >> e-mail: michael.kluge at tu-dresden.de
> >> WWW:    http://www.tu-dresden.de/zih
> >> 
> >> _______________________________________________
> >> Lustre-devel mailing list
> >> Lustre-devel at lists.lustre.org
> >> http://lists.lustre.org/mailman/listinfo/lustre-devel
> > 
> > 
> > -- 
> > 
> > Michael Kluge, M.Sc.
> > 
> > Technische Universit?t Dresden
> > Center for Information Services and
> > High Performance Computing (ZIH)
> > D-01062 Dresden
> > Germany
> > 
> > Contact:
> > Willersbau, Room WIL A 208
> > Phone:  (+49) 351 463-34217
> > Fax:    (+49) 351 463-37773
> > e-mail: michael.kluge at tu-dresden.de
> > WWW:    http://www.tu-dresden.de/zih
> > 
> 
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.
> 
> 
-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5997 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100617/5402077b/attachment.bin

Michael Kluge

2010-Jun-20 20:44 UTC

head link

[Lustre-devel] Lustre RPC visualization

OK, I found time do do programming again. There are still some issues, 
but I got a first screenshot with the data from Roberts log.

Michael

Am 17.06.2010 16:35, schrieb Robert Read:> I don''t have the nids handy (I''ll try to get that later
today), but these are the server logs:
>
> ior-634/hyperion-sun-ost6.ior.log
> ior-634/hyperion-sun-ost7.ior.log
> ior-634/hyperion-sun-ost5.ior.log
> ior-634/hyperion-mds7.ior.log
>
> robert
>
> On Jun 17, 2010, at 7:02 , Michael Kluge wrote:
>
>> Hmm. Ok. Is there a way to figure out what type of file I have if I
have
>> nothing like the (most probably incomplete) file itself? Whether it is
a
>> client or a server log and which NID the node has had? Right now the
>> code is executing a not-so-easy mapping algorithm over all files to
>> figure this out. The way to distinguish between client and server was
to
>> look at those lines which I expected to see only on the server (the
>> "incoming req at .....").
>>
>> Michael
>>
>> Am Mittwoch, den 16.06.2010, 08:50 -0600 schrieb Andreas Dilger:
>>> On 2010-06-16, at 02:46, Michael Kluge wrote:
>>>> thanks for the trace files. Every single file contains log
entries like:
>>>>
>>>> line:
00000100:00100000:2.0F:1275433209.964416:0:7924:0:(events.c:285:request_in_callback())
incoming req at ffff8101b32f5400 x1337373369608379 msgsize 296
>>>>
>>>> Are in the tarball 684 server logs? Or am I wrong with the
assumption that the lines with an "incoming req@" can only show up on
the servers?
>>>
>>> Clients can receive requests from the server for things like lock
cancellations, so this line will appear in both client and server logs.  We
don''t have a test system with 684 servers...
>>>
>>>> Am 01.06.2010 um 21:39 schrieb Michael Kluge:
>>>>
>>>>>> On 2010-06-01, at 06:12, di.wang wrote:
>>>>>>> Michael, I do not think you need all the trace logs
from the clients. right?
>>>>>>
>>>>>> Actually, I think he does want trace logs from all of
the clients.
>>>>>
>>>>> Yes, 600 is a good number. Vampir can easily handle this.
If possible, I''d like to have all server traces as well to include this
information. Right now I am only putting these RPC''s in the trace where
all 5 events (client send,server recv/start/done,client done) are present. I see
the "client send/done" events in the OSS log as well, probably it is
talking to the MDS or MGS. Events that are incomplete are being ignored but
counted in "dumped events" counter. But that could be changed.
>>>>>
>>>>>
>>>>> Michael
>>>>>
>>>>>>> I had thought the current target is to make sure
vampire can handle
>>>>>>> the trace from "big" enough clusters.
>>>>>>>
>>>>>>> Actually, I am not sure whether vampire(with this
trace analyse) should
>>>>>>> work in this way as we do now?  i.e. running job
exclusively on the cluster, get rpctrace log, then get graph by vampire?
>>>>>>>
>>>>>>> Or I miss sth here? Michale, could you please
explain a bit? What is
>>>>>>> your idea how vampire could help the end users? or
the target here is
>>>>>>> just helping the developers and sysadmin to
understand the system?
>>>>>>
>>>>>> My thoughts on this in the past were that it should be
possible to use ONLY the client logs to plot syscall latencies (using client
VFSTRACE lines) and the RPC latencies (using client RPCTRACE lines) and locking
(using DLMTRACE lines).  Some code rework would be needed to allow regular users
to run "lctl dk", and it would filter out the lines specific to their
own processes.  This would allow regular users to collect and analyze their
application without assistance or coordination from the sysadmin.
>>>>>>
>>>>>> In order to allow regular users to trace such
information, the VFSTRACE calls should report the UID of the process doing the
system call, which would immediately map to a PID.  The PID can be used to track
the majority of the debug entries, but not always those done in another thread
(e.g. ptlrpcd).  The RPCTRACE messages also contain the PID, so that would be
helpful, but it would mean that there has to be a parser/filter in the kernel to
ensure users cannot access trace information that is not their own.  That would
be a significant undertaking, I think.
>>>>>>
>>>>>> In the meantime, for testing purposes and initial usage
(with sysadmin assistance) the full debug log can be extracted from the kernel
and filtered in userspace as needed.
>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: di.wang [mailto:di.wang at
oracle.com]
>>>>>>>>> Sent: 01 June 2010 12:50 PM
>>>>>>>>> To: Robert Read
>>>>>>>>> Cc: Michael Kluge; Eric Barton; Galen M.
Shipman
>>>>>>>>> Subject: Re: [Lustre-devel] Lustre RPC
visualization
>>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> IMHO, just run IOR with whatever
parameters, and get rpctrace
>>>>>>>>> log(probably only enable rpctrace) from 1
OST and some of clients
>>>>>>>>> (probably 2 is enough).
>>>>>>>>> Note: please make sure these 2 clients did
communicate the OST during
>>>>>>>>> the IOR.
>>>>>>>>>
>>>>>>>>> Michael, I do not think you need all the
trace logs from the clients. right?
>>>>>>>>>
>>>>>>>>> Robert
>>>>>>>>>
>>>>>>>>> If there are available time slots for this
test on Hyperion, who can
>>>>>>>>> help to get these logs?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Wangdi
>>>>>>>>>
>>>>>>>>> Robert Read wrote:
>>>>>>>>>
>>>>>>>>>> What should I run then? Do have scripts
to capture this?
>>>>>>>>>>
>>>>>>>>>> robert
>>>>>>>>>>
>>>>>>>>>> On May 31, 2010, at 2:39 , Michael
Kluge wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hi Robert,
>>>>>>>>>>>
>>>>>>>>>>> 600 is a nice number. Plus the
traces from the server an I am happy.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Michael
>>>>>>>>>>>
>>>>>>>>>>> Am 28.05.2010 um 17:53 schrieb
Robert Read:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> On May 28, 2010, at 4:09 ,
di.wang wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Hello, Michael
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> One good news: The
Feature that Vampir can show something like a heat
>>>>>>>>>>>>>> map (Eric asked about
this) comes back with the release at ISC. It is
>>>>>>>>>>>>>> now called
"performance radar". It can produce a heat map for a
>>>>>>>>>>>>>> counter
>>>>>>>>>>>>>> and does some other
things as well. I could send a picture around, but
>>>>>>>>>>>>>> need at first an bigger
trace (more hosts generating traces in
>>>>>>>>>>>>>> parallel).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Right now I do not have big
clusters available to generate the trace.
>>>>>>>>>>>>> I will see what I can do
here.
>>>>>>>>>>>>>
>>>>>>>>>>>> If ~600 clients is big enough
we could generate that on Hyperion.
>>>>>>>>>>>>
>>>>>>>>>>>> robert
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>> WangDi
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
------------------------------------------------------------------------
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>> Lustre-devel mailing
list
>>>>>>>>>>>>>> Lustre-devel at
lists.lustre.org<mailto:Lustre-devel at lists.lustre.org>
>>>>>>>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Michael Kluge, M.Sc.
>>>>>>>>>>>
>>>>>>>>>>> Technische Universit?t Dresden
>>>>>>>>>>> Center for Information Services and
>>>>>>>>>>> High Performance Computing (ZIH)
>>>>>>>>>>> D-01062 Dresden
>>>>>>>>>>> Germany
>>>>>>>>>>>
>>>>>>>>>>> Contact:
>>>>>>>>>>> Willersbau, Room WIL A 208
>>>>>>>>>>> Phone:  (+49) 351 463-34217
>>>>>>>>>>> Fax:    (+49) 351 463-37773
>>>>>>>>>>> e-mail: michael.kluge at
tu-dresden.de<mailto:michael.kluge at tu-dresden.de>
>>>>>>>>>>> WWW:   
http://www.tu-dresden.de/zih
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Lustre-devel mailing list
>>>>>>>> Lustre-devel at lists.lustre.org
>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Lustre-devel mailing list
>>>>>>> Lustre-devel at lists.lustre.org
>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>
>>>>>>
>>>>>> Cheers, Andreas
>>>>>> --
>>>>>> Andreas Dilger
>>>>>> Lustre Technical Lead
>>>>>> Oracle Corporation Canada Inc.
>>>>>>
>>>>>> _______________________________________________
>>>>>> Lustre-devel mailing list
>>>>>> Lustre-devel at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> Michael Kluge, M.Sc.
>>>>>
>>>>> Technische Universit?t Dresden
>>>>> Center for Information Services and
>>>>> High Performance Computing (ZIH)
>>>>> D-01062 Dresden
>>>>> Germany
>>>>>
>>>>> Contact:
>>>>> Willersbau, Room WIL A 208
>>>>> Phone:  (+49) 351 463-34217
>>>>> Fax:    (+49) 351 463-37773
>>>>> e-mail: michael.kluge at tu-dresden.de
>>>>> WWW:    http://www.tu-dresden.de/zih
>>>>>
>>>>> _______________________________________________
>>>>> Lustre-devel mailing list
>>>>> Lustre-devel at lists.lustre.org
>>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>
>>>>
>>>> --
>>>>
>>>> Michael Kluge, M.Sc.
>>>>
>>>> Technische Universit?t Dresden
>>>> Center for Information Services and
>>>> High Performance Computing (ZIH)
>>>> D-01062 Dresden
>>>> Germany
>>>>
>>>> Contact:
>>>> Willersbau, Room WIL A 208
>>>> Phone:  (+49) 351 463-34217
>>>> Fax:    (+49) 351 463-37773
>>>> e-mail: michael.kluge at tu-dresden.de
>>>> WWW:    http://www.tu-dresden.de/zih
>>>>
>>>
>>>
>>> Cheers, Andreas
>>> --
>>> Andreas Dilger
>>> Lustre Technical Lead
>>> Oracle Corporation Canada Inc.
>>>
>>>
>>
>> --
>>
>> Michael Kluge, M.Sc.
>>
>> Technische Universit?t Dresden
>> Center for Information Services and
>> High Performance Computing (ZIH)
>> D-01062 Dresden
>> Germany
>>
>> Contact:
>> Willersbau, Room A 208
>> Phone:  (+49) 351 463-34217
>> Fax:    (+49) 351 463-37773
>> e-mail: michael.kluge at tu-dresden.de
>> WWW:    http://www.tu-dresden.de/zih
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>
>

-- 
Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lustre2.png
Type: image/png
Size: 28695 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100620/81293fa9/attachment-0002.png
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lustre1.png
Type: image/png
Size: 63875 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100620/81293fa9/attachment-0003.png

Michael Kluge

2010-Jun-22 15:12 UTC

head link

[Lustre-devel] Lustre RPC visualization

The remaining problems of the counter calculations have been fixed. There is a
screenshot attached showing some values. The code is in a gforge server that we
operate here in Dresden (gforge.zih.tu-dresden.de). The converter runs on Linux
and on MAC OS X and you need Vampir to take a look at the OTF trace file.

Eric, Galen, for the moment I think I am done with the stuff I promised you at
LUG this year? Are there any more ideas?


Michael








Am 20.06.2010 um 22:44 schrieb Michael Kluge:
> OK, I found time do do programming again. There are still some issues, but
I got a first screenshot with the data from Roberts log.
> 
> Michael
> 
> Am 17.06.2010 16:35, schrieb Robert Read:
>> I don''t have the nids handy (I''ll try to get that
later today), but these are the server logs:
>> 
>> ior-634/hyperion-sun-ost6.ior.log
>> ior-634/hyperion-sun-ost7.ior.log
>> ior-634/hyperion-sun-ost5.ior.log
>> ior-634/hyperion-mds7.ior.log
>> 
>> robert
>> 
>> On Jun 17, 2010, at 7:02 , Michael Kluge wrote:
>> 
>>> Hmm. Ok. Is there a way to figure out what type of file I have if I
have
>>> nothing like the (most probably incomplete) file itself? Whether it
is a
>>> client or a server log and which NID the node has had? Right now
the
>>> code is executing a not-so-easy mapping algorithm over all files to
>>> figure this out. The way to distinguish between client and server
was to
>>> look at those lines which I expected to see only on the server (the
>>> "incoming req at .....").
>>> 
>>> Michael
>>> 
>>> Am Mittwoch, den 16.06.2010, 08:50 -0600 schrieb Andreas Dilger:
>>>> On 2010-06-16, at 02:46, Michael Kluge wrote:
>>>>> thanks for the trace files. Every single file contains log
entries like:
>>>>> 
>>>>> line:
00000100:00100000:2.0F:1275433209.964416:0:7924:0:(events.c:285:request_in_callback())
incoming req at ffff8101b32f5400 x1337373369608379 msgsize 296
>>>>> 
>>>>> Are in the tarball 684 server logs? Or am I wrong with the
assumption that the lines with an "incoming req@" can only show up on
the servers?
>>>> 
>>>> Clients can receive requests from the server for things like
lock cancellations, so this line will appear in both client and server logs.  We
don''t have a test system with 684 servers...
>>>> 
>>>>> Am 01.06.2010 um 21:39 schrieb Michael Kluge:
>>>>> 
>>>>>>> On 2010-06-01, at 06:12, di.wang wrote:
>>>>>>>> Michael, I do not think you need all the trace
logs from the clients. right?
>>>>>>> 
>>>>>>> Actually, I think he does want trace logs from all
of the clients.
>>>>>> 
>>>>>> Yes, 600 is a good number. Vampir can easily handle
this. If possible, I''d like to have all server traces as well to
include this information. Right now I am only putting these RPC''s in
the trace where all 5 events (client send,server recv/start/done,client done)
are present. I see the "client send/done" events in the OSS log as
well, probably it is talking to the MDS or MGS. Events that are incomplete are
being ignored but counted in "dumped events" counter. But that could
be changed.
>>>>>> 
>>>>>> 
>>>>>> Michael
>>>>>> 
>>>>>>>> I had thought the current target is to make
sure vampire can handle
>>>>>>>> the trace from "big" enough clusters.
>>>>>>>> 
>>>>>>>> Actually, I am not sure whether vampire(with
this trace analyse) should
>>>>>>>> work in this way as we do now?  i.e. running
job exclusively on the cluster, get rpctrace log, then get graph by vampire?
>>>>>>>> 
>>>>>>>> Or I miss sth here? Michale, could you please
explain a bit? What is
>>>>>>>> your idea how vampire could help the end users?
or the target here is
>>>>>>>> just helping the developers and sysadmin to
understand the system?
>>>>>>> 
>>>>>>> My thoughts on this in the past were that it should
be possible to use ONLY the client logs to plot syscall latencies (using client
VFSTRACE lines) and the RPC latencies (using client RPCTRACE lines) and locking
(using DLMTRACE lines).  Some code rework would be needed to allow regular users
to run "lctl dk", and it would filter out the lines specific to their
own processes.  This would allow regular users to collect and analyze their
application without assistance or coordination from the sysadmin.
>>>>>>> 
>>>>>>> In order to allow regular users to trace such
information, the VFSTRACE calls should report the UID of the process doing the
system call, which would immediately map to a PID.  The PID can be used to track
the majority of the debug entries, but not always those done in another thread
(e.g. ptlrpcd).  The RPCTRACE messages also contain the PID, so that would be
helpful, but it would mean that there has to be a parser/filter in the kernel to
ensure users cannot access trace information that is not their own.  That would
be a significant undertaking, I think.
>>>>>>> 
>>>>>>> In the meantime, for testing purposes and initial
usage (with sysadmin assistance) the full debug log can be extracted from the
kernel and filtered in userspace as needed.
>>>>>>> 
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: di.wang [mailto:di.wang at
oracle.com]
>>>>>>>>>> Sent: 01 June 2010 12:50 PM
>>>>>>>>>> To: Robert Read
>>>>>>>>>> Cc: Michael Kluge; Eric Barton; Galen
M. Shipman
>>>>>>>>>> Subject: Re: [Lustre-devel] Lustre RPC
visualization
>>>>>>>>>> 
>>>>>>>>>> Hello,
>>>>>>>>>> 
>>>>>>>>>> IMHO, just run IOR with whatever
parameters, and get rpctrace
>>>>>>>>>> log(probably only enable rpctrace) from
1 OST and some of clients
>>>>>>>>>> (probably 2 is enough).
>>>>>>>>>> Note: please make sure these 2 clients
did communicate the OST during
>>>>>>>>>> the IOR.
>>>>>>>>>> 
>>>>>>>>>> Michael, I do not think you need all
the trace logs from the clients. right?
>>>>>>>>>> 
>>>>>>>>>> Robert
>>>>>>>>>> 
>>>>>>>>>> If there are available time slots for
this test on Hyperion, who can
>>>>>>>>>> help to get these logs?
>>>>>>>>>> 
>>>>>>>>>> Thanks
>>>>>>>>>> Wangdi
>>>>>>>>>> 
>>>>>>>>>> Robert Read wrote:
>>>>>>>>>> 
>>>>>>>>>>> What should I run then? Do have
scripts to capture this?
>>>>>>>>>>> 
>>>>>>>>>>> robert
>>>>>>>>>>> 
>>>>>>>>>>> On May 31, 2010, at 2:39 , Michael
Kluge wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> Hi Robert,
>>>>>>>>>>>> 
>>>>>>>>>>>> 600 is a nice number. Plus the
traces from the server an I am happy.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Michael
>>>>>>>>>>>> 
>>>>>>>>>>>> Am 28.05.2010 um 17:53 schrieb
Robert Read:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> On May 28, 2010, at 4:09 ,
di.wang wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hello, Michael
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> One good news: The
Feature that Vampir can show something like a heat
>>>>>>>>>>>>>>> map (Eric asked
about this) comes back with the release at ISC. It is
>>>>>>>>>>>>>>> now called
"performance radar". It can produce a heat map for a
>>>>>>>>>>>>>>> counter
>>>>>>>>>>>>>>> and does some other
things as well. I could send a picture around, but
>>>>>>>>>>>>>>> need at first an
bigger trace (more hosts generating traces in
>>>>>>>>>>>>>>> parallel).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Right now I do not have
big clusters available to generate the trace.
>>>>>>>>>>>>>> I will see what I can
do here.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> If ~600 clients is big
enough we could generate that on Hyperion.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> robert
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> WangDi
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>
------------------------------------------------------------------------
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>
_______________________________________________
>>>>>>>>>>>>>>> Lustre-devel
mailing list
>>>>>>>>>>>>>>> Lustre-devel at
lists.lustre.org<mailto:Lustre-devel at lists.lustre.org>
>>>>>>>>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> 
>>>>>>>>>>>> Michael Kluge, M.Sc.
>>>>>>>>>>>> 
>>>>>>>>>>>> Technische Universit?t Dresden
>>>>>>>>>>>> Center for Information Services
and
>>>>>>>>>>>> High Performance Computing
(ZIH)
>>>>>>>>>>>> D-01062 Dresden
>>>>>>>>>>>> Germany
>>>>>>>>>>>> 
>>>>>>>>>>>> Contact:
>>>>>>>>>>>> Willersbau, Room WIL A 208
>>>>>>>>>>>> Phone:  (+49) 351 463-34217
>>>>>>>>>>>> Fax:    (+49) 351 463-37773
>>>>>>>>>>>> e-mail: michael.kluge at
tu-dresden.de<mailto:michael.kluge at tu-dresden.de>
>>>>>>>>>>>> WWW:   
http://www.tu-dresden.de/zih
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>
_______________________________________________
>>>>>>>>> Lustre-devel mailing list
>>>>>>>>> Lustre-devel at lists.lustre.org
>>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> Lustre-devel mailing list
>>>>>>>> Lustre-devel at lists.lustre.org
>>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>> 
>>>>>>> 
>>>>>>> Cheers, Andreas
>>>>>>> --
>>>>>>> Andreas Dilger
>>>>>>> Lustre Technical Lead
>>>>>>> Oracle Corporation Canada Inc.
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> Lustre-devel mailing list
>>>>>>> Lustre-devel at lists.lustre.org
>>>>>>>
http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> Michael Kluge, M.Sc.
>>>>>> 
>>>>>> Technische Universit?t Dresden
>>>>>> Center for Information Services and
>>>>>> High Performance Computing (ZIH)
>>>>>> D-01062 Dresden
>>>>>> Germany
>>>>>> 
>>>>>> Contact:
>>>>>> Willersbau, Room WIL A 208
>>>>>> Phone:  (+49) 351 463-34217
>>>>>> Fax:    (+49) 351 463-37773
>>>>>> e-mail: michael.kluge at tu-dresden.de
>>>>>> WWW:    http://www.tu-dresden.de/zih
>>>>>> 
>>>>>> _______________________________________________
>>>>>> Lustre-devel mailing list
>>>>>> Lustre-devel at lists.lustre.org
>>>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>> 
>>>>> 
>>>>> --
>>>>> 
>>>>> Michael Kluge, M.Sc.
>>>>> 
>>>>> Technische Universit?t Dresden
>>>>> Center for Information Services and
>>>>> High Performance Computing (ZIH)
>>>>> D-01062 Dresden
>>>>> Germany
>>>>> 
>>>>> Contact:
>>>>> Willersbau, Room WIL A 208
>>>>> Phone:  (+49) 351 463-34217
>>>>> Fax:    (+49) 351 463-37773
>>>>> e-mail: michael.kluge at tu-dresden.de
>>>>> WWW:    http://www.tu-dresden.de/zih
>>>>> 
>>>> 
>>>> 
>>>> Cheers, Andreas
>>>> --
>>>> Andreas Dilger
>>>> Lustre Technical Lead
>>>> Oracle Corporation Canada Inc.
>>>> 
>>>> 
>>> 
>>> --
>>> 
>>> Michael Kluge, M.Sc.
>>> 
>>> Technische Universit?t Dresden
>>> Center for Information Services and
>>> High Performance Computing (ZIH)
>>> D-01062 Dresden
>>> Germany
>>> 
>>> Contact:
>>> Willersbau, Room A 208
>>> Phone:  (+49) 351 463-34217
>>> Fax:    (+49) 351 463-37773
>>> e-mail: michael.kluge at tu-dresden.de
>>> WWW:    http://www.tu-dresden.de/zih
>>> _______________________________________________
>>> Lustre-devel mailing list
>>> Lustre-devel at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>> 
>> 
> 
> 
> -- 
> Michael Kluge, M.Sc.
> 
> Technische Universit?t Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih
>
<lustre2.png><lustre1.png>_______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100622/644c5c4f/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lustre3.png
Type: image/png
Size: 153070 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100622/644c5c4f/attachment-0001.png

Alexey Lyashkov

2010-Jun-23 10:29 UTC

head link

[Lustre-devel] Lustre RPC visualization

I think better to use client UUID instead of NID as client identification.
Because in your''s case - you can''t separate info from two
clients which run on same node.


On Jun 22, 2010, at 19:12, Michael Kluge wrote:
> The remaining problems of the counter calculations have been fixed. There
is a screenshot attached showing some values. The code is in a gforge server
that we operate here in Dresden (gforge.zih.tu-dresden.de). The converter runs
on Linux and on MAC OS X and you need Vampir to take a look at the OTF trace
file.
> 
> Eric, Galen, for the moment I think I am done with the stuff I promised you
at LUG this year? Are there any more ideas?
> 
> 
> Michael
> 
> 
> 
> <lustre3.png>
> 
> 
> 
> 
> http://lists.lustre.org/mailman/listinfo/lustre-devel


--------------------------------------
Alexey Lyashkov
alexey.lyashkov at clusterstor.com




-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100623/f04b9bb3/attachment.html

Michael Kluge

2010-Jun-23 11:50 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi Alexey,

thanks  for the note. How do I do this if I only have a debug log? I see a field
"cluuid+ref" in those lines of the log file that we are interested in.
cluuid is probably "client uuid". This UUID is definitely associated
to the node, that also owns the NID that appears in the same log line? What does
"+ref" stands for?


Michael

Am 23.06.2010 um 12:29 schrieb Alexey Lyashkov:
> I think better to use client UUID instead of NID as client identification.
Because in your''s case - you can''t separate info from two
clients which run on same node.
> 
> 
> On Jun 22, 2010, at 19:12, Michael Kluge wrote:
> 
>> The remaining problems of the counter calculations have been fixed.
There is a screenshot attached showing some values. The code is in a gforge
server that we operate here in Dresden (gforge.zih.tu-dresden.de). The converter
runs on Linux and on MAC OS X and you need Vampir to take a look at the OTF
trace file.
>> 
>> Eric, Galen, for the moment I think I am done with the stuff I promised
you at LUG this year? Are there any more ideas?
>> 
>> 
>> Michael
>> 
>> 
>> 
>> <lustre3.png>
>> 
>> 
>> 
>> 
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
> 
> 
> --------------------------------------
> Alexey Lyashkov
> alexey.lyashkov at clusterstor.com
> 
> 
> 
> 

-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100623/c50279fa/attachment.html

Alexey Lyashkov

2010-Jun-23 12:09 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi Mchael,

that easy to get from logs.>>00000100:00100000:0:1197446488.301126:0:28197:0:(service.c:721:ptlrpc_server_handle_request())
Handled RPC pname:cluuid+ref:pi
d:xid:nid:opc ll_ost_26:lustre-MDT0000-mdtlov_UUID+5:27309:266:12345-0 at
lo:400>>so..
ll_ost_26 - thread name, which handle request,
lustre-MDT0000-mdtlov_UUID - client which send request.
+5 - number references to that request (unneed for you).
27309 - pid which handle request (or which send on client)
266 - xid - internal request id.
12345-0 - sender NID
@lo - LND used to deliver request = @lo -> loopback, tcp -> tcp, @o2ib,  -
IB hardware, @ptl - cray portals.
400 - command, you can look to enums "*_cmd_t " to decode that command
identifier (in that example - 400 OBD_PING)

as other suggestions fix scaling, at you example only few clients has 100
requests in flight, but legend on bottom show up to 510.


On Jun 23, 2010, at 15:50, Michael Kluge wrote:
> Hi Alexey,
> 
> thanks  for the note. How do I do this if I only have a debug log? I see a
field "cluuid+ref" in those lines of the log file that we are
interested in. cluuid is probably "client uuid". This UUID is
definitely associated to the node, that also owns the NID that appears in the
same log line? What does "+ref" stands for?
> 
> 
> Michael
> 
> Am 23.06.2010 um 12:29 schrieb Alexey Lyashkov:
> 
>> I think better to use client UUID instead of NID as client
identification. Because in your''s case - you can''t separate
info from two clients which run on same node.
>> 
>> 
>> On Jun 22, 2010, at 19:12, Michael Kluge wrote:
>> 
>>> The remaining problems of the counter calculations have been fixed.
There is a screenshot attached showing some values. The code is in a gforge
server that we operate here in Dresden (gforge.zih.tu-dresden.de). The converter
runs on Linux and on MAC OS X and you need Vampir to take a look at the OTF
trace file.
>>> 
>>> Eric, Galen, for the moment I think I am done with the stuff I
promised you at LUG this year? Are there any more ideas?
>>> 
>>> 
>>> Michael
>>> 
>>> 
>>> 
>>> <lustre3.png>
>>> 
>>> 
>>> 
>>> 
>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>> 
>> 
>> 
>> --------------------------------------
>> Alexey Lyashkov
>> alexey.lyashkov at clusterstor.com
>> 
>> 
>> 
>> 
> 
> 
> -- 
> 
> Michael Kluge, M.Sc.
> 
> Technische Universit?t Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100623/b973f159/attachment-0001.html

Michael Kluge

2010-Jun-23 12:38 UTC

head link

[Lustre-devel] Lustre RPC visualization

>
00000100:00100000:0:1197446488.301126:0:28197:0:(service.c:721:ptlrpc_server_handle_request())
Handled RPC pname:cluuid+ref:pi
> d:xid:nid:opc ll_ost_26:lustre-MDT0000-mdtlov_UUID+5:27309:266:12345-0 at
lo:400
> >>
> so..
> ll_ost_26 - thread name, which handle request,
> lustre-MDT0000-mdtlov_UUID - client which send request.
> +5 - number references to that request (unneed for you).
> 27309 - pid which handle request (or which send on client)
> 266 - xid - internal request id.
> 12345-0 - sender NID
> @lo - LND used to deliver request = @lo -> loopback, tcp -> tcp,
@o2ib,  - IB hardware, @ptl - cray portals.
> 400 - command, you can look to enums "*_cmd_t " to decode that
command identifier (in that example - 400 OBD_PING)
OK, thanks. I''ll need to think about this as it might make sense to
have one line per thread and not per UUID? Or is there already a 1:1 mapping?
> as other suggestions fix scaling, at you example only few clients has 100
requests in flight, but legend on bottom show up to 510.
Yes, did that while you were writing the mail ;) Definitely looks much better.



Michael
> 
> 
> On Jun 23, 2010, at 15:50, Michael Kluge wrote:
> 
>> Hi Alexey,
>> 
>> thanks  for the note. How do I do this if I only have a debug log? I
see a field "cluuid+ref" in those lines of the log file that we are
interested in. cluuid is probably "client uuid". This UUID is
definitely associated to the node, that also owns the NID that appears in the
same log line? What does "+ref" stands for?
>> 
>> 
>> Michael
>> 
>> Am 23.06.2010 um 12:29 schrieb Alexey Lyashkov:
>> 
>>> I think better to use client UUID instead of NID as client
identification. Because in your''s case - you can''t separate
info from two clients which run on same node.
>>> 
>>> 
>>> On Jun 22, 2010, at 19:12, Michael Kluge wrote:
>>> 
>>>> The remaining problems of the counter calculations have been
fixed. There is a screenshot attached showing some values. The code is in a
gforge server that we operate here in Dresden (gforge.zih.tu-dresden.de). The
converter runs on Linux and on MAC OS X and you need Vampir to take a look at
the OTF trace file.
>>>> 
>>>> Eric, Galen, for the moment I think I am done with the stuff I
promised you at LUG this year? Are there any more ideas?
>>>> 
>>>> 
>>>> Michael
>>>> 
>>>> 
>>>> 
>>>> <lustre3.png>
>>>> 
>>>> 
>>>> 
>>>> 
>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>> 
>>> 
>>> 
>>> --------------------------------------
>>> Alexey Lyashkov
>>> alexey.lyashkov at clusterstor.com
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> -- 
>> 
>> Michael Kluge, M.Sc.
>> 
>> Technische Universit?t Dresden
>> Center for Information Services and
>> High Performance Computing (ZIH)
>> D-01062 Dresden
>> Germany
>> 
>> Contact:
>> Willersbau, Room WIL A 208
>> Phone:  (+49) 351 463-34217
>> Fax:    (+49) 351 463-37773
>> e-mail: michael.kluge at tu-dresden.de
>> WWW:    http://www.tu-dresden.de/zih
>> 
> 

-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100623/c0c14636/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: client_rpcs_in_flight.jpg
Type: image/jpg
Size: 326747 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100623/c0c14636/attachment-0001.jpg

Andreas Dilger

2010-Jun-23 15:55 UTC

head link

[Lustre-devel] Lustre RPC visualization

On 2010-06-23, at 05:50, Michael Kluge wrote:> thanks  for the note. How do I do this if I only have a debug log? I see a
field "cluuid+ref" in those lines of the log file that we are
interested in. cluuid is probably "client uuid". This UUID is
definitely associated to the node, that also owns the NID that appears in the
same log line? What does "+ref" stands for?
While Alexey''s comment is correct, in that there can be multiple Lustre
client mounts on a single node, this is generally only used for testing.  For
client connections the UUID is just a random value like
"f4726b1e-f7eb-479f-b163-784ea45adf32", so using the NID is much more
useful to the viewer in the vast majority of cases.

If you wanted to distinguish the multiple mounts from a single client it would
be best to just do this internally by tracking both the NID and the UUID, but
only printing the NID on the output.  For the rare case where requests have the
same NID but a different UUID you could show this as "NID:2",
"NID:3" or similar.  That preserves the distinction between client
connections, while not making the output completely useless.

Even better than plain numeric NIDs would be to do IP->hostname conversion
for the case of TCP and IB LNDs, if this works.
> Am 23.06.2010 um 12:29 schrieb Alexey Lyashkov:
> 
>> I think better to use client UUID instead of NID as client
identification. Because in your''s case - you can''t separate
info from two clients which run on same node.
>> 
>> 
>> On Jun 22, 2010, at 19:12, Michael Kluge wrote:
>> 
>>> The remaining problems of the counter calculations have been fixed.
There is a screenshot attached showing some values. The code is in a gforge
server that we operate here in Dresden (gforge.zih.tu-dresden.de). The converter
runs on Linux and on MAC OS X and you need Vampir to take a look at the OTF
trace file.
>>> 
>>> Eric, Galen, for the moment I think I am done with the stuff I
promised you at LUG this year? Are there any more ideas?
>>> 
>>> 
>>> Michael
>>> 
>>> 
>>> 
>>> <lustre3.png>
>>> 
>>> 
>>> 
>>> 
>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>> 
>> 
>> 
>> --------------------------------------
>> Alexey Lyashkov
>> alexey.lyashkov at clusterstor.com
>> 
>> 
>> 
>> 
> 
> 
> -- 
> 
> Michael Kluge, M.Sc.
> 
> Technische Universit?t Dresden
> Center for Information Services and
> High Performance Computing (ZIH)
> D-01062 Dresden
> Germany
> 
> Contact:
> Willersbau, Room WIL A 208
> Phone:  (+49) 351 463-34217
> Fax:    (+49) 351 463-37773
> e-mail: michael.kluge at tu-dresden.de
> WWW:    http://www.tu-dresden.de/zih
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

Michael Kluge

2010-Jun-24 08:01 UTC

head link

[Lustre-devel] Lustre RPC visualization

> While Alexey''s comment is correct, in that there can be multiple
Lustre client mounts on a single node, this is generally only used for testing. 
For client connections the UUID is just a random value like
"f4726b1e-f7eb-479f-b163-784ea45adf32", so using the NID is much more
useful to the viewer in the vast majority of cases.
Well, what I can do with OTF and Vampir is to use a process hierarchy. So for
each NID I can have a couple of UUID as ''child processes''.
Like the way one looks at hybrid MPI & OpenMP programs. Where we have MPI
processes and treat OpenMP threads as child processes.
> If you wanted to distinguish the multiple mounts from a single client it
would be best to just do this internally by tracking both the NID and the UUID,
but only printing the NID on the output.  For the rare case where requests have
the same NID but a different UUID you could show this as "NID:2",
"NID:3" or similar.  That preserves the distinction between client
connections, while not making the output completely useless.
Yes, that fits into the statement above.
> Even better than plain numeric NIDs would be to do IP->hostname
conversion for the case of TCP and IB LNDs, if this works.
Well, typically I only have the debug log. Which might be incomplete. I
don''t think I have something that can do this conversion reliably?


Michael

>> Am 23.06.2010 um 12:29 schrieb Alexey Lyashkov:
>> 
>>> I think better to use client UUID instead of NID as client
identification. Because in your''s case - you can''t separate
info from two clients which run on same node.
>>> 
>>> 
>>> On Jun 22, 2010, at 19:12, Michael Kluge wrote:
>>> 
>>>> The remaining problems of the counter calculations have been
fixed. There is a screenshot attached showing some values. The code is in a
gforge server that we operate here in Dresden (gforge.zih.tu-dresden.de). The
converter runs on Linux and on MAC OS X and you need Vampir to take a look at
the OTF trace file.
>>>> 
>>>> Eric, Galen, for the moment I think I am done with the stuff I
promised you at LUG this year? Are there any more ideas?
>>>> 
>>>> 
>>>> Michael
>>>> 
>>>> 
>>>> 
>>>> <lustre3.png>
>>>> 
>>>> 
>>>> 
>>>> 
>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>> 
>>> 
>>> 
>>> --------------------------------------
>>> Alexey Lyashkov
>>> alexey.lyashkov at clusterstor.com
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> -- 
>> 
>> Michael Kluge, M.Sc.
>> 
>> Technische Universit?t Dresden
>> Center for Information Services and
>> High Performance Computing (ZIH)
>> D-01062 Dresden
>> Germany
>> 
>> Contact:
>> Willersbau, Room WIL A 208
>> Phone:  (+49) 351 463-34217
>> Fax:    (+49) 351 463-37773
>> e-mail: michael.kluge at tu-dresden.de
>> WWW:    http://www.tu-dresden.de/zih
>> 
>> _______________________________________________
>> Lustre-devel mailing list
>> Lustre-devel at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-devel
> 
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Technical Lead
> Oracle Corporation Canada Inc.
> 
> 

-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room WIL A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100624/77e1f513/attachment.html

Michael Kluge

2010-Sep-22 13:46 UTC

head link

[Lustre-devel] Lustre RPC visualization

Hi Robert,

could please sent me the command how to generate such a log? 

lctl debug_daemon start /tmp/$HOSTNAME
sysctl -w lnet.debug=+rpctrace

Still produces some kind of binary format (at least with my 1.8.3) which
looks different from what you sent me.


Regards, Michael
> > Right now I do not have big clusters available to generate the trace.
> > I will see what I can do here.
> 
> If ~600 clients is big enough we could generate that on Hyperion.
> 
> robert
> 
> > 
> > Thanks
> > WangDi
> >> 
> >> 
> >>  
> >>>> 
> >>>>      
> >>>    
> >> 
> >> 
------------------------------------------------------------------------
> >> 
> >> _______________________________________________
> >> Lustre-devel mailing list
> >> Lustre-devel at lists.lustre.org
> >> http://lists.lustre.org/mailman/listinfo/lustre-devel
> >>  
> > 
> 
> 
-- 

Michael Kluge, M.Sc.

Technische Universit?t Dresden
Center for Information Services and
High Performance Computing (ZIH)
D-01062 Dresden
Germany

Contact:
Willersbau, Room A 208
Phone:  (+49) 351 463-34217
Fax:    (+49) 351 463-37773
e-mail: michael.kluge at tu-dresden.de
WWW:    http://www.tu-dresden.de/zih
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 5997 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-devel/attachments/20100922/c7d27756/attachment.bin

Rahul Deshmukh

2010-Sep-22 13:57 UTC

head link

[Lustre-discuss] [Lustre-devel] Lustre RPC visualization

Michael Kluge wrote:> Hi Robert,
> 
> could please sent me the command how to generate such a log? 
> 
> lctl debug_daemon start /tmp/$HOSTNAME
> sysctl -w lnet.debug=+rpctrace
I think this command will help you.

lctl debug_file /tmp/$HOSTNAME > /tmp/$HOSTNAME_readable
> 
> Still produces some kind of binary format (at least with my 1.8.3) which
> looks different from what you sent me.
> 
> 
> Regards, Michael
Regards,
Rahul.
> 
>>> Right now I do not have big clusters available to generate the
trace.
>>> I will see what I can do here.
>> If ~600 clients is big enough we could generate that on Hyperion.
>>
>> robert
>>
>>> Thanks
>>> WangDi
>>>>
>>>>  
>>>>>>      
>>>>>    
>>>> 
------------------------------------------------------------------------
>>>>
>>>> _______________________________________________
>>>> Lustre-devel mailing list
>>>> Lustre-devel at lists.lustre.org
>>>> http://lists.lustre.org/mailman/listinfo/lustre-devel
>>>>  
>>
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> Lustre-devel mailing list
> Lustre-devel at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-devel

Andreas Dilger

2010-Sep-22 18:28 UTC

head link

[Lustre-devel] Lustre RPC visualization

On 2010-09-22, at 06:46, Michael Kluge wrote:> could please sent me the command how to generate such a log? 
> 
> lctl debug_daemon start /tmp/$HOSTNAME
> sysctl -w lnet.debug=+rpctrace
> 
> Still produces some kind of binary format (at least with my 1.8.3) which
> looks different from what you sent me.
There are a few different ways to generate a Lustre debug log:
- "lctl dk [filename]" (short for "lctl debug_kernel
[filename]") will
  dump the current debug buffer to [filename], or stdout if unspecified.
  It will decode the binary debug log into ASCII at this time
- "lctl debug_daemon {filename}" starts the binary kernel debug log to
be
  written to {filename} as the debug pages are filled
- in case of LBUG/LASSERT it also dumps the binary debug log to the file
  /tmp/lustre.log.{timestamp} (unless specified otherwise)

In the last two cases, it is necessary to run "lctl debug_file {infile}
[outfile]" decodes binary the input file to the ASCII output file.  This
should preferably be run on the system where the binary file was generated,
though generally the binary format is only changing between systems of different
endianness.

Cheers, Andreas
--
Andreas Dilger
Lustre Technical Lead
Oracle Corporation Canada Inc.

Lustre devel - Apr 2010 - (no subject)

[Lustre-devel] (no subject)

[Lustre-devel] (no subject)

[Lustre-devel] (no subject)

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization

[Lustre-discuss] [Lustre-devel] Lustre RPC visualization

[Lustre-devel] Lustre RPC visualization