I''ve been playing around off and on with making a USDT for the MPI
Peruse hooks in the Open MPI implementation of the MPI library. I''ve
included the probe definitions below. One thing that bothers me is I am
including "op" as a parameter to all the probes even though it is
contained in the comm_spec_t structure. I did this so I could easily do
predicates with op to filter on Send and Recv based events. Other than that the
probes match the MPI Peruse hooks.
probe name args[0] args[1] args[2] args[3]
mpi_peruse*:::PERUSE_COMM_REQ_ACTIVATE
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_REQ_MATCH_UNEX
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_REQ_INSERT_IN_POSTED_Q
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_REQ_REMOVE_FROM_POSTED_Q
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_REQ_XFER_BEGIN
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_REQ_XFER_CONTINUE
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_REQ_XFER_END
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_REQ_COMPLETE
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_REQ_NOTIFY
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_MSG_ARRIVED
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_MSG_INSERT_IN_UNEX_Q
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_MSG_MATCH_POSTED_REQ
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_SEARCH_POSTED_Q_BEGIN
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_SEARCH_POSTED_Q_END
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_SEARCH_UNEX_Q_BEGIN
conninfo_t * id op comm_spec_t *
mpi_peruse*:::PERUSE_COMM_SEARCH_UNEX_Q_END
conninfo_t * id op comm_spec_t *
typedef struct conninfo {
string ci_local; /* local host address */
string ci_remote; /* remote host address */
string ci_protocol; /* protocol (ipv4, ipv6, etc) */
} conninfo_t;
int id /* unique communications request id */
int op; /* operation identifier SEND=0, RECV=1 */
struct comm_spec {
uintptr_t comm;
uintptr_t buf;
int count;
uintptr_t datatype;
int peer;
int tag;
int operation;
} comm_spec_t;
In addition to the above I am looking at making some Open MPI specific probes to
dump out communicator group information (to determine where each rank resides)
during creation and some datatype information.
I also, would like to add a message sequence number to the above probes so I can
match messages from sender to receiver exactly. However that would make the
probes be incompatible with other MPI implementations since the sequence numbers
would be an Open MPI specific. That and the fact I am running into a snag
getting the sequence numbers without breaking the Peruse macros Open MPI uses.
--td
--
This message posted from opensolaris.org
Hey Terry, At some point we really need to get our collective act together and write up a DTrace provider naming guide, but absent that here are a few comments: We tend to prefer dashes to underscores so I think the name of the provider should be mpi-peruse (mpi__peruse in your D file). It seems redundant to have the PERUSE_COMM prefix, would it make sense to name probes like this: mpi-peruse:::req-activate?> struct comm_spec { > uintptr_t comm; > uintptr_t buf; > int count; > uintptr_t datatype; > int peer; > int tag; > int operation; > } comm_spec_t;Can you explain what these members represent, and could you explain the semantics of the various probes? Also, I''m doing some work with the NFS team on an nfsv4 provider that should make it easier to use the common conninfo_t that we''re using in the iSCSI target provider today. Thanks. Adam On Wed, Aug 15, 2007 at 05:23:59AM -0700, Terry Dontje wrote:> I''ve been playing around off and on with making a USDT for the MPI Peruse hooks in the Open MPI implementation of the MPI library. I''ve included the probe definitions below. One thing that bothers me is I am including "op" as a parameter to all the probes even though it is contained in the comm_spec_t structure. I did this so I could easily do predicates with op to filter on Send and Recv based events. Other than that the probes match the MPI Peruse hooks. > > probe name args[0] args[1] args[2] args[3] > mpi_peruse*:::PERUSE_COMM_REQ_ACTIVATE > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_MATCH_UNEX > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_INSERT_IN_POSTED_Q > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_REMOVE_FROM_POSTED_Q > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_XFER_BEGIN > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_XFER_CONTINUE > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_XFER_END > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_COMPLETE > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_NOTIFY > conninfo_t * id op comm_spec_t * > > mpi_peruse*:::PERUSE_COMM_MSG_ARRIVED > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_MSG_INSERT_IN_UNEX_Q > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_MSG_MATCH_POSTED_REQ > conninfo_t * id op comm_spec_t * > > mpi_peruse*:::PERUSE_COMM_SEARCH_POSTED_Q_BEGIN > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_SEARCH_POSTED_Q_END > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_SEARCH_UNEX_Q_BEGIN > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_SEARCH_UNEX_Q_END > conninfo_t * id op comm_spec_t * > > typedef struct conninfo { > string ci_local; /* local host address */ > string ci_remote; /* remote host address */ > string ci_protocol; /* protocol (ipv4, ipv6, etc) */ > } conninfo_t; > > int id /* unique communications request id */ > > int op; /* operation identifier SEND=0, RECV=1 */ > > struct comm_spec { > uintptr_t comm; > uintptr_t buf; > int count; > uintptr_t datatype; > int peer; > int tag; > int operation; > } comm_spec_t; > > > In addition to the above I am looking at making some Open MPI specific probes to dump out communicator group information (to determine where each rank resides) during creation and some datatype information. > > I also, would like to add a message sequence number to the above probes so I can match messages from sender to receiver exactly. However that would make the probes be incompatible with other MPI implementations since the sequence numbers would be an Open MPI specific. That and the fact I am running into a snag getting the sequence numbers without breaking the Peruse macros Open MPI uses. > > --td > > > -- > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Hi Adam,
Thanks for the comments here are the answers to them...
1. I''ll rename the mpi_peruse to mpi-peruse. I forgot why I did away
with the "-".
2. Removing the PERUSE_COMM prefix from the probes is going to be next to
impossible.
The reason being is I am building on top of the MPI Peruse framework within
Open MPI which uses macros. I am placing the probes inside the macros and
using one
of the parameters to the macros for the probe names.
3. The definitions of the members in the comm_spec structure follows. Note
this was adopted from the MPI Peruse spec.
> struct comm_spec {
> uintptr_t comm; /* pointer to MPI Communicator communications is on */
> uintptr_t buf; /* pointer to buffer being sent */
> int count; /* Number of datatypes in buffer to be
sent/recv''d */
> uintptr_t datatype; /* pointer to the MPI datatype handle, how the data
layout looks */
> int peer; /* rank of peer sending to or receiving data from
*/
> int tag; /* tag of message being sent */
> int operation; /* type of operation being done while event was
triggered (Send=0,Recv=1) */
> } comm_spec_t;
4. The sematics of the probes are basically the same as described in the MPI
Peruse Spec
(http://www.mpi-peruse.org/current_peruse_spec.pdf) in chapter 4 pages 15
through 19.
Each event that can be triggered by Peruse is probed. The difference is
that we use the
DTrace mechanism to turn on and off the probes instead of Peruse''s
way of registering
callbacks for events.
Adam I am interested in anything you come up with the NFS engineers regarding a
common comminfo_t. One silly question I have is I am assuming the comminfo_t
protocol member should be set to "mpi" in my probes, right?
--td
--
This message posted from opensolaris.org
On Wed, Aug 22, 2007 at 05:10:34AM -0700, Terry Dontje wrote:> Adam I am interested in anything you come up with the NFS engineers > regarding a common comminfo_t. One silly question I have is I am > assuming the comminfo_t protocol member should be set to "mpi" in my > probes, right?The protocol should be the actual transport (e.g. "ipv4"). "mpi" is implied by the provider (should it be the ''mpi'' provider?). Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
> The protocol should be the actual transport (e.g. > "ipv4"). "mpi" is > implied by the provider (should it be the ''mpi'' > provider?). > > AdamI was thinking that might have been what you meant, however my probes might be too high to determine which protocol the message will be going over and in certain case a message can be striped over multiple interfaces and protocols (uDAPL and tcp). Anyways, I''ll see if I can cook something up that allows me to specify which protocol will be used. --td -- This message posted from opensolaris.org