I''ve been playing around off and on with making a USDT for the MPI Peruse hooks in the Open MPI implementation of the MPI library. I''ve included the probe definitions below. One thing that bothers me is I am including "op" as a parameter to all the probes even though it is contained in the comm_spec_t structure. I did this so I could easily do predicates with op to filter on Send and Recv based events. Other than that the probes match the MPI Peruse hooks. probe name args[0] args[1] args[2] args[3] mpi_peruse*:::PERUSE_COMM_REQ_ACTIVATE conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_REQ_MATCH_UNEX conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_REQ_INSERT_IN_POSTED_Q conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_REQ_REMOVE_FROM_POSTED_Q conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_REQ_XFER_BEGIN conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_REQ_XFER_CONTINUE conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_REQ_XFER_END conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_REQ_COMPLETE conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_REQ_NOTIFY conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_MSG_ARRIVED conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_MSG_INSERT_IN_UNEX_Q conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_MSG_MATCH_POSTED_REQ conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_SEARCH_POSTED_Q_BEGIN conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_SEARCH_POSTED_Q_END conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_SEARCH_UNEX_Q_BEGIN conninfo_t * id op comm_spec_t * mpi_peruse*:::PERUSE_COMM_SEARCH_UNEX_Q_END conninfo_t * id op comm_spec_t * typedef struct conninfo { string ci_local; /* local host address */ string ci_remote; /* remote host address */ string ci_protocol; /* protocol (ipv4, ipv6, etc) */ } conninfo_t; int id /* unique communications request id */ int op; /* operation identifier SEND=0, RECV=1 */ struct comm_spec { uintptr_t comm; uintptr_t buf; int count; uintptr_t datatype; int peer; int tag; int operation; } comm_spec_t; In addition to the above I am looking at making some Open MPI specific probes to dump out communicator group information (to determine where each rank resides) during creation and some datatype information. I also, would like to add a message sequence number to the above probes so I can match messages from sender to receiver exactly. However that would make the probes be incompatible with other MPI implementations since the sequence numbers would be an Open MPI specific. That and the fact I am running into a snag getting the sequence numbers without breaking the Peruse macros Open MPI uses. --td -- This message posted from opensolaris.org
Hey Terry, At some point we really need to get our collective act together and write up a DTrace provider naming guide, but absent that here are a few comments: We tend to prefer dashes to underscores so I think the name of the provider should be mpi-peruse (mpi__peruse in your D file). It seems redundant to have the PERUSE_COMM prefix, would it make sense to name probes like this: mpi-peruse:::req-activate?> struct comm_spec { > uintptr_t comm; > uintptr_t buf; > int count; > uintptr_t datatype; > int peer; > int tag; > int operation; > } comm_spec_t;Can you explain what these members represent, and could you explain the semantics of the various probes? Also, I''m doing some work with the NFS team on an nfsv4 provider that should make it easier to use the common conninfo_t that we''re using in the iSCSI target provider today. Thanks. Adam On Wed, Aug 15, 2007 at 05:23:59AM -0700, Terry Dontje wrote:> I''ve been playing around off and on with making a USDT for the MPI Peruse hooks in the Open MPI implementation of the MPI library. I''ve included the probe definitions below. One thing that bothers me is I am including "op" as a parameter to all the probes even though it is contained in the comm_spec_t structure. I did this so I could easily do predicates with op to filter on Send and Recv based events. Other than that the probes match the MPI Peruse hooks. > > probe name args[0] args[1] args[2] args[3] > mpi_peruse*:::PERUSE_COMM_REQ_ACTIVATE > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_MATCH_UNEX > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_INSERT_IN_POSTED_Q > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_REMOVE_FROM_POSTED_Q > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_XFER_BEGIN > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_XFER_CONTINUE > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_XFER_END > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_COMPLETE > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_REQ_NOTIFY > conninfo_t * id op comm_spec_t * > > mpi_peruse*:::PERUSE_COMM_MSG_ARRIVED > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_MSG_INSERT_IN_UNEX_Q > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_MSG_MATCH_POSTED_REQ > conninfo_t * id op comm_spec_t * > > mpi_peruse*:::PERUSE_COMM_SEARCH_POSTED_Q_BEGIN > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_SEARCH_POSTED_Q_END > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_SEARCH_UNEX_Q_BEGIN > conninfo_t * id op comm_spec_t * > mpi_peruse*:::PERUSE_COMM_SEARCH_UNEX_Q_END > conninfo_t * id op comm_spec_t * > > typedef struct conninfo { > string ci_local; /* local host address */ > string ci_remote; /* remote host address */ > string ci_protocol; /* protocol (ipv4, ipv6, etc) */ > } conninfo_t; > > int id /* unique communications request id */ > > int op; /* operation identifier SEND=0, RECV=1 */ > > struct comm_spec { > uintptr_t comm; > uintptr_t buf; > int count; > uintptr_t datatype; > int peer; > int tag; > int operation; > } comm_spec_t; > > > In addition to the above I am looking at making some Open MPI specific probes to dump out communicator group information (to determine where each rank resides) during creation and some datatype information. > > I also, would like to add a message sequence number to the above probes so I can match messages from sender to receiver exactly. However that would make the probes be incompatible with other MPI implementations since the sequence numbers would be an Open MPI specific. That and the fact I am running into a snag getting the sequence numbers without breaking the Peruse macros Open MPI uses. > > --td > > > -- > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
Hi Adam, Thanks for the comments here are the answers to them... 1. I''ll rename the mpi_peruse to mpi-peruse. I forgot why I did away with the "-". 2. Removing the PERUSE_COMM prefix from the probes is going to be next to impossible. The reason being is I am building on top of the MPI Peruse framework within Open MPI which uses macros. I am placing the probes inside the macros and using one of the parameters to the macros for the probe names. 3. The definitions of the members in the comm_spec structure follows. Note this was adopted from the MPI Peruse spec.> struct comm_spec { > uintptr_t comm; /* pointer to MPI Communicator communications is on */ > uintptr_t buf; /* pointer to buffer being sent */ > int count; /* Number of datatypes in buffer to be sent/recv''d */ > uintptr_t datatype; /* pointer to the MPI datatype handle, how the data layout looks */ > int peer; /* rank of peer sending to or receiving data from */ > int tag; /* tag of message being sent */ > int operation; /* type of operation being done while event was triggered (Send=0,Recv=1) */ > } comm_spec_t;4. The sematics of the probes are basically the same as described in the MPI Peruse Spec (http://www.mpi-peruse.org/current_peruse_spec.pdf) in chapter 4 pages 15 through 19. Each event that can be triggered by Peruse is probed. The difference is that we use the DTrace mechanism to turn on and off the probes instead of Peruse''s way of registering callbacks for events. Adam I am interested in anything you come up with the NFS engineers regarding a common comminfo_t. One silly question I have is I am assuming the comminfo_t protocol member should be set to "mpi" in my probes, right? --td -- This message posted from opensolaris.org
On Wed, Aug 22, 2007 at 05:10:34AM -0700, Terry Dontje wrote:> Adam I am interested in anything you come up with the NFS engineers > regarding a common comminfo_t. One silly question I have is I am > assuming the comminfo_t protocol member should be set to "mpi" in my > probes, right?The protocol should be the actual transport (e.g. "ipv4"). "mpi" is implied by the provider (should it be the ''mpi'' provider?). Adam -- Adam Leventhal, Solaris Kernel Development http://blogs.sun.com/ahl
> The protocol should be the actual transport (e.g. > "ipv4"). "mpi" is > implied by the provider (should it be the ''mpi'' > provider?). > > AdamI was thinking that might have been what you meant, however my probes might be too high to determine which protocol the message will be going over and in certain case a message can be striped over multiple interfaces and protocols (uDAPL and tcp). Anyways, I''ll see if I can cook something up that allows me to specify which protocol will be used. --td -- This message posted from opensolaris.org