Gordon Marler
2008-Feb-27 23:19 UTC
[dtrace-discuss] Mapping device major, minor numbers to the related link under /dev/[r]dsk/
When using DTrace to track I/Os to disks under the sd/ssd driver, it''s
easy enough to get the major/minor device number of the disk being targeted for
I/O:
fbt:ssd:ssdstrategy:entry,
fbt:sd:sdstrategy:entry
{
io_start[(struct buf *)arg0] = timestamp;
}
fbt:ssd:ssdintr:entry,
fbt:sd:sdintr:entry
/ io_start[(this->buf = (struct buf *)((struct scsi_pkt
*)arg0)->pkt_private)]
!= 0 /
{
this->un = ((struct sd_xbuf *) this->buf->b_private)->xb_un;
this->major = getmajor(this->buf->b_edev);
this->minor = getminor(this->buf->b_edev);
@max_ms[this->major, this->minor] max((timestamp -
io_start[this->buf])/1000000); /* convert ns to ms */
io_start[this->buf] = 0;
}
We can even get device names in the form of ''ssdXXX'', which
unfortunately aren''t that helpful.
How can we map major/minor numbers or device names to their /dev/[r]dsk/...
symlinks in DTrace? That would make the output more intuitive, at least to us.
Gordon Marler
gmarler at gmarler.com
--
This message posted from opensolaris.org
Roch - PAE
2008-Mar-05 00:04 UTC
[dtrace-discuss] Mapping device major, minor numbers to the related link under /dev/[r]dsk/
Gordon Marler writes:
> When using DTrace to track I/Os to disks under the sd/ssd driver,
it''s easy enough to get the major/minor device number of the disk
being targeted for I/O:
>
> fbt:ssd:ssdstrategy:entry,
> fbt:sd:sdstrategy:entry
> {
> io_start[(struct buf *)arg0] = timestamp;
> }
>
> fbt:ssd:ssdintr:entry,
> fbt:sd:sdintr:entry
> / io_start[(this->buf = (struct buf *)((struct scsi_pkt
*)arg0)->pkt_private)]
> != 0 /
> {
> this->un = ((struct sd_xbuf *) this->buf->b_private)->xb_un;
> this->major = getmajor(this->buf->b_edev);
> this->minor = getminor(this->buf->b_edev);
>
> @max_ms[this->major, this->minor] > max((timestamp -
io_start[this->buf])/1000000); /* convert ns to ms */
>
> io_start[this->buf] = 0;
> }
>
> We can even get device names in the form of ''ssdXXX'',
which unfortunately aren''t that helpful.
>
> How can we map major/minor numbers or device names to their
/dev/[r]dsk/... symlinks in DTrace? That would make the output more intuitive,
at least to us.
>
> Gordon Marler
> gmarler at gmarler.com
>
>
> --
> This message posted from opensolaris.org
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
Here is an example script where I got this working :
#!/usr/bin/ksh
#
# iolat - print avg latency and distribution of each disk
#
# This scripts prints average latency of each disk over the length of the
# run. With -d, results includes an lquantization of the result. Scripts
# runs until interrupted. Results are in usec (actually nsec >>10).
#
# If you get "DIF program exceeds maximum program size"
# Try : echo dtrace_dof_maxsize/Z1000000 | mdb -kw
#
opt_dist=0; min=0; max=50000; bin=5000;
### process options
while getopts dh name
do
case $name in
d) opt_dist=1 ;;
h|?) cat <<-END >&2
USAGE: iolat [-d [min max bucket]]
-d # report per disk latency distribution.
min max bucket # arguments to dtrace lquantize in usec.
# Default : 0, 5000, 500.
END
exit 1
esac
done
shift $(( $OPTIND - 1 ))
### option logic
if [[ ! "$1" == "" ]]; then
min=$1; shift
fi
if [[ ! "$1" == "" ]]; then
max=$1; shift
fi
if [[ ! "$1" == "" ]]; then
bin=$1; shift
fi
#Build a string to be parsed by dtrace
#Contains entry of the form
# disks[major, minor] = "c0t0d0s0";
#
disks_arr_string=`/bin/ls -1lL /dev/dsk/* 2>/dev/null | /usr/bin/nawk
-F''[ ,\t/]*'' ''{print "disks[" $5 ",
" $6 "] = \"" $12 "\";"}''`
if [[ $opt_dist == 1 ]]; then
quantize_gather="@d[this->disk] = lquantize(this->lat, $min,
$max, $bin);"
quantize_print="printf(\"\n\nLatency Distribution:\n\n\");
printa(\"%20s %@10d\n\", @d);"
else
quantize_gather=""
quantize_print=""
fi
script_str=''
BEGIN {
''$disks_arr_string'';
printf("Ctrl-C to terminate and get results\n");
}
io:::start{
started[args[0]->b_edev, args[0]->b_blkno] = timestamp;
}
io:::done
/started[args[0]->b_edev, args[0]->b_blkno] != 0/
{
this->major = args[1]->dev_major;
this->minor = args[1]->dev_minor;
this->disk = (disks[this->major, this->minor] == "" ?
"unknown" : disks[this->major, this->minor]);
this->lat = (timestamp - started[args[0]->b_edev,
args[0]->b_blkno])>>10;
@a[this->disk] = avg(this->lat); /* First is sort key */
@b[this->disk] = count();
@c[this->disk] = sum(args[0]->b_bcount);
''$quantize_gather''
started[args[0]->b_edev, args[0]->b_blkno] = 0;
}
END {
printf("%20s %10s %10s %10s\n", "Disk", "Avg
Lat(us)", "IO cnt", "Bytes");
printf("%20s %10s %10s %10s\n", "----",
"----------", "------", "-----");
printa("%20s %@10d %@10d %@10d\n", @a, @b, @c);
''$quantize_print''
}
''
/usr/bin/pfexec /usr/sbin/dtrace -qn "$script_str"
Gordon Marler
2008-Mar-12 18:44 UTC
[dtrace-discuss] Mapping device major, minor numbers to the related link under /dev/[r]dsk/
Very nice example. Thanks for the follow up. Seems to work well, just had to bump dtrace_dof_maxsize up to 512K with mdb to get it to work on Solaris 10. -- This message posted from opensolaris.org