Hi Gurus, I am using a dtrace script to collect the disk io information. #!/usr/sbin/dtrace -qCs dtrace:::BEGIN { printf("Tracing... Hit Ctrl-C to end.\n"); } fbt:sd:sdstrategy:entry { start[(struct buf *)arg0] = timestamp; } fbt:sd:sdintr:entry / start[(this->buf = (struct buf *)((struct scsi_pkt *)arg0)->pkt_private)] != 0 / { this->un = ((struct sd_xbuf *) this->buf->b_private)->xb_un; this->scsi_device = ((struct sd_lun *)this->un)->un_sd; this->scsi_address = ((struct scsi_device *)this->scsi_device)->sd_address; this->tgt_id = ((struct scsi_address)this->scsi_address).a_target; this->lun_id = ((struct scsi_address)this->scsi_address).a_lun; this->sd_dev = ((struct scsi_device *)this->scsi_device)->sd_dev; this->devi_addr = ((struct dev_info *)this->sd_dev)->devi_addr; @[stringof(this->devi_addr), this->tgt_id, this->lun_id, "<TGT,LUN>"] = lquantize((timestamp - start[this->buf])/1000000, 60000, 600000, 60000); @q[stringof(this->devi_addr), this->tgt_id, this->lun_id, "<TGT,LUN>"] = quantize((timestamp - start[this->buf])/1000000); start[this->buf] = 0; } dtrace:::END { } I am happy with the result I collected. w21000011c68108b1,0 1284 0 <TGT,LUN> value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 70706 1 |@@@@@@@@@@@ 25665 2 | 58 4 | 10 8 | 5 16 | 1 32 | 0 w21000011c6810a19,0 2050 0 <TGT,LUN> value ------------- Distribution ------------- count -1 | 0 0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@ 65479 1 |@@@@@@@@@@@@@ 32115 2 | 38 4 | 9 8 | 2 16 | 0 20:28:59.029 82 112600.30 54.98 512 100.00 1.137 11.273 0.230 72.8 56.0 20:29:00.034 83 111777.97 54.58 512 100.00 1.136 17.561 0.263 72.1 55.8 20:29:01.039 84 112644.53 55.00 512 100.00 1.136 13.126 0.221 73.0 56.2 20:29:02.026 85 111823.39 54.60 512 100.00 1.136 17.770 0.271 80.0 57.8 20:29:03.027 86 111960.96 54.67 512 100.00 1.137 18.749 0.350 83.0 57.6 20:29:04.099 87 110560.03 53.98 512 100.00 1.136 32.826 0.688 91.0 68.7 20:29:05.019 88 109541.66 53.49 512 100.00 1.140 105.035 1.263 99.2 80.9 20:29:06.041 89 108109.20 52.79 512 100.00 1.143 34.500 1.023 99.3 80.7 20:29:07.066 90 107981.97 52.73 512 100.00 1.141 40.364 0.920 99.5 80.8 interval i/o MB/sec bytes read resp resp resp cpu% cpu% rate 1024**2 i/o pct time max stddev sys+usr sys 20:29:08.062 91 109250.32 53.34 512 100.00 1.144 72.684 1.098 99.0 79.9 20:29:09.101 92 108673.08 53.06 512 100.00 1.147 83.059 1.358 99.3 80.0 20:29:10.081 93 109205.82 53.32 512 100.00 1.139 111.060 1.198 99.5 81.2 20:29:11.028 94 108347.70 52.90 512 100.00 1.141 85.047 1.049 99.0 80.2 20:29:12.030 95 109196.89 53.32 512 100.00 1.148 57.354 1.248 99.5 80.8 20:29:13.057 96 108357.71 52.91 512 100.00 1.142 46.475 1.095 99.5 80.6 20:29:14.018 97 109040.55 53.24 512 100.00 1.145 45.807 1.106 99.0 80.5 20:29:15.030 98 109622.74 53.53 512 100.00 1.141 51.471 0.965 99.0 81.3 20:29:16.041 99 109202.76 53.32 512 100.00 1.141 52.890 1.103 99.0 80.4 20:29:17.030 100 108316.06 52.89 512 100.00 1.139 70.290 1.135 99.5 80.1 20:29:18.035 101 108316.66 52.89 512 100.00 1.141 51.215 1.048 99.0 80.1 20:29:19.029 102 110732.81 54.07 512 100.00 1.153 94.411 1.082 74.8 57.7 20:29:20.020 103 112189.29 54.78 512 100.00 1.135 13.923 0.218 72.7 55.6 20:29:21.026 104 111925.33 54.65 512 100.00 1.136 16.001 0.237 72.6 56.0 20:29:22.032 105 112443.53 54.90 512 100.00 1.136 18.047 0.247 73.0 55.8 20:29:23.027 106 112051.83 54.71 512 100.00 1.135 11.832 0.188 73.6 56.3 20:29:24.030 107 112146.90 54.76 512 100.00 1.136 13.474 0.226 74.1 56.4 The only thing I am dissatisfied is that the CPU utilization increase from 73% to 99% during the probing period. Is this something I have to tolerate? Does it mean that the DTrace is not a good tool for perf monitoring. Thanks for any comments John -- This message posted from opensolaris.org
My test machine is SunFire x4600M2 with 4 CPUs running snv_87, Sun''s vdbench test suite. -- This message posted from opensolaris.org
As I observe, your script is basically doing all work inside the probe. Since you already have a fairly busy system with LOT of IO, your script creates a significant overhead. Can you try to limit the script to just collect the data and post-process it either in END probe or via another script ? I am not sure whether writing a translator would give you better performance than so many casts, but you can try. -- This message posted from opensolaris.org
> As I observe, your script is basically doing all work inside the > probe. Since you already have a fairly busy system with LOT of IO, > your script creates a significant overhead.That may be, but the overhead of the probe is probably far greater than the work done in the probe. You might try just enabling the probes without any data gathering as a test (you''ll need some trivial operation to suppress the default action).> Can you try to limit the script to just collect the data and post- > process it either in END probe or via another script ?Post-processing the data will certainly consume more system resources as there will be much more data moving through the system. One of the key innovations of DTrace is the ability to prune and aggregate data in situ (rather than processing it a posteriori).> I am not sure whether writing a translator would give you better > performance than so many casts, but you can try.First, casts are for the compiler''s type checking and don''t actually generate any code; this is true in D as it is on C. There are a bunch of loads in this D script, but a translator isn''t going to do anything about that. Adam -- Adam Leventhal, Fishworks http://blogs.sun.com/ahl
> > Can you try to limit the script to just collect the > data and post- > > process it either in END probe or via another > script ? > > Post-processing the data will certainly consume more > system resources as > there will be much more data moving through the > system. One of the key > innovations of DTrace is the ability to prune and > aggregate data in situ > (rather than processing it a posteriori).I''d say it will depend upon the situation. In this case all the data collected is being displayed in output along with some post-processing (viz. lquantize). On a CPU starved system, it may pay to store the data and process it offline (or in END action) than in realtime as system is CPU starved but may still have good amount of IO bandwidth remaining. -- This message posted from opensolaris.org
> I''d say it will depend upon the situation. In this case all the data > collected is being displayed in output along with some post- > processing (viz. lquantize). On a CPU starved system, it may pay to > store the data and process it offline (or in END action) than in > realtime as system is CPU starved but may still have good amount of > IO bandwidth remaining.It''s quite unlikely that you''ll see better performance by post- processing the data, but if you''ve seen that in practice, I''d love to hear about it. Adam -- Adam Leventhal, Fishworks http://blogs.sun.com/ahl
after removed lquantize and quantize, #!/usr/sbin/dtrace -qCs dtrace:::BEGIN { printf("Tracing...Hit Ctrl-C to end.\n"); } fbt:sd:sdstrategy:entry { start[(struct buf *)arg0] = timestamp; } fbt:sd:sdintr:entry / start[(this->buf = (struct buf *)((struct scsi_pkt *)arg0)->pkt_private)] != 0 / { this->un = ((struct sd_xbuf *) this->buf->b_private)->xb_un; this->scsi_device = ((struct sd_lun *)this->un)->un_sd; this->scsi_address = ((struct scsi_device *)this->scsi_device)->sd_address; this->tgt_id = ((struct scsi_address)this->scsi_address).a_target; this->lun_id = ((struct scsi_address)this->scsi_address).a_lun; this->sd_dev = ((struct scsi_device *)this->scsi_device)->sd_dev; this->devi_addr = ((struct dev_info *)this->sd_dev)->devi_addr; start[this->buf] = 0; } the CPU utilization changed from 73% to 91% after remove all the casts: #!/usr/sbin/dtrace -qCs dtrace:::BEGIN { printf("Tracing...Hit Ctrl-C to end.\n"); } fbt:sd:sdstrategy:entry { start[(struct buf *)arg0] = timestamp; } fbt:sd:sdintr:entry / start[(this->buf = (struct buf *)((struct scsi_pkt *)arg0)->pkt_private)] != 0 / { start[this->buf] = 0; } The CPU utilization change is from 73% to 83% With the following code: #!/usr/sbin/dtrace -qCs dtrace:::BEGIN { printf("Tracing...Hit Ctrl-C to end.\n"); } fbt:sd:sdstrategy:entry { } fbt:sd:sdintr:entry { } CPU utilization is from 73% to 76% With this code: fbt:sd:sdintr:entry / start[(this->buf = (struct buf *)((struct scsi_pkt *)arg0)->pkt_private)] != 0 / { this->un = ((struct sd_xbuf *) this->buf->b_private)->xb_un; this->scsi_device = ((struct sd_lun *)this->un)->un_sd; this->scsi_address = ((struct scsi_device *)this->scsi_device)->sd_address; this->tgt_id = ((struct scsi_address)this->scsi_address).a_target; this->lun_id = ((struct scsi_address)this->scsi_address).a_lun; start[this->buf] = 0; } CPU utilization is from 73% to 90% Thanks. -- This message posted from opensolaris.org