How can I tell if this is normal behaviour? Oracle imports are horribly slow, an order of magnitude slower than on the same hardware with a slower disk array and Solaris 9. What I can look for to see where the problem lies? The server is 99% idle right now, with one database running. Each sample is about 5 seconds. I''ve tried setting kernel parameters despite the docs saying that they are obsolete or set now via resource controls. syslogging for the resource controls shows no problems. # ./errinfo -cn oracle Sampling... Hit Ctrl-C to end. ^C EXEC SYSCALL ERR COUNT DESC oracle ioctl 12 1 Not enough core oracle semsys 11 4 Resource temporarily unavailable # ./procsystime -can oracle Hit Ctrl-C to stop sampling... ^C Elapsed Times for processes oracle, SYSCALL TIME (ns) sysconfig 4400 pset 7700 lwp_park 34600 times 39400 kill 46200 close 67900 open 177100 read 326500 kaio 16935000 pwrite64 31855600 semsys 6015496000 CPU Times for processes oracle, SYSCALL TIME (ns) sysconfig 700 pset 4200 times 11200 kill 21700 lwp_park 23400 kaio 40800 close 42700 semsys 96000 open 150200 read 300900 pwrite64 435000 Syscall Counts for processes oracle, SYSCALL COUNT pset 1 sysconfig 1 pollsys 1 pwrite64 2 lwp_park 4 kaio 4 semsys 6 kill 6 close 6 read 6 open 6 times 8 This message posted from opensolaris.org
Alan, You should be able to use the same /etc/system parameters, even though it''s recommended to use resource controls. Is the file system type the same? Same mount options and other tuning parameters? With imports, this can make a big difference. Check i/o performance with iostat. Also use ''io'' provider scripts to observe physical i/o sizes and timings. Jim Alan Owens wrote:>How can I tell if this is normal behaviour? Oracle imports are horribly slow, an order of magnitude slower than on the same hardware with a slower disk array and Solaris 9. What I can look for to see where the problem lies? > >The server is 99% idle right now, with one database running. Each sample is about 5 seconds. I''ve tried setting kernel parameters despite the docs saying that they are obsolete or set now via resource controls. syslogging for the resource controls shows no problems. > ># ./errinfo -cn oracle >Sampling... Hit Ctrl-C to end. >^C > EXEC SYSCALL ERR COUNT DESC > oracle ioctl 12 1 Not enough core > oracle semsys 11 4 Resource temporarily unavailable > > > ># ./procsystime -can oracle >Hit Ctrl-C to stop sampling... >^C > >Elapsed Times for processes oracle, > > SYSCALL TIME (ns) > sysconfig 4400 > pset 7700 > lwp_park 34600 > times 39400 > kill 46200 > close 67900 > open 177100 > read 326500 > kaio 16935000 > pwrite64 31855600 > semsys 6015496000 > >CPU Times for processes oracle, > > SYSCALL TIME (ns) > sysconfig 700 > pset 4200 > times 11200 > kill 21700 > lwp_park 23400 > kaio 40800 > close 42700 > semsys 96000 > open 150200 > read 300900 > pwrite64 435000 > >Syscall Counts for processes oracle, > > SYSCALL COUNT > pset 1 > sysconfig 1 > pollsys 1 > pwrite64 2 > lwp_park 4 > kaio 4 > semsys 6 > kill 6 > close 6 > read 6 > open 6 > times 8 >This message posted from opensolaris.org >_______________________________________________ >dtrace-discuss mailing list >dtrace-discuss at opensolaris.org > >
Thanks Jim, I was pretty sure that the problem isn''t i/o related. The server is a 4 way v440 with 16gb mem and a 6120 disk array. I never see any i/o wait in sar or vmstat. Here''s the /etc/vfstab entry. Logging is on by default in Solaris 10, but I always turned it on for Solaris 9 ufs filesystems so I set it explicitly here too. /dev/dsk/c3t1d0s3 /dev/rdsk/c3t1d0s3 /data1 ufs 1 yes logging I ran this to test write speed: # dd if=/dev/zero of=/data1/alan.data bs=8k count=128000 128000+0 records in 128000+0 records out # ./iopattern -f /data1/alan.data FILE %RAN %SEQ COUNT MIN MAX AVG KR KW /data1/alan.data 0 0 0 0 0 0 0 0 /data1/alan.data 84 16 58 8192 1048576 659314 0 37344 /data1/alan.data 83 17 55 8192 1048576 706597 0 37952 /data1/alan.data 92 8 56 8192 1048576 686665 0 37552 /data1/alan.data 85 15 74 8192 1048576 432072 0 31224 /data1/alan.data 78 22 56 8192 1048576 731721 0 40016 /data1/alan.data 63 37 44 8192 1048576 712331 0 30608 /data1/alan.data 78 22 57 8192 1048576 699625 0 38944 /data1/alan.data 60 40 43 8192 1048576 806245 0 33856 /data1/alan.data 77 23 53 8192 1048576 682254 0 35312 /data1/alan.data 85 15 64 8192 1048576 660736 0 41296 /data1/alan.data 76 24 56 8192 1048576 710509 0 38856 /data1/alan.data 70 30 47 8192 1048576 664074 0 30480 /data1/alan.data 94 6 38 8192 1048576 755819 0 28048 /data1/alan.data 66 34 45 8192 1048576 815559 0 35840 /data1/alan.data 66 34 51 8192 1048576 760731 0 37888 /data1/alan.data 69 31 52 8192 1048576 648428 0 32928 /data1/alan.data 71 29 32 8192 1048576 669440 0 20920 FILE %RAN %SEQ COUNT MIN MAX AVG KR KW /data1/alan.data 73 27 42 8192 1048576 807107 0 33104 /data1/alan.data 71 29 53 8192 1048576 650568 0 33672 /data1/alan.data 92 8 63 8192 1048576 696970 0 42880 /data1/alan.data 65 35 40 8192 1048576 722739 0 28232 /data1/alan.data 80 20 51 8192 1048576 719129 0 35816 /data1/alan.data 77 23 54 8192 1048576 800843 0 42232 /data1/alan.data 81 19 53 8192 1048576 762165 0 39448 /data1/alan.data 40 60 30 8192 1048576 876544 0 25680 /data1/alan.data 71 29 52 8192 1048576 628893 0 31936 /data1/alan.data 64 36 39 8192 1048576 896708 0 34152 /data1/alan.data 66 34 45 8192 1048576 817743 0 35936 /data1/alan.data 80 20 45 8192 1048576 892381 0 39216 /data1/alan.data 78 22 14 8192 1048576 873033 0 11936 /data1/alan.data 100 0 2 16384 65536 40960 0 80 Oracle isn''t getting that type of through put however. There are 20+ databases running and typically I see this: # iostat -dnx 2 | grep c3t1d0 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 61.4 91.2 3461.7 729.1 0.0 2.2 0.0 14.7 0 99 c3t1d0 212.3 101.7 2620.2 783.0 0.0 2.9 0.0 9.4 0 100 c3t1d0 2.0 87.5 16.0 689.5 0.0 2.2 0.0 24.7 0 100 c3t1d0 0.5 95.5 4.0 761.3 0.0 1.7 0.0 17.7 0 99 c3t1d0 364.5 72.5 9099.9 558.5 0.0 2.1 0.0 4.8 0 100 c3t1d0 0.5 99.5 4.0 792.0 0.0 2.5 0.0 25.1 0 99 c3t1d0 2.0 82.5 16.0 653.0 0.0 1.2 0.0 13.9 0 100 c3t1d0 17.0 100.5 136.0 797.0 0.0 3.3 0.0 27.7 0 99 c3t1d0 473.0 76.0 11963.6 598.0 0.0 2.9 0.0 5.3 0 100 c3t1d0 67.0 105.0 1048.0 833.0 0.0 3.3 0.0 19.0 0 100 c3t1d0 16.5 84.0 140.0 665.5 0.0 1.3 0.0 12.8 0 100 c3t1d0 128.5 99.5 1084.0 785.5 0.0 3.1 0.0 13.6 0 100 c3t1d0 692.5 76.0 7631.7 561.5 0.0 2.7 0.0 3.5 1 100 c3t1d0 20.5 80.0 216.0 633.0 0.0 1.4 0.0 13.8 0 100 c3t1d0 0.0 108.0 0.0 858.0 0.0 3.7 0.0 34.1 0 99 c3t1d0 Although while running dd from /dev/zero I see this: r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 7.5 51.0 60.0 37007.6 0.0 16.2 0.0 276.5 0 100 c3t1d0 12.0 59.0 96.0 46459.3 0.0 15.5 0.0 218.4 0 100 c3t1d0 16.5 49.5 132.0 34877.3 0.0 14.2 0.0 215.8 0 95 c3t1d0 0.0 54.5 0.0 37069.7 0.0 15.3 0.0 281.5 0 100 c3t1d0 0.0 74.5 0.0 30912.9 0.0 25.7 0.0 345.0 0 100 c3t1d0 And while copying a large file in the same directory I see this 56.5 43.5 25328.0 24692.0 0.0 9.6 0.0 96.0 0 100 c3t1d0 54.5 46.0 26326.1 26311.4 0.0 8.2 0.0 81.9 0 100 c3t1d0 28.0 42.0 25601.6 26221.7 0.0 8.5 0.0 120.9 0 100 c3t1d0 30.0 63.0 26163.3 27571.2 0.0 10.3 0.0 110.5 0 100 c3t1d0 29.0 62.0 24333.0 23433.7 0.0 10.0 0.0 110.0 0 100 c3t1d0 sys% is high, and gets worse when the server is busy. Here''s a snippet from sar 00:00:00 %usr %sys %wio %idle 15:00:00 12 10 0 78 15:10:01 11 9 0 80 15:20:00 14 9 0 77 15:30:00 10 9 0 81 15:40:00 19 12 0 69 15:50:01 56 27 0 17 16:00:00 61 37 0 2 16:10:00 61 38 0 1 16:20:00 61 38 0 1 16:30:00 60 39 0 1 16:40:00 61 39 0 1 16:50:00 61 38 0 1 17:00:00 61 38 0 1 17:10:00 64 36 0 1 17:20:00 55 32 0 12 17:30:00 10 7 0 83 17:40:00 9 7 0 84 I was thinking perhaps a shared memory or semaphore issue (hence the kernel parameter experiment). Free buffer wait is high in the database and as I understand it, that''s when Oracle tries to find contiguous free memory space. We have about 70 servers, but this is the first (and only) Solaris 10 database server and is the only one that we have a problem with. I talked up Solaris 10 until they let me install it and now I can''t get it to perform! Database tuning has been given lots of attention with no difference in performance. it seems like semsys has an awfully high elapsed time. Write time is high too, but I''d expect it to be the most expensive operation, so I''m kinda lost as to where to delve further and what to change. # ./procsystime -can oracle Hit Ctrl-C to stop sampling... ^C Elapsed Times for processes oracle, SYSCALL TIME (ns) getgid 17500 sigpending 24000 setpgrp 25400 getpeername 28300 umask 33400 systeminfo 39200 getuid 52500 getsockname 53000 fstatvfs 53000 mprotect 55700 dup 57300 uname 75200 lseek 83400 getrlimit64 93800 fstat64 108500 getcwd 160400 setitimer 167200 getrlimit 199500 setrlimit 205400 getpid 217700 chdir 223200 open64 249200 setcontext 250900 pset 252200 sysconfig 330300 schedctl 435400 lwp_sigmask 454600 setsockopt 476200 sigaction 921600 access 942700 kill 1100500 fcntl 1183400 doorfs 1236700 resolvepath 1466100 munmap 1656100 shmsys 1757500 brk 1928400 ioctl 2714100 stat 2835900 close 3427600 times 3576400 mmap 4415100 write 5307800 memcntl 6275700 open 8990900 pread64 12105100 read 139717500 readv 273008400 lwp_park 5139631300 pollsys 15039226700 kaio 35598591600 pwrite64 57911988400 semsys 62986506600 CPU Times for processes oracle, SYSCALL TIME (ns) getgid 2900 umask 4500 sigpending 7400 setpgrp 8200 getuid 8500 getpeername 13700 systeminfo 22600 mprotect 25000 dup 27000 lseek 34200 getsockname 34500 fstatvfs 37700 getrlimit64 38400 getpid 41500 uname 43000 sysconfig 58300 fstat64 75900 lwp_sigmask 101100 getrlimit 105100 setitimer 105900 getcwd 142700 setrlimit 162500 pset 169000 setcontext 183900 chdir 205800 open64 227100 sigaction 269200 doorfs 300500 pollsys 350600 schedctl 416600 setsockopt 428100 pread64 506000 kill 602700 lwp_park 663200 access 763700 fcntl 1080300 times 1108200 resolvepath 1166900 brk 1242600 semsys 1346300 munmap 1418700 kaio 1467700 shmsys 1622500 stat 2333900 close 2528000 ioctl 2606100 mmap 3514300 write 4888500 memcntl 6062600 open 8034000 readv 15252300 read 17064400 pwrite64 25607800 Syscall Counts for processes oracle, SYSCALL COUNT fstatvfs 1 getgid 5 sigpending 5 getcwd 5 rexit 5 schedctl 5 systeminfo 5 getpeername 5 chdir 5 setpgrp 5 getsockname 5 open64 6 pread64 7 dup 10 mprotect 10 umask 10 uname 10 fstat64 10 getuid 15 setsockopt 15 setrlimit 15 lseek 19 doorfs 20 getrlimit64 20 setcontext 21 setitimer 22 pollsys 27 pset 28 getrlimit 31 ioctl 33 fcntl 37 shmsys 45 pwrite64 59 access 60 getpid 65 memcntl 65 readv 66 munmap 75 sysconfig 103 resolvepath 105 lwp_park 107 write 120 kaio 121 semsys 126 lwp_sigmask 127 kill 168 stat 180 brk 240 sigaction 260 open 320 mmap 320 close 327 read 390 times 931 # ps -fu oracle | grep pmon | wc 21 188 1449 Sorry for the super long posting... Alan This message posted from opensolaris.org
G''Day Alan, On Mon, 29 Aug 2005, Jim Fiori wrote: [...]> Check i/o performance with iostat. Also use ''io'' provider > scripts to observe physical i/o sizes and timings.You could try "iopattern 5", "iotop -oC 5" and "rwtop -C 5" to examine I/O behaviour. The pwrite64s could indicate some random disk I/O, although the elapsed times suggest not very much. Those semsys calls are interesting, might have to do some more digging with locks. "lockbyproc.d"? "mpstat"? Brendan [Sydney, Australia]> > Jim > > > Alan Owens wrote: > > >How can I tell if this is normal behaviour? Oracle imports are horribly slow, an order of magnitude slower than on the same hardware with a slower disk array and Solaris 9. What I can look for to see where the problem lies? > > > >The server is 99% idle right now, with one database running. Each sample is about 5 seconds. I''ve tried setting kernel parameters despite the docs saying that they are obsolete or set now via resource controls. syslogging for the resource controls shows no problems.
Hi Alan, Please note that waitio was never a very good statistic and it now (as of S10) is always set to 0. So the reason you see waitio at 0 is it is hard-coded that way. However, your iostat data shows you have definite I/O issues. Your single device (c3t1d0) is at 100% busy and the service times (asvt) are moving up, meaning that''s about all you can expect from that LUN. Your test for write speed assumes sequential writes. From the iopattern data you''re at 80-90% random. With a random pattern you are limited by seek time. This is (most likely) an I/O problem. Add more spindles to your databases. Dave Miller Sun Alan Owens wrote, On 08/29/05 05:25 PM:> Thanks Jim, I was pretty sure that the problem isn''t i/o related. The server is a 4 way v440 with 16gb mem and a 6120 disk array. I never see any i/o wait in sar or vmstat. > > Here''s the /etc/vfstab entry. Logging is on by default in Solaris 10, but I always turned it on for Solaris 9 ufs filesystems so I set it explicitly here too. > > /dev/dsk/c3t1d0s3 /dev/rdsk/c3t1d0s3 /data1 ufs 1 yes logging > > I ran this to test write speed: > > # dd if=/dev/zero of=/data1/alan.data bs=8k count=128000 > 128000+0 records in > 128000+0 records out > > # ./iopattern -f /data1/alan.data > FILE %RAN %SEQ COUNT MIN MAX AVG KR KW > /data1/alan.data 0 0 0 0 0 0 0 0 > /data1/alan.data 84 16 58 8192 1048576 659314 0 37344 > /data1/alan.data 83 17 55 8192 1048576 706597 0 37952 > /data1/alan.data 92 8 56 8192 1048576 686665 0 37552 > /data1/alan.data 85 15 74 8192 1048576 432072 0 31224 > /data1/alan.data 78 22 56 8192 1048576 731721 0 40016 > /data1/alan.data 63 37 44 8192 1048576 712331 0 30608 > /data1/alan.data 78 22 57 8192 1048576 699625 0 38944 > /data1/alan.data 60 40 43 8192 1048576 806245 0 33856 > /data1/alan.data 77 23 53 8192 1048576 682254 0 35312 > /data1/alan.data 85 15 64 8192 1048576 660736 0 41296 > /data1/alan.data 76 24 56 8192 1048576 710509 0 38856 > /data1/alan.data 70 30 47 8192 1048576 664074 0 30480 > /data1/alan.data 94 6 38 8192 1048576 755819 0 28048 > /data1/alan.data 66 34 45 8192 1048576 815559 0 35840 > /data1/alan.data 66 34 51 8192 1048576 760731 0 37888 > /data1/alan.data 69 31 52 8192 1048576 648428 0 32928 > /data1/alan.data 71 29 32 8192 1048576 669440 0 20920 > FILE %RAN %SEQ COUNT MIN MAX AVG KR KW > /data1/alan.data 73 27 42 8192 1048576 807107 0 33104 > /data1/alan.data 71 29 53 8192 1048576 650568 0 33672 > /data1/alan.data 92 8 63 8192 1048576 696970 0 42880 > /data1/alan.data 65 35 40 8192 1048576 722739 0 28232 > /data1/alan.data 80 20 51 8192 1048576 719129 0 35816 > /data1/alan.data 77 23 54 8192 1048576 800843 0 42232 > /data1/alan.data 81 19 53 8192 1048576 762165 0 39448 > /data1/alan.data 40 60 30 8192 1048576 876544 0 25680 > /data1/alan.data 71 29 52 8192 1048576 628893 0 31936 > /data1/alan.data 64 36 39 8192 1048576 896708 0 34152 > /data1/alan.data 66 34 45 8192 1048576 817743 0 35936 > /data1/alan.data 80 20 45 8192 1048576 892381 0 39216 > /data1/alan.data 78 22 14 8192 1048576 873033 0 11936 > /data1/alan.data 100 0 2 16384 65536 40960 0 80 > > > Oracle isn''t getting that type of through put however. There are 20+ databases running and typically I see this: > > # iostat -dnx 2 | grep c3t1d0 > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 61.4 91.2 3461.7 729.1 0.0 2.2 0.0 14.7 0 99 c3t1d0 > 212.3 101.7 2620.2 783.0 0.0 2.9 0.0 9.4 0 100 c3t1d0 > 2.0 87.5 16.0 689.5 0.0 2.2 0.0 24.7 0 100 c3t1d0 > 0.5 95.5 4.0 761.3 0.0 1.7 0.0 17.7 0 99 c3t1d0 > 364.5 72.5 9099.9 558.5 0.0 2.1 0.0 4.8 0 100 c3t1d0 > 0.5 99.5 4.0 792.0 0.0 2.5 0.0 25.1 0 99 c3t1d0 > 2.0 82.5 16.0 653.0 0.0 1.2 0.0 13.9 0 100 c3t1d0 > 17.0 100.5 136.0 797.0 0.0 3.3 0.0 27.7 0 99 c3t1d0 > 473.0 76.0 11963.6 598.0 0.0 2.9 0.0 5.3 0 100 c3t1d0 > 67.0 105.0 1048.0 833.0 0.0 3.3 0.0 19.0 0 100 c3t1d0 > 16.5 84.0 140.0 665.5 0.0 1.3 0.0 12.8 0 100 c3t1d0 > 128.5 99.5 1084.0 785.5 0.0 3.1 0.0 13.6 0 100 c3t1d0 > 692.5 76.0 7631.7 561.5 0.0 2.7 0.0 3.5 1 100 c3t1d0 > 20.5 80.0 216.0 633.0 0.0 1.4 0.0 13.8 0 100 c3t1d0 > 0.0 108.0 0.0 858.0 0.0 3.7 0.0 34.1 0 99 c3t1d0 > > > Although while running dd from /dev/zero I see this: > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 7.5 51.0 60.0 37007.6 0.0 16.2 0.0 276.5 0 100 c3t1d0 > 12.0 59.0 96.0 46459.3 0.0 15.5 0.0 218.4 0 100 c3t1d0 > 16.5 49.5 132.0 34877.3 0.0 14.2 0.0 215.8 0 95 c3t1d0 > 0.0 54.5 0.0 37069.7 0.0 15.3 0.0 281.5 0 100 c3t1d0 > 0.0 74.5 0.0 30912.9 0.0 25.7 0.0 345.0 0 100 c3t1d0 > > And while copying a large file in the same directory I see this > > 56.5 43.5 25328.0 24692.0 0.0 9.6 0.0 96.0 0 100 c3t1d0 > 54.5 46.0 26326.1 26311.4 0.0 8.2 0.0 81.9 0 100 c3t1d0 > 28.0 42.0 25601.6 26221.7 0.0 8.5 0.0 120.9 0 100 c3t1d0 > 30.0 63.0 26163.3 27571.2 0.0 10.3 0.0 110.5 0 100 c3t1d0 > 29.0 62.0 24333.0 23433.7 0.0 10.0 0.0 110.0 0 100 c3t1d0 > > > sys% is high, and gets worse when the server is busy. Here''s a snippet from sar > > 00:00:00 %usr %sys %wio %idle > 15:00:00 12 10 0 78 > 15:10:01 11 9 0 80 > 15:20:00 14 9 0 77 > 15:30:00 10 9 0 81 > 15:40:00 19 12 0 69 > 15:50:01 56 27 0 17 > 16:00:00 61 37 0 2 > 16:10:00 61 38 0 1 > 16:20:00 61 38 0 1 > 16:30:00 60 39 0 1 > 16:40:00 61 39 0 1 > 16:50:00 61 38 0 1 > 17:00:00 61 38 0 1 > 17:10:00 64 36 0 1 > 17:20:00 55 32 0 12 > 17:30:00 10 7 0 83 > 17:40:00 9 7 0 84 > > I was thinking perhaps a shared memory or semaphore issue (hence the kernel parameter experiment). Free buffer wait is high in the database and as I understand it, that''s when Oracle tries to find contiguous free memory space. We have about 70 servers, but this is the first (and only) Solaris 10 database server and is the only one that we have a problem with. I talked up Solaris 10 until they let me install it and now I can''t get it to perform! Database tuning has been given lots of attention with no difference in performance. > > it seems like semsys has an awfully high elapsed time. Write time is high too, but I''d expect it to be the most expensive operation, so I''m kinda lost as to where to delve further and what to change. > > # ./procsystime -can oracle > Hit Ctrl-C to stop sampling... > ^C > > Elapsed Times for processes oracle, > > SYSCALL TIME (ns) > getgid 17500 > sigpending 24000 > setpgrp 25400 > getpeername 28300 > umask 33400 > systeminfo 39200 > getuid 52500 > getsockname 53000 > fstatvfs 53000 > mprotect 55700 > dup 57300 > uname 75200 > lseek 83400 > getrlimit64 93800 > fstat64 108500 > getcwd 160400 > setitimer 167200 > getrlimit 199500 > setrlimit 205400 > getpid 217700 > chdir 223200 > open64 249200 > setcontext 250900 > pset 252200 > sysconfig 330300 > schedctl 435400 > lwp_sigmask 454600 > setsockopt 476200 > sigaction 921600 > access 942700 > kill 1100500 > fcntl 1183400 > doorfs 1236700 > resolvepath 1466100 > munmap 1656100 > shmsys 1757500 > brk 1928400 > ioctl 2714100 > stat 2835900 > close 3427600 > times 3576400 > mmap 4415100 > write 5307800 > memcntl 6275700 > open 8990900 > pread64 12105100 > read 139717500 > readv 273008400 > lwp_park 5139631300 > pollsys 15039226700 > kaio 35598591600 > pwrite64 57911988400 > semsys 62986506600 > > CPU Times for processes oracle, > > SYSCALL TIME (ns) > getgid 2900 > umask 4500 > sigpending 7400 > setpgrp 8200 > getuid 8500 > getpeername 13700 > systeminfo 22600 > mprotect 25000 > dup 27000 > lseek 34200 > getsockname 34500 > fstatvfs 37700 > getrlimit64 38400 > getpid 41500 > uname 43000 > sysconfig 58300 > fstat64 75900 > lwp_sigmask 101100 > getrlimit 105100 > setitimer 105900 > getcwd 142700 > setrlimit 162500 > pset 169000 > setcontext 183900 > chdir 205800 > open64 227100 > sigaction 269200 > doorfs 300500 > pollsys 350600 > schedctl 416600 > setsockopt 428100 > pread64 506000 > kill 602700 > lwp_park 663200 > access 763700 > fcntl 1080300 > times 1108200 > resolvepath 1166900 > brk 1242600 > semsys 1346300 > munmap 1418700 > kaio 1467700 > shmsys 1622500 > stat 2333900 > close 2528000 > ioctl 2606100 > mmap 3514300 > write 4888500 > memcntl 6062600 > open 8034000 > readv 15252300 > read 17064400 > pwrite64 25607800 > > Syscall Counts for processes oracle, > > SYSCALL COUNT > fstatvfs 1 > getgid 5 > sigpending 5 > getcwd 5 > rexit 5 > schedctl 5 > systeminfo 5 > getpeername 5 > chdir 5 > setpgrp 5 > getsockname 5 > open64 6 > pread64 7 > dup 10 > mprotect 10 > umask 10 > uname 10 > fstat64 10 > getuid 15 > setsockopt 15 > setrlimit 15 > lseek 19 > doorfs 20 > getrlimit64 20 > setcontext 21 > setitimer 22 > pollsys 27 > pset 28 > getrlimit 31 > ioctl 33 > fcntl 37 > shmsys 45 > pwrite64 59 > access 60 > getpid 65 > memcntl 65 > readv 66 > munmap 75 > sysconfig 103 > resolvepath 105 > lwp_park 107 > write 120 > kaio 121 > semsys 126 > lwp_sigmask 127 > kill 168 > stat 180 > brk 240 > sigaction 260 > open 320 > mmap 320 > close 327 > read 390 > times 931 > > # ps -fu oracle | grep pmon | wc > 21 188 1449 > > > Sorry for the super long posting... > > Alan > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
Hi Alan, Oracle imports an order of magnitude slower than S9 is a bug>What I can look for to see where the problem lies?Can you post a statspack for this database? There is a solaris 10 and oracle 9 bug http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6251659 but I don''t think your are hitting it Rgds Robin This message posted from opensolaris.org
You may be seeing, 6238533, which is fixed and being backported to S10 as we speak. Try disabling logging to see if this clears the regression... [ I can provide you with more info if you need ] Also, I would grab the latencies at pread/pwrite syscalls with DTrace, inclusive and exclusive of the disk I/O''s (latency from the I/O provider). This will help show up if the file system itself is contributing to the latency... Thanks, Richard. On Mon, Aug 29, 2005 at 03:25:35PM -0700, Alan Owens wrote:> Thanks Jim, I was pretty sure that the problem isn''t i/o related. The server is a 4 way v440 with 16gb mem and a 6120 disk array. I never see any i/o wait in sar or vmstat. > > Here''s the /etc/vfstab entry. Logging is on by default in Solaris 10, but I always turned it on for Solaris 9 ufs filesystems so I set it explicitly here too. > > /dev/dsk/c3t1d0s3 /dev/rdsk/c3t1d0s3 /data1 ufs 1 yes logging > > I ran this to test write speed: > > # dd if=/dev/zero of=/data1/alan.data bs=8k count=128000 > 128000+0 records in > 128000+0 records out > > # ./iopattern -f /data1/alan.data > FILE %RAN %SEQ COUNT MIN MAX AVG KR KW > /data1/alan.data 0 0 0 0 0 0 0 0 > /data1/alan.data 84 16 58 8192 1048576 659314 0 37344 > /data1/alan.data 83 17 55 8192 1048576 706597 0 37952 > /data1/alan.data 92 8 56 8192 1048576 686665 0 37552 > /data1/alan.data 85 15 74 8192 1048576 432072 0 31224 > /data1/alan.data 78 22 56 8192 1048576 731721 0 40016 > /data1/alan.data 63 37 44 8192 1048576 712331 0 30608 > /data1/alan.data 78 22 57 8192 1048576 699625 0 38944 > /data1/alan.data 60 40 43 8192 1048576 806245 0 33856 > /data1/alan.data 77 23 53 8192 1048576 682254 0 35312 > /data1/alan.data 85 15 64 8192 1048576 660736 0 41296 > /data1/alan.data 76 24 56 8192 1048576 710509 0 38856 > /data1/alan.data 70 30 47 8192 1048576 664074 0 30480 > /data1/alan.data 94 6 38 8192 1048576 755819 0 28048 > /data1/alan.data 66 34 45 8192 1048576 815559 0 35840 > /data1/alan.data 66 34 51 8192 1048576 760731 0 37888 > /data1/alan.data 69 31 52 8192 1048576 648428 0 32928 > /data1/alan.data 71 29 32 8192 1048576 669440 0 20920 > FILE %RAN %SEQ COUNT MIN MAX AVG KR KW > /data1/alan.data 73 27 42 8192 1048576 807107 0 33104 > /data1/alan.data 71 29 53 8192 1048576 650568 0 33672 > /data1/alan.data 92 8 63 8192 1048576 696970 0 42880 > /data1/alan.data 65 35 40 8192 1048576 722739 0 28232 > /data1/alan.data 80 20 51 8192 1048576 719129 0 35816 > /data1/alan.data 77 23 54 8192 1048576 800843 0 42232 > /data1/alan.data 81 19 53 8192 1048576 762165 0 39448 > /data1/alan.data 40 60 30 8192 1048576 876544 0 25680 > /data1/alan.data 71 29 52 8192 1048576 628893 0 31936 > /data1/alan.data 64 36 39 8192 1048576 896708 0 34152 > /data1/alan.data 66 34 45 8192 1048576 817743 0 35936 > /data1/alan.data 80 20 45 8192 1048576 892381 0 39216 > /data1/alan.data 78 22 14 8192 1048576 873033 0 11936 > /data1/alan.data 100 0 2 16384 65536 40960 0 80 > > > Oracle isn''t getting that type of through put however. There are 20+ databases running and typically I see this: > > # iostat -dnx 2 | grep c3t1d0 > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 61.4 91.2 3461.7 729.1 0.0 2.2 0.0 14.7 0 99 c3t1d0 > 212.3 101.7 2620.2 783.0 0.0 2.9 0.0 9.4 0 100 c3t1d0 > 2.0 87.5 16.0 689.5 0.0 2.2 0.0 24.7 0 100 c3t1d0 > 0.5 95.5 4.0 761.3 0.0 1.7 0.0 17.7 0 99 c3t1d0 > 364.5 72.5 9099.9 558.5 0.0 2.1 0.0 4.8 0 100 c3t1d0 > 0.5 99.5 4.0 792.0 0.0 2.5 0.0 25.1 0 99 c3t1d0 > 2.0 82.5 16.0 653.0 0.0 1.2 0.0 13.9 0 100 c3t1d0 > 17.0 100.5 136.0 797.0 0.0 3.3 0.0 27.7 0 99 c3t1d0 > 473.0 76.0 11963.6 598.0 0.0 2.9 0.0 5.3 0 100 c3t1d0 > 67.0 105.0 1048.0 833.0 0.0 3.3 0.0 19.0 0 100 c3t1d0 > 16.5 84.0 140.0 665.5 0.0 1.3 0.0 12.8 0 100 c3t1d0 > 128.5 99.5 1084.0 785.5 0.0 3.1 0.0 13.6 0 100 c3t1d0 > 692.5 76.0 7631.7 561.5 0.0 2.7 0.0 3.5 1 100 c3t1d0 > 20.5 80.0 216.0 633.0 0.0 1.4 0.0 13.8 0 100 c3t1d0 > 0.0 108.0 0.0 858.0 0.0 3.7 0.0 34.1 0 99 c3t1d0 > > > Although while running dd from /dev/zero I see this: > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 7.5 51.0 60.0 37007.6 0.0 16.2 0.0 276.5 0 100 c3t1d0 > 12.0 59.0 96.0 46459.3 0.0 15.5 0.0 218.4 0 100 c3t1d0 > 16.5 49.5 132.0 34877.3 0.0 14.2 0.0 215.8 0 95 c3t1d0 > 0.0 54.5 0.0 37069.7 0.0 15.3 0.0 281.5 0 100 c3t1d0 > 0.0 74.5 0.0 30912.9 0.0 25.7 0.0 345.0 0 100 c3t1d0 > > And while copying a large file in the same directory I see this > > 56.5 43.5 25328.0 24692.0 0.0 9.6 0.0 96.0 0 100 c3t1d0 > 54.5 46.0 26326.1 26311.4 0.0 8.2 0.0 81.9 0 100 c3t1d0 > 28.0 42.0 25601.6 26221.7 0.0 8.5 0.0 120.9 0 100 c3t1d0 > 30.0 63.0 26163.3 27571.2 0.0 10.3 0.0 110.5 0 100 c3t1d0 > 29.0 62.0 24333.0 23433.7 0.0 10.0 0.0 110.0 0 100 c3t1d0 > > > sys% is high, and gets worse when the server is busy. Here''s a snippet from sar > > 00:00:00 %usr %sys %wio %idle > 15:00:00 12 10 0 78 > 15:10:01 11 9 0 80 > 15:20:00 14 9 0 77 > 15:30:00 10 9 0 81 > 15:40:00 19 12 0 69 > 15:50:01 56 27 0 17 > 16:00:00 61 37 0 2 > 16:10:00 61 38 0 1 > 16:20:00 61 38 0 1 > 16:30:00 60 39 0 1 > 16:40:00 61 39 0 1 > 16:50:00 61 38 0 1 > 17:00:00 61 38 0 1 > 17:10:00 64 36 0 1 > 17:20:00 55 32 0 12 > 17:30:00 10 7 0 83 > 17:40:00 9 7 0 84 > > I was thinking perhaps a shared memory or semaphore issue (hence the kernel parameter experiment). Free buffer wait is high in the database and as I understand it, that''s when Oracle tries to find contiguous free memory space. We have about 70 servers, but this is the first (and only) Solaris 10 database server and is the only one that we have a problem with. I talked up Solaris 10 until they let me install it and now I can''t get it to perform! Database tuning has been given lots of attention with no difference in performance. > > it seems like semsys has an awfully high elapsed time. Write time is high too, but I''d expect it to be the most expensive operation, so I''m kinda lost as to where to delve further and what to change. > > # ./procsystime -can oracle > Hit Ctrl-C to stop sampling... > ^C > > Elapsed Times for processes oracle, > > SYSCALL TIME (ns) > getgid 17500 > sigpending 24000 > setpgrp 25400 > getpeername 28300 > umask 33400 > systeminfo 39200 > getuid 52500 > getsockname 53000 > fstatvfs 53000 > mprotect 55700 > dup 57300 > uname 75200 > lseek 83400 > getrlimit64 93800 > fstat64 108500 > getcwd 160400 > setitimer 167200 > getrlimit 199500 > setrlimit 205400 > getpid 217700 > chdir 223200 > open64 249200 > setcontext 250900 > pset 252200 > sysconfig 330300 > schedctl 435400 > lwp_sigmask 454600 > setsockopt 476200 > sigaction 921600 > access 942700 > kill 1100500 > fcntl 1183400 > doorfs 1236700 > resolvepath 1466100 > munmap 1656100 > shmsys 1757500 > brk 1928400 > ioctl 2714100 > stat 2835900 > close 3427600 > times 3576400 > mmap 4415100 > write 5307800 > memcntl 6275700 > open 8990900 > pread64 12105100 > read 139717500 > readv 273008400 > lwp_park 5139631300 > pollsys 15039226700 > kaio 35598591600 > pwrite64 57911988400 > semsys 62986506600 > > CPU Times for processes oracle, > > SYSCALL TIME (ns) > getgid 2900 > umask 4500 > sigpending 7400 > setpgrp 8200 > getuid 8500 > getpeername 13700 > systeminfo 22600 > mprotect 25000 > dup 27000 > lseek 34200 > getsockname 34500 > fstatvfs 37700 > getrlimit64 38400 > getpid 41500 > uname 43000 > sysconfig 58300 > fstat64 75900 > lwp_sigmask 101100 > getrlimit 105100 > setitimer 105900 > getcwd 142700 > setrlimit 162500 > pset 169000 > setcontext 183900 > chdir 205800 > open64 227100 > sigaction 269200 > doorfs 300500 > pollsys 350600 > schedctl 416600 > setsockopt 428100 > pread64 506000 > kill 602700 > lwp_park 663200 > access 763700 > fcntl 1080300 > times 1108200 > resolvepath 1166900 > brk 1242600 > semsys 1346300 > munmap 1418700 > kaio 1467700 > shmsys 1622500 > stat 2333900 > close 2528000 > ioctl 2606100 > mmap 3514300 > write 4888500 > memcntl 6062600 > open 8034000 > readv 15252300 > read 17064400 > pwrite64 25607800 > > Syscall Counts for processes oracle, > > SYSCALL COUNT > fstatvfs 1 > getgid 5 > sigpending 5 > getcwd 5 > rexit 5 > schedctl 5 > systeminfo 5 > getpeername 5 > chdir 5 > setpgrp 5 > getsockname 5 > open64 6 > pread64 7 > dup 10 > mprotect 10 > umask 10 > uname 10 > fstat64 10 > getuid 15 > setsockopt 15 > setrlimit 15 > lseek 19 > doorfs 20 > getrlimit64 20 > setcontext 21 > setitimer 22 > pollsys 27 > pset 28 > getrlimit 31 > ioctl 33 > fcntl 37 > shmsys 45 > pwrite64 59 > access 60 > getpid 65 > memcntl 65 > readv 66 > munmap 75 > sysconfig 103 > resolvepath 105 > lwp_park 107 > write 120 > kaio 121 > semsys 126 > lwp_sigmask 127 > kill 168 > stat 180 > brk 240 > sigaction 260 > open 320 > mmap 320 > close 327 > read 390 > times 931 > > # ps -fu oracle | grep pmon | wc > 21 188 1449 > > > Sorry for the super long posting... > > Alan > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- :-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-: Richard Mc Dougall : Richard.McDougall at Sun.COM Performance and Availability : x31542 Engineering : http://blogs.sun.com/rmc Sun Microsystems Inc : +1 650 352 6438 :-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
Also, replacing the /data1 file system buffer by a bigger Oracle SGA and using the forcedirectio option (via the vfstab for ex.) will give you immediate benefits. (verify that the Oracle parameter disk_asynch_io is set to true) benoit Richard McDougall wrote:>You may be seeing, 6238533, which is fixed and being backported to S10 >as we speak. Try disabling logging to see if this clears the >regression... > >[ I can provide you with more info if you need ] > >Also, I would grab the latencies at pread/pwrite syscalls with DTrace, >inclusive and exclusive of the disk I/O''s (latency from the I/O >provider). This will help show up if the file system itself is >contributing to the latency... > >Thanks, > >Richard. > > > > > >On Mon, Aug 29, 2005 at 03:25:35PM -0700, Alan Owens wrote: > > >>Thanks Jim, I was pretty sure that the problem isn''t i/o related. The server is a 4 way v440 with 16gb mem and a 6120 disk array. I never see any i/o wait in sar or vmstat. >> >>Here''s the /etc/vfstab entry. Logging is on by default in Solaris 10, but I always turned it on for Solaris 9 ufs filesystems so I set it explicitly here too. >> >>/dev/dsk/c3t1d0s3 /dev/rdsk/c3t1d0s3 /data1 ufs 1 yes logging >> >>I ran this to test write speed: >> >># dd if=/dev/zero of=/data1/alan.data bs=8k count=128000 >>128000+0 records in >>128000+0 records out >> >># ./iopattern -f /data1/alan.data >>FILE %RAN %SEQ COUNT MIN MAX AVG KR KW >>/data1/alan.data 0 0 0 0 0 0 0 0 >>/data1/alan.data 84 16 58 8192 1048576 659314 0 37344 >>/data1/alan.data 83 17 55 8192 1048576 706597 0 37952 >>/data1/alan.data 92 8 56 8192 1048576 686665 0 37552 >>/data1/alan.data 85 15 74 8192 1048576 432072 0 31224 >>/data1/alan.data 78 22 56 8192 1048576 731721 0 40016 >>/data1/alan.data 63 37 44 8192 1048576 712331 0 30608 >>/data1/alan.data 78 22 57 8192 1048576 699625 0 38944 >>/data1/alan.data 60 40 43 8192 1048576 806245 0 33856 >>/data1/alan.data 77 23 53 8192 1048576 682254 0 35312 >>/data1/alan.data 85 15 64 8192 1048576 660736 0 41296 >>/data1/alan.data 76 24 56 8192 1048576 710509 0 38856 >>/data1/alan.data 70 30 47 8192 1048576 664074 0 30480 >>/data1/alan.data 94 6 38 8192 1048576 755819 0 28048 >>/data1/alan.data 66 34 45 8192 1048576 815559 0 35840 >>/data1/alan.data 66 34 51 8192 1048576 760731 0 37888 >>/data1/alan.data 69 31 52 8192 1048576 648428 0 32928 >>/data1/alan.data 71 29 32 8192 1048576 669440 0 20920 >>FILE %RAN %SEQ COUNT MIN MAX AVG KR KW >>/data1/alan.data 73 27 42 8192 1048576 807107 0 33104 >>/data1/alan.data 71 29 53 8192 1048576 650568 0 33672 >>/data1/alan.data 92 8 63 8192 1048576 696970 0 42880 >>/data1/alan.data 65 35 40 8192 1048576 722739 0 28232 >>/data1/alan.data 80 20 51 8192 1048576 719129 0 35816 >>/data1/alan.data 77 23 54 8192 1048576 800843 0 42232 >>/data1/alan.data 81 19 53 8192 1048576 762165 0 39448 >>/data1/alan.data 40 60 30 8192 1048576 876544 0 25680 >>/data1/alan.data 71 29 52 8192 1048576 628893 0 31936 >>/data1/alan.data 64 36 39 8192 1048576 896708 0 34152 >>/data1/alan.data 66 34 45 8192 1048576 817743 0 35936 >>/data1/alan.data 80 20 45 8192 1048576 892381 0 39216 >>/data1/alan.data 78 22 14 8192 1048576 873033 0 11936 >>/data1/alan.data 100 0 2 16384 65536 40960 0 80 >> >> >>Oracle isn''t getting that type of through put however. There are 20+ databases running and typically I see this: >> >># iostat -dnx 2 | grep c3t1d0 >> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device >> 61.4 91.2 3461.7 729.1 0.0 2.2 0.0 14.7 0 99 c3t1d0 >> 212.3 101.7 2620.2 783.0 0.0 2.9 0.0 9.4 0 100 c3t1d0 >> 2.0 87.5 16.0 689.5 0.0 2.2 0.0 24.7 0 100 c3t1d0 >> 0.5 95.5 4.0 761.3 0.0 1.7 0.0 17.7 0 99 c3t1d0 >> 364.5 72.5 9099.9 558.5 0.0 2.1 0.0 4.8 0 100 c3t1d0 >> 0.5 99.5 4.0 792.0 0.0 2.5 0.0 25.1 0 99 c3t1d0 >> 2.0 82.5 16.0 653.0 0.0 1.2 0.0 13.9 0 100 c3t1d0 >> 17.0 100.5 136.0 797.0 0.0 3.3 0.0 27.7 0 99 c3t1d0 >> 473.0 76.0 11963.6 598.0 0.0 2.9 0.0 5.3 0 100 c3t1d0 >> 67.0 105.0 1048.0 833.0 0.0 3.3 0.0 19.0 0 100 c3t1d0 >> 16.5 84.0 140.0 665.5 0.0 1.3 0.0 12.8 0 100 c3t1d0 >> 128.5 99.5 1084.0 785.5 0.0 3.1 0.0 13.6 0 100 c3t1d0 >> 692.5 76.0 7631.7 561.5 0.0 2.7 0.0 3.5 1 100 c3t1d0 >> 20.5 80.0 216.0 633.0 0.0 1.4 0.0 13.8 0 100 c3t1d0 >> 0.0 108.0 0.0 858.0 0.0 3.7 0.0 34.1 0 99 c3t1d0 >> >> >>Although while running dd from /dev/zero I see this: >> r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device >> 7.5 51.0 60.0 37007.6 0.0 16.2 0.0 276.5 0 100 c3t1d0 >> 12.0 59.0 96.0 46459.3 0.0 15.5 0.0 218.4 0 100 c3t1d0 >> 16.5 49.5 132.0 34877.3 0.0 14.2 0.0 215.8 0 95 c3t1d0 >> 0.0 54.5 0.0 37069.7 0.0 15.3 0.0 281.5 0 100 c3t1d0 >> 0.0 74.5 0.0 30912.9 0.0 25.7 0.0 345.0 0 100 c3t1d0 >> >>And while copying a large file in the same directory I see this >> >> 56.5 43.5 25328.0 24692.0 0.0 9.6 0.0 96.0 0 100 c3t1d0 >> 54.5 46.0 26326.1 26311.4 0.0 8.2 0.0 81.9 0 100 c3t1d0 >> 28.0 42.0 25601.6 26221.7 0.0 8.5 0.0 120.9 0 100 c3t1d0 >> 30.0 63.0 26163.3 27571.2 0.0 10.3 0.0 110.5 0 100 c3t1d0 >> 29.0 62.0 24333.0 23433.7 0.0 10.0 0.0 110.0 0 100 c3t1d0 >> >> >>sys% is high, and gets worse when the server is busy. Here''s a snippet from sar >> >>00:00:00 %usr %sys %wio %idle >>15:00:00 12 10 0 78 >>15:10:01 11 9 0 80 >>15:20:00 14 9 0 77 >>15:30:00 10 9 0 81 >>15:40:00 19 12 0 69 >>15:50:01 56 27 0 17 >>16:00:00 61 37 0 2 >>16:10:00 61 38 0 1 >>16:20:00 61 38 0 1 >>16:30:00 60 39 0 1 >>16:40:00 61 39 0 1 >>16:50:00 61 38 0 1 >>17:00:00 61 38 0 1 >>17:10:00 64 36 0 1 >>17:20:00 55 32 0 12 >>17:30:00 10 7 0 83 >>17:40:00 9 7 0 84 >> >>I was thinking perhaps a shared memory or semaphore issue (hence the kernel parameter experiment). Free buffer wait is high in the database and as I understand it, that''s when Oracle tries to find contiguous free memory space. We have about 70 servers, but this is the first (and only) Solaris 10 database server and is the only one that we have a problem with. I talked up Solaris 10 until they let me install it and now I can''t get it to perform! Database tuning has been given lots of attention with no difference in performance. >> >>it seems like semsys has an awfully high elapsed time. Write time is high too, but I''d expect it to be the most expensive operation, so I''m kinda lost as to where to delve further and what to change. >> >># ./procsystime -can oracle >>Hit Ctrl-C to stop sampling... >>^C >> >>Elapsed Times for processes oracle, >> >> SYSCALL TIME (ns) >> getgid 17500 >> sigpending 24000 >> setpgrp 25400 >> getpeername 28300 >> umask 33400 >> systeminfo 39200 >> getuid 52500 >> getsockname 53000 >> fstatvfs 53000 >> mprotect 55700 >> dup 57300 >> uname 75200 >> lseek 83400 >> getrlimit64 93800 >> fstat64 108500 >> getcwd 160400 >> setitimer 167200 >> getrlimit 199500 >> setrlimit 205400 >> getpid 217700 >> chdir 223200 >> open64 249200 >> setcontext 250900 >> pset 252200 >> sysconfig 330300 >> schedctl 435400 >> lwp_sigmask 454600 >> setsockopt 476200 >> sigaction 921600 >> access 942700 >> kill 1100500 >> fcntl 1183400 >> doorfs 1236700 >> resolvepath 1466100 >> munmap 1656100 >> shmsys 1757500 >> brk 1928400 >> ioctl 2714100 >> stat 2835900 >> close 3427600 >> times 3576400 >> mmap 4415100 >> write 5307800 >> memcntl 6275700 >> open 8990900 >> pread64 12105100 >> read 139717500 >> readv 273008400 >> lwp_park 5139631300 >> pollsys 15039226700 >> kaio 35598591600 >> pwrite64 57911988400 >> semsys 62986506600 >> >>CPU Times for processes oracle, >> >> SYSCALL TIME (ns) >> getgid 2900 >> umask 4500 >> sigpending 7400 >> setpgrp 8200 >> getuid 8500 >> getpeername 13700 >> systeminfo 22600 >> mprotect 25000 >> dup 27000 >> lseek 34200 >> getsockname 34500 >> fstatvfs 37700 >> getrlimit64 38400 >> getpid 41500 >> uname 43000 >> sysconfig 58300 >> fstat64 75900 >> lwp_sigmask 101100 >> getrlimit 105100 >> setitimer 105900 >> getcwd 142700 >> setrlimit 162500 >> pset 169000 >> setcontext 183900 >> chdir 205800 >> open64 227100 >> sigaction 269200 >> doorfs 300500 >> pollsys 350600 >> schedctl 416600 >> setsockopt 428100 >> pread64 506000 >> kill 602700 >> lwp_park 663200 >> access 763700 >> fcntl 1080300 >> times 1108200 >> resolvepath 1166900 >> brk 1242600 >> semsys 1346300 >> munmap 1418700 >> kaio 1467700 >> shmsys 1622500 >> stat 2333900 >> close 2528000 >> ioctl 2606100 >> mmap 3514300 >> write 4888500 >> memcntl 6062600 >> open 8034000 >> readv 15252300 >> read 17064400 >> pwrite64 25607800 >> >>Syscall Counts for processes oracle, >> >> SYSCALL COUNT >> fstatvfs 1 >> getgid 5 >> sigpending 5 >> getcwd 5 >> rexit 5 >> schedctl 5 >> systeminfo 5 >> getpeername 5 >> chdir 5 >> setpgrp 5 >> getsockname 5 >> open64 6 >> pread64 7 >> dup 10 >> mprotect 10 >> umask 10 >> uname 10 >> fstat64 10 >> getuid 15 >> setsockopt 15 >> setrlimit 15 >> lseek 19 >> doorfs 20 >> getrlimit64 20 >> setcontext 21 >> setitimer 22 >> pollsys 27 >> pset 28 >> getrlimit 31 >> ioctl 33 >> fcntl 37 >> shmsys 45 >> pwrite64 59 >> access 60 >> getpid 65 >> memcntl 65 >> readv 66 >> munmap 75 >> sysconfig 103 >> resolvepath 105 >> lwp_park 107 >> write 120 >> kaio 121 >> semsys 126 >> lwp_sigmask 127 >> kill 168 >> stat 180 >> brk 240 >> sigaction 260 >> open 320 >> mmap 320 >> close 327 >> read 390 >> times 931 >> >># ps -fu oracle | grep pmon | wc >> 21 188 1449 >> >> >>Sorry for the super long posting... >> >>Alan >>This message posted from opensolaris.org >>_______________________________________________ >>dtrace-discuss mailing list >>dtrace-discuss at opensolaris.org >> >> > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20050829/ff92a29b/attachment.html>
Thank you all for your replies. Unfortunately I can''t just unmount and remount the filesystem at will, so I''ll have to wait to try the nologging,forcedirectio mount options. I should be able to arrange a downtime later in the week. I have another identical server 4way, v440, 16GB, with 14x72GB disk in 6120Array RAID5+hotspare, but it''s running solaris 9 and it flies, so the hardware and the manner in which it''s configured can''t explain the performance. I ran a copy of a large table to test oracle''s ability to read/write large volumes of data and I got decent throughput. Here''s a few snippet of iostat and iopattern (not sync''d) create table aowens as select * from WORK_DETAIL; r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 6.0 82.0 3071.5 4583.3 0.0 2.5 0.0 28.2 0 99 c3t1d0 17.0 82.0 3032.4 8467.2 0.0 3.1 0.0 31.7 0 99 c3t1d0 7.0 117.0 3111.8 6878.0 0.0 4.5 0.0 36.2 0 99 c3t1d0 5.0 82.0 3072.0 6710.9 0.0 3.1 0.0 35.6 0 99 c3t1d0 10.0 85.0 4224.6 6577.4 0.0 2.8 0.0 29.6 0 99 c3t1d0 5.0 114.0 2056.0 2864.5 0.0 3.8 0.0 31.9 0 99 c3t1d0 5.0 79.0 40.0 1541.3 0.0 1.4 0.0 16.7 0 99 c3t1d0 4.0 78.0 1880.3 3611.5 0.0 1.7 0.0 21.0 0 99 c3t1d0 5.0 113.0 3071.9 4845.3 0.0 4.5 0.0 37.7 0 99 c3t1d0 8.0 79.0 4095.6 6544.3 0.0 2.8 0.0 31.6 0 99 c3t1d0 10.0 89.0 8310.0 16862.0 0.0 5.3 0.0 53.7 0 100 c3t1d0 10.0 84.0 9099.5 25338.3 0.0 11.0 0.0 117.1 0 100 c3t1d0 9.0 104.0 8191.9 19369.4 0.0 6.0 0.0 53.4 0 99 c3t1d0 28.0 78.0 13317.9 20726.2 0.0 6.0 0.0 57.0 0 100 c3t1d0 14.0 83.0 11257.5 23754.7 0.0 7.8 0.0 80.4 0 100 c3t1d0 9.0 122.0 8190.8 22078.3 0.0 7.3 0.0 56.0 0 100 c3t1d0 16.0 79.0 11261.0 18693.1 0.0 4.9 0.0 51.7 0 100 c3t1d0 13.0 72.0 12386.9 19827.2 0.0 7.2 0.0 84.4 0 100 c3t1d0 10.0 123.0 9120.5 17876.0 0.0 12.0 0.0 90.1 0 100 c3t1d0 9.0 98.0 8192.7 20664.3 0.0 13.8 0.0 128.6 0 100 c3t1d0 8.0 137.0 8192.4 21702.0 0.0 11.8 0.0 81.2 0 100 c3t1d0 0.0 162.0 0.0 2150.7 0.0 5.3 0.0 32.9 0 100 c3t1d0 0.0 116.0 0.0 940.4 0.0 3.1 0.0 27.1 0 99 c3t1d0 0.0 80.0 0.0 633.1 0.0 1.1 0.0 14.4 0 99 c3t1d0 0.0 82.0 0.0 648.9 0.0 1.1 0.0 13.5 0 99 c3t1d0 10.0 80.0 8200.9 12298.9 0.0 4.3 0.0 47.5 0 100 c3t1d0 0.0 120.0 0.0 5420.2 0.0 4.0 0.0 33.3 0 100 c3t1d0 0.0 91.0 0.0 703.4 0.0 1.6 0.0 17.6 0 99 c3t1d0 0.0 78.0 0.0 617.1 0.0 1.1 0.0 14.2 0 100 c3t1d0 0.0 75.0 0.0 593.0 0.0 1.1 0.0 14.3 0 100 c3t1d0 8.0 75.0 8190.8 7695.9 0.0 1.7 0.0 20.3 0 100 c3t1d0 0.0 110.0 0.0 1873.3 0.0 3.3 0.0 30.0 0 100 c3t1d0 1.0 98.0 8.0 8772.7 0.0 4.5 0.0 45.8 0 100 c3t1d0 0.0 80.0 0.0 652.6 0.0 1.0 0.0 12.9 0 99 c3t1d0 0.0 75.0 0.0 592.6 0.0 1.1 0.0 14.8 0 100 c3t1d0 0.0 77.0 0.0 609.0 0.0 1.1 0.0 14.2 0 100 c3t1d0 0.0 111.0 0.0 874.4 0.0 3.5 0.0 31.4 0 100 c3t1d0 10.0 121.0 8191.6 17007.2 0.0 8.3 0.0 63.5 0 100 c3t1d0 0.0 84.0 0.0 677.0 0.0 1.1 0.0 13.0 0 99 c3t1d0 0.0 81.0 0.0 633.5 0.0 1.2 0.0 15.0 0 99 c3t1d0 0.0 113.0 0.0 897.0 0.0 3.1 0.0 27.2 0 99 c3t1d0 9.0 76.0 8190.8 6693.0 0.0 2.2 0.0 26.4 0 100 c3t1d0 0.0 86.0 0.0 10777.7 0.0 3.6 0.0 41.8 0 100 c3t1d0 5.0 84.0 1840.1 676.5 0.0 1.2 0.0 13.4 0 99 c3t1d0 0.0 116.0 0.0 921.1 0.0 2.8 0.0 24.0 0 100 c3t1d0 0.0 81.0 0.0 645.9 0.0 1.1 0.0 13.4 0 99 c3t1d0 9.0 73.0 8191.7 15744.4 0.0 4.8 0.0 59.1 0 100 c3t1d0 # ./iopattern -m /data1 MOUNT %RAN %SEQ COUNT MIN MAX AVG KR KW /data1 33 67 60 8192 8192 8192 0 480 /data1 64 36 104 8192 8192 8192 0 832 /data1 30 70 66 8192 851968 20976 840 512 /data1 42 58 59 8192 8192 8192 24 448 /data1 64 36 97 8192 188416 10049 200 752 /data1 37 63 74 8192 8192 8192 8 584 /data1 24 76 74 8192 335872 12620 328 584 /data1 20 80 81 8192 8192 8192 0 648 /data1 47 53 121 8192 8192 8192 0 968 /data1 35 65 85 8192 8192 8192 0 680 /data1 27 73 86 8192 516096 18479 8 1544 /data1 99 1 141 8192 647168 148850 40 20456 /data1 91 9 86 8192 638976 262715 0 22064 /data1 94 6 90 8192 663552 261142 0 22952 /data1 97 3 119 8192 1048576 221184 0 25704 /data1 99 1 106 8192 1048576 260521 0 26968 /data1 100 0 85 8192 1048576 318331 16 26408 /data1 97 3 127 8192 1048576 193447 8 23984 /data1 93 7 94 8192 1048576 259180 0 23792 /data1 90 10 60 8192 1048576 440046 8 25776 MOUNT %RAN %SEQ COUNT MIN MAX AVG KR KW /data1 98 2 77 8192 1048576 286720 8 21552 /data1 94 6 116 8192 172032 10381 0 1176 /data1 85 15 137 8192 8192 8192 0 1096 Random data can be written by oracle with good throughput under the present configuration. An import however doesn''t even register a blip on the disk. The bottleneck appears to be elsewhere. I''l talk to the DBAs again to understand the difference between an import and ''create table aowens as select * from WORK_DETAIL;'' thanks again Alan This message posted from opensolaris.org
On Tue, Aug 30, 2005 at 02:18:36PM -0700, Alan Owens wrote:> Thank you all for your replies. Unfortunately I can''t just unmount and remount the filesystem at will, so I''ll have to wait to try the nologging,forcedirectio mount options. I should be able to arrange a downtime later in the week.Hi Alan, It''s also possible since Solaris 8 to change those options online, without unmount: mount -oremount,nologging /fs mount -oremount,forcedirectio /fs Regards Richard.> > I have another identical server 4way, v440, 16GB, with 14x72GB disk in 6120Array RAID5+hotspare, but it''s running solaris 9 and it flies, so the hardware and the manner in which it''s configured can''t explain the performance. > > I ran a copy of a large table to test oracle''s ability to read/write large volumes of data and I got decent throughput. Here''s a few snippet of iostat and iopattern (not sync''d) > > create table aowens as select * from WORK_DETAIL; > > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 6.0 82.0 3071.5 4583.3 0.0 2.5 0.0 28.2 0 99 c3t1d0 > 17.0 82.0 3032.4 8467.2 0.0 3.1 0.0 31.7 0 99 c3t1d0 > 7.0 117.0 3111.8 6878.0 0.0 4.5 0.0 36.2 0 99 c3t1d0 > 5.0 82.0 3072.0 6710.9 0.0 3.1 0.0 35.6 0 99 c3t1d0 > 10.0 85.0 4224.6 6577.4 0.0 2.8 0.0 29.6 0 99 c3t1d0 > 5.0 114.0 2056.0 2864.5 0.0 3.8 0.0 31.9 0 99 c3t1d0 > 5.0 79.0 40.0 1541.3 0.0 1.4 0.0 16.7 0 99 c3t1d0 > 4.0 78.0 1880.3 3611.5 0.0 1.7 0.0 21.0 0 99 c3t1d0 > 5.0 113.0 3071.9 4845.3 0.0 4.5 0.0 37.7 0 99 c3t1d0 > 8.0 79.0 4095.6 6544.3 0.0 2.8 0.0 31.6 0 99 c3t1d0 > 10.0 89.0 8310.0 16862.0 0.0 5.3 0.0 53.7 0 100 c3t1d0 > 10.0 84.0 9099.5 25338.3 0.0 11.0 0.0 117.1 0 100 c3t1d0 > 9.0 104.0 8191.9 19369.4 0.0 6.0 0.0 53.4 0 99 c3t1d0 > 28.0 78.0 13317.9 20726.2 0.0 6.0 0.0 57.0 0 100 c3t1d0 > 14.0 83.0 11257.5 23754.7 0.0 7.8 0.0 80.4 0 100 c3t1d0 > 9.0 122.0 8190.8 22078.3 0.0 7.3 0.0 56.0 0 100 c3t1d0 > 16.0 79.0 11261.0 18693.1 0.0 4.9 0.0 51.7 0 100 c3t1d0 > 13.0 72.0 12386.9 19827.2 0.0 7.2 0.0 84.4 0 100 c3t1d0 > 10.0 123.0 9120.5 17876.0 0.0 12.0 0.0 90.1 0 100 c3t1d0 > 9.0 98.0 8192.7 20664.3 0.0 13.8 0.0 128.6 0 100 c3t1d0 > 8.0 137.0 8192.4 21702.0 0.0 11.8 0.0 81.2 0 100 c3t1d0 > 0.0 162.0 0.0 2150.7 0.0 5.3 0.0 32.9 0 100 c3t1d0 > 0.0 116.0 0.0 940.4 0.0 3.1 0.0 27.1 0 99 c3t1d0 > 0.0 80.0 0.0 633.1 0.0 1.1 0.0 14.4 0 99 c3t1d0 > 0.0 82.0 0.0 648.9 0.0 1.1 0.0 13.5 0 99 c3t1d0 > 10.0 80.0 8200.9 12298.9 0.0 4.3 0.0 47.5 0 100 c3t1d0 > 0.0 120.0 0.0 5420.2 0.0 4.0 0.0 33.3 0 100 c3t1d0 > 0.0 91.0 0.0 703.4 0.0 1.6 0.0 17.6 0 99 c3t1d0 > 0.0 78.0 0.0 617.1 0.0 1.1 0.0 14.2 0 100 c3t1d0 > 0.0 75.0 0.0 593.0 0.0 1.1 0.0 14.3 0 100 c3t1d0 > 8.0 75.0 8190.8 7695.9 0.0 1.7 0.0 20.3 0 100 c3t1d0 > 0.0 110.0 0.0 1873.3 0.0 3.3 0.0 30.0 0 100 c3t1d0 > 1.0 98.0 8.0 8772.7 0.0 4.5 0.0 45.8 0 100 c3t1d0 > 0.0 80.0 0.0 652.6 0.0 1.0 0.0 12.9 0 99 c3t1d0 > 0.0 75.0 0.0 592.6 0.0 1.1 0.0 14.8 0 100 c3t1d0 > 0.0 77.0 0.0 609.0 0.0 1.1 0.0 14.2 0 100 c3t1d0 > 0.0 111.0 0.0 874.4 0.0 3.5 0.0 31.4 0 100 c3t1d0 > 10.0 121.0 8191.6 17007.2 0.0 8.3 0.0 63.5 0 100 c3t1d0 > 0.0 84.0 0.0 677.0 0.0 1.1 0.0 13.0 0 99 c3t1d0 > 0.0 81.0 0.0 633.5 0.0 1.2 0.0 15.0 0 99 c3t1d0 > 0.0 113.0 0.0 897.0 0.0 3.1 0.0 27.2 0 99 c3t1d0 > 9.0 76.0 8190.8 6693.0 0.0 2.2 0.0 26.4 0 100 c3t1d0 > 0.0 86.0 0.0 10777.7 0.0 3.6 0.0 41.8 0 100 c3t1d0 > 5.0 84.0 1840.1 676.5 0.0 1.2 0.0 13.4 0 99 c3t1d0 > 0.0 116.0 0.0 921.1 0.0 2.8 0.0 24.0 0 100 c3t1d0 > 0.0 81.0 0.0 645.9 0.0 1.1 0.0 13.4 0 99 c3t1d0 > 9.0 73.0 8191.7 15744.4 0.0 4.8 0.0 59.1 0 100 c3t1d0 > > > # ./iopattern -m /data1 > MOUNT %RAN %SEQ COUNT MIN MAX AVG KR KW > /data1 33 67 60 8192 8192 8192 0 480 > /data1 64 36 104 8192 8192 8192 0 832 > /data1 30 70 66 8192 851968 20976 840 512 > /data1 42 58 59 8192 8192 8192 24 448 > /data1 64 36 97 8192 188416 10049 200 752 > /data1 37 63 74 8192 8192 8192 8 584 > /data1 24 76 74 8192 335872 12620 328 584 > /data1 20 80 81 8192 8192 8192 0 648 > /data1 47 53 121 8192 8192 8192 0 968 > /data1 35 65 85 8192 8192 8192 0 680 > /data1 27 73 86 8192 516096 18479 8 1544 > /data1 99 1 141 8192 647168 148850 40 20456 > /data1 91 9 86 8192 638976 262715 0 22064 > /data1 94 6 90 8192 663552 261142 0 22952 > /data1 97 3 119 8192 1048576 221184 0 25704 > /data1 99 1 106 8192 1048576 260521 0 26968 > /data1 100 0 85 8192 1048576 318331 16 26408 > /data1 97 3 127 8192 1048576 193447 8 23984 > /data1 93 7 94 8192 1048576 259180 0 23792 > /data1 90 10 60 8192 1048576 440046 8 25776 > MOUNT %RAN %SEQ COUNT MIN MAX AVG KR KW > /data1 98 2 77 8192 1048576 286720 8 21552 > /data1 94 6 116 8192 172032 10381 0 1176 > /data1 85 15 137 8192 8192 8192 0 1096 > > Random data can be written by oracle with good throughput under the present configuration. An import however doesn''t even register a blip on the disk. The bottleneck appears to be elsewhere. I''l talk to the DBAs again to understand the difference between an import and ''create table aowens as select * from WORK_DETAIL;'' > > thanks again > Alan > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- :-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-: Richard Mc Dougall : Richard.McDougall at Sun.COM Performance and Availability : x31542 Engineering : http://blogs.sun.com/rmc Sun Microsystems Inc : +1 650 352 6438 :-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
G''Day Folks, On Tue, 30 Aug 2005, Alan Owens wrote: [...]> # ./iopattern -m /data1 > MOUNT %RAN %SEQ COUNT MIN MAX AVG KR KW > /data1 33 67 60 8192 8192 8192 0 480 > /data1 64 36 104 8192 8192 8192 0 832 > /data1 30 70 66 8192 851968 20976 840 512 > /data1 42 58 59 8192 8192 8192 24 448 > /data1 64 36 97 8192 188416 10049 200 752 > /data1 37 63 74 8192 8192 8192 8 584 > /data1 24 76 74 8192 335872 12620 328 584 > /data1 20 80 81 8192 8192 8192 0 648 > /data1 47 53 121 8192 8192 8192 0 968 > /data1 35 65 85 8192 8192 8192 0 680 > /data1 27 73 86 8192 516096 18479 8 1544 > /data1 99 1 141 8192 647168 148850 40 20456 > /data1 91 9 86 8192 638976 262715 0 22064 > /data1 94 6 90 8192 663552 261142 0 22952 > /data1 97 3 119 8192 1048576 221184 0 25704 > /data1 99 1 106 8192 1048576 260521 0 26968 > /data1 100 0 85 8192 1048576 318331 16 26408 > /data1 97 3 127 8192 1048576 193447 8 23984iopattern on Oracle - very interesting (I don''t have Oracle DBs on hand to test). iopattern is a new script. It considers an event random if the disk heads seek (or, at least, we think the disk heads seek - based on block address and size). It doesn''t measure the size of the seek (this is in "NOTES" in the script itself). I haven''t seen so many random events with such a large size - 256 Kb events, up to 1 Mb events. Would it be fair to describe scattered 1 Mb events as random? They seem pretty big to me. And if this is UFS, they may just be scattered over sequential cylinder groups (and skipping cylinder group blocks / backup super blocks) - no not a huge amount of seeking. Perhaps, ideally, I should be measure the time the disk heads spent seeking and disk rotation vs the time to actually read/write.. anyway, iopattern is a new tool - so I''m curious to see how useful it is. I think at the moment it would be best to keep an eye on the size columns for MIN, MAX, AVG as well. cheers, Brendan
That''s cool, I didn''t know that! Thanks Richard. I''ve remounted it and will get the DBAs to run an import tomorrow morning. # mount -v | grep data1 /dev/dsk/c3t1d0s3 on /data1 type ufs read/write/setuid/devices/intr/largefiles/logging/xattr/onerror=panic/de v=1d80003 on Thu Aug 25 08:31:41 2005 # mount -oremount,nologging,forcedirectio /data1 # mount -v | grep data1 /dev/dsk/c3t1d0s3 on /data1 type ufs read/write/setuid/devices/intr/forcedirectio/largefiles/xattr/onerror=pa nic/dev=1d80003 on Tue Aug 30 20:11:11 2005 Alan -----Original Message----- From: Richard McDougall [mailto:Richard.McDougall at Sun.COM] Sent: Tuesday, August 30, 2005 5:58 PM To: Alan Owens Cc: dtrace-discuss at opensolaris.org Subject: Re: [dtrace-discuss] Re: Oracle 9.2.0.6 on Solaris 10 On Tue, Aug 30, 2005 at 02:18:36PM -0700, Alan Owens wrote:> Thank you all for your replies. Unfortunately I can''t just unmountand remount the filesystem at will, so I''ll have to wait to try the nologging,forcedirectio mount options. I should be able to arrange a downtime later in the week. Hi Alan, It''s also possible since Solaris 8 to change those options online, without unmount: mount -oremount,nologging /fs mount -oremount,forcedirectio /fs Regards Richard.> > I have another identical server 4way, v440, 16GB, with 14x72GB disk in6120Array RAID5+hotspare, but it''s running solaris 9 and it flies, so the hardware and the manner in which it''s configured can''t explain the performance.> > I ran a copy of a large table to test oracle''s ability to read/writelarge volumes of data and I got decent throughput. Here''s a few snippet of iostat and iopattern (not sync''d)> > create table aowens as select * from WORK_DETAIL; > > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 6.0 82.0 3071.5 4583.3 0.0 2.5 0.0 28.2 0 99 c3t1d0 > 17.0 82.0 3032.4 8467.2 0.0 3.1 0.0 31.7 0 99 c3t1d0 > 7.0 117.0 3111.8 6878.0 0.0 4.5 0.0 36.2 0 99 c3t1d0 > 5.0 82.0 3072.0 6710.9 0.0 3.1 0.0 35.6 0 99 c3t1d0 > 10.0 85.0 4224.6 6577.4 0.0 2.8 0.0 29.6 0 99 c3t1d0 > 5.0 114.0 2056.0 2864.5 0.0 3.8 0.0 31.9 0 99 c3t1d0 > 5.0 79.0 40.0 1541.3 0.0 1.4 0.0 16.7 0 99 c3t1d0 > 4.0 78.0 1880.3 3611.5 0.0 1.7 0.0 21.0 0 99 c3t1d0 > 5.0 113.0 3071.9 4845.3 0.0 4.5 0.0 37.7 0 99 c3t1d0 > 8.0 79.0 4095.6 6544.3 0.0 2.8 0.0 31.6 0 99 c3t1d0 > 10.0 89.0 8310.0 16862.0 0.0 5.3 0.0 53.7 0 100 c3t1d0 > 10.0 84.0 9099.5 25338.3 0.0 11.0 0.0 117.1 0 100 c3t1d0 > 9.0 104.0 8191.9 19369.4 0.0 6.0 0.0 53.4 0 99 c3t1d0 > 28.0 78.0 13317.9 20726.2 0.0 6.0 0.0 57.0 0 100 c3t1d0 > 14.0 83.0 11257.5 23754.7 0.0 7.8 0.0 80.4 0 100 c3t1d0 > 9.0 122.0 8190.8 22078.3 0.0 7.3 0.0 56.0 0 100 c3t1d0 > 16.0 79.0 11261.0 18693.1 0.0 4.9 0.0 51.7 0 100 c3t1d0 > 13.0 72.0 12386.9 19827.2 0.0 7.2 0.0 84.4 0 100 c3t1d0 > 10.0 123.0 9120.5 17876.0 0.0 12.0 0.0 90.1 0 100 c3t1d0 > 9.0 98.0 8192.7 20664.3 0.0 13.8 0.0 128.6 0 100 c3t1d0 > 8.0 137.0 8192.4 21702.0 0.0 11.8 0.0 81.2 0 100 c3t1d0 > 0.0 162.0 0.0 2150.7 0.0 5.3 0.0 32.9 0 100 c3t1d0 > 0.0 116.0 0.0 940.4 0.0 3.1 0.0 27.1 0 99 c3t1d0 > 0.0 80.0 0.0 633.1 0.0 1.1 0.0 14.4 0 99 c3t1d0 > 0.0 82.0 0.0 648.9 0.0 1.1 0.0 13.5 0 99 c3t1d0 > 10.0 80.0 8200.9 12298.9 0.0 4.3 0.0 47.5 0 100 c3t1d0 > 0.0 120.0 0.0 5420.2 0.0 4.0 0.0 33.3 0 100 c3t1d0 > 0.0 91.0 0.0 703.4 0.0 1.6 0.0 17.6 0 99 c3t1d0 > 0.0 78.0 0.0 617.1 0.0 1.1 0.0 14.2 0 100 c3t1d0 > 0.0 75.0 0.0 593.0 0.0 1.1 0.0 14.3 0 100 c3t1d0 > 8.0 75.0 8190.8 7695.9 0.0 1.7 0.0 20.3 0 100 c3t1d0 > 0.0 110.0 0.0 1873.3 0.0 3.3 0.0 30.0 0 100 c3t1d0 > 1.0 98.0 8.0 8772.7 0.0 4.5 0.0 45.8 0 100 c3t1d0 > 0.0 80.0 0.0 652.6 0.0 1.0 0.0 12.9 0 99 c3t1d0 > 0.0 75.0 0.0 592.6 0.0 1.1 0.0 14.8 0 100 c3t1d0 > 0.0 77.0 0.0 609.0 0.0 1.1 0.0 14.2 0 100 c3t1d0 > 0.0 111.0 0.0 874.4 0.0 3.5 0.0 31.4 0 100 c3t1d0 > 10.0 121.0 8191.6 17007.2 0.0 8.3 0.0 63.5 0 100 c3t1d0 > 0.0 84.0 0.0 677.0 0.0 1.1 0.0 13.0 0 99 c3t1d0 > 0.0 81.0 0.0 633.5 0.0 1.2 0.0 15.0 0 99 c3t1d0 > 0.0 113.0 0.0 897.0 0.0 3.1 0.0 27.2 0 99 c3t1d0 > 9.0 76.0 8190.8 6693.0 0.0 2.2 0.0 26.4 0 100 c3t1d0 > 0.0 86.0 0.0 10777.7 0.0 3.6 0.0 41.8 0 100 c3t1d0 > 5.0 84.0 1840.1 676.5 0.0 1.2 0.0 13.4 0 99 c3t1d0 > 0.0 116.0 0.0 921.1 0.0 2.8 0.0 24.0 0 100 c3t1d0 > 0.0 81.0 0.0 645.9 0.0 1.1 0.0 13.4 0 99 c3t1d0 > 9.0 73.0 8191.7 15744.4 0.0 4.8 0.0 59.1 0 100 c3t1d0 > > > # ./iopattern -m /data1 > MOUNT %RAN %SEQ COUNT MIN MAX AVG KR KW > /data1 33 67 60 8192 8192 8192 0 480 > /data1 64 36 104 8192 8192 8192 0 832 > /data1 30 70 66 8192 851968 20976 840 512 > /data1 42 58 59 8192 8192 8192 24 448 > /data1 64 36 97 8192 188416 10049 200 752 > /data1 37 63 74 8192 8192 8192 8 584 > /data1 24 76 74 8192 335872 12620 328 584 > /data1 20 80 81 8192 8192 8192 0 648 > /data1 47 53 121 8192 8192 8192 0 968 > /data1 35 65 85 8192 8192 8192 0 680 > /data1 27 73 86 8192 516096 18479 8 1544 > /data1 99 1 141 8192 647168 148850 40 20456 > /data1 91 9 86 8192 638976 262715 0 22064 > /data1 94 6 90 8192 663552 261142 0 22952 > /data1 97 3 119 8192 1048576 221184 0 25704 > /data1 99 1 106 8192 1048576 260521 0 26968 > /data1 100 0 85 8192 1048576 318331 16 26408 > /data1 97 3 127 8192 1048576 193447 8 23984 > /data1 93 7 94 8192 1048576 259180 0 23792 > /data1 90 10 60 8192 1048576 440046 8 25776 > MOUNT %RAN %SEQ COUNT MIN MAX AVG KR KW > /data1 98 2 77 8192 1048576 286720 8 21552 > /data1 94 6 116 8192 172032 10381 0 1176 > /data1 85 15 137 8192 8192 8192 0 1096 > > Random data can be written by oracle with good throughput under thepresent configuration. An import however doesn''t even register a blip on the disk. The bottleneck appears to be elsewhere. I''l talk to the DBAs again to understand the difference between an import and ''create table aowens as select * from WORK_DETAIL;''> > thanks again > Alan > This message posted from opensolaris.org > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- :-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-: Richard Mc Dougall : Richard.McDougall at Sun.COM Performance and Availability : x31542 Engineering : http://blogs.sun.com/rmc Sun Microsystems Inc : +1 650 352 6438 :-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:
Alan - Just remember that whilst it''s mounted nologging, if you have a hard crash (power outage / Dstop / fatal error etc), you could be in for a long long fsck... Might be better to do it just before the import, and change it back when it''s done... My 2c... :) Nathan. On Wed, 2005-08-31 at 10:13, Alan Owens wrote:> That''s cool, I didn''t know that! Thanks Richard. > > I''ve remounted it and will get the DBAs to run an import tomorrow > morning. > > # mount -v | grep data1 > /dev/dsk/c3t1d0s3 on /data1 type ufs > read/write/setuid/devices/intr/largefiles/logging/xattr/onerror=panic/de > v=1d80003 on Thu Aug 25 08:31:41 2005 > # mount -oremount,nologging,forcedirectio /data1 > # mount -v | grep data1 > /dev/dsk/c3t1d0s3 on /data1 type ufs > read/write/setuid/devices/intr/forcedirectio/largefiles/xattr/onerror=pa > nic/dev=1d80003 on Tue Aug 30 20:11:11 2005 > > Alan > > -----Original Message----- > From: Richard McDougall [mailto:Richard.McDougall at Sun.COM] > Sent: Tuesday, August 30, 2005 5:58 PM > To: Alan Owens > Cc: dtrace-discuss at opensolaris.org > Subject: Re: [dtrace-discuss] Re: Oracle 9.2.0.6 on Solaris 10 > > > On Tue, Aug 30, 2005 at 02:18:36PM -0700, Alan Owens wrote: > > Thank you all for your replies. Unfortunately I can''t just unmount > and remount the filesystem at will, so I''ll have to wait to try the > nologging,forcedirectio mount options. I should be able to arrange a > downtime later in the week. > > > Hi Alan, > > It''s also possible since Solaris 8 to change those options online, > without unmount: > > mount -oremount,nologging /fs > mount -oremount,forcedirectio /fs > > Regards > > Richard. > > > > > > > I have another identical server 4way, v440, 16GB, with 14x72GB disk in > 6120Array RAID5+hotspare, but it''s running solaris 9 and it flies, so > the hardware and the manner in which it''s configured can''t explain the > performance. > > > > I ran a copy of a large table to test oracle''s ability to read/write > large volumes of data and I got decent throughput. Here''s a few snippet > of iostat and iopattern (not sync''d) > > > > create table aowens as select * from WORK_DETAIL; > > > > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > > 6.0 82.0 3071.5 4583.3 0.0 2.5 0.0 28.2 0 99 c3t1d0 > > 17.0 82.0 3032.4 8467.2 0.0 3.1 0.0 31.7 0 99 c3t1d0 > > 7.0 117.0 3111.8 6878.0 0.0 4.5 0.0 36.2 0 99 c3t1d0 > > 5.0 82.0 3072.0 6710.9 0.0 3.1 0.0 35.6 0 99 c3t1d0 > > 10.0 85.0 4224.6 6577.4 0.0 2.8 0.0 29.6 0 99 c3t1d0 > > 5.0 114.0 2056.0 2864.5 0.0 3.8 0.0 31.9 0 99 c3t1d0 > > 5.0 79.0 40.0 1541.3 0.0 1.4 0.0 16.7 0 99 c3t1d0 > > 4.0 78.0 1880.3 3611.5 0.0 1.7 0.0 21.0 0 99 c3t1d0 > > 5.0 113.0 3071.9 4845.3 0.0 4.5 0.0 37.7 0 99 c3t1d0 > > 8.0 79.0 4095.6 6544.3 0.0 2.8 0.0 31.6 0 99 c3t1d0 > > 10.0 89.0 8310.0 16862.0 0.0 5.3 0.0 53.7 0 100 c3t1d0 > > 10.0 84.0 9099.5 25338.3 0.0 11.0 0.0 117.1 0 100 c3t1d0 > > 9.0 104.0 8191.9 19369.4 0.0 6.0 0.0 53.4 0 99 c3t1d0 > > 28.0 78.0 13317.9 20726.2 0.0 6.0 0.0 57.0 0 100 c3t1d0 > > 14.0 83.0 11257.5 23754.7 0.0 7.8 0.0 80.4 0 100 c3t1d0 > > 9.0 122.0 8190.8 22078.3 0.0 7.3 0.0 56.0 0 100 c3t1d0 > > 16.0 79.0 11261.0 18693.1 0.0 4.9 0.0 51.7 0 100 c3t1d0 > > 13.0 72.0 12386.9 19827.2 0.0 7.2 0.0 84.4 0 100 c3t1d0 > > 10.0 123.0 9120.5 17876.0 0.0 12.0 0.0 90.1 0 100 c3t1d0 > > 9.0 98.0 8192.7 20664.3 0.0 13.8 0.0 128.6 0 100 c3t1d0 > > 8.0 137.0 8192.4 21702.0 0.0 11.8 0.0 81.2 0 100 c3t1d0 > > 0.0 162.0 0.0 2150.7 0.0 5.3 0.0 32.9 0 100 c3t1d0 > > 0.0 116.0 0.0 940.4 0.0 3.1 0.0 27.1 0 99 c3t1d0 > > 0.0 80.0 0.0 633.1 0.0 1.1 0.0 14.4 0 99 c3t1d0 > > 0.0 82.0 0.0 648.9 0.0 1.1 0.0 13.5 0 99 c3t1d0 > > 10.0 80.0 8200.9 12298.9 0.0 4.3 0.0 47.5 0 100 c3t1d0 > > 0.0 120.0 0.0 5420.2 0.0 4.0 0.0 33.3 0 100 c3t1d0 > > 0.0 91.0 0.0 703.4 0.0 1.6 0.0 17.6 0 99 c3t1d0 > > 0.0 78.0 0.0 617.1 0.0 1.1 0.0 14.2 0 100 c3t1d0 > > 0.0 75.0 0.0 593.0 0.0 1.1 0.0 14.3 0 100 c3t1d0 > > 8.0 75.0 8190.8 7695.9 0.0 1.7 0.0 20.3 0 100 c3t1d0 > > 0.0 110.0 0.0 1873.3 0.0 3.3 0.0 30.0 0 100 c3t1d0 > > 1.0 98.0 8.0 8772.7 0.0 4.5 0.0 45.8 0 100 c3t1d0 > > 0.0 80.0 0.0 652.6 0.0 1.0 0.0 12.9 0 99 c3t1d0 > > 0.0 75.0 0.0 592.6 0.0 1.1 0.0 14.8 0 100 c3t1d0 > > 0.0 77.0 0.0 609.0 0.0 1.1 0.0 14.2 0 100 c3t1d0 > > 0.0 111.0 0.0 874.4 0.0 3.5 0.0 31.4 0 100 c3t1d0 > > 10.0 121.0 8191.6 17007.2 0.0 8.3 0.0 63.5 0 100 c3t1d0 > > 0.0 84.0 0.0 677.0 0.0 1.1 0.0 13.0 0 99 c3t1d0 > > 0.0 81.0 0.0 633.5 0.0 1.2 0.0 15.0 0 99 c3t1d0 > > 0.0 113.0 0.0 897.0 0.0 3.1 0.0 27.2 0 99 c3t1d0 > > 9.0 76.0 8190.8 6693.0 0.0 2.2 0.0 26.4 0 100 c3t1d0 > > 0.0 86.0 0.0 10777.7 0.0 3.6 0.0 41.8 0 100 c3t1d0 > > 5.0 84.0 1840.1 676.5 0.0 1.2 0.0 13.4 0 99 c3t1d0 > > 0.0 116.0 0.0 921.1 0.0 2.8 0.0 24.0 0 100 c3t1d0 > > 0.0 81.0 0.0 645.9 0.0 1.1 0.0 13.4 0 99 c3t1d0 > > 9.0 73.0 8191.7 15744.4 0.0 4.8 0.0 59.1 0 100 c3t1d0 > > > > > > # ./iopattern -m /data1 > > MOUNT %RAN %SEQ COUNT MIN MAX AVG KR KW > > /data1 33 67 60 8192 8192 8192 0 480 > > /data1 64 36 104 8192 8192 8192 0 832 > > /data1 30 70 66 8192 851968 20976 840 512 > > /data1 42 58 59 8192 8192 8192 24 448 > > /data1 64 36 97 8192 188416 10049 200 752 > > /data1 37 63 74 8192 8192 8192 8 584 > > /data1 24 76 74 8192 335872 12620 328 584 > > /data1 20 80 81 8192 8192 8192 0 648 > > /data1 47 53 121 8192 8192 8192 0 968 > > /data1 35 65 85 8192 8192 8192 0 680 > > /data1 27 73 86 8192 516096 18479 8 1544 > > /data1 99 1 141 8192 647168 148850 40 20456 > > /data1 91 9 86 8192 638976 262715 0 22064 > > /data1 94 6 90 8192 663552 261142 0 22952 > > /data1 97 3 119 8192 1048576 221184 0 25704 > > /data1 99 1 106 8192 1048576 260521 0 26968 > > /data1 100 0 85 8192 1048576 318331 16 26408 > > /data1 97 3 127 8192 1048576 193447 8 23984 > > /data1 93 7 94 8192 1048576 259180 0 23792 > > /data1 90 10 60 8192 1048576 440046 8 25776 > > MOUNT %RAN %SEQ COUNT MIN MAX AVG KR KW > > /data1 98 2 77 8192 1048576 286720 8 21552 > > /data1 94 6 116 8192 172032 10381 0 1176 > > /data1 85 15 137 8192 8192 8192 0 1096 > > > > Random data can be written by oracle with good throughput under the > present configuration. An import however doesn''t even register a blip > on the disk. The bottleneck appears to be elsewhere. I''l talk to the > DBAs again to understand the difference between an import and ''create > table aowens as select * from WORK_DETAIL;'' > > > > thanks again > > Alan > > This message posted from opensolaris.org > > _______________________________________________ > > dtrace-discuss mailing list > > dtrace-discuss at opensolaris.org
Hi Brendan, Actually this data makes perfect sense to me. It was from a "create table as select * from ..." construct. With Oracle that will do a parallel full-table scan (FTS) to copy to the new table. The FTS reads blocks in db_blocksize * db_file_multiblock_read_count blocks, which in this case looks to be 1M. With the filesystem involved, since each FTS process is reading sequentially, the filesystem will do the necessary readahead and get to the larger sizes specified. Since there are multiple processes scanning the table at the same time from different places in the table, and all are going to the same device/LUN, at the device layer it''s going to look like large (i.e. 1M), random reads. That plus the output is also going to different places on the same LUN which also will require the heads to move. Dave Miller Brendan Gregg wrote, On 08/30/05 06:07 PM:> G''Day Folks, > > On Tue, 30 Aug 2005, Alan Owens wrote: > [...] > >># ./iopattern -m /data1 >>MOUNT %RAN %SEQ COUNT MIN MAX AVG KR KW >>/data1 33 67 60 8192 8192 8192 0 480 >>/data1 64 36 104 8192 8192 8192 0 832 >>/data1 30 70 66 8192 851968 20976 840 512 >>/data1 42 58 59 8192 8192 8192 24 448 >>/data1 64 36 97 8192 188416 10049 200 752 >>/data1 37 63 74 8192 8192 8192 8 584 >>/data1 24 76 74 8192 335872 12620 328 584 >>/data1 20 80 81 8192 8192 8192 0 648 >>/data1 47 53 121 8192 8192 8192 0 968 >>/data1 35 65 85 8192 8192 8192 0 680 >>/data1 27 73 86 8192 516096 18479 8 1544 >>/data1 99 1 141 8192 647168 148850 40 20456 >>/data1 91 9 86 8192 638976 262715 0 22064 >>/data1 94 6 90 8192 663552 261142 0 22952 >>/data1 97 3 119 8192 1048576 221184 0 25704 >>/data1 99 1 106 8192 1048576 260521 0 26968 >>/data1 100 0 85 8192 1048576 318331 16 26408 >>/data1 97 3 127 8192 1048576 193447 8 23984 > > > iopattern on Oracle - very interesting (I don''t have Oracle DBs on hand > to test). > > iopattern is a new script. It considers an event random if the disk heads > seek (or, at least, we think the disk heads seek - based on block address > and size). It doesn''t measure the size of the seek (this is in "NOTES" in > the script itself). > > I haven''t seen so many random events with such a large size - 256 Kb > events, up to 1 Mb events. Would it be fair to describe scattered 1 Mb > events as random? They seem pretty big to me. And if this is UFS, they may > just be scattered over sequential cylinder groups (and skipping cylinder > group blocks / backup super blocks) - no not a huge amount of seeking. > > Perhaps, ideally, I should be measure the time the disk heads spent > seeking and disk rotation vs the time to actually read/write.. > > anyway, iopattern is a new tool - so I''m curious to see how useful it is. > I think at the moment it would be best to keep an eye on the size columns > for MIN, MAX, AVG as well. > > cheers, > > Brendan > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org
David Miller was right on the money, it was purely an i/o issue. I had a downtime last weekend where I was able to backup the 300+GB we have on the array, blow away the volume, reconfig the array, recreate the volume, copy all the data back. I changed the block size (stripe unit size) on the array from 16k to 64k, which changes the cache segment size from 2k to 8k. There were a couple of other things I put right also - writebehind caching, multipath io config, mirrored cache, etc... We ran an db create and import of a 100GB database in 27 hours, including index creation. Previously a 20GB import was taking a week to complete. Thanks David and everyone else for your help on this. Alan This message posted from opensolaris.org