Hi Experts, Here''s the performance related question,please help to review what can I do to get the issue fixed ? IHAC who has one M5000 with Solaris 10 10/08(KJP: 138888-01) installed and 16GB RAM configured,running sybase ASE 12.5 and JBOSS application,recently,they felt the OS got very slow after OS running for some sime,collected vmstat data points out memory shortage,as: # vmstat 5 kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id 0 0 153 6953672 254552 228 228 1843 1218 1687 0 685 3 2 0 0 2334 32431 3143 1 1 97 0 0 153 6953672 259888 115 115 928 917 917 0 264 0 35 0 2 2208 62355 3332 7 3 90 0 0 153 6953672 255688 145 145 1168 1625 1625 0 1482 0 6 1 0 2088 40113 3070 2 1 96 0 0 153 6953640 256144 111 111 894 1371 1624 0 1124 0 6 0 0 2080 55278 3106 3 3 94 0 0 153 6953640 256048 241 241 1935 2585 3035 0 1009 0 18 0 0 2392 40643 3164 2 2 96 0 0 153 6953648 257112 236 235 1916 1710 1710 0 1223 0 7 0 0 2672 62582 3628 3 4 93 As above,the "w" column is very high all time,and "sr" column also kept very high,which indicates the page scanner is activated and busying for page out,but the CPU is very idle,checked "/etc/system",found one improper entry: set shmsys:shminfo_shmmax = 0xffffffffffff So I think it''s the improper share memory setting to cause too many physical RAM was reserved by application and suggest to adjustment the share memory to 8GB(0x200000000),but as customer feedback,seems it got worst result based on new vmstat output: kthr memory page disk faults cpu r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id 0 6 762 3941344 515848 18 29 4544 0 0 0 0 4 562 0 1 2448 25687 3623 1 2 97 0 6 762 4235016 749616 66 21 4251 2 2 0 0 0 528 0 0 2508 50540 3733 2 5 93 0 6 762 4428080 889864 106 299 4694 0 0 0 0 1 573 0 7 2741 182274 3907 10 4 86 0 5 762 4136400 664888 19 174 4126 0 0 0 0 6 511 0 0 2968 241186 4417 18 9 73 0 7 762 3454280 193776 103 651 2526 3949 4860 0 121549 11 543 0 5 2808 149820 4164 10 12 78 0 9 762 3160424 186016 61 440 1803 7362 15047 0 189720 12 567 0 5 3101 119895 4125 6 13 81 0 6 762 3647456 403056 44 279 4260 331 331 0 243 10 540 0 3 2552 38374 3847 5 3 92 the "w" & "sr" value increased instead,why ? And I also attached the "prstat" outout,it''s a prstat snapshot after share memory adjustment,please help to have a look ? what can I do next to get the issue solved ? what''s the possible factors to cause memory shortage again and again,even they have 16GB RAM + 16GB Swap the physical RAM really shortage? Or is there any useful dtrace script to trace the problem ? Thanks very much ! Best Regards, Simon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20091120/75950f7c/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: prstat.JPG Type: image/jpeg Size: 137960 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20091120/75950f7c/attachment-0001.jpe>
Simon wrote:> Hi Experts, > > Here''s the performance related question,please help to review what can I > do to get the issue fixed ?Simon, Solaris 10 is NOT OpenSolaris; moreover, Sun sells maintainance contracts for Solaris 10, which you should go and buy if you don''t have one if you need support. Please contact your local support organisation, they are trained to help you with this kind of issues. also, this question looks suspiciously familiar - did you ask it yesterday and by chance get the same answer I gave above? having said all that, I''ll give you an answer to one of the questions, which may help you figure out some of this: the shmsys:shminfo_shmmax setting in /etc/system is just a upper limit to the size shared memory (or a single segment - I forget which) can be - in and of itself, it has no effect. I think it''s also deprecated in favour of using projects, but I haven''t played around with those to be of any help there. regards Michael> IHAC who has one M5000 with Solaris 10 10/08(KJP: 138888-01) installed > and 16GB RAM configured,running sybase ASE 12.5 and JBOSS > application,recently,they felt the OS got very slow after OS running for > some sime,collected vmstat data points out memory shortage,as: > > # vmstat 5 > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id > 0 0 153 6953672 254552 228 228 1843 1218 1687 0 685 3 2 0 0 2334 32431 3143 1 1 97 > 0 0 153 6953672 259888 115 115 928 917 917 0 264 0 35 0 2 2208 62355 3332 7 3 90 > 0 0 153 6953672 255688 145 145 1168 1625 1625 0 1482 0 6 1 0 2088 40113 3070 2 1 96 > 0 0 153 6953640 256144 111 111 894 1371 1624 0 1124 0 6 0 0 2080 55278 3106 3 3 94 > 0 0 153 6953640 256048 241 241 1935 2585 3035 0 1009 0 18 0 0 2392 40643 3164 2 2 96 > 0 0 153 6953648 257112 236 235 1916 1710 1710 0 1223 0 7 0 0 2672 62582 3628 3 4 93 > > As above,the "w" column is very high all time,and "sr" column also kept > very high,which indicates the page scanner is activated and busying for > page out,but the CPU is very idle,checked "/etc/system",found one > improper entry: > set shmsys:shminfo_shmmax = 0xffffffffffff > > So I think it''s the improper share memory setting to cause too many > physical RAM was reserved by application and suggest to adjustment the > share memory to 8GB(0x200000000),but as customer feedback,seems it got > worst result based on new vmstat output: > > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id > 0 6 762 3941344 515848 18 29 4544 0 0 0 0 4 562 0 1 2448 25687 3623 1 2 97 > 0 6 762 4235016 749616 66 21 4251 2 2 0 0 0 528 0 0 2508 50540 3733 2 5 93 > 0 6 762 4428080 889864 106 299 4694 0 0 0 0 1 573 0 7 2741 182274 3907 10 4 86 > 0 5 762 4136400 664888 19 174 4126 0 0 0 0 6 511 0 0 2968 241186 4417 18 9 73 > 0 7 762 3454280 193776 103 651 2526 3949 4860 0 121549 11 543 0 5 2808 149820 4164 10 12 78 > 0 9 762 3160424 186016 61 440 1803 7362 15047 0 189720 12 567 0 5 3101 119895 4125 6 13 81 > 0 6 762 3647456 403056 44 279 4260 331 331 0 243 10 540 0 3 2552 38374 3847 5 3 92 > > the "w" & "sr" value increased instead,why ? > > And I also attached the "prstat" outout,it''s a prstat snapshot after > share memory adjustment,please help to have a look ? what can I do next > to get the issue solved ? what''s the possible factors to cause memory > shortage again and again,even they have 16GB RAM + 16GB Swap the physical RAM really shortage? > > Or is there any useful dtrace script to trace the problem ? > > Thanks very much ! > > Best Regards, > Simon > > > > > ------------------------------------------------------------------------ > > > ------------------------------------------------------------------------ > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org-- Michael Schuster http://blogs.sun.com/recursion Recursion, n.: see ''Recursion''
Simon, For a 16GB box, the page scanner kicks in when freemem drops below 1/64th of memory, or about 256MB. Doesn''t matter if the system is idle or not. The ''w'' column numbers mean that threads were swapped out at some point in the past because of a severe memory shortage and never swapped backed in (because they''ve not been awoken yet). So it''s normal for that column to stay high even if much of the memory was released. It looks to me like you''re just oversubscribing memory. If you look at the prstat output I see easily 13-14GB of physical memory in use, plus you have the kernel memory. As for virtual memory, about 23GB shows up at least. Did you check for additional virtual space usage in /tmp? Are you using ZFS (ARC space needed for that)? You can also try using the "::memstat" mdb dcmd to break out kernel memory further. Jim Simon wrote:> Hi Experts, > > Here''s the performance related question,please help to review what can I > do to get the issue fixed ? > > IHAC who has one M5000 with Solaris 10 10/08(KJP: 138888-01) installed > and 16GB RAM configured,running sybase ASE 12.5 and JBOSS > application,recently,they felt the OS got very slow after OS running for > some sime,collected vmstat data points out memory shortage,as: > > # vmstat 5 > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id > 0 0 153 6953672 254552 228 228 1843 1218 1687 0 685 3 2 0 0 2334 32431 3143 1 1 97 > 0 0 153 6953672 259888 115 115 928 917 917 0 264 0 35 0 2 2208 62355 3332 7 3 90 > 0 0 153 6953672 255688 145 145 1168 1625 1625 0 1482 0 6 1 0 2088 40113 3070 2 1 96 > 0 0 153 6953640 256144 111 111 894 1371 1624 0 1124 0 6 0 0 2080 55278 3106 3 3 94 > 0 0 153 6953640 256048 241 241 1935 2585 3035 0 1009 0 18 0 0 2392 40643 3164 2 2 96 > 0 0 153 6953648 257112 236 235 1916 1710 1710 0 1223 0 7 0 0 2672 62582 3628 3 4 93 > > As above,the "w" column is very high all time,and "sr" column also kept > very high,which indicates the page scanner is activated and busying for > page out,but the CPU is very idle,checked "/etc/system",found one > improper entry: > set shmsys:shminfo_shmmax = 0xffffffffffff > > So I think it''s the improper share memory setting to cause too many > physical RAM was reserved by application and suggest to adjustment the > share memory to 8GB(0x200000000),but as customer feedback,seems it got > worst result based on new vmstat output: > > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id > 0 6 762 3941344 515848 18 29 4544 0 0 0 0 4 562 0 1 2448 25687 3623 1 2 97 > 0 6 762 4235016 749616 66 21 4251 2 2 0 0 0 528 0 0 2508 50540 3733 2 5 93 > 0 6 762 4428080 889864 106 299 4694 0 0 0 0 1 573 0 7 2741 182274 3907 10 4 86 > 0 5 762 4136400 664888 19 174 4126 0 0 0 0 6 511 0 0 2968 241186 4417 18 9 73 > 0 7 762 3454280 193776 103 651 2526 3949 4860 0 121549 11 543 0 5 2808 149820 4164 10 12 78 > 0 9 762 3160424 186016 61 440 1803 7362 15047 0 189720 12 567 0 5 3101 119895 4125 6 13 81 > 0 6 762 3647456 403056 44 279 4260 331 331 0 243 10 540 0 3 2552 38374 3847 5 3 92 > > the "w" & "sr" value increased instead,why ? > > And I also attached the "prstat" outout,it''s a prstat snapshot after > share memory adjustment,please help to have a look ? what can I do next > to get the issue solved ? what''s the possible factors to cause memory > shortage again and again,even they have 16GB RAM + 16GB Swap the physical RAM really shortage? > Or is there any useful dtrace script to trace the problem ? > Thanks very much ! > > Best Regards, > Simon > > > > > ------------------------------------------------------------------------ > > > ------------------------------------------------------------------------ > > _______________________________________________ > perf-discuss mailing list > perf-discuss at opensolaris.org >
If you''re running out of memory, which it appears you are, you need to profile the memory consumers, and determine if you have either a memory leak somewhere, or an under-configured system. Note 16GB is really tiny by todays standards, especially for an M5000-class server. It''s like putting an engine from a Ford sedan into an 18-wheel truck - the capacity to do work is severely limited by a lack of towing power. Laptops ship with 8GB these days... Back to memory consumers. We have; - The kernel - User processes - The file system cache (which is technically part of the kernel, but significant enough such that it should be measured seperately. If the database on a file system, and if so, which one (UFS? ZFS, VxFS?). How much shared memory is really being used (ipcs -a)? If the system starts off well, and degrades over time, then you need to capture memory data over time and see what area is growing. Based on that data, we can determine if something is leaking memory, or you have an underconfigured machine. I would start with; echo "::memstat" | mdb -k ipcs -a ps -eo pid,vsz,rss,class,pri,fname,args prstat -c 1 30 kstat -n system_pages You need to collect that data and some regular interval with timestamps. The interval depends on how long it takes the machine to degrade. If the systems goes from fresh boot to degraded state in 1 hour, I''d collect the data every second. If the machine goes from fresh boot to degraded state in 1 week, I''d grab the data every 2 hours or so. /jim Simon wrote:> Hi Experts, > > Here''s the performance related question,please help to review what can I > do to get the issue fixed ? > > IHAC who has one M5000 with Solaris 10 10/08(KJP: 138888-01) installed > and 16GB RAM configured,running sybase ASE 12.5 and JBOSS > application,recently,they felt the OS got very slow after OS running for > some sime,collected vmstat data points out memory shortage,as: > > # vmstat 5 > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id > 0 0 153 6953672 254552 228 228 1843 1218 1687 0 685 3 2 0 0 2334 32431 3143 1 1 97 > 0 0 153 6953672 259888 115 115 928 917 917 0 264 0 35 0 2 2208 62355 3332 7 3 90 > 0 0 153 6953672 255688 145 145 1168 1625 1625 0 1482 0 6 1 0 2088 40113 3070 2 1 96 > 0 0 153 6953640 256144 111 111 894 1371 1624 0 1124 0 6 0 0 2080 55278 3106 3 3 94 > 0 0 153 6953640 256048 241 241 1935 2585 3035 0 1009 0 18 0 0 2392 40643 3164 2 2 96 > 0 0 153 6953648 257112 236 235 1916 1710 1710 0 1223 0 7 0 0 2672 62582 3628 3 4 93 > > As above,the "w" column is very high all time,and "sr" column also kept > very high,which indicates the page scanner is activated and busying for > page out,but the CPU is very idle,checked "/etc/system",found one > improper entry: > set shmsys:shminfo_shmmax = 0xffffffffffff > > So I think it''s the improper share memory setting to cause too many > physical RAM was reserved by application and suggest to adjustment the > share memory to 8GB(0x200000000),but as customer feedback,seems it got > worst result based on new vmstat output: > > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id > 0 6 762 3941344 515848 18 29 4544 0 0 0 0 4 562 0 1 2448 25687 3623 1 2 97 > 0 6 762 4235016 749616 66 21 4251 2 2 0 0 0 528 0 0 2508 50540 3733 2 5 93 > 0 6 762 4428080 889864 106 299 4694 0 0 0 0 1 573 0 7 2741 182274 3907 10 4 86 > 0 5 762 4136400 664888 19 174 4126 0 0 0 0 6 511 0 0 2968 241186 4417 18 9 73 > 0 7 762 3454280 193776 103 651 2526 3949 4860 0 121549 11 543 0 5 2808 149820 4164 10 12 78 > 0 9 762 3160424 186016 61 440 1803 7362 15047 0 189720 12 567 0 5 3101 119895 4125 6 13 81 > 0 6 762 3647456 403056 44 279 4260 331 331 0 243 10 540 0 3 2552 38374 3847 5 3 92 > > the "w" & "sr" value increased instead,why ? > > And I also attached the "prstat" outout,it''s a prstat snapshot after > share memory adjustment,please help to have a look ? what can I do next > to get the issue solved ? what''s the possible factors to cause memory > shortage again and again,even they have 16GB RAM + 16GB Swap the physical RAM really shortage? > Or is there any useful dtrace script to trace the problem ? > Thanks very much ! > > Best Regards, > Simon > > > > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------ > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
max at bruningsystems.com
2009-Nov-20 19:06 UTC
[dtrace-discuss] [sysadmin-discuss] Who''re stealing memory ?
Hi Jim, Jim Mauro wrote:> > > Back to memory consumers. We have; > - The kernel > - User processes > - The file system cache (which is technically part of the kernel, > but significant enough such that it should be measured > seperately.tmpfs (i.e., /tmp, /var/run, and /etc/svc/volatile on my system) also uses memory, and I don''t believe is counted in the above. max
Hi Jim, Thanks for your reply,here''s my update: Did you check for additional virtual space usage in /tmp?>"df -k" shows only 1% used in "/tmp" filesystem: swap 10943720 968 10942752 1% /tmp swap 10942832 80 10942752 1% /var/run Are you using ZFS (ARC space needed for that)?>No any zfs used,all filesystems are UFS. You can also try using the "::memstat" mdb dcmd to break out kernel memory> further. >> ::memstatPage Summary Pages MB %Tot ------------ ---------------- ---------------- ---- Kernel 111925 874 5% Anon 1715077 13399 83% Exec and libs 64697 505 3% Page cache 71828 561 3% Free (cachelist) 51148 399 2% Free (freelist) 43872 342 2% Total 2058547 16082 Physical 2037012 15914 As above,the Anonymous memory is very high,I think some user thread using the memory in an abnormal way,I checked one of process with "pmap -x" and found many of stack/heap,as: # ps -ef |grep bea |grep -v grep kplustp 28447 1 0 07:01:37 ? 0:26 /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -Xbootclasspath/ kplustp 28447 1 0 07:01:37 ? 0:26 /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -Xbootclasspath/ kplustp 28443 1 0 07:01:37 ? 2:29 /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -Xbootclasspath/ kplustp 28445 1 0 07:01:37 ? 1:24 /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -Xbootclasspath/ kplustp 28457 1 0 07:01:38 ? 0:50 /export/home1/bea/jdk160_05//bin/java -Xms512m -Xmx1024m -Djava.awt.headless=tr kplustp 28453 1 0 07:01:37 ? 1:55 /export/home1/bea/jdk160_05//bin/java -Xms512m -Xmx1024m -Xbootclasspath/p:./.. kplustp 28449 1 0 07:01:37 ? 0:25 /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -Xbootclasspath/ kplustp 28508 1 0 07:01:44 ? 1:15 /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -classpath ./../ kplustp 28451 1 0 07:01:37 ? 1:25 /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -Xbootclasspath/ kplustp 28455 1 0 07:01:37 ? 1:27 /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -Xbootclasspath/ kplustp 28439 1 0 07:01:36 ? 0:28 /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -Xbootclasspath/ kplustp 28441 1 0 07:01:36 ? 0:26 /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -Xbootclasspath/ kplustp 28459 1 0 07:01:38 ? 0:26 /export/home1/bea/jdk160_05//bin/java -Djdbc.drivers=com.sybase.jdbc3.jdbc.SybD # pmap -x 28447 28447: /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -Xbootc Address Kbytes RSS Anon Locked Mode Mapped File 00010000 48 48 - - r-x-- java 0002A000 8 8 - - rwx-- java 0002C000 3920 272 264 - rwx-- [ heap ] 00400000 4096 - - - rwx-- [ heap ] B62F8000 32 32 32 - rwx-R [ stack tid=24 ] B647A000 8 8 8 - rwx-R [ stack tid=23 ] B6678000 16 16 16 - rwx-R [ stack tid=21 ] B677A000 8 8 8 - rwx-R [ stack tid=20 ] B6878000 16 16 16 - rwx-R [ stack tid=19 ] B68FE000 8 8 8 - rwx-R [ stack tid=18 ] B6FFE000 8 8 8 - rwx-R [ stack tid=11 ] B7070000 1584 1504 - - r--s- dev:85,50 ino:129269 B77C0000 32 32 - - r-x-- libaio.so.1 B77D8000 8 8 - - rwx-- libaio.so.1 B77E0000 24 24 - - r-x-- librt.so.1 B77F6000 8 8 - - rwx-- librt.so.1 B7800000 16384 12288 12288 - rwx-- [ anon ] BB800000 176128 - - - rwx-- [ anon ] E6400000 90112 8192 8192 - rwx-- [ anon ] FBC10000 336 336 - - r-x-- libtibrv.so FBC72000 24 24 24 - rwx-- libtibrv.so FBD10000 24 24 - - r-x-- libnio.so FBD24000 16 16 8 - rwx-- libnio.so FBD30000 8 8 - - r-x-- libkstat.so.1 FBD42000 8 8 - - rwx-- libkstat.so.1 FBD50000 88 88 - - r-x-- libtibrvcm.so FBD74000 16 16 8 - rwx-- libtibrvcm.so FBE10000 24 24 - - r-x-- libtibrvft.so FBE24000 8 8 - - rwx-- libtibrvft.so FBE30000 48 48 - - r-x-- libtibrvcmq.so FBE4A000 8 8 - - rwx-- libtibrvcmq.so FBE50000 72 56 - - r-x-- libnet.so FBE70000 8 8 - - rwx-- libnet.so FBE80000 344 - - - rwx-- [ anon ] FBFE0000 32 32 - - r-x-- libtibrvj.so FBFF0000 16 16 - - r--s- dev:85,50 ino:128526 FBFF6000 16 8 - - rwx-- libtibrvj.so FC000000 4096 4096 4096 - rwx-- [ anon ] FE010000 32 32 - - r--s- dev:85,50 ino:129270 FE020000 16 16 - - r-x-- libpthread.so.1 FE030000 16 16 - - r-x-- libKtpcrypt.so FE042000 16 8 - - rwx-- libKtpcrypt.so FE04C000 160 160 - - r--s- dev:85,60 ino:274206 FE080000 32 - - - rwx-- [ anon ] FE0A0000 344 - - - rwx-- [ anon ] FE1F6000 176 8 8 - rwx-- [ anon ] FE2A2000 8 - - - rwx-- [ anon ] FE2B0000 32 32 - - r--s- dev:85,50 ino:128648 FE2C0000 32 24 - - r--s- dev:85,60 ino:314837 FE2D2000 152 144 - - r--s- dev:85,60 ino:314730 FE300000 24 - - - rwx-- [ anon ] FE390000 32 32 - - r--s- dev:85,60 ino:314841 FE3A0000 32 - - - rwx-- [ anon ] FE3D0000 64 64 - - r-x-- libzip.so FE3E0000 8 - - - rwx-- libzip.so FE3E8000 16 - - - r--s- dev:85,60 ino:274380 FE3F0000 152 136 - - r-x-- libjava.so FE418000 24 - - - r--s- dev:85,60 ino:274288 FE420000 8 - - - r--s- dev:85,60 ino:134780 FE426000 8 - - - rwx-- libjava.so FE430000 56 56 - - r-x-- libverify.so FE440000 40 40 - - r--s- dev:85,5 ino:29959 FE44E000 8 - - - rwx-- libverify.so FE460000 64 - - - rwx-- [ anon ] FE510000 32 32 - - r-x-- libhpi.so FE520000 8 8 8 - rwx-- [ anon ] FE528000 8 - - - rwx-- libhpi.so FE52A000 8 - - - rwx-- libhpi.so FE530000 64 56 56 - rwx-- [ anon ] FE550000 64 - - - rw--- [ anon ] FE570000 64 32 32 - rw--- [ anon ] FE590000 16 16 - - r-x-- libmp.so.2 FE5A0000 8 8 8 - rwx-- [ anon ] FE5A4000 8 - - - rwx-- libmp.so.2 FE5B0000 80 80 - - r-x-- libmd.so.1 FE5CC000 16 16 - - r--s- dev:85,60 ino:314729 FE5D4000 8 8 - - rwx-- libmd.so.1 FE5E0000 24 24 - - r-x-- libgen.so.1 FE5E8000 32 32 - - r--s- dev:85,60 ino:314782 FE5F6000 8 8 - - rwx-- libgen.so.1 FE600000 680 680 - - r-x-- libm.so.2 FE6B0000 8 - - - r--s- dev:85,60 ino:134781 FE6B8000 32 32 - - rwx-- libm.so.2 FE6C2000 96 96 - - r--s- dev:85,60 ino:314727 FE6E0000 32 32 - - r-x-- libuutil.so.1 FE6F0000 16 16 - - r--s- dev:85,60 ino:314707 FE6F8000 8 8 - - rwx-- libuutil.so.1 FE700000 584 584 - - r-x-- libnsl.so.1 FE7A2000 40 40 - - rwx-- libnsl.so.1 FE7AC000 24 - - - rwx-- libnsl.so.1 FE7C0000 16 16 - - r--s- dev:85,60 ino:314725 FE7D0000 96 96 - - r-x-- libscf.so.1 FE7F0000 8 8 8 - rwx-- [ anon ] FE7F8000 8 8 - - rwx-- libscf.so.1 FE800000 9064 8616 - - r-x-- libjvm.so FF0E0000 32 16 - - rw-s- dev:324,2 ino:106631901 FF0EA000 280 80 80 - rwx-- libjvm.so FF130000 88 56 56 - rwx-- libjvm.so FF150000 8 8 8 - rwx-- [ anon ] FF160000 8 8 - - r-x-- libdoor.so.1 FF172000 8 8 - - rwx-- libdoor.so.1 FF180000 56 56 - - r-x-- libCrun.so.1 FF190000 8 8 8 - rwx-- [ anon ] FF19C000 8 8 - - rwx-- libCrun.so.1 FF19E000 24 - - - rwx-- libCrun.so.1 FF1B0000 16 8 - - r-x-- libm.so.1 FF1C2000 8 - - - rwx-- libm.so.1 FF1D0000 48 48 - - r-x-- libsocket.so.1 FF1E0000 8 8 - - r---- [ anon ] FF1EC000 8 8 8 - rwx-- libsocket.so.1 FF1F0000 8 8 - - r-x-- libsched.so.1 FF200000 1208 1208 - - r-x-- libc.so.1 FF330000 24 8 8 - rwx-- [ anon ] FF33E000 40 40 32 - rwx-- libc.so.1 FF348000 8 8 8 - rwx-- libc.so.1 FF350000 8 8 - - r-x-- libdl.so.1 FF35C000 16 16 - - r--s- dev:85,60 ino:314813 FF362000 8 8 - - rwx-- libdl.so.1 FF370000 32 24 - - r-x-- libjli.so FF380000 8 8 8 - rwx-- [ anon ] FF386000 16 8 - - rwx-- libjli.so FF390000 8 8 - - r-x-- libc_psr.so.1 FF3A0000 16 16 - - r-x-- libthread.so.1 FF3B0000 208 208 - - r-x-- ld.so.1 FF3E8000 8 - - - r--s- dev:85,60 ino:274115 FF3F0000 8 8 8 - rwx-- [ anon ] FF3F4000 8 8 8 - rwx-- ld.so.1 FF3F6000 8 8 8 - rwx-- ld.so.1 FF3FA000 8 8 - - rwxs- [ anon ] FFBFA000 24 8 8 - rwx-- [ stack ] -------- ------- ------- ------- ------- total Kb 312632 40560 25344 - Other processes initialized by user "kplustp" has similar memory usage as above. Thanks. Best Regards, Simon On Fri, Nov 20, 2009 at 10:02 PM, Jim Fiori <Jim.Fiori at sun.com> wrote:> Simon, > > For a 16GB box, the page scanner kicks in when freemem drops below 1/64th > of memory, or about 256MB. Doesn''t matter if the system is idle or not. > > The ''w'' column numbers mean that threads were swapped out at some point in > the past because of a severe memory shortage and never swapped backed in > (because they''ve not been awoken yet). So it''s normal for that column to > stay high even if much of the memory was released. > > It looks to me like you''re just oversubscribing memory. If you look at the > prstat output I see easily 13-14GB of physical memory in use, plus you have > the kernel memory. As for virtual memory, about 23GB shows up at least. > > Did you check for additional virtual space usage in /tmp? > > Are you using ZFS (ARC space needed for that)? > > You can also try using the "::memstat" mdb dcmd to break out kernel memory > further. > > Jim > > Simon wrote: > >> Hi Experts, >> >> Here''s the performance related question,please help to review what can I >> do to get the issue fixed ? >> >> IHAC who has one M5000 with Solaris 10 10/08(KJP: 138888-01) installed >> and 16GB RAM configured,running sybase ASE 12.5 and JBOSS >> application,recently,they felt the OS got very slow after OS running for >> some sime,collected vmstat data points out memory shortage,as: >> >> # vmstat 5 >> kthr memory page disk faults cpu >> r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id >> 0 0 153 6953672 254552 228 228 1843 1218 1687 0 685 3 2 0 0 2334 32431 >> 3143 1 1 97 >> 0 0 153 6953672 259888 115 115 928 917 917 0 264 0 35 0 2 2208 62355 3332 >> 7 3 90 >> 0 0 153 6953672 255688 145 145 1168 1625 1625 0 1482 0 6 1 0 2088 40113 >> 3070 2 1 96 >> 0 0 153 6953640 256144 111 111 894 1371 1624 0 1124 0 6 0 0 2080 55278 >> 3106 3 3 94 >> 0 0 153 6953640 256048 241 241 1935 2585 3035 0 1009 0 18 0 0 2392 40643 >> 3164 2 2 96 >> 0 0 153 6953648 257112 236 235 1916 1710 1710 0 1223 0 7 0 0 2672 62582 >> 3628 3 4 93 >> >> As above,the "w" column is very high all time,and "sr" column also kept >> very high,which indicates the page scanner is activated and busying for >> page out,but the CPU is very idle,checked "/etc/system",found one >> improper entry: >> set shmsys:shminfo_shmmax = 0xffffffffffff >> >> So I think it''s the improper share memory setting to cause too many >> physical RAM was reserved by application and suggest to adjustment the >> share memory to 8GB(0x200000000),but as customer feedback,seems it got >> worst result based on new vmstat output: >> >> kthr memory page disk faults cpu >> r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id >> 0 6 762 3941344 515848 18 29 4544 0 0 0 0 4 562 0 1 2448 25687 3623 1 2 97 >> 0 6 762 4235016 749616 66 21 4251 2 2 0 0 0 528 0 0 2508 50540 3733 2 5 93 >> 0 6 762 4428080 889864 106 299 4694 0 0 0 0 1 573 0 7 2741 182274 3907 10 >> 4 86 >> 0 5 762 4136400 664888 19 174 4126 0 0 0 0 6 511 0 0 2968 241186 4417 18 9 >> 73 >> 0 7 762 3454280 193776 103 651 2526 3949 4860 0 121549 11 543 0 5 2808 >> 149820 4164 10 12 78 >> 0 9 762 3160424 186016 61 440 1803 7362 15047 0 189720 12 567 0 5 3101 >> 119895 4125 6 13 81 >> 0 6 762 3647456 403056 44 279 4260 331 331 0 243 10 540 0 3 2552 38374 >> 3847 5 3 92 >> >> the "w" & "sr" value increased instead,why ? >> >> And I also attached the "prstat" outout,it''s a prstat snapshot after >> share memory adjustment,please help to have a look ? what can I do next >> to get the issue solved ? what''s the possible factors to cause memory >> shortage again and again,even they have 16GB RAM + 16GB Swap the physical >> RAM really shortage? >> Or is there any useful dtrace script to trace the problem ? Thanks very >> much ! >> >> Best Regards, >> Simon >> >> >> >> ------------------------------------------------------------------------ >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> perf-discuss mailing list >> perf-discuss at opensolaris.org >> >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20091121/3aa4a482/attachment-0001.html>
Hi Jim, Thank you. see my update inline. Thanks. Best Regards, Simon On Fri, Nov 20, 2009 at 11:51 PM, Jim Mauro <James.Mauro at sun.com> wrote:> If you''re running out of memory, which it appears you are, > you need to profile the memory consumers, and determine if > you have either a memory leak somewhere, or an under-configured > system. Note 16GB is really tiny by todays standards, especially for > an M5000-class server. It''s like putting an engine from a Ford sedan > into an 18-wheel truck - the capacity to do work is severely limited > by a lack of towing power. Laptops ship with 8GB these days... > > Back to memory consumers. We have; > - The kernel > - User processes > - The file system cache (which is technically part of the kernel, > but significant enough such that it should be measured > seperately. > > If the database on a file system, and if so, which one (UFS? ZFS, > VxFS?). How much shared memory is really being used > (ipcs -a)? >Just UFS used.here''s the ouput of "ipcs -a":> If the system starts off well, and degrades over time, then you need > to capture memory data over time and see what area is growing. > Based on that data, we can determine if something is leaking memory, > or you have an underconfigured machine. > > I would start with; > echo "::memstat" | mdb -k > ipcs -a ># ipcs -a IPC status from <running system> as of Thu Nov 12 12:05:28 HKT 2009 T ID KEY MODE OWNER GROUP CREATOR CGROUP CBYTES QNUM QBYTES LSPID LRPID STIME RTIME CTIME Message Queues: T ID KEY MODE OWNER GROUP CREATOR CGROUP NATTCH SEGSZ CPID LPID ATIME DTIME CTIME Shared Memory: m 3 0xe9032d40 --rw------- sybase staff sybase staff 3 738803712 1314 2125 20:47:22 no-entry 20:47:14 m 2 0x51 --rw-rw-r-- root root root root 1 2000196 2122 8553 15:15:18 15:15:23 13:14:38 m 1 0x50 --rw-rw-r-- root root root root 1 600196 2121 2121 13:14:38 no-entry 13:14:38 m 0 0xe9032d32 --rw------- sybase staff sybase staff 3 7851147264 1314 2125 13:14:40 13:14:40 13:13:42 T ID KEY MODE OWNER GROUP CREATOR CGROUP NSEMS OTIME CTIME Semaphores: s 1 0x51 --ra-ra-ra- root root root root 6 12:05:28 13:14:38 s 0 0x50 --ra-ra-ra- root root root root 6 12:05:28 13:14:38 # ipcs -mb (after adjust the share memory define in "/etc/system" from 0xfffffffff to 0x20000000) IPC status from <running system> as of Thu Nov 19 16:38:17 HKT 2009 T ID KEY MODE OWNER GROUP SEGSZ Shared Memory: m 2 0x51 --rw-rw-r-- root root 2000196 m 1 0x50 --rw-rw-r-- root root 600196 m 0 0xe9032d32 --rw------- sybase staff 8548687872> ps -eo pid,vsz,rss,class,pri,fname,args > prstat -c 1 30 >>From the "prstat" output,we found 3 sybase process,and each process derived12 threads,the java process(launched by customer application) derived total 370 threads, I think it''s too many threads(especially of "java" program) that generate excessive stack/heaps,and finally used up the RAM ? So I think decrease the share memory used by sybase(defined at sybase configuration layer,not in "/etc/system" file) would be helpful ?> kstat -n system_pages >I capatured the system_pages usage for about 0.5hr,one piece looks as below: Mon Nov 16 17:24:25 2009 module: unix instance: 0 name: system_pages class: pages availrmem 857798 crtime 89.53186 desfree 15914 desscan 8972 econtig 188874752 fastscan 1002870 freemem 30730 kernelbase 16777216 lotsfree 31828 minfree 7957 nalloc 66478696 nalloc_calls 19381 nfree 55736969 nfree_calls 14546 nscan 5520 pagesfree 30730 pageslocked 1169036 pagestotal 2037012 physmem 2058547 pp_kernel 189372 slowscan 100 snaptime 359704.2493636> > You need to collect that data and some regular interval > with timestamps. The interval depends on how long it takes > the machine to degrade. If the systems goes from fresh boot to > degraded state in 1 hour, I''d collect the data every second. > If the machine goes from fresh boot to degraded state in 1 week, > I''d grab the data every 2 hours or so. > > /jim > > > Simon wrote: > >> Hi Experts, >> >> Here''s the performance related question,please help to review what can I >> do to get the issue fixed ? >> >> IHAC who has one M5000 with Solaris 10 10/08(KJP: 138888-01) installed >> and 16GB RAM configured,running sybase ASE 12.5 and JBOSS >> application,recently,they felt the OS got very slow after OS running for >> some sime,collected vmstat data points out memory shortage,as: >> >> # vmstat 5 >> kthr memory page disk faults cpu >> r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id >> 0 0 153 6953672 254552 228 228 1843 1218 1687 0 685 3 2 0 0 2334 32431 >> 3143 1 1 97 >> 0 0 153 6953672 259888 115 115 928 917 917 0 264 0 35 0 2 2208 62355 3332 >> 7 3 90 >> 0 0 153 6953672 255688 145 145 1168 1625 1625 0 1482 0 6 1 0 2088 40113 >> 3070 2 1 96 >> 0 0 153 6953640 256144 111 111 894 1371 1624 0 1124 0 6 0 0 2080 55278 >> 3106 3 3 94 >> 0 0 153 6953640 256048 241 241 1935 2585 3035 0 1009 0 18 0 0 2392 40643 >> 3164 2 2 96 >> 0 0 153 6953648 257112 236 235 1916 1710 1710 0 1223 0 7 0 0 2672 62582 >> 3628 3 4 93 >> >> As above,the "w" column is very high all time,and "sr" column also kept >> very high,which indicates the page scanner is activated and busying for >> page out,but the CPU is very idle,checked "/etc/system",found one >> improper entry: >> set shmsys:shminfo_shmmax = 0xffffffffffff >> >> So I think it''s the improper share memory setting to cause too many >> physical RAM was reserved by application and suggest to adjustment the >> share memory to 8GB(0x200000000),but as customer feedback,seems it got >> worst result based on new vmstat output: >> >> kthr memory page disk faults cpu >> r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id >> 0 6 762 3941344 515848 18 29 4544 0 0 0 0 4 562 0 1 2448 25687 3623 1 2 97 >> 0 6 762 4235016 749616 66 21 4251 2 2 0 0 0 528 0 0 2508 50540 3733 2 5 93 >> 0 6 762 4428080 889864 106 299 4694 0 0 0 0 1 573 0 7 2741 182274 3907 10 >> 4 86 >> 0 5 762 4136400 664888 19 174 4126 0 0 0 0 6 511 0 0 2968 241186 4417 18 9 >> 73 >> 0 7 762 3454280 193776 103 651 2526 3949 4860 0 121549 11 543 0 5 2808 >> 149820 4164 10 12 78 >> 0 9 762 3160424 186016 61 440 1803 7362 15047 0 189720 12 567 0 5 3101 >> 119895 4125 6 13 81 >> 0 6 762 3647456 403056 44 279 4260 331 331 0 243 10 540 0 3 2552 38374 >> 3847 5 3 92 >> >> the "w" & "sr" value increased instead,why ? >> >> And I also attached the "prstat" outout,it''s a prstat snapshot after >> share memory adjustment,please help to have a look ? what can I do next >> to get the issue solved ? what''s the possible factors to cause memory >> shortage again and again,even they have 16GB RAM + 16GB Swap the physical >> RAM really shortage? >> Or is there any useful dtrace script to trace the problem ? Thanks very >> much ! >> >> Best Regards, >> Simon >> >> >> >> ------------------------------------------------------------------------ >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> dtrace-discuss mailing list >> dtrace-discuss at opensolaris.org >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20091121/31dd3e1a/attachment.html>
Hi Michael, Now the system been reset seems that process disappear,but I found some similar processes which launched by the same user "kplus",as: kplus 20905 0.1 1.1464984180288 ? S 08:18:38 2:17 /usr/java/bin/java -Dprogram.name=run.sh -server -Xms128m -Xmx512m -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.rmi.dgc.server.gcInterval=3600000 -Djava.awt.headless=true -Djava.endorsed.dirs=/export/home1/jboss/jboss-4.0.5.GA/lib/endorsed-classpath /export/home1/jboss/ jboss-4.0.5.GA/bin/run.jar:/usr/java/lib/tools.jar org.jboss.Main Thanks. Best Regards, Simon On Sat, Nov 21, 2009 at 4:22 PM, Michael Schulte <mschulte at sunspezialist.de>wrote:> Hey Simon, > > > ># pmap -x 28447 > >28447: /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -Xbootc > > Can you give the full argument list of this java? > > Memory de-allocation is Java is fully asynchronous in Garbage Collection > and can be > tuned by command line options when starting the application. Just Google > for > the exakt syntax. > > Michael > > > Simon wrote: > >> Hi Jim, >> >> Thanks for your reply,here''s my update: >> >> Did you check for additional virtual space usage in /tmp? >> >> >> "df -k" shows only 1% used in "/tmp" filesystem: >> swap 10943720 968 10942752 1% /tmp >> swap 10942832 80 10942752 1% /var/run >> >> Are you using ZFS (ARC space needed for that)? >> >> No any zfs used,all filesystems are UFS. >> >> You can also try using the "::memstat" mdb dcmd to break out kernel >> memory further. >> >> >> > ::memstat >> >> Page Summary Pages MB %Tot >> ------------ ---------------- ---------------- ---- >> Kernel 111925 874 5% >> Anon 1715077 13399 83% >> Exec and libs 64697 505 3% >> Page cache 71828 561 3% >> Free (cachelist) 51148 399 2% >> Free (freelist) 43872 342 2% >> >> Total 2058547 16082 >> Physical 2037012 15914 >> >> As above,the Anonymous memory is very high,I think some user thread using >> the memory in an abnormal way,I checked one of process with "pmap -x" and >> found many of stack/heap,as: >> >> # ps -ef |grep bea |grep -v grep >> kplustp 28447 1 0 07:01:37 ? 0:26 >> /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true >> -Xbootclasspath/ >> kplustp 28447 1 0 07:01:37 ? 0:26 >> /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true >> -Xbootclasspath/ >> kplustp 28443 1 0 07:01:37 ? 2:29 >> /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true >> -Xbootclasspath/ >> kplustp 28445 1 0 07:01:37 ? 1:24 >> /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true >> -Xbootclasspath/ >> kplustp 28457 1 0 07:01:38 ? 0:50 >> /export/home1/bea/jdk160_05//bin/java -Xms512m -Xmx1024m >> -Djava.awt.headless=tr >> kplustp 28453 1 0 07:01:37 ? 1:55 >> /export/home1/bea/jdk160_05//bin/java -Xms512m -Xmx1024m >> -Xbootclasspath/p:./.. >> kplustp 28449 1 0 07:01:37 ? 0:25 >> /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true >> -Xbootclasspath/ >> kplustp 28508 1 0 07:01:44 ? 1:15 >> /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true -classpath >> ./../ >> kplustp 28451 1 0 07:01:37 ? 1:25 >> /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true >> -Xbootclasspath/ >> kplustp 28455 1 0 07:01:37 ? 1:27 >> /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true >> -Xbootclasspath/ >> kplustp 28439 1 0 07:01:36 ? 0:28 >> /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true >> -Xbootclasspath/ >> kplustp 28441 1 0 07:01:36 ? 0:26 >> /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true >> -Xbootclasspath/ >> kplustp 28459 1 0 07:01:38 ? 0:26 >> /export/home1/bea/jdk160_05//bin/java >> -Djdbc.drivers=com.sybase.jdbc3.jdbc.SybD >> >> # pmap -x 28447 >> 28447: /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true >> -Xbootc >> Address Kbytes RSS Anon Locked Mode Mapped File >> 00010000 48 48 - - r-x-- java >> 0002A000 8 8 - - rwx-- java >> 0002C000 3920 272 264 - rwx-- [ heap ] >> 00400000 4096 - - - rwx-- [ heap ] >> B62F8000 32 32 32 - rwx-R [ stack tid=24 ] >> B647A000 8 8 8 - rwx-R [ stack tid=23 ] >> B6678000 16 16 16 - rwx-R [ stack tid=21 ] >> B677A000 8 8 8 - rwx-R [ stack tid=20 ] >> B6878000 16 16 16 - rwx-R [ stack tid=19 ] >> B68FE000 8 8 8 - rwx-R [ stack tid=18 ] >> B6FFE000 8 8 8 - rwx-R [ stack tid=11 ] >> B7070000 1584 1504 - - r--s- dev:85,50 ino:129269 >> B77C0000 32 32 - - r-x-- libaio.so.1 >> B77D8000 8 8 - - rwx-- libaio.so.1 >> B77E0000 24 24 - - r-x-- librt.so.1 >> B77F6000 8 8 - - rwx-- librt.so.1 >> B7800000 16384 12288 12288 - rwx-- [ anon ] >> BB800000 176128 - - - rwx-- [ anon ] >> E6400000 90112 8192 8192 - rwx-- [ anon ] >> FBC10000 336 336 - - r-x-- libtibrv.so >> FBC72000 24 24 24 - rwx-- libtibrv.so >> FBD10000 24 24 - - r-x-- libnio.so >> FBD24000 16 16 8 - rwx-- libnio.so >> FBD30000 8 8 - - r-x-- libkstat.so.1 >> FBD42000 8 8 - - rwx-- libkstat.so.1 >> FBD50000 88 88 - - r-x-- libtibrvcm.so >> FBD74000 16 16 8 - rwx-- libtibrvcm.so >> FBE10000 24 24 - - r-x-- libtibrvft.so >> FBE24000 8 8 - - rwx-- libtibrvft.so >> FBE30000 48 48 - - r-x-- libtibrvcmq.so >> FBE4A000 8 8 - - rwx-- libtibrvcmq.so >> FBE50000 72 56 - - r-x-- libnet.so >> FBE70000 8 8 - - rwx-- libnet.so >> FBE80000 344 - - - rwx-- [ anon ] >> FBFE0000 32 32 - - r-x-- libtibrvj.so >> FBFF0000 16 16 - - r--s- dev:85,50 ino:128526 >> FBFF6000 16 8 - - rwx-- libtibrvj.so >> FC000000 4096 4096 4096 - rwx-- [ anon ] >> FE010000 32 32 - - r--s- dev:85,50 ino:129270 >> FE020000 16 16 - - r-x-- libpthread.so.1 >> FE030000 16 16 - - r-x-- libKtpcrypt.so >> FE042000 16 8 - - rwx-- libKtpcrypt.so >> FE04C000 160 160 - - r--s- dev:85,60 ino:274206 >> FE080000 32 - - - rwx-- [ anon ] >> FE0A0000 344 - - - rwx-- [ anon ] >> FE1F6000 176 8 8 - rwx-- [ anon ] >> FE2A2000 8 - - - rwx-- [ anon ] >> FE2B0000 32 32 - - r--s- dev:85,50 ino:128648 >> FE2C0000 32 24 - - r--s- dev:85,60 ino:314837 >> FE2D2000 152 144 - - r--s- dev:85,60 ino:314730 >> FE300000 24 - - - rwx-- [ anon ] >> FE390000 32 32 - - r--s- dev:85,60 ino:314841 >> FE3A0000 32 - - - rwx-- [ anon ] >> FE3D0000 64 64 - - r-x-- libzip.so >> FE3E0000 8 - - - rwx-- libzip.so >> FE3E8000 16 - - - r--s- dev:85,60 ino:274380 >> FE3F0000 152 136 - - r-x-- libjava.so >> FE418000 24 - - - r--s- dev:85,60 ino:274288 >> FE420000 8 - - - r--s- dev:85,60 ino:134780 >> FE426000 8 - - - rwx-- libjava.so >> FE430000 56 56 - - r-x-- libverify.so >> FE440000 40 40 - - r--s- dev:85,5 ino:29959 >> FE44E000 8 - - - rwx-- libverify.so >> FE460000 64 - - - rwx-- [ anon ] >> FE510000 32 32 - - r-x-- libhpi.so >> FE520000 8 8 8 - rwx-- [ anon ] >> FE528000 8 - - - rwx-- libhpi.so >> FE52A000 8 - - - rwx-- libhpi.so >> FE530000 64 56 56 - rwx-- [ anon ] >> FE550000 64 - - - rw--- [ anon ] >> FE570000 64 32 32 - rw--- [ anon ] >> FE590000 16 16 - - r-x-- libmp.so.2 >> FE5A0000 8 8 8 - rwx-- [ anon ] >> FE5A4000 8 - - - rwx-- libmp.so.2 >> FE5B0000 80 80 - - r-x-- libmd.so.1 >> FE5CC000 16 16 - - r--s- dev:85,60 ino:314729 >> FE5D4000 8 8 - - rwx-- libmd.so.1 >> FE5E0000 24 24 - - r-x-- libgen.so.1 >> FE5E8000 32 32 - - r--s- dev:85,60 ino:314782 >> FE5F6000 8 8 - - rwx-- libgen.so.1 >> FE600000 680 680 - - r-x-- libm.so.2 >> FE6B0000 8 - - - r--s- dev:85,60 ino:134781 >> FE6B8000 32 32 - - rwx-- libm.so.2 >> FE6C2000 96 96 - - r--s- dev:85,60 ino:314727 >> FE6E0000 32 32 - - r-x-- libuutil.so.1 >> FE6F0000 16 16 - - r--s- dev:85,60 ino:314707 >> FE6F8000 8 8 - - rwx-- libuutil.so.1 >> FE700000 584 584 - - r-x-- libnsl.so.1 >> FE7A2000 40 40 - - rwx-- libnsl.so.1 >> FE7AC000 24 - - - rwx-- libnsl.so.1 >> FE7C0000 16 16 - - r--s- dev:85,60 ino:314725 >> FE7D0000 96 96 - - r-x-- libscf.so.1 >> FE7F0000 8 8 8 - rwx-- [ anon ] >> FE7F8000 8 8 - - rwx-- libscf.so.1 >> FE800000 9064 8616 - - r-x-- libjvm.so >> FF0E0000 32 16 - - rw-s- dev:324,2 ino:106631901 >> FF0EA000 280 80 80 - rwx-- libjvm.so >> FF130000 88 56 56 - rwx-- libjvm.so >> FF150000 8 8 8 - rwx-- [ anon ] >> FF160000 8 8 - - r-x-- libdoor.so.1 >> FF172000 8 8 - - rwx-- libdoor.so.1 >> FF180000 56 56 - - r-x-- libCrun.so.1 >> FF190000 8 8 8 - rwx-- [ anon ] >> FF19C000 8 8 - - rwx-- libCrun.so.1 >> FF19E000 24 - - - rwx-- libCrun.so.1 >> FF1B0000 16 8 - - r-x-- libm.so.1 >> FF1C2000 8 - - - rwx-- libm.so.1 >> FF1D0000 48 48 - - r-x-- libsocket.so.1 >> FF1E0000 8 8 - - r---- [ anon ] >> FF1EC000 8 8 8 - rwx-- libsocket.so.1 >> FF1F0000 8 8 - - r-x-- libsched.so.1 >> FF200000 1208 1208 - - r-x-- libc.so.1 >> FF330000 24 8 8 - rwx-- [ anon ] >> FF33E000 40 40 32 - rwx-- libc.so.1 >> FF348000 8 8 8 - rwx-- libc.so.1 >> FF350000 8 8 - - r-x-- libdl.so.1 >> FF35C000 16 16 - - r--s- dev:85,60 ino:314813 >> FF362000 8 8 - - rwx-- libdl.so.1 >> FF370000 32 24 - - r-x-- libjli.so >> FF380000 8 8 8 - rwx-- [ anon ] >> FF386000 16 8 - - rwx-- libjli.so >> FF390000 8 8 - - r-x-- libc_psr.so.1 >> FF3A0000 16 16 - - r-x-- libthread.so.1 >> FF3B0000 208 208 - - r-x-- ld.so.1 >> FF3E8000 8 - - - r--s- dev:85,60 ino:274115 >> FF3F0000 8 8 8 - rwx-- [ anon ] >> FF3F4000 8 8 8 - rwx-- ld.so.1 >> FF3F6000 8 8 8 - rwx-- ld.so.1 >> FF3FA000 8 8 - - rwxs- [ anon ] >> FFBFA000 24 8 8 - rwx-- [ stack ] >> -------- ------- ------- ------- ------- >> total Kb 312632 40560 25344 - >> >> Other processes initialized by user "kplustp" has similar memory usage as >> above. >> >> Thanks. >> Best Regards, >> Simon >> >> On Fri, Nov 20, 2009 at 10:02 PM, Jim Fiori <Jim.Fiori at sun.com <mailto: >> Jim.Fiori at sun.com>> wrote: >> >> Simon, >> >> For a 16GB box, the page scanner kicks in when freemem drops below >> 1/64th of memory, or about 256MB. Doesn''t matter if the system is >> idle or not. >> >> The ''w'' column numbers mean that threads were swapped out at some >> point in the past because of a severe memory shortage and never >> swapped backed in (because they''ve not been awoken yet). So it''s >> normal for that column to stay high even if much of the memory was >> released. >> >> It looks to me like you''re just oversubscribing memory. If you look >> at the prstat output I see easily 13-14GB of physical memory in use, >> plus you have the kernel memory. As for virtual memory, about 23GB >> shows up at least. >> >> Did you check for additional virtual space usage in /tmp? >> >> Are you using ZFS (ARC space needed for that)? >> >> You can also try using the "::memstat" mdb dcmd to break out kernel >> memory further. >> >> Jim >> >> Simon wrote: >> >> Hi Experts, >> >> Here''s the performance related question,please help to review >> what can I >> do to get the issue fixed ? >> >> IHAC who has one M5000 with Solaris 10 10/08(KJP: 138888-01) >> installed >> and 16GB RAM configured,running sybase ASE 12.5 and JBOSS >> application,recently,they felt the OS got very slow after OS >> running for >> some sime,collected vmstat data points out memory shortage,as: >> >> # vmstat 5 >> kthr memory page disk faults cpu >> r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id >> 0 0 153 6953672 254552 228 228 1843 1218 1687 0 685 3 2 0 0 2334 >> 32431 3143 1 1 97 >> 0 0 153 6953672 259888 115 115 928 917 917 0 264 0 35 0 2 2208 >> 62355 3332 7 3 90 >> 0 0 153 6953672 255688 145 145 1168 1625 1625 0 1482 0 6 1 0 >> 2088 40113 3070 2 1 96 >> 0 0 153 6953640 256144 111 111 894 1371 1624 0 1124 0 6 0 0 2080 >> 55278 3106 3 3 94 >> 0 0 153 6953640 256048 241 241 1935 2585 3035 0 1009 0 18 0 0 >> 2392 40643 3164 2 2 96 >> 0 0 153 6953648 257112 236 235 1916 1710 1710 0 1223 0 7 0 0 >> 2672 62582 3628 3 4 93 >> >> As above,the "w" column is very high all time,and "sr" column >> also kept >> very high,which indicates the page scanner is activated and >> busying for >> page out,but the CPU is very idle,checked "/etc/system",found one >> improper entry: >> set shmsys:shminfo_shmmax = 0xffffffffffff >> >> So I think it''s the improper share memory setting to cause too many >> physical RAM was reserved by application and suggest to >> adjustment the >> share memory to 8GB(0x200000000),but as customer feedback,seems >> it got >> worst result based on new vmstat output: >> >> kthr memory page disk faults cpu >> r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy cs us sy id >> 0 6 762 3941344 515848 18 29 4544 0 0 0 0 4 562 0 1 2448 25687 >> 3623 1 2 97 >> 0 6 762 4235016 749616 66 21 4251 2 2 0 0 0 528 0 0 2508 50540 >> 3733 2 5 93 >> 0 6 762 4428080 889864 106 299 4694 0 0 0 0 1 573 0 7 2741 >> 182274 3907 10 4 86 >> 0 5 762 4136400 664888 19 174 4126 0 0 0 0 6 511 0 0 2968 241186 >> 4417 18 9 73 >> 0 7 762 3454280 193776 103 651 2526 3949 4860 0 121549 11 543 0 >> 5 2808 149820 4164 10 12 78 >> 0 9 762 3160424 186016 61 440 1803 7362 15047 0 189720 12 567 0 >> 5 3101 119895 4125 6 13 81 >> 0 6 762 3647456 403056 44 279 4260 331 331 0 243 10 540 0 3 2552 >> 38374 3847 5 3 92 >> >> the "w" & "sr" value increased instead,why ? >> >> And I also attached the "prstat" outout,it''s a prstat snapshot >> after >> share memory adjustment,please help to have a look ? what can I >> do next >> to get the issue solved ? what''s the possible factors to cause >> memory >> shortage again and again,even they have 16GB RAM + 16GB Swap the >> physical RAM really shortage? >> Or is there any useful dtrace script to trace the problem ? >> Thanks very much ! >> >> Best Regards, >> Simon >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> perf-discuss mailing list >> perf-discuss at opensolaris.org <mailto:perf-discuss at opensolaris.org> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> perf-discuss mailing list >> perf-discuss at opensolaris.org >> > > > -- > Michael Schulte > mschulte at sunspezialist.de > OpenSolaris Kernel Development > http://opensolaris.org/ > > _______________________________________________ > perf-discuss mailing list > perf-discuss at opensolaris.org >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20091121/6264d9e1/attachment-0001.html>
You have about 9GB of shared memory (on a 16GB machine).> > From the "prstat" output,we found 3 sybase process,and each process > derived 12 threads,the java process(launched by customer application) > derived total 370 threads, I think it''s too many threads(especially of > "java" program) that generate excessive stack/heaps,and finally used > up the RAM ?Java can consume a lot of memory. Need to see the memory sizes, but it''s certainly a possibility.> > So I think decrease the share memory used by sybase(defined at sybase > configuration layer,not in "/etc/system" file) would be helpful ?Sure. If you take memory away from one consumer, it leaves more for the others. Whether or not it actually solves your problem, meaning after such a change the system has sufficient memory to run without paging, remains to be seen. In order to be sure, you need to so some additional memory accounting and determine how much RAM you need to support the shared segments for Sybase, and the JVMs. Thanks, /jim
Right. All your memory appears to be anon segments - 13.4GB worth. About 9GB of that is the shared memory segments. That leaves 4.4GB. I see 13 Java processes listed. Assuming they have a similar memory footprint as the one pmap example, which shows about 40MB of RSS, that''s (40MB x 13) about 500MB. It''s of course possible that the other JVMs are using more memory than the 1 pmap example. The ps output or prstat would be helpful here. You still need to account for about 3.9GB of anon usage... Thanks, /jim Simon wrote:> Hi Michael, > > Now the system been reset seems that process disappear,but I found > some similar processes which launched by the same user "kplus",as: > > kplus 20905 0.1 1.1464984180288 ? S 08:18:38 2:17 > /usr/java/bin/java -Dprogram.name=run.sh -server -Xms128m -Xmx512m > -Dsun.rmi.dgc.client.gcInterval=3600000 > -Dsun.rmi.dgc.server.gcInterval=3600000 -Djava.awt.headless=true > -Djava.endorsed.dirs=/export/home1/jboss/jboss-4.0.5.GA/lib/endorsed > <http://jboss-4.0.5.GA/lib/endorsed> -classpath > /export/home1/jboss/jboss-4.0.5.GA/bin/run.jar:/usr/java/lib/tools.jar > <http://jboss-4.0.5.GA/bin/run.jar:/usr/java/lib/tools.jar> org.jboss.Main > > Thanks. > Best Regards, > Simon > > > On Sat, Nov 21, 2009 at 4:22 PM, Michael Schulte > <mschulte at sunspezialist.de <mailto:mschulte at sunspezialist.de>> wrote: > > Hey Simon, > > > ># pmap -x 28447 > >28447: /export/home1/bea/jdk160_05//bin/java > -Djava.awt.headless=true -Xbootc > > Can you give the full argument list of this java? > > Memory de-allocation is Java is fully asynchronous in Garbage > Collection and can be > tuned by command line options when starting the application. Just > Google for > the exakt syntax. > > Michael > > > Simon wrote: > > Hi Jim, > > Thanks for your reply,here''s my update: > > Did you check for additional virtual space usage in /tmp? > > > "df -k" shows only 1% used in "/tmp" filesystem: > swap 10943720 968 10942752 1% /tmp > swap 10942832 80 10942752 1% /var/run > > Are you using ZFS (ARC space needed for that)? > > No any zfs used,all filesystems are UFS. > > You can also try using the "::memstat" mdb dcmd to break > out kernel > memory further. > > > > ::memstat > > Page Summary Pages MB %Tot > ------------ ---------------- ---------------- ---- > Kernel 111925 874 5% > Anon 1715077 13399 83% > Exec and libs 64697 505 3% > Page cache 71828 561 3% > Free (cachelist) 51148 399 2% > Free (freelist) 43872 342 2% > > Total 2058547 16082 > Physical 2037012 15914 > > As above,the Anonymous memory is very high,I think some user > thread using the memory in an abnormal way,I checked one of > process with "pmap -x" and found many of stack/heap,as: > > # ps -ef |grep bea |grep -v grep > kplustp 28447 1 0 07:01:37 ? 0:26 > /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -Xbootclasspath/ > kplustp 28447 1 0 07:01:37 ? 0:26 > /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -Xbootclasspath/ > kplustp 28443 1 0 07:01:37 ? 2:29 > /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -Xbootclasspath/ > kplustp 28445 1 0 07:01:37 ? 1:24 > /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -Xbootclasspath/ > kplustp 28457 1 0 07:01:38 ? 0:50 > /export/home1/bea/jdk160_05//bin/java -Xms512m -Xmx1024m > -Djava.awt.headless=tr > kplustp 28453 1 0 07:01:37 ? 1:55 > /export/home1/bea/jdk160_05//bin/java -Xms512m -Xmx1024m > -Xbootclasspath/p:./.. > kplustp 28449 1 0 07:01:37 ? 0:25 > /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -Xbootclasspath/ > kplustp 28508 1 0 07:01:44 ? 1:15 > /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -classpath ./../ > kplustp 28451 1 0 07:01:37 ? 1:25 > /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -Xbootclasspath/ > kplustp 28455 1 0 07:01:37 ? 1:27 > /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -Xbootclasspath/ > kplustp 28439 1 0 07:01:36 ? 0:28 > /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -Xbootclasspath/ > kplustp 28441 1 0 07:01:36 ? 0:26 > /export/home1/bea/jdk160_05//bin/java -Djava.awt.headless=true > -Xbootclasspath/ > kplustp 28459 1 0 07:01:38 ? 0:26 > /export/home1/bea/jdk160_05//bin/java > -Djdbc.drivers=com.sybase.jdbc3.jdbc.SybD > > # pmap -x 28447 > 28447: /export/home1/bea/jdk160_05//bin/java > -Djava.awt.headless=true -Xbootc > Address Kbytes RSS Anon Locked Mode Mapped File > 00010000 48 48 - - r-x-- java > 0002A000 8 8 - - rwx-- java > 0002C000 3920 272 264 - rwx-- [ heap ] > 00400000 4096 - - - rwx-- [ heap ] > B62F8000 32 32 32 - rwx-R [ stack tid=24 ] > B647A000 8 8 8 - rwx-R [ stack tid=23 ] > B6678000 16 16 16 - rwx-R [ stack tid=21 ] > B677A000 8 8 8 - rwx-R [ stack tid=20 ] > B6878000 16 16 16 - rwx-R [ stack tid=19 ] > B68FE000 8 8 8 - rwx-R [ stack tid=18 ] > B6FFE000 8 8 8 - rwx-R [ stack tid=11 ] > B7070000 1584 1504 - - r--s- dev:85,50 > ino:129269 > B77C0000 32 32 - - r-x-- libaio.so.1 > B77D8000 8 8 - - rwx-- libaio.so.1 > B77E0000 24 24 - - r-x-- librt.so.1 > B77F6000 8 8 - - rwx-- librt.so.1 > B7800000 16384 12288 12288 - rwx-- [ anon ] > BB800000 176128 - - - rwx-- [ anon ] > E6400000 90112 8192 8192 - rwx-- [ anon ] > FBC10000 336 336 - - r-x-- libtibrv.so > FBC72000 24 24 24 - rwx-- libtibrv.so > FBD10000 24 24 - - r-x-- libnio.so > FBD24000 16 16 8 - rwx-- libnio.so > FBD30000 8 8 - - r-x-- libkstat.so.1 > FBD42000 8 8 - - rwx-- libkstat.so.1 > FBD50000 88 88 - - r-x-- libtibrvcm.so > FBD74000 16 16 8 - rwx-- libtibrvcm.so > FBE10000 24 24 - - r-x-- libtibrvft.so > FBE24000 8 8 - - rwx-- libtibrvft.so > FBE30000 48 48 - - r-x-- libtibrvcmq.so > FBE4A000 8 8 - - rwx-- libtibrvcmq.so > FBE50000 72 56 - - r-x-- libnet.so > FBE70000 8 8 - - rwx-- libnet.so > FBE80000 344 - - - rwx-- [ anon ] > FBFE0000 32 32 - - r-x-- libtibrvj.so > FBFF0000 16 16 - - r--s- dev:85,50 > ino:128526 > FBFF6000 16 8 - - rwx-- libtibrvj.so > FC000000 4096 4096 4096 - rwx-- [ anon ] > FE010000 32 32 - - r--s- dev:85,50 > ino:129270 > FE020000 16 16 - - r-x-- libpthread.so.1 > FE030000 16 16 - - r-x-- libKtpcrypt.so > FE042000 16 8 - - rwx-- libKtpcrypt.so > FE04C000 160 160 - - r--s- dev:85,60 > ino:274206 > FE080000 32 - - - rwx-- [ anon ] > FE0A0000 344 - - - rwx-- [ anon ] > FE1F6000 176 8 8 - rwx-- [ anon ] > FE2A2000 8 - - - rwx-- [ anon ] > FE2B0000 32 32 - - r--s- dev:85,50 > ino:128648 > FE2C0000 32 24 - - r--s- dev:85,60 > ino:314837 > FE2D2000 152 144 - - r--s- dev:85,60 > ino:314730 > FE300000 24 - - - rwx-- [ anon ] > FE390000 32 32 - - r--s- dev:85,60 > ino:314841 > FE3A0000 32 - - - rwx-- [ anon ] > FE3D0000 64 64 - - r-x-- libzip.so > FE3E0000 8 - - - rwx-- libzip.so > FE3E8000 16 - - - r--s- dev:85,60 > ino:274380 > FE3F0000 152 136 - - r-x-- libjava.so > FE418000 24 - - - r--s- dev:85,60 > ino:274288 > FE420000 8 - - - r--s- dev:85,60 > ino:134780 > FE426000 8 - - - rwx-- libjava.so > FE430000 56 56 - - r-x-- libverify.so > FE440000 40 40 - - r--s- dev:85,5 ino:29959 > FE44E000 8 - - - rwx-- libverify.so > FE460000 64 - - - rwx-- [ anon ] > FE510000 32 32 - - r-x-- libhpi.so > FE520000 8 8 8 - rwx-- [ anon ] > FE528000 8 - - - rwx-- libhpi.so > FE52A000 8 - - - rwx-- libhpi.so > FE530000 64 56 56 - rwx-- [ anon ] > FE550000 64 - - - rw--- [ anon ] > FE570000 64 32 32 - rw--- [ anon ] > FE590000 16 16 - - r-x-- libmp.so.2 > FE5A0000 8 8 8 - rwx-- [ anon ] > FE5A4000 8 - - - rwx-- libmp.so.2 > FE5B0000 80 80 - - r-x-- libmd.so.1 > FE5CC000 16 16 - - r--s- dev:85,60 > ino:314729 > FE5D4000 8 8 - - rwx-- libmd.so.1 > FE5E0000 24 24 - - r-x-- libgen.so.1 > FE5E8000 32 32 - - r--s- dev:85,60 > ino:314782 > FE5F6000 8 8 - - rwx-- libgen.so.1 > FE600000 680 680 - - r-x-- libm.so.2 > FE6B0000 8 - - - r--s- dev:85,60 > ino:134781 > FE6B8000 32 32 - - rwx-- libm.so.2 > FE6C2000 96 96 - - r--s- dev:85,60 > ino:314727 > FE6E0000 32 32 - - r-x-- libuutil.so.1 > FE6F0000 16 16 - - r--s- dev:85,60 > ino:314707 > FE6F8000 8 8 - - rwx-- libuutil.so.1 > FE700000 584 584 - - r-x-- libnsl.so.1 > FE7A2000 40 40 - - rwx-- libnsl.so.1 > FE7AC000 24 - - - rwx-- libnsl.so.1 > FE7C0000 16 16 - - r--s- dev:85,60 > ino:314725 > FE7D0000 96 96 - - r-x-- libscf.so.1 > FE7F0000 8 8 8 - rwx-- [ anon ] > FE7F8000 8 8 - - rwx-- libscf.so.1 > FE800000 9064 8616 - - r-x-- libjvm.so > FF0E0000 32 16 - - rw-s- dev:324,2 > ino:106631901 > FF0EA000 280 80 80 - rwx-- libjvm.so > FF130000 88 56 56 - rwx-- libjvm.so > FF150000 8 8 8 - rwx-- [ anon ] > FF160000 8 8 - - r-x-- libdoor.so.1 > FF172000 8 8 - - rwx-- libdoor.so.1 > FF180000 56 56 - - r-x-- libCrun.so.1 > FF190000 8 8 8 - rwx-- [ anon ] > FF19C000 8 8 - - rwx-- libCrun.so.1 > FF19E000 24 - - - rwx-- libCrun.so.1 > FF1B0000 16 8 - - r-x-- libm.so.1 > FF1C2000 8 - - - rwx-- libm.so.1 > FF1D0000 48 48 - - r-x-- libsocket.so.1 > FF1E0000 8 8 - - r---- [ anon ] > FF1EC000 8 8 8 - rwx-- libsocket.so.1 > FF1F0000 8 8 - - r-x-- libsched.so.1 > FF200000 1208 1208 - - r-x-- libc.so.1 > FF330000 24 8 8 - rwx-- [ anon ] > FF33E000 40 40 32 - rwx-- libc.so.1 > FF348000 8 8 8 - rwx-- libc.so.1 > FF350000 8 8 - - r-x-- libdl.so.1 > FF35C000 16 16 - - r--s- dev:85,60 > ino:314813 > FF362000 8 8 - - rwx-- libdl.so.1 > FF370000 32 24 - - r-x-- libjli.so > FF380000 8 8 8 - rwx-- [ anon ] > FF386000 16 8 - - rwx-- libjli.so > FF390000 8 8 - - r-x-- libc_psr.so.1 > FF3A0000 16 16 - - r-x-- libthread.so.1 > FF3B0000 208 208 - - r-x-- ld.so.1 > FF3E8000 8 - - - r--s- dev:85,60 > ino:274115 > FF3F0000 8 8 8 - rwx-- [ anon ] > FF3F4000 8 8 8 - rwx-- ld.so.1 > FF3F6000 8 8 8 - rwx-- ld.so.1 > FF3FA000 8 8 - - rwxs- [ anon ] > FFBFA000 24 8 8 - rwx-- [ stack ] > -------- ------- ------- ------- ------- > total Kb 312632 40560 25344 - > > Other processes initialized by user "kplustp" has similar > memory usage as above. > > Thanks. > Best Regards, > Simon > > On Fri, Nov 20, 2009 at 10:02 PM, Jim Fiori <Jim.Fiori at sun.com > <mailto:Jim.Fiori at sun.com> <mailto:Jim.Fiori at sun.com > <mailto:Jim.Fiori at sun.com>>> wrote: > > Simon, > > For a 16GB box, the page scanner kicks in when freemem > drops below > 1/64th of memory, or about 256MB. Doesn''t matter if the > system is > idle or not. > > The ''w'' column numbers mean that threads were swapped out > at some > point in the past because of a severe memory shortage and never > swapped backed in (because they''ve not been awoken yet). So > it''s > normal for that column to stay high even if much of the > memory was > released. > > It looks to me like you''re just oversubscribing memory. If > you look > at the prstat output I see easily 13-14GB of physical > memory in use, > plus you have the kernel memory. As for virtual memory, > about 23GB > shows up at least. > > Did you check for additional virtual space usage in /tmp? > > Are you using ZFS (ARC space needed for that)? > > You can also try using the "::memstat" mdb dcmd to break > out kernel > memory further. > > Jim > > Simon wrote: > > Hi Experts, > > Here''s the performance related question,please help to > review > what can I > do to get the issue fixed ? > > IHAC who has one M5000 with Solaris 10 10/08(KJP: > 138888-01) > installed > and 16GB RAM configured,running sybase ASE 12.5 and JBOSS > application,recently,they felt the OS got very slow > after OS > running for > some sime,collected vmstat data points out memory > shortage,as: > > # vmstat 5 > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy > cs us sy id > 0 0 153 6953672 254552 228 228 1843 1218 1687 0 685 3 2 > 0 0 2334 > 32431 3143 1 1 97 > 0 0 153 6953672 259888 115 115 928 917 917 0 264 0 35 0 > 2 2208 > 62355 3332 7 3 90 > 0 0 153 6953672 255688 145 145 1168 1625 1625 0 1482 0 > 6 1 0 > 2088 40113 3070 2 1 96 > 0 0 153 6953640 256144 111 111 894 1371 1624 0 1124 0 6 > 0 0 2080 > 55278 3106 3 3 94 > 0 0 153 6953640 256048 241 241 1935 2585 3035 0 1009 0 > 18 0 0 > 2392 40643 3164 2 2 96 > 0 0 153 6953648 257112 236 235 1916 1710 1710 0 1223 0 > 7 0 0 > 2672 62582 3628 3 4 93 > > As above,the "w" column is very high all time,and "sr" > column > also kept > very high,which indicates the page scanner is activated and > busying for > page out,but the CPU is very idle,checked > "/etc/system",found one > improper entry: > set shmsys:shminfo_shmmax = 0xffffffffffff > > So I think it''s the improper share memory setting to > cause too many > physical RAM was reserved by application and suggest to > adjustment the > share memory to 8GB(0x200000000),but as customer > feedback,seems > it got > worst result based on new vmstat output: > > kthr memory page disk faults cpu > r b w swap free re mf pi po fr de sr m0 m1 m4 m5 in sy > cs us sy id > 0 6 762 3941344 515848 18 29 4544 0 0 0 0 4 562 0 1 > 2448 25687 > 3623 1 2 97 > 0 6 762 4235016 749616 66 21 4251 2 2 0 0 0 528 0 0 > 2508 50540 > 3733 2 5 93 > 0 6 762 4428080 889864 106 299 4694 0 0 0 0 1 573 0 7 2741 > 182274 3907 10 4 86 > 0 5 762 4136400 664888 19 174 4126 0 0 0 0 6 511 0 0 > 2968 241186 > 4417 18 9 73 > 0 7 762 3454280 193776 103 651 2526 3949 4860 0 121549 > 11 543 0 > 5 2808 149820 4164 10 12 78 > 0 9 762 3160424 186016 61 440 1803 7362 15047 0 189720 > 12 567 0 > 5 3101 119895 4125 6 13 81 > 0 6 762 3647456 403056 44 279 4260 331 331 0 243 10 540 > 0 3 2552 > 38374 3847 5 3 92 > > the "w" & "sr" value increased instead,why ? > > And I also attached the "prstat" outout,it''s a prstat > snapshot after > share memory adjustment,please help to have a look ? > what can I > do next > to get the issue solved ? what''s the possible factors > to cause > memory > shortage again and again,even they have 16GB RAM + 16GB > Swap the > physical RAM really shortage? > Or is there any useful dtrace script to trace the problem ? > Thanks very much ! > > Best Regards, > Simon > > > > ------------------------------------------------------------------------ > > > > ------------------------------------------------------------------------ > > _______________________________________________ > perf-discuss mailing list > perf-discuss at opensolaris.org > <mailto:perf-discuss at opensolaris.org> > <mailto:perf-discuss at opensolaris.org > <mailto:perf-discuss at opensolaris.org>> > > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > perf-discuss mailing list > perf-discuss at opensolaris.org <mailto:perf-discuss at opensolaris.org> > > > > -- > Michael Schulte > mschulte at sunspezialist.de <mailto:mschulte at sunspezialist.de> > OpenSolaris Kernel Development > http://opensolaris.org/ > > _______________________________________________ > perf-discuss mailing list > perf-discuss at opensolaris.org <mailto:perf-discuss at opensolaris.org> > > > ------------------------------------------------------------------------ > > _______________________________________________ > dtrace-discuss mailing list > dtrace-discuss at opensolaris.org >
Hi Jim, In order to be sure, you need to so some additional memory> accounting and determine how much RAM you need to support > the shared segments for Sybase, and the JVMs. >It''s difficult for me now since I don''t kwow what is the really troublemaker to cause this issue,I guess the JVMs,and I will suggest reduce the share memory allocation for sybase,that control it in sybase configuration file,not in "/etc/system". Thanks. Best Regards, Simon On Sun, Nov 22, 2009 at 11:33 AM, Jim Mauro <James.Mauro at sun.com> wrote:> > You have about 9GB of shared memory (on a 16GB machine). > > > >> From the "prstat" output,we found 3 sybase process,and each process >> derived 12 threads,the java process(launched by customer application) >> derived total 370 threads, I think it''s too many threads(especially of >> "java" program) that generate excessive stack/heaps,and finally used up the >> RAM ? >> > > Java can consume a lot of memory. Need to see the memory sizes, > but it''s certainly a possibility. > > >> So I think decrease the share memory used by sybase(defined at sybase >> configuration layer,not in "/etc/system" file) would be helpful ? >> > > Sure. If you take memory away from one consumer, it leaves > more for the others. Whether or not it actually solves your > problem, meaning after such a change the system has sufficient > memory to run without paging, remains to be seen. > > In order to be sure, you need to so some additional memory > accounting and determine how much RAM you need to support > the shared segments for Sybase, and the JVMs. > > Thanks, > /jim > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/dtrace-discuss/attachments/20091122/cf81ce77/attachment-0001.html>
Have to grabbed periodic snapshots with ps, or prstat? These will gave you a sense of which processes have large physical memory footprints, and you can pmap from there.... Thanks, /jim Simon wrote:> Hi Jim, > > In order to be sure, you need to so some additional memory > accounting and determine how much RAM you need to support > the shared segments for Sybase, and the JVMs. > > > It''s difficult for me now since I don''t kwow what is the really > troublemaker to cause this issue,I guess the JVMs,and I will suggest > reduce the share memory allocation for sybase,that control it in > sybase configuration file,not in "/etc/system". > > > Thanks. > Best Regards, > Simon > > > On Sun, Nov 22, 2009 at 11:33 AM, Jim Mauro <James.Mauro at sun.com > <mailto:James.Mauro at sun.com>> wrote: > > > You have about 9GB of shared memory (on a 16GB machine). > > > > From the "prstat" output,we found 3 sybase process,and each > process derived 12 threads,the java process(launched by > customer application) derived total 370 threads, I think it''s > too many threads(especially of "java" program) that generate > excessive stack/heaps,and finally used up the RAM ? > > > Java can consume a lot of memory. Need to see the memory sizes, > but it''s certainly a possibility. > > > So I think decrease the share memory used by sybase(defined at > sybase configuration layer,not in "/etc/system" file) would be > helpful ? > > > Sure. If you take memory away from one consumer, it leaves > more for the others. Whether or not it actually solves your > problem, meaning after such a change the system has sufficient > memory to run without paging, remains to be seen. > > In order to be sure, you need to so some additional memory > accounting and determine how much RAM you need to support > the shared segments for Sybase, and the JVMs. > > Thanks, > /jim > >