I''ve been testing ZFS since it came out on b27 and this week I BFUed to b30. I''ve seen two problems, one I''ll call minor and the other major. The hardware is a Dell PowerEdge 2600 with 2 3.2GHz Xeons, 2GB memory and a perc3 controller. I have created a filesystem for over 1000 users on it and take hourly snapshots, which destroy the one from 24 hours ago, except the one at 11pm which destroys the snapshot from one week ago. This all works fine except it takes a long time to reboot (the minor problem) and the whole system will hang and need to be power cycled after a while. First, here is the zfs setup: # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT pool1 169G 23.5G 146G 13% ONLINE - pool2 408G 123G 285G 30% ONLINE - # zpool status pool: pool1 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM pool1 ONLINE 0 0 0 raidz ONLINE 0 0 0 c0t1d0s0 ONLINE 0 0 0 c0t2d0s0 ONLINE 0 0 0 c0t3d0s0 ONLINE 0 0 0 c0t4d0s0 ONLINE 0 0 0 c0t5d0s0 ONLINE 0 0 0 pool: pool2 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM pool2 ONLINE 0 0 0 raidz ONLINE 0 0 0 c0t6d0s0 ONLINE 0 0 0 c0t7d0s0 ONLINE 0 0 0 c0t8d0s0 ONLINE 0 0 0 c0t9d0s0 ONLINE 0 0 0 c0t10d0s0 ONLINE 0 0 0 c0t11d0s0 ONLINE 0 0 0 The disks in pool1 are 36GB internal and the disks in pool2 are 73GB in an external box. There are 370 user zfs filesystems on pool1 and 660 on pool2. All filesystems have quotas set, are shared to some netgroups and have suid and device mount options turned off. I assume the long reboot time is to mount and share so many filesystems. Does on the order of 45 minutes to reboot seem right though? The major problem is the hanging after a while. Sometimes it can go maybe 5 days, other times fails in a day or two. The load is very low (updating changes from the production server once a day or so). When the hang happens a df will go through the ufs filesystems and stop before printing the first zfs one. Then the system is unresponsive and must be rebooted. Any ideas on this? I''m going to try to redo it using hardware raid5 on it next instead of raidz and see if that stops the hang. Ben This message posted from opensolaris.org
On Fri, Jan 13, 2006 at 08:40:43AM -0800, Ben Miller wrote:> > The disks in pool1 are 36GB internal and the disks in pool2 are 73GB > in an external box. There are 370 user zfs filesystems on pool1 and > 660 on pool2. All filesystems have quotas set, are shared to some > netgroups and have suid and device mount options turned off. > > I assume the long reboot time is to mount and share so many > filesystems. Does on the order of 45 minutes to reboot seem right > though? >No, this is definitely not "right". I haven''t had the chance to do much userland optimization, but it''s on the list. My guess is that there is some low-hanging fruit combined with some bugs (i.e. memory leaks from libzfs). I''ll try to reproduce this, but once the system is up, can you try running: # ptime zfs unmount -a # ptime zfs mount -a # ptime zfs share -a - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
> > I assume the long reboot time is to mount and share> --> so many > > filesystems. Does on the order of 45 minutes to > reboot seem right > > though? > > > > No, this is definitely not "right". I haven''t had > the chance to do much > userland optimization, but it''s on the list. My > guess is that there is > some low-hanging fruit combined with some bugs (i.e. > memory leaks from > libzfs). I''ll try to reproduce this, but once the > system is up, can you > try running: > > # ptime zfs unmount -a > # ptime zfs mount -a > # ptime zfs share -a > > - EricI''ll post the time numbers next week. thanks, Ben This message posted from opensolaris.org
> The major problem is the hanging after a while. > Sometimes it can go maybe 5 days, other times fails > s in a day or two. The load is very low (updating > changes from the production server once a day or > so). > When the hang happens a df will go through the ufs > filesystems and stop before printing the first zfs > one. Then the system is unresponsive and must be > rebooted. Any ideas on this? > I''m going to try to redo it using hardware raid5 on > it next instead of raidz and see if that stops the > hang.I think it''s unlikely that raid-z is the issue. I don''t think that switching from raid-z to either mirroring or hardware raid will improve your situation. Just to be clear, you have one filesystem *each* for over 1000 users, right? So every hour you are taking 1000+ snapshots, one for each user''s filesystem? There is a know performance issue with taking many snapshots at once. It shouldn''t be too difficult to fix so hopefully I will get to it soon. If possible, it would help to get a crash dump from the system while it is hung. (You should be able to get a dump on x86 even if the kernel is wedged by running under kmdb.) thanks, --matt This message posted from opensolaris.org
On 1/13/06, Matthew A. Ahrens <Matthew.Ahrens at sun.com> wrote:> > > There is a know performance issue with taking many snapshots at once. It > shouldn''t be too difficult to fix so hopefully I will get to it soon. If > possible, it would help to get a crash dump from the system while it is > hung. (You should be able to get a dump on x86 even if the kernel is wedged > by running under kmdb.)Can someone explain a little about "get a dump on x86 by running under kmdb", or point me to some documentation? Several days ago when I ftp a 1GB+ file to a ZFS filesystem (striped on two SCSI disks) on a remote system, it went into a state where no new process can be forked: The ftp finished. Machine was ping''able, but no new connection can be established. On the pre-existed ssh session, I can execuate shell building commands (echo etc.), but cannot fork any new executable/process. I waited for some time, thinking some kernel resource was in shortage, but still the same state after 20 minutes. It''s not neccessarily a ZFS problem, althought the ftp was the only user activity running at that point. I wish there''s way to force a dump in situation like that, then I''ll know for sure. System is W2100z running b27. I wonder how I can force a dump when it''s running under kmdb, push the power button? Remember I can''t run any new command in that state, although system is not completely dead. TIA. Tao -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060113/7de55a32/attachment.html>
On Fri, Jan 13, 2006 at 04:38:11PM -0600, Tao Chen wrote:> On 1/13/06, Matthew A. Ahrens <Matthew.Ahrens at sun.com> wrote: > > > > > > There is a know performance issue with taking many snapshots at once. It > > shouldn''t be too difficult to fix so hopefully I will get to it soon. If > > possible, it would help to get a crash dump from the system while it is > > hung. (You should be able to get a dump on x86 even if the kernel is wedged > > by running under kmdb.) > > > > Can someone explain a little about "get a dump on x86 by running under > kmdb", > or point me to some documentation?I''m not sure where such documentation is. Generally: /usr/sbin/reboot -d will immediately induce a crash dump, assuming you can actually fork and exec it. If that''s not working, you might be able to do: exec /usr/sbin/reboot -d but if there''s trouble in the exec path, you could be screwed. If you''ve booted under kmdb(1) (i.e. with a -k flag), or after kmdb(1) is loaded using ''mdb -K'', then: 1. on sparc, typing STOP-A (hold down stop, type ''a'') should drop you to a kmdb prompt, or 2. on i386/amd64, the key-sequence F1-A (hold down F1, hit A) should drop you to a kmdb prompt, or 3. on many platforms, hitting the power button three times in less than a second should drop you to a kmdb prompt. Once you''ve hit the kmdb(1) prompt, entering: $<systemdump will induce a crash dump. If instead you want to continue running the system, do: :c Cheers, - jonathan -- Jonathan Adams, Solaris Kernel Development
On 1/13/06, Jonathan Adams <jonathan.adams at sun.com> wrote:> > > exec /usr/sbin/reboot -d > > but if there''s trouble in the exec path, you could be screwed. > > If you''ve booted under kmdb(1) (i.e. with a -k flag), or after kmdb(1) is > loaded using ''mdb -K'', then: > > 1. on sparc, typing STOP-A (hold down stop, type ''a'') should drop > you to a kmdb prompt, or > 2. on i386/amd64, the key-sequence F1-A (hold down F1, hit A) > should > drop you to a kmdb prompt, or > 3. on many platforms, hitting the power button three times in > less than a second should drop you to a kmdb prompt. > >Thank you! I''ll give those a try later. Sounds like if "exec /usr/sbin/reboot -d" fails, I need to hook up a keyboard and a monitor in order to get the kmdb prompt, that is if the kernel can still detect keyboard hotplug at that time. I wish the x86 power button can be programmed to initialize a dump and reboot automatically. Didn''t mean to hijack this thread. I was wondering if OP''s system was in similiar state as mine. OP described it as "the system is unresponsive and must be rebooted". Tao -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20060113/d12e3773/attachment.html>
I can almost assert that RAID-Z is not the issue. Given your config, I would guess that 32-bit VA exhaustion in the kernel is your issue. It''s a known problem with ZFS on 32-bit machines, and we''re working on it. In the meantime, you might consider lowering kernelbase to get some more breathing room. Also, cranking up the frequency of the kmem reaper helps. Try adding these two lines to /etc/system and then rebooting: set kmem_reap_interval = 0x64 set kernelbase = 0xa0000000 That should help to alleviate most of your problems until we can get it fixed for real. If not, let us know and we can dig some more. --Bill On Fri, Jan 13, 2006 at 08:40:43AM -0800, Ben Miller wrote:> I''ve been testing ZFS since it came out on b27 and this week I BFUed to b30. I''ve seen two problems, one I''ll call minor and the other major. The hardware is a Dell PowerEdge 2600 with 2 3.2GHz Xeons, 2GB memory and a perc3 controller. I have created a filesystem for over 1000 users on it and take hourly snapshots, which destroy the one from 24 hours ago, except the one at 11pm which destroys the snapshot from one week ago. This all works fine except it takes a long time to reboot (the minor problem) and the whole system will hang and need to be power cycled after a while. > > First, here is the zfs setup: > # zpool list > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > pool1 169G 23.5G 146G 13% ONLINE - > pool2 408G 123G 285G 30% ONLINE - > # zpool status > pool: pool1 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > pool1 ONLINE 0 0 0 > raidz ONLINE 0 0 0 > c0t1d0s0 ONLINE 0 0 0 > c0t2d0s0 ONLINE 0 0 0 > c0t3d0s0 ONLINE 0 0 0 > c0t4d0s0 ONLINE 0 0 0 > c0t5d0s0 ONLINE 0 0 0 > > pool: pool2 > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > pool2 ONLINE 0 0 0 > raidz ONLINE 0 0 0 > c0t6d0s0 ONLINE 0 0 0 > c0t7d0s0 ONLINE 0 0 0 > c0t8d0s0 ONLINE 0 0 0 > c0t9d0s0 ONLINE 0 0 0 > c0t10d0s0 ONLINE 0 0 0 > c0t11d0s0 ONLINE 0 0 0 > > The disks in pool1 are 36GB internal and the disks in pool2 are 73GB in an external box. There are 370 user zfs filesystems on pool1 and 660 on pool2. All filesystems have quotas set, are shared to some netgroups and have suid and device mount options turned off. > > I assume the long reboot time is to mount and share so many filesystems. Does on the order of 45 minutes to reboot seem right though? > > The major problem is the hanging after a while. Sometimes it can go maybe 5 days, other times fails in a day or two. The load is very low (updating changes from the production server once a day or so). > When the hang happens a df will go through the ufs filesystems and stop before printing the first zfs one. Then the system is unresponsive and must be rebooted. Any ideas on this? > I''m going to try to redo it using hardware raid5 on it next instead of raidz and see if that stops the hang. > > Ben > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>I wonder how I can force a dump when it''s running under kmdb, push the power >button? >Remember I can''t run any new command in that state, although system is not >completely dead.If you install frkit''s "acpidrv" and set the kernel variable to allow it to use its power button handling (set acpidrv:acpidrv_nobutton = 0) you should be able to hit the powerbutton three times in quick succession and have the system enter the debugger if it is running or panic. Casper
> Didn''t mean to hijack this > thread. I was wondering if OP''s system was in > similiar state<br>as mine. OP described it as > "the system is unresponsive and must be > rebooted". > <br><br>Tao<br><br>No problem, your question generated some good info. It does sound like my system is in a similar state and I''ll need to do the same to generate a crash dump. Ben This message posted from opensolaris.org
> I can almost assert that RAID-Z is not the issue. > Given your config, I > would guess that 32-bit VA exhaustion in the kernel > is your issue. It''s > a known problem with ZFS on 32-bit machines, and > we''re working on it. > In the meantime, you might consider lowering > kernelbase to get some more > breathing room. Also, cranking up the frequency of > the kmem reaper > helps. Try adding these two lines to /etc/system and > then rebooting: > > set kmem_reap_interval = 0x64 > set kernelbase = 0xa0000000 > > That should help to alleviate most of your problems > until we can get it > fixed for real. If not, let us know and we can dig > some more. >I''ll try these out on Tuesday. Thanks, Ben This message posted from opensolaris.org
> I think it''s unlikely that raid-z is the issue. I > don''t think that switching from raid-z to either > mirroring or hardware raid will improve your > situation. > > Just to be clear, you have one filesystem *each* for > over 1000 users, right? So every hour you are taking > 1000+ snapshots, one for each user''s filesystem? >Yes. The snapshot script for this takes around 10 minutes to complete.> There is a know performance issue with taking many > snapshots at once. It shouldn''t be too difficult to > fix so hopefully I will get to it soon. If possible, > it would help to get a crash dump from the system > while it is hung. (You should be able to get a dump > on x86 even if the kernel is wedged by running under > kmdb.) >I''ll try that if I get the problem again now that I''ve set the two values in /etc/system. Thanks, Ben This message posted from opensolaris.org
> Yes. The snapshot script for this takes around 10 minutes to complete.Hm, this might be an interesting feature: - a snapshot which is only taken if any I/O was done since the last snapshot. That way, you can have a "sparse" snapshot tree containing only interesting snapshots. Casper
> No, this is definitely not "right". I haven''t had > the chance to do much > userland optimization, but it''s on the list. My > guess is that there is > some low-hanging fruit combined with some bugs (i.e. > memory leaks from > libzfs). I''ll try to reproduce this, but once the > system is up, can you > try running: > > # ptime zfs unmount -a > # ptime zfs mount -a > # ptime zfs share -a >I destroyed all the filesystems and recreated them. The times with 1000+ empty filesystems and no snapshots: # ptime zfs umount -a real 25.537 user 3.865 sys 20.104 # ptime zfs mount -a real 3:41.023 user 0.559 sys 49.410 # ptime zfs share -a real 2:55.866 user 7.473 sys 2:40.937 I copied over some of the data and took a bunch of snapshots (25 per filesystem). It took about 30 minutes to reboot and the times are: # ptime zfs umount -a real 17.574 user 0.191 sys 16.652 # ptime zfs mount -a real 14:41.830 user 2.975 sys 13:07.226 # ptime zfs share -a real 10:52.053 user 2.546 sys 10:48.945 I then destroyed most of the snapshots, leaving two per filesystem, and the times are: # ptime zfs umount -a real 37.508 user 3.978 sys 21.004 # ptime zfs mount -a real 5:35.256 user 0.698 sys 1:39.457 # ptime zfs share -a real 3:36.673 user 7.677 sys 3:23.729 So it looks like it is having too many snapshots making reboot take a long time... Ben> - Eric > > -- > Eric Schrock, Solaris Kernel Development > http://blogs.sun.com/eschrockThis message posted from opensolaris.org
> I can almost assert that RAID-Z is not the issue. > Given your config, I > would guess that 32-bit VA exhaustion in the kernel > is your issue. It''s > a known problem with ZFS on 32-bit machines, and > we''re working on it. > In the meantime, you might consider lowering > kernelbase to get some more > breathing room. Also, cranking up the frequency of > the kmem reaper > helps. Try adding these two lines to /etc/system and > then rebooting: > > set kmem_reap_interval = 0x64I have this set now and will test to see if it helps.> set kernelbase = 0xa0000000 >Having this one set causes a kernel panic... Ben> That should help to alleviate most of your problems > until we can get it > fixed for real. If not, let us know and we can dig > some more. > > > --Bill >This message posted from opensolaris.org
Thanks. There are some known issues with snapshot scalability, some in the kernel and some in userland. I have a guess as to what''s happening here - I''ll make sure to tackle this next. Thanks for the report and extra data. - Eric On Tue, Jan 17, 2006 at 08:09:25AM -0800, Ben Miller wrote:> > No, this is definitely not "right". I haven''t had > > the chance to do much > > userland optimization, but it''s on the list. My > > guess is that there is > > some low-hanging fruit combined with some bugs (i.e. > > memory leaks from > > libzfs). I''ll try to reproduce this, but once the > > system is up, can you > > try running: > > > > # ptime zfs unmount -a > > # ptime zfs mount -a > > # ptime zfs share -a > > > I destroyed all the filesystems and recreated them. The times with 1000+ empty filesystems and no snapshots: > # ptime zfs umount -a > real 25.537 > user 3.865 > sys 20.104 > # ptime zfs mount -a > real 3:41.023 > user 0.559 > sys 49.410 > # ptime zfs share -a > real 2:55.866 > user 7.473 > sys 2:40.937 > > I copied over some of the data and took a bunch of snapshots (25 per filesystem). It took about 30 minutes to reboot and the times are: > > # ptime zfs umount -a > real 17.574 > user 0.191 > sys 16.652 > # ptime zfs mount -a > real 14:41.830 > user 2.975 > sys 13:07.226 > # ptime zfs share -a > real 10:52.053 > user 2.546 > sys 10:48.945 > > I then destroyed most of the snapshots, leaving two per filesystem, and the times are: > > # ptime zfs umount -a > real 37.508 > user 3.978 > sys 21.004 > # ptime zfs mount -a > real 5:35.256 > user 0.698 > sys 1:39.457 > # ptime zfs share -a > real 3:36.673 > user 7.677 > sys 3:23.729 > > So it looks like it is having too many snapshots making reboot take a long time... > > Ben > > > - Eric > > > > -- > > Eric Schrock, Solaris Kernel Development > > http://blogs.sun.com/eschrock > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss-- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
> > helps. Try adding these two lines to /etc/system > and > > then rebooting: > > > > set kmem_reap_interval = 0x64 > > I have this set now and will test to see if it > if it helps.The system just hung up again while copying data to it. I''ll see if I can set up for a crash dump and get it to duplicate again.> > > set kernelbase = 0xa0000000 > > > Having this one set causes a kernel panic...Any thoughts on this one? thanks, Ben> > Ben > > > That should help to alleviate most of your > problems > > until we can get it > > fixed for real. If not, let us know and we can > dig > > some more. > > > > > > --Bill > >This message posted from opensolaris.org
On Tue, Jan 17, 2006 at 10:31:59AM -0800, Ben Miller wrote:> > > set kernelbase = 0xa0000000 > > > > > Having this one set causes a kernel panic... > > Any thoughts on this one?Try this (as root): echo ''kernelbase/X'' | mdb -k And see what it reports. It could be that the number I gave you was too high. If you don''t run many user apps on this box (mainly just file service), you can try making it even lower, say, 0x80000000. --Bill
Ben Miller wrote: ....>>> set kernelbase = 0xa0000000 >> Having this one set causes a kernel panic... > Any thoughts on this one?Standard questions: what is the output of ::status panicstr::print *panic_thread::findstack -v and (not-so-standard questions): ::vmem ::kmastat thanks, James C. McPherson -- Solaris Datapath Engineering Data Management Group Sun Microsystems
> Try this (as root): > > echo ''kernelbase/X'' | mdb -k >kernelbase: c2800000> And see what it reports. It could be that the number > I gave you was too > high. If you don''t run many user apps on this box > (mainly just file > service), you can try making it even lower, say, > 0x80000000.This value also give a kernel panic. Ben> > > --BillThis message posted from opensolaris.org
> Ben Miller wrote: > .... > >>> set kernelbase = 0xa00000000 > >> Having this one set causes a kernel panic... > > Any thoughts on this one? >Here''s the panic console output with kernelbase =0x80000000 : panic[cpu3]/thread=d6cd6de0: assertion failed: hat == kas.a_hat || va <= kernelbase, file: ../../i86pc/vm/hat_i86.c, line: 1557 d6cb58c0 genunix:assfail+5c (fe8939d8, fe8939bc,) d6cb58f8 unix:hat_memload+136 (cc945ee8, c27c7000,) d6cb596c genunix:segvn_faultpage+463 (cc945ee8, d6cf6f00,) d6cb5a64 genunix:segvn_fault+cf5 (cc945ee8, d6cf6f00,) d6cb5af0 genunix:as_fault+4b8 (cc945ee8, d65fef80,) d6cb5b88 genunix:execmap+165 (d6cedf40, c27c7000,) d6cb5c2c elfexec:mapelfexec+317 (d6cedf40, d610f180,) d6cb5d04 elfexec:elfexec+712 (d660f240, d6cb5ed4,) d6cb5dec genunix:gexec+2e7 (d6cb5ef4, d6cb5ed4,) d6cb5f00 genunix:exec_common+304 (8047ff5, 8047fec, 0) d6cb5f58 genunix:exec_init+1be (fec4e46c, 1, 0) d6cb5f8c genunix:icode+d3 (0, 0) d6cb5f9c unix:thread_start+8 ()> Standard questions: what is the output of > > > ::status::status debugging live kernel (32-bit) on (not set) operating system: 5.11 opensol-20060102 (i86pc) CPU-specific support: Intel Pentium 4 (pre-Prescott) DTrace state: inactive stopped on: debugger entry trap> panicstr::printpanicstr::print 0xfea42180 "assertion failed: %s, file: %s, line: %d"> *panic_thread::findstack -v >*panic_thread::findstack -v stack pointer for thread d6cd6de0: d6cb5890 d6cb58a8 panic+0x12(fea42180, fe8939d8, fe8939bc, 615) d6cb58c0 assfail+0x5c(fe8939d8, fe8939bc, 615) d6cb58f8 hat_memload+0x136(cc945ee8, c27c7000, fbd1d730, d, 400) d6cb596c segvn_faultpage+0x463(cc945ee8, d6cf6f00, c27c7000, 0, 0, 0) d6cb5a64 segvn_fault+0xcf5(cc945ee8, d6cf6f00, c27c7000, 24000, 0, 1) d6cb5af0 as_fault+0x4b8(cc945ee8, d65fef80, c27c7000, 23346, 0, 1) d6cb5b88 execmap+0x165(d6cedf40, c27c7000, 23346, 0, 0, d) d6cb5c2c elfexec`mapelfexec+0x317(d6cedf40, d610f180, 5, d6097030, d6cb5cd8, d6cb5cd8) d6cb5d04 elfexec`elfexec+0x712(d660f240, d6cb5ed4, d6cb5e68, 0, 0, d6cb5efc) d6cb5dec gexec+0x2e7(d6cb5ef4, d6cb5ed4, d6cb5e68, 0, 0, d6cb5efc) d6cb5f00 exec_common+0x304(8047ff5, 8047fec, 0) d6cb5f58 exec_init+0x1be(fec4e46c, 1, 0) d6cb5f8c icode+0xd3(0, 0) d6cb5f9c thread_start+8()> > and (not-so-standard questions): > > ::vmem::vmem ADDR NAME INUSE TOTAL SUCCEED FAIL fec70188 heap 203927552 1992286208 965 0 fec70818 vmem_metadata 1122304 1179648 148 0 fec70ea8 vmem_seg 1032192 1032192 126 0 fec71538 vmem_hash 20736 24576 16 0 fec71bc8 vmem_vmem 68880 82336 42 0 fec72258 heaptext 4993024 67108864 63 0 cc690690 module_text 5795688 5910528 102 0 fec728e8 static 0 0 0 0 fec72f78 static_alloc 0 0 0 0 fec73608 hat_memload 1114112 1114112 272 0 fec73c98 kstat 117248 122880 419 0 ca801000 kmem_metadata 4063232 4063232 992 0 ca801690 kmem_msb 3600384 3600384 881 0 ca802000 kmem_cache 337680 413696 245 0 ca802690 kmem_hash 47872 49152 219 0 ca803000 kmem_log 31316704 31318016 12 0 ca803690 kmem_firewall_va 0 0 3 0 ca804000 kmem_firewall 0 0 3 0 ca80d000 mod_sysfile 416 4096 15 0 ca804690 kmem_oversize 17092338 17203200 243 0 ca80d690 kmem_va 4718592 4718592 36 0 ca810000 kmem_default 4251648 4251648 671 0 ca81d690 kmem_io_2G 139264 139264 34 0 ca820000 kmem_io_16M 4096 4096 1 0 ca820690 id32 0 0 0 0 cc689000 bp_map 0 0 0 0 cc689690 ksyms 1028959 1077248 98 0 cc690000 ctf 1142190 1155072 108 0 cc69f000 module_data 1754627 1781760 151 0 cc69f690 segkp 2818048 2818048 83 0 cc6a6000 umem_np 0 0 0 0 ca810690 logminor_space 16 262137 16 0 ca81d000 taskq_id_arena 11 2147483647 11 0 cc6a6690 rctl_ids 27 32767 27 0 cce9d000 zoneid_space 0 9998 0 0 cce9d690 taskid_space 2 999999 2 0 ccea0000 pool_ids 0 999998 0 0 ccea0690 contracts 1 2147483646 1 0 d6070000 ip_minor_arena 64 262140 1 0 d6070690 dld_ctl 0 262143 0 0 d6079000 dld_minor_arena 0 262143 0 0> ::kmastat::kmastat cache buf buf buf memory alloc alloc name size in use total in use succeed fail ------------------------- ------ ------ ------ --------- --------- ----- kmem_magazine_1 8 22 169 4096 22 0 kmem_magazine_3 16 52 127 4096 52 0 kmem_magazine_7 32 142 189 12288 142 0 kmem_magazine_15 64 178 186 24576 178 0 kmem_magazine_31 128 0 0 0 0 0 kmem_magazine_47 192 0 0 0 0 0 kmem_magazine_63 256 0 0 0 0 0 kmem_magazine_95 384 0 0 0 0 0 kmem_magazine_143 576 0 0 0 0 0 kmem_slab_cache 28 786 840 40960 827 0 kmem_bufctl_cache 12 0 0 0 0 0 kmem_bufctl_audit_cache 100 28298 28314 3514368 28434 0 kmem_va_4096 4096 495 512 2097152 495 0 kmem_va_8192 8192 35 48 393216 35 0 kmem_va_12288 12288 110 110 1441792 110 0 kmem_va_16384 16384 12 16 262144 12 0 kmem_va_20480 20480 19 24 524288 19 0 kmem_va_24576 24576 0 0 0 0 0 kmem_va_28672 28672 0 0 0 0 0 kmem_va_32768 32768 0 0 0 0 0 kmem_alloc_8 8 6626 6970 167936 16397 0 kmem_alloc_16 16 5506 5632 180224 8352 0 kmem_alloc_24 24 3368 3468 139264 6169 0 kmem_alloc_32 32 1296 1360 65536 2719 0 kmem_alloc_40 40 721 1022 57344 1962 0 kmem_alloc_48 48 92 128 8192 342 0 kmem_alloc_56 56 239 448 32768 2303 0 kmem_alloc_64 64 183 256 32768 638 0 kmem_alloc_80 80 473 504 49152 1042 0 kmem_alloc_96 96 987 1008 114688 1501 0 kmem_alloc_112 112 25 32 4096 167 0 kmem_alloc_128 128 48 63 12288 267 0 kmem_alloc_160 160 296 322 57344 935 0 kmem_alloc_192 192 123 128 32768 291 0 kmem_alloc_224 224 101 119 28672 256 0 kmem_alloc_256 256 9 24 8192 271 0 kmem_alloc_320 320 29 40 16384 1208 0 kmem_alloc_384 384 6 18 8192 146 0 kmem_alloc_448 448 5 24 12288 437 0 kmem_alloc_512 512 125 133 77824 329 0 kmem_alloc_640 640 18 33 24576 1152 0 kmem_alloc_768 768 3 9 8192 62 0 kmem_alloc_896 896 8 12 12288 26 0 kmem_alloc_1152 1152 22 30 36864 855 0 kmem_alloc_1344 1344 11 16 24576 375 0 kmem_alloc_1600 1600 14 21 36864 38 0 kmem_alloc_2048 2048 19 27 61440 46 0 kmem_alloc_2688 2688 29 35 102400 70 0 kmem_alloc_4096 4096 25 30 245760 110 0 kmem_alloc_8192 8192 95 99 1216512 508 0 kmem_alloc_12288 12288 8 11 180224 761 0 kmem_alloc_16384 16384 7 9 184320 29 0 streams_mblk 32 532 576 36864 692 0 streams_dblk_64 128 267 273 53248 411 0 streams_dblk_128 192 0 16 4096 5 0 streams_dblk_320 384 263 270 122880 274 0 streams_dblk_576 640 0 0 0 0 0 streams_dblk_1088 1152 0 0 0 0 0 streams_dblk_1536 1600 2 7 12288 2 0 streams_dblk_1984 2048 0 0 0 0 0 streams_dblk_2624 2688 0 0 0 0 0 streams_dblk_3968 4032 0 0 0 0 0 streams_dblk_8192 64 0 0 0 0 0 streams_dblk_12160 12224 0 0 0 0 0 streams_dblk_16384 64 0 0 0 0 0 streams_dblk_20352 20416 0 0 0 0 0 streams_dblk_24576 64 0 0 0 0 0 streams_dblk_28544 28608 0 0 0 0 0 streams_dblk_32768 64 0 0 0 0 0 streams_dblk_36736 36800 0 0 0 0 0 streams_dblk_40960 64 0 0 0 0 0 streams_dblk_44928 44992 0 0 0 0 0 streams_dblk_49152 64 0 0 0 0 0 streams_dblk_53120 53184 0 0 0 0 0 streams_dblk_57344 64 0 0 0 0 0 streams_dblk_61312 61376 0 0 0 0 0 streams_dblk_65536 64 0 0 0 0 0 streams_dblk_69504 69568 0 0 0 0 0 streams_dblk_73728 64 0 0 0 0 0 streams_dblk_esb 64 0 0 0 0 0 streams_fthdr 168 0 0 0 0 0 streams_ftblk 152 0 0 0 0 0 multidata 136 0 0 0 0 0 multidata_pdslab 3560 0 0 0 0 0 multidata_pattbl 20 0 0 0 0 0 taskq_ent_cache 28 1025 1105 53248 1025 0 taskq_cache 160 39 46 8192 39 0 kmem_io_2G_128 128 1 16 4096 1 0 kmem_io_2G_256 256 255 256 131072 255 0 kmem_io_2G_512 512 0 0 0 0 0 kmem_io_2G_1024 1024 0 0 0 0 0 kmem_io_2G_2048 2048 0 1 4096 2 0 kmem_io_2G_4096 4096 0 0 0 0 0 kmem_io_16M_128 128 1 16 4096 3 0 kmem_io_16M_256 256 0 0 0 0 0 kmem_io_16M_512 512 0 0 0 0 0 kmem_io_16M_1024 1024 0 0 0 0 0 kmem_io_16M_2048 2048 0 0 0 0 0 kmem_io_16M_4096 4096 0 0 0 0 0 id32_cache 32 0 0 0 0 0 bp_map_4096 4096 0 0 0 0 0 bp_map_8192 8192 0 0 0 0 0 bp_map_12288 12288 0 0 0 0 0 bp_map_16384 16384 0 0 0 0 0 bp_map_20480 20480 0 0 0 0 0 bp_map_24576 24576 0 0 0 0 0 bp_map_28672 28672 0 0 0 0 0 bp_map_32768 32768 0 0 0 0 0 mod_hash_entries 12 87 128 4096 89 0 ipp_mod 284 0 0 0 0 0 ipp_action 328 0 0 0 0 0 ipp_packet 40 0 0 0 0 0 htable_t 44 887 924 45056 889 0 hment_t 20 44044 44109 1069056 44044 0 hat_t 96 2 36 4096 3 0 HatHash 1024 2 11 12288 3 0 seg_cache 44 4 64 4096 5 0 snode_cache 88 23 39 4096 66 0 dv_node_cache 68 19 46 4096 19 0 dev_info_node_cache 288 155 156 49152 293 0 segkp_4096 4096 0 0 0 0 0 segkp_8192 8192 0 0 0 0 0 segkp_12288 12288 196 200 2621440 196 0 segkp_16384 16384 0 0 0 0 0 segkp_20480 20480 8 9 196608 128 0 umem_np_4096 4096 0 0 0 0 0 umem_np_8192 8192 0 0 0 0 0 umem_np_12288 12288 0 0 0 0 0 umem_np_16384 16384 0 0 0 0 0 umem_np_20480 20480 0 0 0 0 0 umem_np_24576 24576 0 0 0 0 0 umem_np_28672 28672 0 0 0 0 0 umem_np_32768 32768 0 0 0 0 0 thread_cache 500 3 15 8192 3 0 lwp_cache 1080 3 11 12288 3 0 turnstile_cache 36 195 365 20480 324 0 cred_cache 140 1 25 4096 1 0 rctl_cache 20 92 102 4096 114 0 rctl_val_cache 44 146 192 12288 191 0 task_cache 72 2 46 4096 2 0 rootnex_dmahdl 1268 260 270 368640 748 0 cyclic_id_cache 32 4 85 4096 4 0 dnlc_space_cache 16 0 0 0 0 0 vn_cache 148 169 189 36864 212 0 file_cache 36 3 73 4096 4 0 stream_head_cache 232 6 16 4096 18 0 queue_cache 356 13 20 8192 42 0 syncq_cache 100 5 34 4096 15 0 qband_cache 32 0 0 0 0 0 linkinfo_cache 24 3 102 4096 4 0 ciputctrl_cache 1024 0 0 0 0 0 serializer_cache 40 0 0 0 0 0 as_cache 112 1 32 4096 2 0 marker_cache 80 0 0 0 0 0 anon_cache 24 8 85 4096 9 0 anonmap_cache 28 2 85 4096 3 0 segvn_cache 64 4 51 4096 5 0 flk_edges 24 0 0 0 0 0 fdb_cache 64 0 0 0 0 0 timer_cache 84 0 0 0 0 0 physio_buf_cache 136 0 0 0 0 0 ufs_inode_cache 276 119 130 40960 119 0 directio_buf_cache 148 0 0 0 0 0 lufs_save 12 0 128 4096 1 0 lufs_bufs 140 0 25 4096 1 0 lufs_mapentry_cache 64 0 0 0 0 0 kcf_sreq_cache 36 0 0 0 0 0 kcf_areq_cache 168 0 0 0 0 0 kcf_context_cache 52 0 0 0 0 0 ipsec_actions 60 0 0 0 0 0 ipsec_selectors 64 0 0 0 0 0 ipsec_policy 48 0 0 0 0 0 ipsec_info 196 0 0 0 0 0 ip_minor_arena_1 1 2 64 64 3 0 ipcl_conn_cache 956 1 4 4096 2 0 ipcl_tcpconn_cache 1748 1 9 16384 1 0 ire_cache 508 0 0 0 0 0 tcp_timercache 44 1 64 4096 1 0 tcp_sack_info_cache 72 0 0 0 0 0 tcp_iphc_cache 120 1 30 4096 1 0 squeue_cache 180 4 16 4096 4 0 sctp_conn_cache 2100 1 9 20480 1 0 sctp_faddr_cache 136 0 0 0 0 0 sctp_set_cache 16 0 0 0 0 0 sctp_ftsn_set_cache 8 0 0 0 0 0 sctpsock 428 0 0 0 0 0 sctp_assoc 44 0 0 0 0 0 socktpi_cache 304 0 0 0 0 0 socktpi_unix_cache 304 0 0 0 0 0 ncafs_cache 328 0 0 0 0 0 mac_impl_cache 656 0 0 0 0 0 dls_cache 92 0 0 0 0 0 dls_vlan_cache 32 0 0 0 0 0 dls_link_cache 584 0 0 0 0 0 dld_ctl_1 1 0 0 0 0 0 dld_str_cache 168 0 0 0 0 0 udp_cache 288 0 0 0 0 0 process_cache 2144 3 9 20480 3 0 exacct_object_cache 28 0 0 0 0 0 fctl_cache 68 0 0 0 0 0 kssl_cache 1320 0 0 0 0 0 ------------------------- ------ ------ ------ --------- --------- ----- Total [hat_memload] 1114112 44933 0 Total [kmem_msb] 3600384 29655 0 Total [kmem_va] 4718592 671 0 Total [kmem_default] 4251648 54528 0 Total [kmem_io_2G] 139264 258 0 Total [kmem_io_16M] 4096 3 0 Total [segkp] 2818048 324 0 Total [ip_minor_arena] 64 3 0 ------------------------- ------ ------ ------ --------- --------- ----- vmem memory memory memory alloc alloc name in use total import succeed fail ------------------------- --------- ---------- --------- --------- ----- heap 203927552 1992286208 0 965 0 vmem_metadata 1122304 1179648 1179648 148 0 vmem_seg 1032192 1032192 1032192 126 0 vmem_hash 20736 24576 24576 16 0 vmem_vmem 68880 82336 65536 42 0 heaptext 4993024 67108864 0 63 0 module_text 5795688 5910528 4993024 102 0 static 0 0 0 0 0 static_alloc 0 0 0 0 0 hat_memload 1114112 1114112 1114112 272 0 kstat 117248 122880 57344 419 0 kmem_metadata 4063232 4063232 4063232 992 0 kmem_msb 3600384 3600384 3600384 881 0 kmem_cache 337680 413696 413696 245 0 kmem_hash 47872 49152 49152 219 0 kmem_log 31316704 31318016 31318016 12 0 kmem_firewall_va 0 0 0 3 0 kmem_firewall 0 0 0 3 0 mod_sysfile 416 4096 4096 15 0 kmem_oversize 17092338 17203200 17203200 243 0 kmem_va 4718592 4718592 4718592 36 0 kmem_default 4251648 4251648 4251648 671 0 kmem_io_2G 139264 139264 139264 34 0 kmem_io_16M 4096 4096 4096 1 0 id32 0 0 0 0 0 bp_map 0 0 0 0 0 ksyms 1028959 1077248 1077248 98 0 ctf 1142190 1155072 1155072 108 0 module_data 1754627 1781760 1474560 151 0 segkp 2818048 2818048 2818048 83 0 umem_np 0 0 0 0 0 logminor_space 16 262137 0 16 0 taskq_id_arena 11 2147483647 0 11 0 rctl_ids 27 32767 0 27 0 zoneid_space 0 9998 0 0 0 taskid_space 2 999999 0 2 0 pool_ids 0 999998 0 0 0 contracts 1 2147483646 0 1 0 ip_minor_arena 64 262140 0 1 0 dld_ctl 0 262143 0 0 0 dld_minor_arena 0 262143 0 0 0 ------------------------- --------- ---------- --------- --------- ----- thanks, Ben> > > > > thanks, > James C. McPherson > -- > Solaris Datapath Engineering > Data Management Group > Sun MicrosystemsThis message posted from opensolaris.org
> There is a know performance issue with taking many > snapshots at once. It shouldn''t be too difficult to > fix so hopefully I will get to it soon. If possible, > it would help to get a crash dump from the system > while it is hung. (You should be able to get a dump > on x86 even if the kernel is wedged by running under > kmdb.) >I have a crash dump from when the system hung now. What should I do with it? thanks, Ben> thanks, > --mattThis message posted from opensolaris.org
>> >>> set kernelbase = 0xa00000000Just a silly question, but isn''t kernelbase supposed to be set in the bootenv.rc? So could you remove that /etc/system setting and try: eeprom kernelbase=0xa0000000 Casper
> > >> >>> set kernelbase = 0xa00000000 > > > Just a silly question, but isn''t kernelbase supposed > to be > set in the bootenv.rc? > > So could you remove that /etc/system setting and try: > > eeprom kernelbase=0xa0000000 >Casper rules! It booted up fine that way. # echo ''kernelbase/X'' | mdb -k kernelbase: kernelbase: a0000000 I''ll restart my data copy test now and see if I get the problem again. thanks! Ben> Casper > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discu > ss >This message posted from opensolaris.org
>Casper rules! It booted up fine that way. ># echo ''kernelbase/X'' | mdb -k >kernelbase: >kernelbase: a0000000 > >I''ll restart my data copy test now and see if I get the problem again.Try with lower values too.... Casper
> > >Casper rules! It booted up fine that way. > ># echo ''kernelbase/X'' | mdb -k > >kernelbase: > >kernelbase: a0000000 > > > >I''ll restart my data copy test now and see if I get > the problem again. > > Try with lower values too....The system has stayed up during the data copy and through the night. I''ll leave it running with this value through the weekend and try out some lower values next week. thanks, Ben> > CasperThis message posted from opensolaris.org
> > > > >Casper rules! It booted up fine that way. > > ># echo ''kernelbase/X'' | mdb -k > > >kernelbase: > > >kernelbase: a0000000 > > > > > >I''ll restart my data copy test now and see if I > get > > the problem again. > > > > Try with lower values too.... > > The system has stayed up during the data copy and > and through the night. I''ll leave it running with > this value through the weekend and try out some lower > values next week. >The system has been working perfectly with kernelbase=0xa0000000. I tried kernelbase=0x80000000 today and it also seems to work fine. The reboot time for 8 snapshots on each of the 1000+ filesystems is around 20 minutes, which should be acceptable for now. Ben This message posted from opensolaris.org