Kyle McDonald
2010-Jul-09 22:08 UTC
[zfs-discuss] ZFS, IPS (IBM ServeRAID) driver, and a kernel panic...
Hi, I have been trying out the latest NextentaCore and NexentaStor Community ed. builds (they have the driver I need built in) on the hardware I have with this controller. The only difference between the 2 machines is that the ''Core'' machine has 16GB of RAM and the ''Stor'' one has 12GB. On both machines I did the following: 1) Created zpool consisting of a single RaidZ from 5 300GB U320 10K drives. 2) Created 4 filesystems in the pool. 3) On the 4 filesystems I set the dedup and compression properties to cover all the combinations. (off/off, off/on, on/off, and on/on) On the ''Stor'' machine I elected to Disable the ZIL and cacheflush through the web GUI. I didn''t do this on the ''Core'' machine. On the ''Core'' machine I mounted the 4 Filesystems from the ''Stor'' machine via NFSv4. Now for a bit of history. I tried out the ''Stor'' machine in this exact config (but with ZIL and Cache flushes on) about a month ago with version 3.0.2. At that time I used a Linux NFS client to time untar''ing the GCC sources to each of the 4 filesystems. This test repeatedly failed on the first filesystem by bringing the machine to it''s knees to the point that I had to power cycle it. This time around I decided to use the ''Core'' machine as the client so I could also time the same test to it''s local ZFS filesystems. At first I got my hopes up, because the test ran to completion (and rather quickly) locally on the core machine. I then added running it over NFS to the ''Stor'' machine to the testing. In the beginning I was untarring it once on each filesystem, and even over NFS this worked (though slower than I''d hoped for having the ZIL and cacheflush disabled.) So I thought I''d push the DeDup a little harder, and I expanded the test to untar the sources 4 times per filesystem. This ran fine until the 4th NFS filesystem, where the ''Stor'' machine panic''d. The client waited while it rebooted, and then resumed the test causing it to panic a second time. For some reason it hung so bad the second time it didn''t even reboot - I''ll have to power cycle it monday when I get to work. The 2 stack traces are identical:> anic[cpu3]/thread=ffffff001782fc60: BAD TRAP: type=e (#pf Page fault) rp=ffffff001782f9c0 addr=18 occurred in module "unix" due to a NULL pointer dereference > > sched: #pf Page fault > Bad kernel fault at addr=0x18 > pid=0, pc=0xfffffffffb863374, sp=0xffffff001782fab8, eflags=0x10286 > cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> > cr2: 18cr3: 5000000cr8: c > > rdi: ffffff03dc84fcfc rsi: ffffff03e1d03d98 rdx: 2 > rcx: 2 r8: 0 r9: ffffff0017a51c60 > rax: ffffff001782fc60 rbx: 2 rbp: ffffff001782fb10 > r10: e10377c748 r11: ffffff0000000000 r12: ffffff03dc84fcfc > r13: ffffff0000000000 r14: ffffff0000000000 r15: 10 > fsb: 0 gsb: ffffff03e1d03ac0 ds: 4b > es: 4b fs: 0 gs: 1c3 > trp: e err: 0 rip: fffffffffb863374 > cs: 30 rfl: 10286 rsp: ffffff001782fab8 > ss: 38 > > ffffff001782f8a0 unix:die+dd () > ffffff001782f9b0 unix:trap+177b () > ffffff001782f9c0 unix:cmntrap+e6 () > ffffff001782fb10 unix:mutex_owner_running+14 () > ffffff001782fb40 ips:ips_remove_busy_command+27 () > ffffff001782fb80 ips:ips_finish_io_request+a8 () > ffffff001782fbb0 ips:ips_intr+7b () > ffffff001782fc00 unix:av_dispatch_autovect+7c () > ffffff001782fc40 unix:dispatch_hardint+33 () > ffffff0018517580 unix:switch_sp_and_call+13 () > ffffff00185175d0 unix:do_interrupt+b8 () > ffffff00185175e0 unix:_interrupt+b8 () > ffffff00185176e0 genunix:kmem_free+34 () > ffffff0018517710 zfs:zio_pop_transforms+86 () > ffffff0018517780 zfs:zio_done+152 () > ffffff00185177b0 zfs:zio_execute+8d () > ffffff0018517810 zfs:zio_notify_parent+a6 () > ffffff0018517880 zfs:zio_done+3e2 () > ffffff00185178b0 zfs:zio_execute+8d () > ffffff0018517910 zfs:zio_notify_parent+a6 () > ffffff0018517980 zfs:zio_done+3e2 () > ffffff00185179b0 zfs:zio_execute+8d () > ffffff0018517a10 zfs:zio_notify_parent+a6 () > ffffff0018517a80 zfs:zio_done+3e2 () > ffffff0018517ab0 zfs:zio_execute+8d () > ffffff0018517b50 genunix:taskq_thread+248 () > ffffff0018517b60 unix:thread_start+8 () > > syncing file systems... done > dumping to /dev/zvol/dsk/syspool/dump, offset 65536, content: kernel + curproc > 0% done: 0 pages dumped, dump failed: error 5 > rebooting... >As I read this, it''s probably a bug in the IPS driver. But I really don''t know anything about kernel panic''s. This seems 100% reproducible, so I''m happy to run more tests in KDB if it will help. As I''ve mentioned before I''d be happy to try to work on the code myself if it were available. Anyone have any ideas? -Kyle On 7/7/2010 3:12 PM, Kyle McDonald wrote:> On 6/24/2010 6:31 PM, James C. McPherson wrote: > > >> hi Kyle, >> the serveraid driver was only ever a community effort; the >> fact that it was done by a Sun engineer is actually irrelevant :-) > > > Hi James, > >> That engineer has since left Sun, and I''ve obtained the source >> for it from him. I just don''t have any time to work on it at >> the moment. > > One of the reasons that I asked originally, was because of these > messages I see during boot: > > WARNING: mutex_init: ffffff04e4724d06 is not 8 byte aligned; caller > ips_attach+22b in module ips. This is unsupported and may cause a panic. > Please report this to the kernel module supplier. > WARNING: mutex_init: ffffff04e4724d0e is not 8 byte aligned; caller > ips_attach+264 in module ips. This is unsupported and may cause a panic. > Please report this to the kernel module supplier. > WARNING: mutex_init: ffffff04e4724cec is not 8 byte aligned; caller > ips_attach+27e in module ips. This is unsupported and may cause a panic. > Please report this to the kernel module supplier. > WARNING: mutex_init: ffffff04e4724e12 is not 8 byte aligned; caller > ips_attach+29e in module ips. This is unsupported and may cause a panic. > Please report this to the kernel module supplier. > WARNING: mutex_init: ffffff04e4724cfc is not 8 byte aligned; caller > ips_attach+2b8 in module ips. This is unsupported and may cause a panic. > Please report this to the kernel module supplier. > > I''m sure it''s likely that this isn''t as striaght forward as I''m hoping, > but I think fixing these messages might be something I can actually do > to help out. :) > > At the same time, I''ve been playing with NexentaStor on another machine > with the same hardware, and when I write to the ZFS filesystem over NFS > the machine quickly becomes unresponsive and eventually needs to be > rebooted. > > I was watching some of the DTrace graphs the Web interface can show and > it seems that ther may be a memory leak. Originally when I asked about > these problems on the ZFS list someone from Nextenta suggested that it > very likely might be the ''ips'' driver that''s causing the problems. > > So Id like to investigate that too if I could, but it''s nearly > impossible if you can''t release the code. > >> Apart from lack of time, I don''t have any hardware to test >> it on, so I cannot work on getting it integrated into ON - all >> drivers require testability within our test suite framework. > > > I understand it can''t be integrated into ON, which was why I was hoping > to get it in the ''contrib'' repository (outside of ON). I''m also > currently trying out the NextentaCore distribution, but what I''d like to > do is use the Distribution Constructor to be able to pull ''ips'' from the > contrib repository and make my own install image. > > I''m pretty sure I can''t (easily) get DC to put the SVR4 packages from > the website into the image - so getting this into some PKG repository > would also be a big help. > >> I will investigate the license on the code and see whether it >> is possible to make it available. > > I can relate to being busy, that''s for sure. So I greatly appreciate any > time you can find to put into this. Thanks so much! > > -Kyle > > > >> James C. McPherson >> -- >> Senior Software Engineer, Solaris >> Oracle >> http://www.jmcp.homeunix.com/blog
Garrett D''Amore
2010-Jul-09 22:24 UTC
[zfs-discuss] ZFS, IPS (IBM ServeRAID) driver, and a kernel panic...
First off, you need to test 3.0.3 if you''re using dedup. Earlier versions had an unduly large number of issues when used with dedup. Hopefully with 3.0.3 we''ve got the bulk of the problems resolved. ;-) Secondly, from your stack backtrace, yes, it appears ips is implicated. If I had source for ips, I might be better able to help you out. - Garrett On Fri, 2010-07-09 at 18:08 -0400, Kyle McDonald wrote:> Hi, > > > I have been trying out the latest NextentaCore and NexentaStor Community > ed. builds (they have the driver I need built in) on the hardware I have > with this controller. > > The only difference between the 2 machines is that the ''Core'' machine > has 16GB of RAM and the ''Stor'' one has 12GB. > > On both machines I did the following: > > 1) Created zpool consisting of a single RaidZ from 5 300GB U320 10K > drives. > 2) Created 4 filesystems in the pool. > 3) On the 4 filesystems I set the dedup and compression properties > to cover all the combinations. (off/off, off/on, on/off, and > on/on) > > On the ''Stor'' machine I elected to Disable the ZIL and cacheflush > through the web GUI. I didn''t do this on the ''Core'' machine. > > On the ''Core'' machine I mounted the 4 Filesystems from the ''Stor'' > machine via NFSv4. > > Now for a bit of history. > > I tried out the ''Stor'' machine in this exact config (but with ZIL and > Cache flushes on) about a month ago with version 3.0.2. At that time I > used a Linux NFS client to time untar''ing the GCC sources to each of the > 4 filesystems. This test repeatedly failed on the first filesystem by > bringing the machine to it''s knees to the point that I had to power > cycle it. > > This time around I decided to use the ''Core'' machine as the client so I > could also time the same test to it''s local ZFS filesystems. > > At first I got my hopes up, because the test ran to completion (and > rather quickly) locally on the core machine. I then added running it > over NFS to the ''Stor'' machine to the testing. In the beginning I was > untarring it once on each filesystem, and even over NFS this worked > (though slower than I''d hoped for having the ZIL and cacheflush disabled.) > > So I thought I''d push the DeDup a little harder, and I expanded the test > to untar the sources 4 times per filesystem. This ran fine until the 4th > NFS filesystem, where the ''Stor'' machine panic''d. The client waited > while it rebooted, and then resumed the test causing it to panic a > second time. For some reason it hung so bad the second time it didn''t > even reboot - I''ll have to power cycle it monday when I get to work. > > The 2 stack traces are identical: > > > anic[cpu3]/thread=ffffff001782fc60: BAD TRAP: type=e (#pf Page fault) rp=ffffff001782f9c0 addr=18 occurred in module "unix" due to a NULL pointer dereference > > > > sched: #pf Page fault > > Bad kernel fault at addr=0x18 > > pid=0, pc=0xfffffffffb863374, sp=0xffffff001782fab8, eflags=0x10286 > > cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f8<xmme,fxsr,pge,mce,pae,pse,de> > > cr2: 18cr3: 5000000cr8: c > > > > rdi: ffffff03dc84fcfc rsi: ffffff03e1d03d98 rdx: 2 > > rcx: 2 r8: 0 r9: ffffff0017a51c60 > > rax: ffffff001782fc60 rbx: 2 rbp: ffffff001782fb10 > > r10: e10377c748 r11: ffffff0000000000 r12: ffffff03dc84fcfc > > r13: ffffff0000000000 r14: ffffff0000000000 r15: 10 > > fsb: 0 gsb: ffffff03e1d03ac0 ds: 4b > > es: 4b fs: 0 gs: 1c3 > > trp: e err: 0 rip: fffffffffb863374 > > cs: 30 rfl: 10286 rsp: ffffff001782fab8 > > ss: 38 > > > > ffffff001782f8a0 unix:die+dd () > > ffffff001782f9b0 unix:trap+177b () > > ffffff001782f9c0 unix:cmntrap+e6 () > > ffffff001782fb10 unix:mutex_owner_running+14 () > > ffffff001782fb40 ips:ips_remove_busy_command+27 () > > ffffff001782fb80 ips:ips_finish_io_request+a8 () > > ffffff001782fbb0 ips:ips_intr+7b () > > ffffff001782fc00 unix:av_dispatch_autovect+7c () > > ffffff001782fc40 unix:dispatch_hardint+33 () > > ffffff0018517580 unix:switch_sp_and_call+13 () > > ffffff00185175d0 unix:do_interrupt+b8 () > > ffffff00185175e0 unix:_interrupt+b8 () > > ffffff00185176e0 genunix:kmem_free+34 () > > ffffff0018517710 zfs:zio_pop_transforms+86 () > > ffffff0018517780 zfs:zio_done+152 () > > ffffff00185177b0 zfs:zio_execute+8d () > > ffffff0018517810 zfs:zio_notify_parent+a6 () > > ffffff0018517880 zfs:zio_done+3e2 () > > ffffff00185178b0 zfs:zio_execute+8d () > > ffffff0018517910 zfs:zio_notify_parent+a6 () > > ffffff0018517980 zfs:zio_done+3e2 () > > ffffff00185179b0 zfs:zio_execute+8d () > > ffffff0018517a10 zfs:zio_notify_parent+a6 () > > ffffff0018517a80 zfs:zio_done+3e2 () > > ffffff0018517ab0 zfs:zio_execute+8d () > > ffffff0018517b50 genunix:taskq_thread+248 () > > ffffff0018517b60 unix:thread_start+8 () > > > > syncing file systems... done > > dumping to /dev/zvol/dsk/syspool/dump, offset 65536, content: kernel + curproc > > 0% done: 0 pages dumped, dump failed: error 5 > > rebooting... > > > > As I read this, it''s probably a bug in the IPS driver. But I really > don''t know anything about kernel panic''s. > > This seems 100% reproducible, so I''m happy to run more tests in KDB if > it will help. As I''ve mentioned before I''d be happy to try to work on > the code myself if it were available. > > Anyone have any ideas? > > -Kyle > > > > On 7/7/2010 3:12 PM, Kyle McDonald wrote: > > On 6/24/2010 6:31 PM, James C. McPherson wrote: > > > > > >> hi Kyle, > >> the serveraid driver was only ever a community effort; the > >> fact that it was done by a Sun engineer is actually irrelevant :-) > > > > > > Hi James, > > > >> That engineer has since left Sun, and I''ve obtained the source > >> for it from him. I just don''t have any time to work on it at > >> the moment. > > > > One of the reasons that I asked originally, was because of these > > messages I see during boot: > > > > WARNING: mutex_init: ffffff04e4724d06 is not 8 byte aligned; caller > > ips_attach+22b in module ips. This is unsupported and may cause a panic. > > Please report this to the kernel module supplier. > > WARNING: mutex_init: ffffff04e4724d0e is not 8 byte aligned; caller > > ips_attach+264 in module ips. This is unsupported and may cause a panic. > > Please report this to the kernel module supplier. > > WARNING: mutex_init: ffffff04e4724cec is not 8 byte aligned; caller > > ips_attach+27e in module ips. This is unsupported and may cause a panic. > > Please report this to the kernel module supplier. > > WARNING: mutex_init: ffffff04e4724e12 is not 8 byte aligned; caller > > ips_attach+29e in module ips. This is unsupported and may cause a panic. > > Please report this to the kernel module supplier. > > WARNING: mutex_init: ffffff04e4724cfc is not 8 byte aligned; caller > > ips_attach+2b8 in module ips. This is unsupported and may cause a panic. > > Please report this to the kernel module supplier. > > > > I''m sure it''s likely that this isn''t as striaght forward as I''m hoping, > > but I think fixing these messages might be something I can actually do > > to help out. :) > > > > At the same time, I''ve been playing with NexentaStor on another machine > > with the same hardware, and when I write to the ZFS filesystem over NFS > > the machine quickly becomes unresponsive and eventually needs to be > > rebooted. > > > > I was watching some of the DTrace graphs the Web interface can show and > > it seems that ther may be a memory leak. Originally when I asked about > > these problems on the ZFS list someone from Nextenta suggested that it > > very likely might be the ''ips'' driver that''s causing the problems. > > > > So Id like to investigate that too if I could, but it''s nearly > > impossible if you can''t release the code. > > > >> Apart from lack of time, I don''t have any hardware to test > >> it on, so I cannot work on getting it integrated into ON - all > >> drivers require testability within our test suite framework. > > > > > > I understand it can''t be integrated into ON, which was why I was hoping > > to get it in the ''contrib'' repository (outside of ON). I''m also > > currently trying out the NextentaCore distribution, but what I''d like to > > do is use the Distribution Constructor to be able to pull ''ips'' from the > > contrib repository and make my own install image. > > > > I''m pretty sure I can''t (easily) get DC to put the SVR4 packages from > > the website into the image - so getting this into some PKG repository > > would also be a big help. > > > >> I will investigate the license on the code and see whether it > >> is possible to make it available. > > > > I can relate to being busy, that''s for sure. So I greatly appreciate any > > time you can find to put into this. Thanks so much! > > > > -Kyle > > > > > > > >> James C. McPherson > >> -- > >> Senior Software Engineer, Solaris > >> Oracle > >> http://www.jmcp.homeunix.com/blog > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >