thr3ads.net - zfs discuss - [zfs-discuss] Bizzare problem with ZFS filesystem [Sep 2006]

If this information is useful, please help other people find it:
Share via:

Anantha N. Srirama

2006-Sep-12 13:21 UTC

[zfs-discuss] Bizzare problem with ZFS filesystem

I''m experiencing a bizzare write performance problem while using a ZFS
filesystem. Here are the relevant facts:

[b]# zpool list[/b]
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
mtdc                   3.27T    502G   2.78T    14%  ONLINE     -
zfspool                68.5G   30.8G   37.7G    44%  ONLINE     -

[b]# zfs list[/b]
NAME                   USED  AVAIL  REFER  MOUNTPOINT
mtdc                   503G  2.73T  24.5K  /mtdc
mtdc/sasmeta           397M   627M   397M  /sasmeta
mtdc/u001             30.5G   226G  30.5G  /u001
mtdc/u002             29.5G   227G  29.5G  /u002
mtdc/u003             29.5G   226G  29.5G  /u003
mtdc/u004             28.4G   228G  28.4G  /u004
mtdc/u005             28.3G   228G  28.3G  /u005
mtdc/u006             29.8G   226G  29.8G  /u006
mtdc/u007             30.1G   226G  30.1G  /u007
mtdc/u008             30.6G   225G  30.6G  /u008
mtdc/u099              266G   502G   266G  /u099
zfspool               30.8G  36.6G  24.5K  /zfspool
zfspool/apps          30.8G  33.2G  28.5G  /apps
zfspool/apps at presqldriver  2.28G      -  29.8G  -
zfspool/home          15.4M  2.98G  15.4M  /home

[b]# zfs list mtdc/u099[/b]
NAME             PROPERTY       VALUE                      SOURCE
mtdc/u099        type           filesystem                 -
mtdc/u099        creation       Thu Aug 17 10:21 2006      -
mtdc/u099        used           267G                       -
mtdc/u099        available      501G                       -
mtdc/u099        referenced     267G                       -
mtdc/u099        compressratio  3.10x                      -
mtdc/u099        mounted        yes                        -
mtdc/u099        quota          768G                       local
mtdc/u099        reservation    none                       default
mtdc/u099        recordsize     128K                       default
mtdc/u099        mountpoint     /u099                      local
mtdc/u099        sharenfs       off                        default
mtdc/u099        checksum       on                         default
mtdc/u099        compression    on                         local
mtdc/u099        atime          off                        local
mtdc/u099        devices        on                         default
mtdc/u099        exec           on                         default
mtdc/u099        setuid         on                         default
mtdc/u099        readonly       off                        default
mtdc/u099        zoned          off                        default
mtdc/u099        snapdir        hidden                     default
mtdc/u099        aclmode        groupmask                  default
mtdc/u099        aclinherit     secure                     default

[b]No error messages listed by zpool or /var/opt/messages.[/b] When I try to
save a file the operation takes an inordinate amount of time, in the 30+ second
range!!! I truss''d the vi session to see the hangup and it waits at the
write system call.

# truss -p <pid>
read(0, 0xFFBFD0AF, 1)          (sleeping...)
read(0, " w", 1)                                = 1
write(1, " w", 1)                               = 1
read(0, " q", 1)                                = 1
write(1, " q", 1)                               = 1
read(0, 0xFFBFD00F, 1)          (sleeping...)
read(0, "\r", 1)                                = 1
ioctl(0, I_STR, 0x000579F8)                     Err#22 EINVAL
write(1, "\r", 1)                               = 1
write(1, " " d e l e t e m e "", 10)            = 10
stat64("deleteme", 0xFFBFCFA0)                  = 0
creat("deleteme", 0666)                         = 4
ioctl(2, TCSETSW, 0x00060C10)                   = 0
[b]write(4, " l f f j d\n", 6)                     = 6[/b] <----
still waiting while I type this message!!

This problem manifests itself only on this filesystem and not on the other ZFS
filesystems on the same server built from the same ZFS pool. While I was
awaiting completion of the above write I was able to start a new vi session in
another window and saved a file to the /u001 filesystem without any problem.
System loads are very low. Can anybody comment on this bizzare behavior?
 
 
This message posted from opensolaris.org

Matthew Ahrens

2006-Sep-12 18:05 UTC

head link

[zfs-discuss] Bizzare problem with ZFS filesystem

Anantha N. Srirama wrote:> I''m experiencing a bizzare write performance problem while using a
> ZFS filesystem. Here are the relevant facts:
> 
> [b]No error messages listed by zpool or /var/opt/messages.[/b] When I
> try to save a file the operation takes an inordinate amount of time,
> in the 30+ second range!!! I truss''d the vi session to see the
hangup
> and it waits at the write system call.
> 
> This problem manifests itself only on this filesystem and not on the
> other ZFS filesystems on the same server built from the same ZFS
> pool. While I was awaiting completion of the above write I was able
> to start a new vi session in another window and saved a file to the
> /u001 filesystem without any problem. System loads are very low. Can
> anybody comment on this bizzare behavior?
Wow, you''re right, that is really bizarre!  If you''re familiar
with mdb,
could you get a kernel stack trace of that process?  Or if not, take a 
crash dump while it is hung and send it to us?  Are there any 
interesting messages printed to the console (eg. SCSI reset timeouts)? 
If you do an administrative command (eg. ''zfs set'') while the
write is
"hung", does it take a long time too?  Can you send us the output of 
''zpool status'', so we can see if there are any I/O errors?

--matt

Anantha N. Srirama

2006-Sep-13 01:08 UTC

head link

[zfs-discuss] Re: Bizzare problem with ZFS filesystem

Here''s the information you requested.

Script started on Tue Sep 12 16:46:46 2006
# uname -a
SunOS umt1a-bio-srv2 5.10 Generic_118833-18 sun4u sparc SUNW,Netra-T12
# prtdiag
System Configuration: Sun Microsystems  sun4u Sun Fire E2900
System clock frequency: 150 MHZ
Memory size: 96GB       

======================================= CPUs
======================================               E$          CPU            
CPU
CPU      Freq      Size        Implementation       Mask    Status      Location
-------  --------  ----------  -------------------  -----   ------      --------
  0,512  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB0/P0
  1,513  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB0/P1
  2,514  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB0/P2
  3,515  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB0/P3
  8,520  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB2/P0
  9,521  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB2/P1
 10,522  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB2/P2
 11,523  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB2/P3
 16,528  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB4/P0
 17,529  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB4/P1
 18,530  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB4/P2
 19,531  1500 MHz  32MB        SUNW,UltraSPARC-IV+   2.1    on-line     SB4/P3

# md    mdb -k
(B)0Loading modules: [ unix krtld genunix dtrace specfs ufs sd sgsbbc md sgenv
ip sctp usba fcp fctl qlc nca ssd lofs zfs random crypto ptm nfs ipc logindmux
cpc sppp fcip wrsmd ]> arc::stat    print{
    anon = ARC_anon
    mru = ARC_mru
    mru_ghost = ARC_mru_ghost
    mfu = ARC_mfu
    mfu_ghost = ARC_mfu_ghost
    size = 0x11917e1200
    p = 0x116e8a1a40
    c = 0x11917cf428
    c_min = 0xbf77c800
    c_max = 0x17aef90000
    hits = 0x489737a8
    misses = 0x8869917
    deleted = 0xc832650
    skipped = 0x15b29b2
    hash_elements = 0x1273d0
    hash_elements_max = 0x17576f
    hash_collisions = 0x4e0ceee
    hash_chains = 0x3a9b2
Segmentation Fault - core dumped
# mdb -k
(B)0Loading modules: [ unix krtld genunix dtrace specfs ufs sd sgsbbc md sgenv
ip sctp usba fcp fctl qlc nca ssd lofs zfs random crypto ptm nfs ipc logindmux
cpc sppp fcip wrsmd ]> ::kmastat
> ::pgrep vi | ::walk thread
3086600f660> :[K> 3086600f660::findstackstack pointer for thread 3086600f660: 2a104598d91
[ 000002a104598d91 cv_wait_sig+0x114() ]
  000002a104598e41 str_cv_wait+0x28()
  000002a104598f01 strwaitq+0x238()
  000002a104598fc1 strread+0x174()
  000002a1045990a1 fop_read+0x20()
  000002a104599161 read+0x274()
  000002a1045992e1 syscall_trap32+0xcc()> 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598e61> 3086600f660::findstackstack pointer for thread 3086600f660: 2a104598e61
  000002a104598f61 zil_lwb_commit+0x1ac()
  000002a104599011 zil_commit+0x1b0()
  000002a1045990c1 zfs_fsync+0xa8()
  000002a104599171 fop_fsync+0x14()
  000002a104599231 fdsync+0x20()
  000002a1045992e1 syscall_trap32+0xcc()> 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598c71> 3086600f660::findstackstack pointer for thread 3086600f660: 2a104598e61
  000002a104598f61 zil_lwb_commit+0x1ac()
  000002a104599011 zil_commit+0x1b0()
  000002a1045990c1 zfs_fsync+0xa8()
  000002a104599171 fop_fsync+0x14()
  000002a104599231 fdsync+0x20()
  000002a1045992e1 syscall_trap32+0xcc()> 3086600f660::findstackstack pointer for thread 3086600f660: 2a104598e61
  000002a104598f61 zil_lwb_commit+0x1ac()
  000002a104599011 zil_commit+0x1b0()
  000002a1045990c1 zfs_fsync+0xa8()
  000002a104599171 fop_fsync+0x14()
  000002a104599231 fdsync+0x20()
  000002a1045992e1 syscall_trap32+0xcc()> 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598e61> 3086600f660::findstack
stack pointer for thread 3086600f660: 2a104598bb1> 3086600f660::findstackstack pointer for thread 3086600f660: 2a104598e61
  000002a104598f61 zil_lwb_commit+0x1ac()
  000002a104599011 zil_commit+0x1b0()
  000002a1045990c1 zfs_fsync+0xa8()
  000002a104599171 fop_fsync+0x14()
  000002a104599231 fdsync+0x20()
  000002a1045992e1 syscall_trap32+0xcc()> 3086600f660::findstackstack pointer for thread 3086600f660 (TS_FREE): 2a104598ba1
  000002a104598fe1 segvn_unmap+0x1b8()
  000002a1045990d1 as_free+0xf4()
  000002a104599181 proc_exit+0x46c()
  000002a104599231 exit+8()
  000002a1045992e1 syscall_trap32+0xcc()
# df -h
Filesystem             size   used  avail capacity  Mounted on
/dev/md/dsk/d10         32G   6.7G    25G    22%    /
/devices                 0K     0K     0K     0%    /devices
ctfs                     0K     0K     0K     0%    /system/contract
proc                     0K     0K     0K     0%    /proc
mnttab                   0K     0K     0K     0%    /etc/mnttab
swap                    13G  1000K    13G     1%    /etc/svc/volatile
objfs                    0K     0K     0K     0%    /system/object
/platform/sun4u-us3/lib/libc_psr/libc_psr_hwcap2.so.1
                        32G   6.7G    25G    22%   
/platform/sun4u-us3/lib/libc_psr.so.1
/platform/sun4u-us3/lib/sparcv9/libc_psr/libc_psr_hwcap2.so.1
                        32G   6.7G    25G    22%   
/platform/sun4u-us3/lib/sparcv9/libc_psr.so.1
fd                       0K     0K     0K     0%    /dev/fd
/dev/md/dsk/d30         20G   956M    19G     5%    /var
swap                   512M   952K   511M     1%    /tmp
swap                    13G    40K    13G     1%    /var/run
zfspool/apps            64G    29G    33G    47%    /apps
zfspool/home           3.0G    15M   3.0G     1%    /home
zfspool                 67G    24K    37G     1%    /zfspool
mtdc                   3.2T    24K   2.8T     1%    /mtdc
mtdc/u001              256G    30G   226G    12%    /u001
mtdc/u002              256G    29G   227G    12%    /u002
mtdc/u003              256G    29G   227G    12%    /u003
mtdc/u004              256G    28G   228G    12%    /u004
mtdc/u005              256G    28G   228G    12%    /u005
mtdc/u006              256G    30G   226G    12%    /u006
mtdc/u007              256G    30G   226G    12%    /u007
mtdc/u008              256G    31G   225G    12%    /u008
mtdc/u099              768G   196G   572G    26%    /u099
mtdc/sasmeta           1.0G   398M   626M    39%    /sasmeta
# zfs list
NAME                   USED  AVAIL  REFER  MOUNTPOINT
mtdc                   434G  2.80T  24.5K  /mtdc
mtdc/sasmeta           398M   626M   398M  /sasmeta
mtdc/u001             30.5G   226G  30.5G  /u001
mtdc/u002             29.5G   227G  29.5G  /u002
mtdc/u003             29.5G   227G  29.5G  /u003
mtdc/u004             28.3G   228G  28.3G  /u004
mtdc/u005             28.4G   228G  28.4G  /u005
mtdc/u006             29.9G   226G  29.9G  /u006
mtdc/u007             30.2G   226G  30.2G  /u007
mtdc/u008             30.6G   225G  30.6G  /u008
mtdc/u099              196G   572G   196G  /u099
zfspool               30.8G  36.6G  24.5K  /zfspool
zfspool/apps          30.8G  33.2G  28.5G  /apps
zfspool/apps at presqldriver  2.28G      -  29.8G  -
zfspool/home          15.4M  2.98G  15.4M  /home
# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
mtdc                   3.27T    434G   2.85T    12%  ONLINE     -
zfspool                68.5G   30.8G   37.7G    44%  ONLINE     -
# tail -10 /var/adm/messages
Sep 12 16:01:21 umt1a-bio-srv2 su: [ID 366847 auth.info] ''su
root'' succeeded for erw4 on /dev/pts/11
Sep 12 16:02:26 umt1a-bio-srv2 su: [ID 366847 auth.info] ''su
adm'' succeeded for erw4 on /dev/pts/11
Sep 12 16:19:33 umt1a-bio-srv2 sshd[3210]: [ID 800047 auth.info] Connection
closed by 158.111.74.113
Sep 12 16:36:38 umt1a-bio-srv2 genunix: [ID 603404 kern.notice] NOTICE:
core_log: mdb[3641] core dumped:
/var/core/core_umt1a-bio-srv2_mdb_0_0_1158093395_3641
Sep 12 16:36:55 umt1a-bio-srv2 login: [ID 644210 auth.notice] ROOT LOGIN
/dev/console
Sep 12 16:38:59 umt1a-bio-srv2 sshd[3710]: [ID 800047 auth.info] Accepted
keyboard-interactive for cix4 from 158.111.30.168 port 1943 ssh2
Sep 12 16:38:59 umt1a-bio-srv2 sshd[3714]: [ID 800047 auth.info] subsystem
request for sftp
Sep 12 16:41:49 umt1a-bio-srv2 genunix: [ID 603404 kern.notice] NOTICE:
core_log: mdb[3770] core dumped:
/var/core/core_umt1a-bio-srv2_mdb_0_0_1158093706_3770
Sep 12 16:42:10 umt1a-bio-srv2 genunix: [ID 603404 kern.notice] NOTICE:
core_log: mdb[3778] core dumped:
/var/core/core_umt1a-bio-srv2_mdb_0_0_1158093727_3778
Sep 12 16:47:28 umt1a-bio-srv2 genunix: [ID 603404 kern.notice] NOTICE:
core_log: mdb[3964] core dumped:
/var/core/core_umt1a-bio-srv2_mdb_0_0_1158094046_3964
# rm /var/c         ^D
 
 
This message posted from opensolaris.org

Wee Yeh Tan

2006-Sep-13 03:26 UTC

head link

[zfs-discuss] Re: Bizzare problem with ZFS filesystem

Anantha,

How''s the output of:
dtrace -F -n ''fbt:zfs::/pid==<pid of stuck
process>/{trace(timestamp)}''


-- 
Just me,
Wire ...


On 9/13/06, Anantha N. Srirama <anantha.srirama at cdc.hhs.gov>
wrote:> Here''s the information you requested.

Anantha N. Srirama

2006-Sep-13 15:42 UTC

head link

[zfs-discuss] Re: Re: Bizzare problem with ZFS filesystem

I ran the DTrace script and the resulting output is rather large (1 million
lines and 65MB), so I won''t burden this forum with that much data. Here
are the top 100 lines from the DTrace output. Let me know if you need the full
output and I''ll figure out a way for the group to get it.

dtrace: description ''fbt:zfs::'' matched 2404 probes
CPU FUNCTION
520  -> zfs_lookup                             2929705866442880
520    -> zfs_zaccess                          2929705866448160
520      -> zfs_zaccess_common                 2929705866451840
520        -> zfs_acl_node_read                2929705866455040
520          -> zfs_acl_node_read_internal     2929705866458400
520            -> zfs_acl_alloc                2929705866461040
520            <- zfs_acl_alloc                2929705866462880
520          <- zfs_acl_node_read_internal     2929705866464080
520        <- zfs_acl_node_read                2929705866465600
520        -> zfs_ace_access                   2929705866467760
520        <- zfs_ace_access                   2929705866468880
520        -> zfs_ace_access                   2929705866469520
520        <- zfs_ace_access                   2929705866470320
520        -> zfs_acl_free                     2929705866471920
520        <- zfs_acl_free                     2929705866472960
520      <- zfs_zaccess_common                 2929705866474720
520    <- zfs_zaccess                          2929705866476320
520    -> zfs_dirlook                          2929705866478320
520      -> zfs_dirent_lock                    2929705866480880
520      <- zfs_dirent_lock                    2929705866486560
520      -> zfs_dirent_unlock                  2929705866489840
520      <- zfs_dirent_unlock                  2929705866491600
520    <- zfs_dirlook                          2929705866492560
520  <- zfs_lookup                             2929705866494080
520  -> zfs_getattr                            2929705866499360
520    -> dmu_object_size_from_db              2929705866503520
520    <- dmu_object_size_from_db              2929705866507920
520  <- zfs_getattr                            2929705866509280
520  -> zfs_lookup                             2929705866520400
520    -> zfs_zaccess                          2929705866521200
520      -> zfs_zaccess_common                 2929705866521920
520        -> zfs_acl_node_read                2929705866523280
520          -> zfs_acl_node_read_internal     2929705866524800
520            -> zfs_acl_alloc                2929705866526000
520            <- zfs_acl_alloc                2929705866526800
520          <- zfs_acl_node_read_internal     2929705866527280
520        <- zfs_acl_node_read                2929705866528160
520        -> zfs_ace_access                   2929705866528720
520        <- zfs_ace_access                   2929705866529280
520        -> zfs_ace_access                   2929705866529920
520        <- zfs_ace_access                   2929705866530800
520        -> zfs_acl_free                     2929705866531360
520        <- zfs_acl_free                     2929705866531920
520      <- zfs_zaccess_common                 2929705866532560
520    <- zfs_zaccess                          2929705866533440
520    -> zfs_dirlook                          2929705866534000
520      -> zfs_dirent_lock                    2929705866534640
520      <- zfs_dirent_lock                    2929705866535600
520      -> zfs_dirent_unlock                  2929705866536480
520      <- zfs_dirent_unlock                  2929705866537120
520    <- zfs_dirlook                          2929705866537760
520  <- zfs_lookup                             2929705866538400
520  -> zfs_getsecattr                         2929705866543600
520    -> zfs_getacl                           2929705866546240
520      -> zfs_zaccess                        2929705866546960
520        -> zfs_zaccess_common               2929705866547680
520          -> zfs_acl_node_read              2929705866548720
520            -> zfs_acl_node_read_internal   2929705866549440
520              -> zfs_acl_alloc              2929705866550080
520              <- zfs_acl_alloc              2929705866550720
520            <- zfs_acl_node_read_internal   2929705866551600
520          <- zfs_acl_node_read              2929705866552160
520          -> zfs_ace_access                 2929705866552720
520          <- zfs_ace_access                 2929705866553280
520          -> zfs_ace_access                 2929705866554160
520          <- zfs_ace_access                 2929705866554720
520          -> zfs_ace_access                 2929705866555600
520          <- zfs_ace_access                 2929705866556160
520          -> zfs_ace_access                 2929705866557040
520          <- zfs_ace_access                 2929705866557600
520          -> zfs_ace_access                 2929705866558160
520          <- zfs_ace_access                 2929705866558720
520          -> zfs_ace_access                 2929705866559760
520          <- zfs_ace_access                 2929705866560320
520          -> zfs_acl_free                   2929705866560880
520          <- zfs_acl_free                   2929705866561440
520        <- zfs_zaccess_common               2929705866562400
520      <- zfs_zaccess                        2929705866562960
520    <- zfs_getacl                           2929705866564560
520  <- zfs_getsecattr                         2929705866566000
520  -> zfs_getattr                            2929705866568880
520    -> dmu_object_size_from_db              2929705866569680
520    <- dmu_object_size_from_db              2929705866570240
520  <- zfs_getattr                            2929705866570880
520  -> zfs_create                             2929705866573120
520    -> zfs_dirent_lock                      2929705866574320
520    <- zfs_dirent_lock                      2929705866575200
520    -> zfs_zaccess_rwx                      2929705866576960
 
 
This message posted from opensolaris.org

Anantha N. Srirama

2006-Sep-13 21:58 UTC

head link

[zfs-discuss] Re: Bizzare problem with ZFS filesystem

One more piece of information. I was able to ascertain the slowdown happens only
when ZFS is used heavily; meaning lots of inflight I/O. This morning when the
system was quiet my writes to the /u099 filesystem was excellent and it has gone
south like I reported earlier.

I am currently awaiting the completion of a write to /u099, well over 60
seconds. At the same time I was able create/save files in /u001 without any
problems. The only difference between the /u001 and /u099 is the size of the
filesystem (256GB vs 768GB).

Per your suggestion I ran a ''zfs set'' command and it completed
after a wait of around 20 seconds while my file save from vi against /u099 is
still pending!!!
 
 
This message posted from opensolaris.org

Neil Perrin

2006-Sep-15 17:35 UTC

head link

[zfs-discuss] Re: Bizzare problem with ZFS filesystem

It is highly likely you are seeing a duplicate of:

6413510 zfs: writing to ZFS filesystem slows down fsync() on
         other files in the same FS

which was fixed recently in build 48 on Nevada.
The symptoms are very similar. That is a fsync from the vi would, prior
to the bug being fixed, have to force out all other data through the
intent log.

Neil.


Anantha N. Srirama wrote On 09/13/06 15:58,:> One more piece of information. I was able to ascertain the slowdown happens
only when ZFS is used heavily; meaning lots of inflight I/O. This morning when
the system was quiet my writes to the /u099 filesystem was excellent and it has
gone south like I reported earlier.
> 
> I am currently awaiting the completion of a write to /u099, well over 60
seconds. At the same time I was able create/save files in /u001 without any
problems. The only difference between the /u001 and /u099 is the size of the
filesystem (256GB vs 768GB).
> 
> Per your suggestion I ran a ''zfs set'' command and it
completed after a wait of around 20 seconds while my file save from vi against
/u099 is still pending!!!
>  
>  
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Wee Yeh Tan

2006-Sep-16 17:38 UTC

head link

[zfs-discuss] Re: Re: Bizzare problem with ZFS filesystem

Anantha,

I was hoping to see a lot less trace records than that.  Was DTrace
running the whole time or did you start it just before you saw the
problem?

Can you sieve thru the trace to see if you can see any subsequent
firings whose timestamp differences are big? (e.g. > 1s).

You can try this awk script:
  awk ''{ if ($4 - lastts > 1000000000) { print last,($4 - lastts);
print $0 } last=$0; lastts = $4}'' < trace_file

This should zoom down on the exact function that is stalling the write.

It does seem from your mdb traces that the issue is something to do
with the intent log.

On 9/13/06, Anantha N. Srirama <anantha.srirama at cdc.hhs.gov>
wrote:> I ran the DTrace script and the resulting output is rather large (1 million
lines and 65MB), so I won''t burden this forum with that much data. Here
are the top 100 lines from the DTrace output. Let me know if you need the full
output and I''ll figure out a way for the group to get it.
>
> dtrace: description ''fbt:zfs::'' matched 2404 probes
> CPU FUNCTION
> 520  -> zfs_lookup                             2929705866442880
> 520    -> zfs_zaccess                          2929705866448160
> 520      -> zfs_zaccess_common                 2929705866451840
> 520        -> zfs_acl_node_read                2929705866455040...

-- 
Just me,
Wire ...

Anantha N. Srirama

2006-Sep-18 15:43 UTC

head link

[zfs-discuss] Re: Re: Bizzare problem with ZFS filesystem

I don''t see a patch for this on the SunSolve website. I''ve
opened a service request to get this patch for Sol10 06/06. Stay tuned.
 
 
This message posted from opensolaris.org

zfs discuss - Sep 2006 - Bizzare problem with ZFS filesystem

[zfs-discuss] Bizzare problem with ZFS filesystem

[zfs-discuss] Bizzare problem with ZFS filesystem

[zfs-discuss] Re: Bizzare problem with ZFS filesystem

[zfs-discuss] Re: Bizzare problem with ZFS filesystem

[zfs-discuss] Re: Re: Bizzare problem with ZFS filesystem

[zfs-discuss] Re: Bizzare problem with ZFS filesystem

[zfs-discuss] Re: Bizzare problem with ZFS filesystem

[zfs-discuss] Re: Re: Bizzare problem with ZFS filesystem

[zfs-discuss] Re: Re: Bizzare problem with ZFS filesystem