thr3ads.net - zfs discuss - [zfs-discuss] What causes slow performance under load? [Apr 2009]

If this information is useful, please help other people find it:
Share via:

Gary Mills

2009-Apr-18 21:27 UTC

[zfs-discuss] What causes slow performance under load?

We have an IMAP server with ZFS for mailbox storage that has recently
become extremely slow on most weekday mornings and afternoons.  When
one of these incidents happens, the number of processes increases, the
load average increases, but ZFS I/O bandwidth decreases.  Users notice
very slow response to IMAP requests.  On the server, even `ps'' becomes
slow.

We''ve tried a number of things, each of which made an improvement, but
the problem still occurs.  The ZFS ARC size was about 10 GB, but was
diminishing to 1 GB when the server was busy.  In fact, it was
unusable when that happened.  Upgrading memory from 16 GB to 64 GB
certainly made a difference.  The ARC size is always over 30 GB now.
Next, we limited the number of `lmtpd'' (local delivery) processes to
64.  With those two changes, the server still became very slow at busy
times, but no longer became unresponsive.  The final change was to
disable ZFS prefetch.  It''s not clear if this made an improvement.

The server is a T2000 running Solaris 10.  It''s a Cyrus murder back-
end, essentially only an IMAP server.  We did recently upgrade the
front-end, from a 4-CPU SPARC box to a 16-core Intel box with more
memory, also running Solaris 10.  The front-end runs sendmail and
proxies IMAP and POP connections to the back-end, and also forwards
SMTP for local deliveries to the back-end, using LMTP.

Cyrus runs thousands of `imapd'' processes, with many `pop3d'',
and
`lmtpd'' processes as well.  This should be an ideal workload for a
Niagara box.  All of these memory-map several moderate-sized
databases, both Berkeley DB and skiplist types, and occasionally
update those databases.  Our EMC Networker client also often runs
during the day, doing backups.  All of the IMAP mailboxes reside on
six ZFS filesystems, using a single 2-TB pool.  It''s only 51% occupied
at the moment.

Many other layers are involved in this server.  We use scsi_vhci for
redundant I/O paths and Sun''s Iscsi initiator to connect to the
storage on our Netapp filer.  The kernel plays a part as well.  How
do we determine which layer is responsible for the slow performance?

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Richard Elling

2009-Apr-19 01:06 UTC

head link

[zfs-discuss] What causes slow performance under load?

[CC''ed to perf-discuss]

Gary Mills wrote:> We have an IMAP server with ZFS for mailbox storage that has recently
> become extremely slow on most weekday mornings and afternoons.  When
> one of these incidents happens, the number of processes increases, the
> load average increases, but ZFS I/O bandwidth decreases.  Users notice
> very slow response to IMAP requests.  On the server, even `ps''
becomes
> slow.
>
> We''ve tried a number of things, each of which made an improvement,
but
> the problem still occurs.  The ZFS ARC size was about 10 GB, but was
> diminishing to 1 GB when the server was busy.  In fact, it was
> unusable when that happened.  Upgrading memory from 16 GB to 64 GB
> certainly made a difference.  The ARC size is always over 30 GB now.
> Next, we limited the number of `lmtpd'' (local delivery) processes
to
> 64.  With those two changes, the server still became very slow at busy
> times, but no longer became unresponsive.  The final change was to
> disable ZFS prefetch.  It''s not clear if this made an improvement.
>   
If memory is being stolen from the ARC, then the consumer must be outside
of ZFS.  I think this is a case for a traditional performance assessment.
> The server is a T2000 running Solaris 10.  It''s a Cyrus murder
back-
> end, essentially only an IMAP server.  We did recently upgrade the
> front-end, from a 4-CPU SPARC box to a 16-core Intel box with more
> memory, also running Solaris 10.  The front-end runs sendmail and
> proxies IMAP and POP connections to the back-end, and also forwards
> SMTP for local deliveries to the back-end, using LMTP.
>
> Cyrus runs thousands of `imapd'' processes, with many
`pop3d'', and
> `lmtpd'' processes as well.  This should be an ideal workload for a
> Niagara box.  All of these memory-map several moderate-sized
> databases, both Berkeley DB and skiplist types, and occasionally
> update those databases.  Our EMC Networker client also often runs
> during the day, doing backups.  All of the IMAP mailboxes reside on
> six ZFS filesystems, using a single 2-TB pool.  It''s only 51%
occupied
> at the moment.
>
> Many other layers are involved in this server.  We use scsi_vhci for
> redundant I/O paths and Sun''s Iscsi initiator to connect to the
> storage on our Netapp filer.  The kernel plays a part as well.  How
> do we determine which layer is responsible for the slow performance?
>
>   
prstat is your friend.  Find out who is consuming the resources and work
from there.

I''ve found that it often makes sense to create processor sets and
segregate
dissimilar apps into different processor sets. mpstat can then clearly show
how each processor set consumes its processors.  IMAP workloads can
be very tricky, because of the sort of I/O generated and because IMAP
allows searching to be done on the server, rather than the client (eg POP)
 -- richard

Gary Mills

2009-Apr-19 01:36 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sat, Apr 18, 2009 at 05:25:17PM -0500, Bob Friesenhahn
wrote:> On Sat, 18 Apr 2009, Gary Mills wrote:
> 
> >How do we determine which layer is responsible for the slow 
> >performance?
> 
> If the ARC size is diminishing under heavy load then there must be 
> excessive pressure for memory from the kernel or applications.  A 30GB 
> ARC is quite large.  The slowdown likely increases the amount of RAM 
> needed since more simultaneous requests are taking place at once and 
> not completing as quickly as they should.  Once the problem starts, it 
> makes itself worse.
It was diminishing under load when the server had only 16 GB of
memory.  There certainly was pressure then, so much so that the server
became unresponsive.  Once we upgraded that to 64 GB, the ARC size
stayed high.  I gather then that there''s no longer pressure for memory
by any of the components that might need it.
> It is good to make sure that the backup software is not the initial 
> cause of the cascade effect.
The backup is also very slow, often running for 24 hours.  Since it''s
spending most of its time reading files, I assume that it must be
cycling a cache someplace.  I don''t know if it''s suffering
from the
same performance problem or if it''s interfering with the IMAP service.
Certainly, killing the backup doesn''t seem to provide any relief.  I
don''t like the idea of backups running in the daytime, but I get
overruled in that one.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Gary Mills

2009-Apr-19 02:01 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sat, Apr 18, 2009 at 06:53:30PM -0400, Ellis, Mike
wrote:> In case the writes are a problem: When zfs sends a sync-command to
> the iscsi luns, does the netapp just ack it, or does it wait till it
> fully destages? Might make sense to disable write/sync in
> /etc/system to be sure.
So far I haven''t been able to get an answer to that question from
Netapp.  I''m assuming that it acks it as soon as it''s in the
Netapp''s
non-volatile write cache.
> In case there is logs of pain on the read side, it could be
> interesting to take 2x 15k rpm drives, and stick them in a
> (hopefully) set of empty slots on the t2000. You don''t have to
> mirror those for safety, just use them both as an l2arc. That could
> potentially help out a lot if data is re-read from those iscsi luns
> a lot. (Essentially helpong out the arc with another say 300gb of
> cache, keeping those iscsi luns lightly loaded...)
I don''t think I can do that with Solaris 10.  However, there
doesn''t
seem to be much ARC activity during a busy period.  Here''s an example.
The load average was around 6, with about 2800 imapd processes:

  # /usr/local/src/zfs/arcstat.pl
      Time  read  miss  miss%  dmis  dm%  pmis  pm%  mmis  mm%  arcsz     c
  10:06:15    1G  109M     10   73M    9   35M   17   52M    6    44G   44G
  10:06:16   995    23      2    23    2     0    0     8    0    44G   44G
  10:06:17    1K    37      3    37    3     0    0    13    1    44G   44G
  10:06:18    1K    81      6    81    6     0    0    22    2    44G   44G
  10:06:19    2K    82      3    82    3     0    0    36    3    44G   44G
  10:06:20    1K   128      7   128    7     0    0    36    3    44G   44G
  # zpool iostat 5 5
                 capacity     operations    bandwidth
  pool         used  avail   read  write   read  write
  ----------  -----  -----  -----  -----  -----  -----
  space       1.04T   974G     87     70  4.62M  2.82M
  space       1.04T   974G     87    404  5.01M  5.58M
  space       1.04T   974G     89    329  5.12M  7.14M
  space       1.04T   974G     93    212  5.17M  3.41M
  space       1.04T   974G    119    178  4.85M  3.23M

The I/O bandwidth is quite low in this instance.  I''ve seen zero
writes at times when it was really slow.
> Speakint of iscsi, how does the network look? Does it get slammed?
> Is something like jumbo packets interesting here?
The Iscsi network is lightly utilized.  It can''t be limited by
network bandwidth.  There could be some other problem, of course.
> Get some data out of fsstat, that could be helpful...
What do I look for with fsstat?  There are so many different
statistics available that I don''t know what''s normal and
what''s not.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Gary Mills

2009-Apr-19 02:27 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sat, Apr 18, 2009 at 09:58:19PM -0400, Ellis, Mike
wrote:> I''ve found that (depending on the backup software) the backup
agents
> tend to run a single thread per filesystem. While that can backup
> several filesystems concurrently, the single filesystem backup is
> single-threaded...
Yes, they do that.  There are two of them running right now, but
together they''re only using 0.6% CPU.  They''re sleeping most
of
the time.
> I assume you''re using zfs snapshots so you don''t get
fuzzy backups
> (over the 20-hour period...)
That''s what I''ve been recommending.  We do have 14 daily
snapshots
available.  I named them by Julian date, but our backups person
doesn''t like them because the names keep changing.
> Can you take a snapshot, and then have your backup software instead
> of backing up 1 entire "fs/tree" backup a bunch of the high-level
> filesystems concurrently? That could make a big difference on
> something like a t2000.
Wouldn''t there be one recent snapshot for each ZFS filesystem? 
We''ve
certainly discussed backing up snapshots, but I wouldn''t expect it to
be much different.  Wouldn''t it still read all of the same files,
except for ones that were added after the snapshot was taken?
> (You''re not by chance using any type of ssh-transfers etc as part
of
> the backups are you)
No, Networker use RPC to connect to the backup server, but there''s no
encryption or compression on the client side.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Tim

2009-Apr-19 02:41 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sat, Apr 18, 2009 at 9:01 PM, Gary Mills <mills at cc.umanitoba.ca>
wrote:
> On Sat, Apr 18, 2009 at 06:53:30PM -0400, Ellis, Mike wrote:
> > In case the writes are a problem: When zfs sends a sync-command to
> > the iscsi luns, does the netapp just ack it, or does it wait till it
> > fully destages? Might make sense to disable write/sync in
> > /etc/system to be sure.
>
> So far I haven''t been able to get an answer to that question from
> Netapp.  I''m assuming that it acks it as soon as it''s in
the Netapp''s
> non-volatile write cache.
>
IIRC, it should just ack it.  What version of ONTAP are you running?

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090418/3565b61b/attachment.html>

Gary Mills

2009-Apr-19 02:55 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sat, Apr 18, 2009 at 06:06:49PM -0700, Richard Elling
wrote:> [CC''ed to perf-discuss]
> 
> Gary Mills wrote:
> >We have an IMAP server with ZFS for mailbox storage that has recently
> >become extremely slow on most weekday mornings and afternoons.  When
> >one of these incidents happens, the number of processes increases, the
> >load average increases, but ZFS I/O bandwidth decreases.  Users notice
> >very slow response to IMAP requests.  On the server, even `ps''
becomes
> >slow.
> 
> If memory is being stolen from the ARC, then the consumer must be outside
> of ZFS.  I think this is a case for a traditional performance assessment.
It was being stolen from the ARC, but once we added memory, that was
no longer the case.  ZFS is still one of the suspects.
> >The server is a T2000 running Solaris 10.  It''s a Cyrus murder
back-
> >end, essentially only an IMAP server.  We did recently upgrade the
> >front-end, from a 4-CPU SPARC box to a 16-core Intel box with more
> >memory, also running Solaris 10.  The front-end runs sendmail and
> >proxies IMAP and POP connections to the back-end, and also forwards
> >SMTP for local deliveries to the back-end, using LMTP.
> >
> >Cyrus runs thousands of `imapd'' processes, with many
`pop3d'', and
> >`lmtpd'' processes as well.  This should be an ideal workload
for a
> >Niagara box.  All of these memory-map several moderate-sized
> >databases, both Berkeley DB and skiplist types, and occasionally
> >update those databases.  Our EMC Networker client also often runs
> >during the day, doing backups.  All of the IMAP mailboxes reside on
> >six ZFS filesystems, using a single 2-TB pool.  It''s only 51%
occupied
> >at the moment.
> >
> >Many other layers are involved in this server.  We use scsi_vhci for
> >redundant I/O paths and Sun''s Iscsi initiator to connect to
the
> >storage on our Netapp filer.  The kernel plays a part as well.  How
> >do we determine which layer is responsible for the slow performance?
> 
> prstat is your friend.  Find out who is consuming the resources and work
> from there.
What resources are visible with prstat, other than CPU and memory?
Even at the busiest times, all of the processes only add up to about
6% of the CPU.  The load average does rise alarmingly.  Nothing is using
large amounts of memory, although with thousands of processes, it would
add up.
> I''ve found that it often makes sense to create processor sets and
segregate
> dissimilar apps into different processor sets. mpstat can then clearly show
> how each processor set consumes its processors.  IMAP workloads can
> be very tricky, because of the sort of I/O generated and because IMAP
> allows searching to be done on the server, rather than the client (eg POP)
What would I look for with mpstat?

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Mike Gerdts

2009-Apr-19 04:45 UTC

head link

[zfs-discuss] What causes slow performance under load?

[perf-discuss cc''d]

On Sat, Apr 18, 2009 at 4:27 PM, Gary Mills <mills at cc.umanitoba.ca>
wrote:> Many other layers are involved in this server. ?We use scsi_vhci for
> redundant I/O paths and Sun''s Iscsi initiator to connect to the
> storage on our Netapp filer. ?The kernel plays a part as well. ?How
> do we determine which layer is responsible for the slow performance?
Have you disabled the nagle algorithm for the iscsi initiator?

http://bugs.opensolaris.org/view_bug.do?bug_id=6772828

Also, you may want to consider doing backups from the NetApp rather
than from the Solaris box.  Assuming all of your LUNs are in the same
volume on the filer, a snapshot should be a crash-consistent image of
the zpool.  You could verify this by making the snapshot rw and trying
to import the snapshotted LUNs on another host.  Anyway, this would
remove the backup-related stress on the T2000.  You can still do
snapshots at the ZFS layer to give you file level restores.  If the
NetApp caught on fire, you would simply need to restore the volume
containing the LUNs (presumably a small collection of large files)
which would go a lot quicker than a large collection of small files.

Since iSCSI is in the mix, you should also be sure that your network
is appropriately tuned.  Assuming that you are using the onboard
e1000g NICs, be sure that none of the "bad" counters are incrementing:

$ kstat -p e1000g | nawk ''$0 ~ /err|drop|fail|no/ && $NF !=
0''

If this gives any output, there is likely something amiss with your network.

The output from "iostat -xCn 10" could be interesting as well.  If
asvc_t is high (>30?), it means the filer is being slow to respond.
If wsvc_t is frequently non-zero, there is some sort of a bottleneck
that prevents the server from sending requests to the filer.  Perhaps
you have tuned ssd_max_throttle or Solaris has backed off because the
filer said to slow down.  (Assuming that ssd is used with iSCSI LUNs).

What else is happening on the filer when mail gets slow?  That is, are
you experiencing slowness due to a mail peak or due to some research
project that happens to be on the same spindles?  What does the
network look like from the NetApp side?

Are the mail server and the NetApp attached to the same switch, or are
they at opposite ends of the campus?  Is there something between them
that is misbehaving?

-- 
Mike Gerdts
http://mgerdts.blogspot.com/

Gary Mills

2009-Apr-19 13:34 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sat, Apr 18, 2009 at 09:41:39PM -0500, Tim wrote:> 
>    On Sat, Apr 18, 2009 at 9:01 PM, Gary Mills <[1]mills at
cc.umanitoba.ca>
>    wrote:
>    
>      On Sat, Apr 18, 2009 at 06:53:30PM -0400, Ellis, Mike wrote:
>      > In case the writes are a problem: When zfs sends a sync-command
>      to
>      > the iscsi luns, does the netapp just ack it, or does it wait till
>      it
>      > fully destages? Might make sense to disable write/sync in
>      > /etc/system to be sure.
>      So far I haven''t been able to get an answer to that question
from
>      Netapp.  I''m assuming that it acks it as soon as
it''s in the
>      Netapp''s
>      non-volatile write cache.
>      
>    IIRC, it should just ack it.  What version of ONTAP are you running?
It seems to be this one:

  MODEL:         FAS3020-R5
  SW VERSION:    7.2.3

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Gary Mills

2009-Apr-19 15:58 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sat, Apr 18, 2009 at 11:45:54PM -0500, Mike Gerdts
wrote:> [perf-discuss cc''d]
> 
> On Sat, Apr 18, 2009 at 4:27 PM, Gary Mills <mills at
cc.umanitoba.ca> wrote:
> > Many other layers are involved in this server. ?We use scsi_vhci for
> > redundant I/O paths and Sun''s Iscsi initiator to connect to
the
> > storage on our Netapp filer. ?The kernel plays a part as well. ?How
> > do we determine which layer is responsible for the slow performance?
> 
> Have you disabled the nagle algorithm for the iscsi initiator?
> 
> http://bugs.opensolaris.org/view_bug.do?bug_id=6772828
I tried that on our test IMAP backend the other day.  It made no
significant difference to read or write times or to ZFS I/O bandwidth.
I conclude that the Iscsi initiator has already sized its TCP packets
to avoid Nagle delays.
> Also, you may want to consider doing backups from the NetApp rather
> than from the Solaris box.
I''ve certainly recommended finding a different way to perform backups.
> Assuming all of your LUNs are in the same
> volume on the filer, a snapshot should be a crash-consistent image of
> the zpool.  You could verify this by making the snapshot rw and trying
> to import the snapshotted LUNs on another host.
That part sounds scary!  The filer exports four LUNs that are combined
into one ZFS pool on the IMAP server.  These LUNs are volumes on the
filer.  How can we safely import them on another host?
> Anyway, this would
> remove the backup-related stress on the T2000.  You can still do
> snapshots at the ZFS layer to give you file level restores.  If the
> NetApp caught on fire, you would simply need to restore the volume
> containing the LUNs (presumably a small collection of large files)
> which would go a lot quicker than a large collection of small files.
Yes, a disaster recovery would be much quicker in this case.
> Since iSCSI is in the mix, you should also be sure that your network
> is appropriately tuned.  Assuming that you are using the onboard
> e1000g NICs, be sure that none of the "bad" counters are
incrementing:
> 
> $ kstat -p e1000g | nawk ''$0 ~ /err|drop|fail|no/ && $NF
!= 0''
> 
> If this gives any output, there is likely something amiss with your
network.
Only this:
    e1000g:0:e1000g0:unknowns       1764449

I don''t know what those are, but it''s e1000g1 and e1000g2 that
are
used for the Iscsi network.
> The output from "iostat -xCn 10" could be interesting as well. 
If
> asvc_t is high (>30?), it means the filer is being slow to respond.
> If wsvc_t is frequently non-zero, there is some sort of a bottleneck
> that prevents the server from sending requests to the filer.  Perhaps
> you have tuned ssd_max_throttle or Solaris has backed off because the
> filer said to slow down.  (Assuming that ssd is used with iSCSI LUNs).
Here''s an example, taken from one of the busy periods:

                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    5.0    0.0    7.7  0.0  0.1    4.1   24.8   1   1 c1t2d0
   27.0   13.8 1523.4  172.9  0.0  0.5    0.0   11.8   0  38
c4t60A98000433469764E4A2D456A644A74d0
   42.0   21.4 2027.3  350.0  0.0  0.9    0.0   13.9   0  60
c4t60A98000433469764E4A2D456A696579d0
   40.8   25.0 1993.5  339.1  0.0  0.8    0.0   11.8   0  52
c4t60A98000433469764E4A476D2F664E4Fd0
   42.0   26.6 1968.4  319.1  0.0  0.8    0.0   11.8   0  56
c4t60A98000433469764E4A476D2F6B385Ad0

The service times seem okay to me.  There''s no `throttle''
setting in
any of the relevant driver conf files.
> What else is happening on the filer when mail gets slow?  That is, are
> you experiencing slowness due to a mail peak or due to some research
> project that happens to be on the same spindles?  What does the
> network look like from the NetApp side?
Our Netapp guy tells me that the filer is operating normally when the
problem occurs.  The Iscsi network is less than 10% utilized.
> Are the mail server and the NetApp attached to the same switch, or are
> they at opposite ends of the campus?  Is there something between them
> that is misbehaving?
I don''t think so.  We have dedicated ethernet ports on both the IMAP
server and the filer for Iscsi, along with a pair of dedicated switches.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Mike Gerdts

2009-Apr-19 18:23 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sun, Apr 19, 2009 at 10:58 AM, Gary Mills <mills at cc.umanitoba.ca>
wrote:> On Sat, Apr 18, 2009 at 11:45:54PM -0500, Mike Gerdts wrote:
>> Also, you may want to consider doing backups from the NetApp rather
>> than from the Solaris box.
>
> I''ve certainly recommended finding a different way to perform
backups.
>
>> Assuming all of your LUNs are in the same
>> volume on the filer, a snapshot should be a crash-consistent image of
>> the zpool. ?You could verify this by making the snapshot rw and trying
>> to import the snapshotted LUNs on another host.
>
> That part sounds scary! ?The filer exports four LUNs that are combined
> into one ZFS pool on the IMAP server. ?These LUNs are volumes on the
> filer. ?How can we safely import them on another host?
This is just like operating on ZFS clones - operations on the clones
do not change the contents of the original.  Again, you are presenting
the snapshots to another host, not the original LUNs.  It is a bit
scary only because you will have to do "zpool import -f". If you have
presented the real LUN and not the rw snapshot to your test host, you
will almost certainly corrupt the active copy.  If you do it
correctly, there should be no danger.  Proving out the process on
something other than your important data is highly recommended.

In any case, this is probably something to think about outside of the
scope of this performance issue.
>> Since iSCSI is in the mix, you should also be sure that your network
>> is appropriately tuned. ?Assuming that you are using the onboard
>> e1000g NICs, be sure that none of the "bad" counters are
incrementing:
>>
>> $ kstat -p e1000g | nawk ''$0 ~ /err|drop|fail|no/ &&
$NF != 0''
>>
>> If this gives any output, there is likely something amiss with your
network.
>
> Only this:
> ? ?e1000g:0:e1000g0:unknowns ? ? ? 1764449
I first saw this statistic a few weeks back.  I''m not sure of the
importance of it.  A cluestick would be most appreciated.
>
> I don''t know what those are, but it''s e1000g1 and e1000g2
that are
> used for the Iscsi network.
>
>> The output from "iostat -xCn 10" could be interesting as
well. ?If
>> asvc_t is high (>30?), it means the filer is being slow to respond.
>> If wsvc_t is frequently non-zero, there is some sort of a bottleneck
>> that prevents the server from sending requests to the filer. ?Perhaps
>> you have tuned ssd_max_throttle or Solaris has backed off because the
>> filer said to slow down. ?(Assuming that ssd is used with iSCSI LUNs).
>
> Here''s an example, taken from one of the busy periods:
>
> ? ? ? ? ? ? ? ? ? ?extended device statistics
> ? ?r/s ? ?w/s ? kr/s ? kw/s wait actv wsvc_t asvc_t ?%w ?%b device
> ? ?0.0 ? ?5.0 ? ?0.0 ? ?7.7 ?0.0 ?0.1 ? ?4.1 ? 24.8 ? 1 ? 1 c1t2d0
> ? 27.0 ? 13.8 1523.4 ?172.9 ?0.0 ?0.5 ? ?0.0 ? 11.8 ? 0 ?38
c4t60A98000433469764E4A2D456A644A74d0
> ? 42.0 ? 21.4 2027.3 ?350.0 ?0.0 ?0.9 ? ?0.0 ? 13.9 ? 0 ?60
c4t60A98000433469764E4A2D456A696579d0
> ? 40.8 ? 25.0 1993.5 ?339.1 ?0.0 ?0.8 ? ?0.0 ? 11.8 ? 0 ?52
c4t60A98000433469764E4A476D2F664E4Fd0
> ? 42.0 ? 26.6 1968.4 ?319.1 ?0.0 ?0.8 ? ?0.0 ? 11.8 ? 0 ?56
c4t60A98000433469764E4A476D2F6B385Ad0
Surely this has been investigated already, but just in case...

I''m not sure of how long that interval was.  It it wasn''t
extremely
short, it looks like you could be bumping up against the throughput
constraints of a 100 Mbit connection.  Have you verified that
everything is running at 1000 Mbit/s, full duplex?

In a hardware and OS configuration similar to yours I can drive 10x
the throughput you are seeing - and I am certain that all of my links
are 1000 full.
>
> The service times seem okay to me. ?There''s no `throttle''
setting in
> any of the relevant driver conf files.
>
>> What else is happening on the filer when mail gets slow? ?That is, are
>> you experiencing slowness due to a mail peak or due to some research
>> project that happens to be on the same spindles? ?What does the
>> network look like from the NetApp side?
>
> Our Netapp guy tells me that the filer is operating normally when the
> problem occurs. ?The Iscsi network is less than 10% utilized.
If something is running at 100 Mbit, this would be "the iSCSI network
is less than 100% utilized."  But... as hard as you have looked at
this, I am not optimistic that something like this would have been
overlooked.

Is the ipfilter service running?  If so, does it need to be?  If so,
is your first rule one that starts with "pass in quick" to ensure that
iSCSI packets are subjected to the fewest number of rules possible?

-- 
Mike Gerdts
http://mgerdts.blogspot.com/

Marion Hakanson

2009-Apr-19 19:23 UTC

head link

[zfs-discuss] What causes slow performance under load?

mills at cc.umanitoba.ca said:> What would I look for with mpstat? 
Look for a CPU (thread) that might be 100% utilized;  Also look to see
if that CPU (or CPU''s) has a larger number in the "ithr"
column than all
other CPU''s.  The idea here is that you aren''t getting much
out of the
T2000 if only one (or a few) of its 32 CPU''s is working hard.

On our T2000''s running Solaris-10 (Update 4, I believe), the default
kernel settings do not enable interrupt-fanout for the network interfaces.
So you can end up with all four of your e1000g''s being serviced by the
same CPU.  You can''t get even one interface to handle more than 35-45%
of a gigabit if that''s the case, but proper tuning has allowed us to
see
90MByte/sec each, on multiple interfaces simultaneously.

Note I''m not suggesting this explains your situation.  But even if
you''ve
addressed this particular issue, you could still have some other piece of
your stack which ends up bottlenecked on a single CPU, and mpstat can
show if that''s happening.

Oh yes, "intrstat" can also show if hardware device interrupts are
being
spread among multiple CPU''s.  On the T2000, it''s recommended
that you
set things up so only one thread per core is allowed to handle interrupts,
freeing the others for application-only work.

Regards,

Marion

Richard Elling

2009-Apr-19 23:47 UTC

head link

[zfs-discuss] What causes slow performance under load?

iostat measurements comment below...

Gary Mills wrote:> On Sat, Apr 18, 2009 at 11:45:54PM -0500, Mike Gerdts wrote:
>   
>> [perf-discuss cc''d]
>>
>> On Sat, Apr 18, 2009 at 4:27 PM, Gary Mills <mills at
cc.umanitoba.ca> wrote:
>>     
>>> Many other layers are involved in this server.  We use scsi_vhci
for
>>> redundant I/O paths and Sun''s Iscsi initiator to connect
to the
>>> storage on our Netapp filer.  The kernel plays a part as well.  How
>>> do we determine which layer is responsible for the slow
performance?
>>>       
>> Have you disabled the nagle algorithm for the iscsi initiator?
>>
>> http://bugs.opensolaris.org/view_bug.do?bug_id=6772828
>>     
>
> I tried that on our test IMAP backend the other day.  It made no
> significant difference to read or write times or to ZFS I/O bandwidth.
> I conclude that the Iscsi initiator has already sized its TCP packets
> to avoid Nagle delays.
>
>   
>> Also, you may want to consider doing backups from the NetApp rather
>> than from the Solaris box.
>>     
>
> I''ve certainly recommended finding a different way to perform
backups.
>
>   
>> Assuming all of your LUNs are in the same
>> volume on the filer, a snapshot should be a crash-consistent image of
>> the zpool.  You could verify this by making the snapshot rw and trying
>> to import the snapshotted LUNs on another host.
>>     
>
> That part sounds scary!  The filer exports four LUNs that are combined
> into one ZFS pool on the IMAP server.  These LUNs are volumes on the
> filer.  How can we safely import them on another host?
>
>   
>> Anyway, this would
>> remove the backup-related stress on the T2000.  You can still do
>> snapshots at the ZFS layer to give you file level restores.  If the
>> NetApp caught on fire, you would simply need to restore the volume
>> containing the LUNs (presumably a small collection of large files)
>> which would go a lot quicker than a large collection of small files.
>>     
>
> Yes, a disaster recovery would be much quicker in this case.
>
>   
>> Since iSCSI is in the mix, you should also be sure that your network
>> is appropriately tuned.  Assuming that you are using the onboard
>> e1000g NICs, be sure that none of the "bad" counters are
incrementing:
>>
>> $ kstat -p e1000g | nawk ''$0 ~ /err|drop|fail|no/ &&
$NF != 0''
>>
>> If this gives any output, there is likely something amiss with your
network.
>>     
>
> Only this:
>     e1000g:0:e1000g0:unknowns       1764449
>
> I don''t know what those are, but it''s e1000g1 and e1000g2
that are
> used for the Iscsi network.
>
>   
>> The output from "iostat -xCn 10" could be interesting as
well.  If
>> asvc_t is high (>30?), it means the filer is being slow to respond.
>> If wsvc_t is frequently non-zero, there is some sort of a bottleneck
>> that prevents the server from sending requests to the filer.  Perhaps
>> you have tuned ssd_max_throttle or Solaris has backed off because the
>> filer said to slow down.  (Assuming that ssd is used with iSCSI LUNs).
>>     
>
> Here''s an example, taken from one of the busy periods:
>
>                     extended device statistics
>     r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
>     0.0    5.0    0.0    7.7  0.0  0.1    4.1   24.8   1   1 c1t2d0
>    27.0   13.8 1523.4  172.9  0.0  0.5    0.0   11.8   0  38
c4t60A98000433469764E4A2D456A644A74d0
>    42.0   21.4 2027.3  350.0  0.0  0.9    0.0   13.9   0  60
c4t60A98000433469764E4A2D456A696579d0
>    40.8   25.0 1993.5  339.1  0.0  0.8    0.0   11.8   0  52
c4t60A98000433469764E4A476D2F664E4Fd0
>    42.0   26.6 1968.4  319.1  0.0  0.8    0.0   11.8   0  56
c4t60A98000433469764E4A476D2F6B385Ad0
>   
I see no evidence of an I/O or file system bottleneck here.  While the
service times are a little higher than I expect, I don''t get worried
until
the %busy is high and actv is high and asvc_t is high(er).  I think your
problem is elsewhere.

NB when looking at ZFS, a 1 second interval for iostat is too small
to be useful.  10 seconds is generally better, especially for older
releases of ZFS (anything on Solaris 10).

<shameless plug>
ZFS consulting available at http://www.richardelling.com
</shamelss plug>

 -- richard

Tim

2009-Apr-20 04:32 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sun, Apr 19, 2009 at 6:47 PM, Richard Elling <richard.elling at
gmail.com>wrote:
> I see no evidence of an I/O or file system bottleneck here.  While the
> service times are a little higher than I expect, I don''t get
worried until
> the %busy is high and actv is high and asvc_t is high(er).  I think your
> problem is elsewhere.
>
> NB when looking at ZFS, a 1 second interval for iostat is too small
> to be useful.  10 seconds is generally better, especially for older
> releases of ZFS (anything on Solaris 10).
>
> <shameless plug>
> ZFS consulting available at http://www.richardelling.com
> </shamelss plug>
>
> -- richard
>
>So does that mean you don''t work for Sun anymore...?

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090419/e98d6179/attachment.html>

Richard Elling

2009-Apr-20 16:35 UTC

head link

[zfs-discuss] What causes slow performance under load?

Tim wrote:>
>
> On Sun, Apr 19, 2009 at 6:47 PM, Richard Elling 
> <richard.elling at gmail.com <mailto:richard.elling at
gmail.com>> wrote:
>
>     I see no evidence of an I/O or file system bottleneck here.  While the
>     service times are a little higher than I expect, I don''t get
>     worried until
>     the %busy is high and actv is high and asvc_t is high(er).  I
>     think your
>     problem is elsewhere.
>
>     NB when looking at ZFS, a 1 second interval for iostat is too small
>     to be useful.  10 seconds is generally better, especially for older
>     releases of ZFS (anything on Solaris 10).
>
>     <shameless plug>
>     ZFS consulting available at http://www.richardelling.com
>     </shamelss plug>
>
>     -- richard
>
> So does that mean you don''t work for Sun anymore...?
I describe it as "free of the shackles of the corporate jail, I can
now recognize and act upon any opportunity I find interesting."
With Sun being bought by Oracle, I have a feeling there will
be plenty of opportunity...
 -- richard

Tim

2009-Apr-20 16:47 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Mon, Apr 20, 2009 at 11:35 AM, Richard Elling
<richard.elling at gmail.com>wrote:
> Tim wrote:
>
>>
>>
>> On Sun, Apr 19, 2009 at 6:47 PM, Richard Elling <richard.elling at
gmail.com<mailto:
>> richard.elling at gmail.com>> wrote:
>>
>>    I see no evidence of an I/O or file system bottleneck here.  While
the
>>    service times are a little higher than I expect, I don''t
get
>>    worried until
>>    the %busy is high and actv is high and asvc_t is high(er).  I
>>    think your
>>    problem is elsewhere.
>>
>>    NB when looking at ZFS, a 1 second interval for iostat is too small
>>    to be useful.  10 seconds is generally better, especially for older
>>    releases of ZFS (anything on Solaris 10).
>>
>>    <shameless plug>
>>    ZFS consulting available at http://www.richardelling.com
>>    </shamelss plug>
>>
>>    -- richard
>>
>> So does that mean you don''t work for Sun anymore...?
>>
>
> I describe it as "free of the shackles of the corporate jail, I can
> now recognize and act upon any opportunity I find interesting."
> With Sun being bought by Oracle, I have a feeling there will
> be plenty of opportunity...
> -- richard
>
LOL, fair enough :)  Sorry for the "intrusion" if you will.  I just
noticed
the @gmail instead of the @sun (perhaps I''m a bit slow) and was a bit
taken
aback to see someone so involved in zfs was no longer with Sun.  I guess
whatever it takes to make the books look good.

Oracle: It should be an interesting ride to say the least.  I guess
we''ll
see just how much they love linux... either zfs et. all will become GPL, or
we''ll see their true colors.  I''m secretly hoping for the
latter (as long as
they keep it open sourced).

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090420/ba10f3f7/attachment.html>

Bob Friesenhahn

2009-Apr-20 18:22 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Mon, 20 Apr 2009, Richard Elling wrote:>
> I describe it as "free of the shackles of the corporate jail, I can
> now recognize and act upon any opportunity I find interesting."
> With Sun being bought by Oracle, I have a feeling there will
> be plenty of opportunity...
Is this a forward-looking statement?  Are you planning on taking your 
consulting company public soon so we can all invest in it?

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Bob Friesenhahn

2009-Apr-20 19:19 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Mon, 20 Apr 2009, Tim wrote:>
> Oracle: It should be an interesting ride to say the least.  I guess
we''ll
> see just how much they love linux... either zfs et. all will become GPL, or
> we''ll see their true colors.  I''m secretly hoping for the
latter (as long as
> they keep it open sourced).
I don''t think that GPL would be very wise, although a dual-license 
may be ok.  Linux would need GPLv2, which is now out of date.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Eric D. Mudama

2009-Apr-20 20:04 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Mon, Apr 20 at 14:19, Bob Friesenhahn wrote:> On Mon, 20 Apr 2009, Tim wrote:
>>
>> Oracle: It should be an interesting ride to say the least.  I guess
we''ll
>> see just how much they love linux... either zfs et. all will become
GPL, or
>> we''ll see their true colors.  I''m secretly hoping for
the latter (as long as
>> they keep it open sourced).
>
> I don''t think that GPL would be very wise, although a dual-license
may be
> ok.  Linux would need GPLv2, which is now out of date.
>
GPL v2 may not be the most recent version, but a lot of people prefer
GPLv2 to GPLv3, in the same way that some people might prefer Solaris
8 to Solaris 10, or Linux 2.4 kernels to the 2.6 series.

I don''t know who they are, but they certainly exist.

-- 
Eric D. Mudama
edmudama at mail.bounceswoosh.org

Ray Van Dolson

2009-Apr-20 20:34 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Mon, Apr 20, 2009 at 01:04:59PM -0700, Eric D. Mudama
wrote:> On Mon, Apr 20 at 14:19, Bob Friesenhahn wrote:
> > On Mon, 20 Apr 2009, Tim wrote:
> >>
> >> Oracle: It should be an interesting ride to say the least.  I
guess we''ll
> >> see just how much they love linux... either zfs et. all will
become GPL, or
> >> we''ll see their true colors.  I''m secretly
hoping for the latter (as long as
> >> they keep it open sourced).
> >
> > I don''t think that GPL would be very wise, although a
dual-license may be
> > ok.  Linux would need GPLv2, which is now out of date.
> >
> 
> GPL v2 may not be the most recent version, but a lot of people prefer
> GPLv2 to GPLv3, in the same way that some people might prefer Solaris
> 8 to Solaris 10, or Linux 2.4 kernels to the 2.6 series.
> 
> I don''t know who they are, but they certainly exist.
I wouldn''t say GPLv2 is out of date.  In fact, I don''t think
it''ll ever
go away as a lot of people see it as being more "free" than GPLv3.

So, yes, GPLv3 has a higher version number, but it hardly obsoletes
GPLv2 :-)

(I think I''m basically agreeing with what you said here)

Ray

Bob Friesenhahn

2009-Apr-20 20:36 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Mon, 20 Apr 2009, Eric D. Mudama wrote:
> GPL v2 may not be the most recent version, but a lot of people prefer
> GPLv2 to GPLv3, in the same way that some people might prefer Solaris
> 8 to Solaris 10, or Linux 2.4 kernels to the 2.6 series.
To be more clear, standard GPL provides the option for the user to use 
any later version.  The Linux kernel uses a modified verison of GPLv2 
which removes that option since they could not control the future of 
GPL.  GPLv2 and GPLv3 are only similar in general intent.  One prints 
in a page or two while the other requires a book.

Due to this, ZFS would need to be licensed using GPLv2 in order to be 
included in the the Linux kernel.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Gary Mills

2009-Apr-21 02:57 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sat, Apr 18, 2009 at 04:27:55PM -0500, Gary Mills
wrote:> We have an IMAP server with ZFS for mailbox storage that has recently
> become extremely slow on most weekday mornings and afternoons.  When
> one of these incidents happens, the number of processes increases, the
> load average increases, but ZFS I/O bandwidth decreases.  Users notice
> very slow response to IMAP requests.  On the server, even `ps''
becomes
> slow.
After I moved a couple of Cyrus databases from ZFS to UFS on Sunday
morning, the server seemed to run quite nicely.  One of these
databases is memory-mapped by all of the lmtpd and pop3d processes.
The other is opened by all the lmtpd processes.  Both were quite
active, with many small writes, so I assumed they''d be better on UFS.
All of the IMAP mailboxes were still on ZFS.

However, this morning, things went from bad to worse.  All writes to
the ZFS filesystems stopped completely.  Look at this:

    $ zpool iostat 5 5
                   capacity     operations    bandwidth
    pool         used  avail   read  write   read  write
    ----------  -----  -----  -----  -----  -----  -----
    space       1.04T   975G     86     67  4.53M  2.57M
    space       1.04T   975G      5      0   159K      0
    space       1.04T   975G      7      0   337K      0
    space       1.04T   975G      3      0   179K      0
    space       1.04T   975G      4      0   167K      0

`fsstat'' told me that there was both writes and memory-mapped I/O
to UFS, but nothing to ZFS.  At the same time, the `ps'' command
would hang and could not be interrupted.  `truss'' on `ps''
looked
like this, but it eventually also stopped and not be interrupted.

    open("/proc/6359/psinfo", O_RDONLY)             = 4
    read(4, "02\0\0\0\0\0\001\0\018D7".., 416)      = 416
    close(4)                                        = 0
    open("/proc/12782/psinfo", O_RDONLY)            = 4
    read(4, "02\0\0\0\0\0\001\0\0 1EE".., 416)      = 416
    close(4)                                        = 0

What could cause this sort of behavior?  It happened three times today!

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Joerg Schilling

2009-Apr-21 08:35 UTC

head link

[zfs-discuss] What causes slow performance under load?

Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:
> On Mon, 20 Apr 2009, Tim wrote:
> >
> > Oracle: It should be an interesting ride to say the least.  I guess
we''ll
> > see just how much they love linux... either zfs et. all will become
GPL, or
> > we''ll see their true colors.  I''m secretly hoping
for the latter (as long as
> > they keep it open sourced).
>
> I don''t think that GPL would be very wise, although a dual-license
> may be ok.  Linux would need GPLv2, which is now out of date.
Dual licensing is a general problem as you might see e.g. GPL-only patches that 
we cannot use for OpenSolaris.

Do you really like Sun to be forced to verify that the kind of such a patch is
below
the interlectual creation level to be able to claim a copyright?

BTW: GPLv2 is still more open for license combinations than GPLv3 is.

While both GPLv2 and GPLv3 distinct bewteen "the work" and "the
complete
source", GPLv2 does not say anything about the license of the rest of the
code.
GPLv3 requires "the complete source" to be under GPLv3. This is a real
problem as:

-	The term complete source is explained in a way that may include 
	compilers, editors, revision control systems. You will have to 
	include them under GPLv3 if a you meet an unhappy author.

-	The term "system library" is limited in GPLv3 and does not even
	match all libraries that are usually part of the OS installation.

-	While GPLv2 results in a general compatibility with any independently
	developed library (because it is not part of "the work"), GPLv3 list
	only a few exceptions (note that the so called GPLv2 system exception
	only allows you to exclude code from "the complete source" but does
not
	affect license compatibility).

-	All other licenses are explicitely excluded from GPLv3 compatibility. 
	A LGPL-2.1 library that is part of the installation but does not match
	the "system library" criteria of the GPLv3 cannot be used from a
GPLv3
	program.

Conclusion: GPLv3 creates more problems than it solved. If you like to think 
about other licenses, you better stay with GPLv2.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       joerg.schilling at fokus.fraunhofer.de (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily

Joerg Schilling

2009-Apr-21 09:06 UTC

head link

[zfs-discuss] What causes slow performance under load?

Bob Friesenhahn <bfriesen at simple.dallas.tx.us> wrote:
> To be more clear, standard GPL provides the option for the user to use 
> any later version.  The Linux kernel uses a modified verison of GPLv2 
Such an option is illegal in Europe anyway - you cannot agree with a contract
that you don''t know.
> Due to this, ZFS would need to be licensed using GPLv2 in order to be 
> included in the the Linux kernel.
I see no need to relicense ZFS as ZFS is doubtlessly a separate work and not 
part of the linux kernel.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       joerg.schilling at fokus.fraunhofer.de (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily

Patrick Skerrett

2009-Apr-21 14:34 UTC

head link

[zfs-discuss] What causes slow performance under load?

I''m fighting with an identical problem here & am very interested in
this
thread.

Solaris 10 127112-11 boxes running ZFS on a fiberchannel raid5 device 
(hardware raid).

Randomly one lun on a machine will stop writing for about 10-15 minutes 
(during a busy time of day), and then all of a sudden become active with 
a burst of activity. Reads will continue to happen.

I just captured this today (problem volume is sd3):

extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd3 122.6 0.0 7519.4 0.0 0.0 1.2 9.9 0 94
sd19 19.2 42.4 1121.5 284.2 0.0 0.3 4.5 0 17
extended device statistics tty cpu

device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd3 140.2 0.0 7387.9 0.0 0.0 1.4 9.9 0 93
sd19 13.6 37.6 870.3 303.5 0.0 0.2 3.9 0 13
extended device statistics tty cpu


Then after a few minutes...

device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd3 32.0 1375.3 1988.6 10631.2 0.0 8.4 5.9 1 63
sd19 13.0 41.6 701.9 246.7 0.0 0.2 3.8 0 12
extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd3 13.8 2844.3 883.3 26842.2 0.0 29.9 10.5 2 100
sd19 19.4 52.2 1229.8 408.4 0.0 0.3 4.3 0 17
extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd3 1.6 889.5 55.6 8856.7 0.0 35.0 39.3 1 100
sd19 22.8 45.6 1459.1 344.3 0.0 0.3 5.0 0 21


Then back to ''normal''...

extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd3 62.0 179.4 3546.5 1086.0 0.0 1.5 6.3 0 48
sd19 15.4 38.8 927.1 223.9 0.0 0.2 3.8 0 14
extended device statistics tty cpu
device r/s w/s kr/s kw/s wait actv svc_t %w %b tin tout us sy wt id
sd3 26.2 128.6 1476.7 994.8 0.0 0.7 4.3 0 23
sd19 15.8 52.2 998.8 357.7 0.0 0.3 4.0 0 16



During the write problem, all my app servers were hung stuck in write 
threads. The zfs machines are Apache/webDav boxes.

I''m in the process of trying to migrate these luns from hardware raid5 
to "Enhanced JBOD" and then create raidz2 devices to see if that
helps,
but that is going to take months.

Any ideas?

Pat S.



Gary Mills wrote:> On Sat, Apr 18, 2009 at 04:27:55PM -0500, Gary Mills wrote:
>   
>> We have an IMAP server with ZFS for mailbox storage that has recently
>> become extremely slow on most weekday mornings and afternoons.  When
>> one of these incidents happens, the number of processes increases, the
>> load average increases, but ZFS I/O bandwidth decreases.  Users notice
>> very slow response to IMAP requests.  On the server, even `ps''
becomes
>> slow.
>>     
>
> After I moved a couple of Cyrus databases from ZFS to UFS on Sunday
> morning, the server seemed to run quite nicely.  One of these
> databases is memory-mapped by all of the lmtpd and pop3d processes.
> The other is opened by all the lmtpd processes.  Both were quite
> active, with many small writes, so I assumed they''d be better on
UFS.
> All of the IMAP mailboxes were still on ZFS.
>
> However, this morning, things went from bad to worse.  All writes to
> the ZFS filesystems stopped completely.  Look at this:
>
>     $ zpool iostat 5 5
>                    capacity     operations    bandwidth
>     pool         used  avail   read  write   read  write
>     ----------  -----  -----  -----  -----  -----  -----
>     space       1.04T   975G     86     67  4.53M  2.57M
>     space       1.04T   975G      5      0   159K      0
>     space       1.04T   975G      7      0   337K      0
>     space       1.04T   975G      3      0   179K      0
>     space       1.04T   975G      4      0   167K      0
>
> `fsstat'' told me that there was both writes and memory-mapped I/O
> to UFS, but nothing to ZFS.  At the same time, the `ps'' command
> would hang and could not be interrupted.  `truss'' on `ps''
looked
> like this, but it eventually also stopped and not be interrupted.
>
>     open("/proc/6359/psinfo", O_RDONLY)             = 4
>     read(4, "02\0\0\0\0\0\001\0\018D7".., 416)      = 416
>     close(4)                                        = 0
>     open("/proc/12782/psinfo", O_RDONLY)            = 4
>     read(4, "02\0\0\0\0\0\001\0\0 1EE".., 416)      = 416
>     close(4)                                        = 0
>
> What could cause this sort of behavior?  It happened three times today!
>
>

Miles Nordin

2009-Apr-21 17:09 UTC

head link

[zfs-discuss] What causes slow performance under load?

>>>>> "js" == Joerg Schilling <Joerg.Schilling at
fokus.fraunhofer.de> writes:
    js> Do you really like Sun to be forced to verify that the kind of
    js> such a patch is below the interlectual creation level to be
    js> able to claim a copyright?

the common and IMHO correct practice, and the practice Sun actually
uses, is to assume all patches deserve the protection of copyright and
get contributor agreements from anyone who submits patches so that Sun
chooses the license, _and_ can change the license later (which Linux
cannot, which sucks for mostly everyone).  Please stop spreading FUD.
Especially since you brought us through this exact same thing before
the last time someone brought up dual-licensing.

    js> BTW: GPLv2 is still more open for license combinations than
    js> GPLv3 is.

[...]

    js> If you like to think about other licenses, you better stay
    js> with GPLv2.

First, there is plain-GPLv2, Linux-modified-GPLv2 with the ``or any
later version'''' clause deleted and the suspect
``interpretation'''' of
kernel modules, and plain-GPLv3: there are three GPL licenses to
worry about.

Second, with whatever respect is due to you, I will rather take advice
about license compatibility from someone who actually _does_ like to
think about other licenses, not from the one whose license actions,
whether you personally accept the result as correct or not, led in
practice to the fork of cdrkit.

  http://www.dwheeler.com/essays/gpl-compatible.html

Either that, or I''ll just sign a contributor agreement if I want to
participate in the ZFS community hosted by Sun, which is really as far
as you have to think about licenses w.r.t. this list---are you willing
to sign one, or not?  Even if you did object to the contributor
agreement, which I don''t, there''s not a big enough community
around
ZFS for a fork, so for now you either sign the agreement or don''t
participate, and the license is chosen by Sun and can be dual or
single or whatever.  It works perfectly, as it was planned to, and
there is simply no cause for your FUD.

My own view is, I might one day become inclined to avoid ZFS entirely
because the license incompatibility with GPL is too constraining and
hassling, but once I''ve chosen to use it I won''t mind signing
an
assignment agreement to cover any patches I made.  The quirks of
individual licenses often look to me less harmful than Linux and BSD''s
lack of clear assignment so they can''t adapt to changing thinking,
license compatibility needs, and market conditions (anti-TiVoizing
clauses, patent sandboxes, 4-clause -> 2 or 3 clause BSD, Apache 2.0
compatibility).  I _like_ assignment and flexibility, and I''m not too
worried about someone changing the license out from underneath me,
because the OSI won''t approve and community won''t accept any
of those
old greedy licenses like SCSL and the early Apple licenses that
prevents the community from forking to evade an undemocratic license
change (like happened to you with cdrkit)---the possibility of this
kind of fork is a GOOD thing because it gives us safe flexibility that
really is rooted in a community, not just a bunch of blogs and
wallpaper.  I guess my flexible position depends on how much I''m
likely to contribute, though, probably pretty small for me---a lot of
larger contributors are stating outright they''re focusing on btrfs,
because of license, or license incompatibility.  :/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090421/64c4825e/attachment.bin>

Joerg Schilling

2009-Apr-21 19:20 UTC

head link

[zfs-discuss] What causes slow performance under load?

Miles Nordin <carton at Ivy.NET> wrote:

...> chooses the license, _and_ can change the license later (which Linux
> cannot, which sucks for mostly everyone).  Please stop spreading FUD.
> Especially since you brought us through this exact same thing before
> the last time someone brought up dual-licensing.
Please stop spreading FUD!

If you don''t understand the problems from dual licensing, please first
try
to inform yourself about what happened with OpenOffice and Redhat a few years 
ago.
> First, there is plain-GPLv2, Linux-modified-GPLv2 with the ``or any
> later version'''' clause deleted and the suspect
``interpretation'''' of
> kernel modules, and plain-GPLv3: there are three GPL licenses to
> worry about.
You just verified that you don''t understand what you are talking about
- sorry.
The clause "or any later version" is _not_ part of the GPL. The Linux
Kernel
of course uses a plain vanilla GPLv2.

The clause "or any later version" is even illegal in many
juristrictions
as these juristrictions forbid to sign a contract that you don''t know
at the
time you sign.

The rest of your text contains a lot more problematic claims, let me delete 
it because it does not look like you like to discuss things.

I am in special very disappointed because you quote a person (Mr. Wheeler) who 
seems to know few to nothing about licensing and who spreads a lot of FUD :-(

The license combination used by cdrtools was verified by several lawywers 
including Sun Legal and Eben Moglen and no lawyer did find a problem. Finally,
with help from Simon Phipps, Debian agreed on March 6th to go back to the
original
cdrtools.

Note: the cdrtools fork "cdrkit" is violating both Copyright law and
GPL and
cannot be legally distributed.

So what is your point?

I am a person that tries to bring different license camps together and it seems 
that I am successful with it - I convinced the *BSD people that there is no 
problem with adding CDDL code (e.g. Dtrace) to their kernel. I am talking with 
many people from the Linux camp and it is nice to see that the Linux people who 
create code do not spread FUD but are interested in a discussion and in 
exchange of ideas and code. I am attending many Linux events and I am giving 
talks at these events...You are quoting people who do not contribute to OSS.

What code did you write? Where did you try to connect
people?>From my investigations, it seems that you did nothing like that.....
J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       joerg.schilling at fokus.fraunhofer.de (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily

Miles Nordin

2009-Apr-21 20:10 UTC

head link

[zfs-discuss] What causes slow performance under load?

>>>>> "js" == Joerg Schilling <Joerg.Schilling at
fokus.fraunhofer.de> writes:
    js> So what is your point?

It was nothing to do with combinations of licenses within cdrkit, nor
within cdrtools.

It was that your changing your project''s license to one incompatible
with the GPL led to the forking of a project, so it would be better
for someone who cares about maximizing license compatibility so that
projects can work together, to seek advice on this elsewhere.

    js> The clause "or any later version" is even illegal in many
    js> juristrictions as these juristrictions forbid to sign a
    js> contract that you don''t know at the time you sign.

Has Eben Moglen reviewed this statement, too?  I doubt it.  

There''s a difference between non-lawyers doing their best to repeat
the advice of lawyers, and non-lawyers just making stuff up.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090421/1a15c465/attachment.bin>

Gary Mills

2009-Apr-22 01:44 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Tue, Apr 21, 2009 at 09:34:57AM -0500, Patrick Skerrett
wrote:> I''m fighting with an identical problem here & am very
interested in this
> thread.
> 
> Solaris 10 127112-11 boxes running ZFS on a fiberchannel raid5 device 
> (hardware raid).
You are about a year behind in kernel patches.  There is one patch
that addresses similar problems.  I''d recommend installing all of
the new patches.  This bug seems to be relevant:

    http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6535160
> Randomly one lun on a machine will stop writing for about 10-15 minutes 
> (during a busy time of day), and then all of a sudden become active with 
> a burst of activity. Reads will continue to happen.
One thing that seems to have solved our hang and stall problems is
to set `pg_contig_disable=1'' in the kernel.  I believe that only
systems with Niagara CPUs are affected.  It has to do with kernel
code for handling two different sizes of memory pages.  You can find
more information here:

    http://forums.sun.com/thread.jspa?threadID=5257060

Also, open a support case with Sun if you haven''t already.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

David Dyer-Bennet

2009-Apr-22 17:52 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Tue, April 21, 2009 14:20, Joerg Schilling wrote:> Miles Nordin <carton at Ivy.NET> wrote:
>> First, there is plain-GPLv2, Linux-modified-GPLv2 with the ``or any
>> later version'''' clause deleted and the suspect
``interpretation'''' of
>> kernel modules, and plain-GPLv3: there are three GPL licenses to
>> worry about.
>
> You just verified that you don''t understand what you are talking
about -
> sorry.
> The clause "or any later version" is _not_ part of the GPL. The
Linux
> Kernel
> of course uses a plain vanilla GPLv2.
>
> The clause "or any later version" is even illegal in many
juristrictions
> as these juristrictions forbid to sign a contract that you don''t
know at
> the time you sign.
So are you saying you''ve never previously noticed section 14 of the GPL
as
displayed at <http://www.gnu.org/copyleft/gpl.html>?

It contains:

If the Program specifies that a certain numbered version of the GNU
General Public License ?or any later version? applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software Foundation.
If the Program does not specify a version number of the GNU General Public
License, you may choose any version ever published by the Free Software
Foundation.

So you''re just plain wrong.  The GPL contains the exact clause you say
it
doesn''t contain.  Furthermore, it does in fact say that (unless
otherwise
restricted by the license grant) that one may use "any version every
published by the Free Software Association".  That''s not limited
to
versions published after the license grant.
-- 
David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

Gary Mills

2009-Apr-22 18:49 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Tue, Apr 21, 2009 at 04:09:03PM -0400, Oscar del Rio
wrote:> There''s a similar thread on HIED-EMAILADMIN at LISTSERV.ND.EDU
> that might help or at least can get you in touch with other University 
> admins in a similar situation.
> 
> https://listserv.nd.edu/cgi-bin/wa?A1=ind0904&L=HIED-EMAILADMIN
> Thread: mail systems using ZFS filesystems?
Thanks.  Those problems do sound similar.  I also see positive
experiences with T2000 servers, ZFS, and Cyrus IMAP from UC Davis.

None of the people involved seem to be active on either the ZFS
mailing list or the Cyrus list.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Gary Mills

2009-Apr-27 21:47 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Sat, Apr 18, 2009 at 04:27:55PM -0500, Gary Mills
wrote:> We have an IMAP server with ZFS for mailbox storage that has recently
> become extremely slow on most weekday mornings and afternoons.  When
> one of these incidents happens, the number of processes increases, the
> load average increases, but ZFS I/O bandwidth decreases.  Users notice
> very slow response to IMAP requests.  On the server, even `ps''
becomes
> slow.
The cause turned out to be this ZFS bug:

    6596237: Stop looking and start ganging

Apparently, the ZFS code was searching the free list looking for the
perfect fit for each write.  With a fragmented pool, this search took
a very long time, delaying the write.  Eventually, the requests arrived
faster than writes could be sent to the devices, causing the server
to be unresponsive.

There isn''t a patch for this one yet, but Sun will supply an IDR if
you open a support case.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

Rince

2009-May-13 02:40 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Tue, Apr 21, 2009 at 3:20 PM, Joerg Schilling
<Joerg.Schilling at fokus.fraunhofer.de> wrote:> The license combination used by cdrtools was verified by several lawywers
> including Sun Legal and Eben Moglen and no lawyer did find a problem.
[citation needed]

https://lists.ubuntu.com/archives/ubuntu-news-team/2009-February/000413.html
states that Eben Moglen claimed Ubuntu cannot ship cdrtools, which
certainly seems like "a problem".
https://lists.ubuntu.com/archives/ubuntu-devel-discuss/2009-January/006688.html
appears to be a reasonable statement of Ubuntu''s position prior to
that decision.
> Finally, with help from Simon Phipps, Debian agreed on March 6th to go back
to the original
> cdrtools.
[citation needed]

According to Debian''s packaging database
(http://packages.qa.debian.org/c/cdrtools.html), cdrtools still hasn''t
been touched since 2006 in Debian. Neither cdrkit.org nor debian-legal
in March (http://osdir.com/ml/debian-legal/2009-03/threads.html) nor
debian-devel in March
(http://osdir.com/ml/debian-devel/2009-03/threads.html) nor even
Google for cdrtools Debian
(http://www.google.com/search?q=cdrtools+Debian) have any mention of
Debian agreeing to go back to the original cdrtools.

Where and when was this discussed?
> Note: the cdrtools fork "cdrkit" is violating both Copyright law
and GPL and
> cannot be legally distributed.
[citation needed]

According to whom is this the case? A quick search includes no claims
but your own.

- Rich

Joerg Schilling

2009-May-13 11:53 UTC

head link

[zfs-discuss] What causes slow performance under load?

Rince <rincebrain at gmail.com> wrote:
> On Tue, Apr 21, 2009 at 3:20 PM, Joerg Schilling
> <Joerg.Schilling at fokus.fraunhofer.de> wrote:
> > The license combination used by cdrtools was verified by several
lawywers
> > including Sun Legal and Eben Moglen and no lawyer did find a problem.
>
> [citation needed]
What is the reason for restarting your FUD campagin?

Please stop this, it is completely off topic.

J?rg

-- 
 EMail:joerg at schily.isdn.cs.tu-berlin.de (home) J?rg Schilling D-13353 Berlin
       js at cs.tu-berlin.de                (uni)  
       joerg.schilling at fokus.fraunhofer.de (work) Blog:
http://schily.blogspot.com/
 URL:  http://cdrecord.berlios.de/private/ ftp://ftp.berlios.de/pub/schily

Gary Mills

2009-May-14 02:25 UTC

head link

[zfs-discuss] What causes slow performance under load?

On Mon, Apr 27, 2009 at 04:47:27PM -0500, Gary Mills
wrote:> On Sat, Apr 18, 2009 at 04:27:55PM -0500, Gary Mills wrote:
> > We have an IMAP server with ZFS for mailbox storage that has recently
> > become extremely slow on most weekday mornings and afternoons.  When
> > one of these incidents happens, the number of processes increases, the
> > load average increases, but ZFS I/O bandwidth decreases.  Users notice
> > very slow response to IMAP requests.  On the server, even
`ps'' becomes
> > slow.
> 
> The cause turned out to be this ZFS bug:
> 
>     6596237: Stop looking and start ganging
> 
> Apparently, the ZFS code was searching the free list looking for the
> perfect fit for each write.  With a fragmented pool, this search took
> a very long time, delaying the write.  Eventually, the requests arrived
> faster than writes could be sent to the devices, causing the server
> to be unresponsive.
We also had another problem, due to this ZFS bug:

    6591646: Hang while trying to enter a txg while holding a txg open

This was a deadlock, with one thread blocking hundreds of other
threads.  Our symptom was that all zpool I/O would stop and the `ps''
command would hang.  A reboot was the only way out.

If you have a support contract, Sun will supply an IDR that fixes
both problems.

-- 
-Gary Mills-    -Unix Support-    -U of M Academic Computing and Networking-

zfs discuss - Apr 2009 - What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?

[zfs-discuss] What causes slow performance under load?