thr3ads.net - zfs discuss - [zfs-discuss] Performance issues with iSCSI under Linux [Oct 2010]

If this information is useful, please help other people find it:
Share via:

Ian D

2010-Oct-08 14:47 UTC

[zfs-discuss] Performance issues with iSCSI under Linux

Hi!We''re trying to pinpoint our performance issues and we could use all
the help to community can provide.  We''re running the latest version of
Nexenta on a pretty powerful machine (4x Xeon 7550, 256GB RAM, 12x 100GB Samsung
SSDs for the cache, 50GB Samsung SSD for the ZIL, 10GbE on a dedicated switch,
11x pairs of 15K HDDs for the pool).  We''re connecting a single Linux
box to it using iSCSI and using "top" we see it spent all its time in
iowait.  Using "zpool iostat" on the Nexenta box I can see that the
disks are barely working, typically reading <500K/sec and doing about 50 IOPS
each- they obviously can do much better.   We see the same pattern wether
we''re doing SQL queries (MySQL) or simply doing file copies. 
What''s particularly curious is that with file copies, there is a long
pause (can be a few seconds) between each file that is copied.  Even reading a
folder (ls) is slow.
Where should we look at?  What more information should I provide?
Thanks a lot!Ian 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101008/11d76764/attachment.html>

Roy Sigurd Karlsbakk

2010-Oct-08 17:06 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Where should we look at? What more information should I provide? Start 
Start with ''iostat -xdn 1''. That''ll provide info
about the actual device I/O.

Vennlige hilsener / Best regards 

roy 
-- 
Roy Sigurd Karlsbakk 
(+47) 97542685 
roy at karlsbakk.net 
http://blogg.karlsbakk.net/ 
-- 
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er
et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og
relevante synonymer p? norsk.

Edward Ned Harvey

2010-Oct-09 01:45 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Ian D
> 
> the help to community can provide. ?We''re running the latest
version of
> Nexenta on a pretty powerful machine (4x Xeon 7550, 256GB RAM, 12x
> 100GB Samsung SSDs for the cache, 50GB Samsung SSD for the ZIL, 10GbE
> on a dedicated switch, 11x pairs of 15K HDDs for the pool). ?We''re
If you have a single SSD for dedicated log, that will surely be a bottleneck
for you.  All sync writes (which are all writes in the case of iscsi) will
hit the log device before the main pool.  But you should still be able to
read fast...

Also, with so much cache & ram, it wouldn''t surprise me a lot to
see really
low disk usage just because it''s already cached.  But that
doesn''t explain
the ridiculously slow performance...

I''ll suggest trying something completely different, like, dd
if=/dev/zero
bs=1024k | pv | ssh othermachine ''cat > /dev/null'' ... 
Just to verify there
isn''t something horribly wrong with your hardware (network).

In linux, run "ifconfig" ... You should see "errors:0"

Make sure each machine has an entry for the other in the hosts file.  I
haven''t seen that cause a problem for iscsi, but certainly for ssh.

2010-Oct-09 02:05 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

To see if it is iscsi related or zfs, have you tried to test performance over
nfs to a zfs filesystem instead of a zvol?

SR
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-09 13:33 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> I''ll suggest trying something completely different, like, dd
if=/dev/zero
> bs=1024k | pv | ssh othermachine ''cat > /dev/null'' ...
Just to verify there
> isn''t something horribly wrong with your hardware (network).
> 
> In linux, run "ifconfig" ... You should see "errors:0"
> 
> Make sure each machine has an entry for the other in the hosts file.  I
> haven''t seen that cause a problem for iscsi, but certainly for
ssh.
> 

we''ll try that, thanks!

Here''s some numbers... 

This is a pastebin from iostat running on the linux box:
http://pastebin.com/8mN8mchH

This from the Nexenta box:
http://pastebin.com/J1E4V1b3


A couple of notes:  we know the "Pool_sata" is resilvering, but
we''re concerned about the performances of the other pool
("Pool_sas").  We also know that we''re not using jumbo frames
as for some reason it makes the linux box crash.  Could that explain it all?

Thanks for helping out!
Ian


 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101009/68519d76/attachment.html>

Roy Sigurd Karlsbakk

2010-Oct-09 13:42 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

A couple of notes: we know the "Pool_sata" is resilvering, but
we''re concerned about the performances of the other pool
("Pool_sas"). We also know that we''re not using jumbo frames
as for some reason it makes the linux box crash. Could that explain it all? What
sort of drives are these? It looks like iSCSI or FC device names, and not local
drives

Vennlige hilsener / Best regards 

roy 
-- 
Roy Sigurd Karlsbakk 
(+47) 97542685 
roy at karlsbakk.net 
http://blogg.karlsbakk.net/ 
-- 
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er
et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og
relevante synonymer p? norsk.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101009/6cb844fa/attachment.html>

Ian D

2010-Oct-09 14:08 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

>A couple of notes:  we know the "Pool_sata" is resilvering, but
we''re concerned about the performances of the other pool
("Pool_sas").  We also know that we''re >not using jumbo
frames as for some reason it makes the linux box crash.  Could that explain it
all?
>What sort of drives are these? It looks like iSCSI or FC device names, and
not local drivesThe "Pool_sas" is made of 15K SAS drives on external JBOD arrays (Dell
MD1000) connected on mirrored LSI 9200-8e SAS HBAs.
The "Pool_sata" is made of SATA drives on other JBODs. 
The shorter device names are from the onboard Dell H700 RAID adapter (the SSDs
and system pool are local) while the longer ones are from the LSI cards.
Does that make sense?


 		 	   		  
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101009/2d7b98f6/attachment.html>

Roy Sigurd Karlsbakk

2010-Oct-09 14:12 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

>> What sort of drives are these? It looks like iSCSI or FC device names,
and not local drives
>
> The "Pool_sas" is made of 15K SAS drives on external JBOD arrays
(Dell MD1000) connected on mirrored LSI 9200-8e SAS HBAs.
> 
> The "Pool_sata" is made of SATA drives on other JBODs. 
> 
> The shorter device names are from the onboard Dell H700 RAID adapter (the
SSDs and system pool are local) while the longer ones are from the LSI cards.
> 
> 
> Does that make sense? 
> somehow 
It does, somehow.

Can you try to run some local benchmarking on those pools to see if they behave
well locally?

And please don''t use HTML mail - it messes up formatting and leads to
top-posting and worse :?
 
Vennlige hilsener / Best regards 

roy 
-- 
Roy Sigurd Karlsbakk 
(+47) 97542685 
roy at karlsbakk.net 
http://blogg.karlsbakk.net/ 
-- 
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er
et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og
relevante synonymer p? norsk.

Ian D

2010-Oct-09 14:35 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> If you have a single SSD for dedicated log, that will
> surely be a bottleneck
> for you.  
We''re aware of that.  The original plan was to use mirrored DDRDrive
X1s but we''re experiencing stability issues.  Chris George is being
very responsible and we''ll help us out investigate that once we figure
out our most pressing performance problems.
> Also, with so much cache & ram, it wouldn''t surprise
> me a lot to see really
> low disk usage just because it''s already cached.  But
> that doesn''t explain
> the ridiculously slow performance...
We do see the load of those drives get much higher when we have several servers
(all linux boxes) connected to it at once.  The problem really seem to be at the
individual linux boxes level.  Something to do with iSCSI/networking even tho
the 10GbE network can certainly handle a lot more than that.
> I''ll suggest trying something completely different,
> like, dd if=/dev/zero
> bs=1024k | pv | ssh othermachine ''cat > /dev/null''
> ...  Just to verify there
> isn''t something horribly wrong with your hardware
> (network).
> 
> In linux, run "ifconfig" ... You should see
> "errors:0"
We''ll do that.  Thanks!

Ian
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-09 19:32 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

\> We''re aware of that.  The original plan was to
use> mirrored DDRDrive X1s but we''re experiencing
> stability issues.  Chris George is being very
> responsible and we''ll help us out investigate that
> once we figure out our most pressing performance
> problems.
I feel I need to add to my comment that our experience with the X1s has been
stellar.  The thing is amazingly fast and I can''t wait to put it back
to production.  We''ve been having issues making our system stable
simply because it mixes many many devices that fight each other for resources-
it is in no way the X1s fault and I sure don''t want to give anybody
that impression.

Chris George and Richard Elling have been very responsive helping us with this
and I''ll be happy to share our success with the community once we
figure it out.

Ian
-- 
This message posted from opensolaris.org

Marty Scholes

2010-Oct-10 01:47 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Ok,

Let''s think about this for a minute.  The log drive is c1t11d0 and it
appears to be almost completely unused, so we probably can rule out a ZIL
bottleneck.  I run Ubuntu booting iSCSI against OSol 128a and the writes do not
appear to be synchronous.  So, writes aren''t the issue.
>From the Linux side, it appears the drive in question is either sdb or dm-3,
and both appear to be the same drive.  Since switching to zfs, my Linux-disk-fu
has become a bit rusty.  Is one an alias for the other?  The Linux disk appears
to top out at around 50MB/s or so.  That looks suspiciously like it is running
on a gigabit connection with some problems.
I agree that the zfs side looks like it has plenty of bandwidth and iops to
spare.
>From what I can see, you can narrow the search down to a few things:1. Linux network stack
2. Linux iSCSI issues
3. Network cabling/switch between the devices
4. Nexenta CPU constraints (unlikely, I know, but let''s be thorough)
5. Nexenta network stack
6. COMSTAR problems

As another poster pointed out, testing some NFS and ssh traffic can eliminate 1,
3 and 5 above.

I recommend going down the list and testing every piece in isolation as much as
possible to narrow the list.

Good luck and let us know what you learn.

Cheers,
Marty
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-10 13:23 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> From the Linux side, it appears the drive in question
> is either sdb or dm-3, and both appear to be the same
> drive.  Since switching to zfs, my Linux-disk-fu has
> become a bit rusty.  Is one an alias for the other?
Yes, dm-3 is the alias created by LVM while sdb is the "physical" (or
raw) device
> The Linux disk appears to top out at around 50MB/s
> or so.  That looks suspiciously like it is running
>  on a gigabit connection with some problems.
That''s what I believe too.  It''s a 10GbE connection...
> From what I can see, you can narrow the search down
> to a few things:
> 1. Linux network stack
> 2. Linux iSCSI issues
> 3. Network cabling/switch between the devices
> 4. Nexenta CPU constraints (unlikely, I know, but
> let''s be thorough)
> 5. Nexenta network stack
> 6. COMSTAR problems
We''ll run your tests and share the results.  It is unlikely to be the
CPUs on any side, they are the latests of Intel (Nexenta box) and AMD (Linux
box) and are barely used.

Thanks for helping!
Ian
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-13 15:51 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Here are some more findings...

The Nexenta box has 3 pools:
syspool: made of 2 mirrored (hardware RAID) local SAS disks
pool_sas: made of 22 15K SAS disks in ZFS mirrors on 2 JBODs on 2 controllers
pool_sata: made of 42 SATA disks in 6 RAIDZ2 vdevs on a single controller

When we copy data from any linux box to either the pool_sas or pool_sata, it is
painfully slow.

When we copy data from any linux box directly to the syspool, it is plenty fast

When we copy data locally on the Nexenta box from the syspool to either the
pool_sas or pool_sata, it is crazy fast.

We also see the same pattern whether we use iSCSI or NFS. We''ve also
tested using different NICs (some at 1GbE, some at 10GbE) and even tried
bypassing the switch by directly connecting the two boxes with a cable- and it
didn''t made any difference.  We''ve also tried not using the
SSD for the ZIL.

So...  
We''ve ruled out iSCSI, the networking, the ZIL device, even the HBAs as
it is fast when it is done locally.

Where should we look next?

Thank you all for your help!
Ian
-- 
This message posted from opensolaris.org

Marty Scholes

2010-Oct-13 16:37 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Here are some more findings...
> 
> The Nexenta box has 3 pools:
> syspool: made of 2 mirrored (hardware RAID) local SAS
> disks
> pool_sas: made of 22 15K SAS disks in ZFS mirrors on
> 2 JBODs on 2 controllers
> pool_sata: made of 42 SATA disks in 6 RAIDZ2 vdevs on
> a single controller
> 
> When we copy data from any linux box to either the
> pool_sas or pool_sata, it is painfully slow.
> 
> When we copy data from any linux box directly to the
> syspool, it is plenty fast
> 
> When we copy data locally on the Nexenta box from the
> syspool to either the pool_sas or pool_sata, it is
> crazy fast.
> 
> We also see the same pattern whether we use iSCSI or
> NFS. We''ve also tested using different NICs (some at
> 1GbE, some at 10GbE) and even tried bypassing the
> switch by directly connecting the two boxes with a
> cable- and it didn''t made any difference.  We''ve also
> tried not using the SSD for the ZIL.
> 
> So...  
> We''ve ruled out iSCSI, the networking, the ZIL
> device, even the HBAs as it is fast when it is done
> locally.
> 
> Where should we look next?
> 
> Thank you all for your help!
> Ian
Looking at the list suggested earlier:
1. Linux network stack
2. Linux iSCSI issues
3. Network cabling/switch between the devices
4. Nexenta CPU constraints (unlikely, I know, but let''s be thorough)
5. Nexenta network stack
6. COMSTAR problems

It looks like you have ruled out everything.

The only thing that still stands out is that network operations (iSCSI and NFS)
to external drives are slow, correct?

Just for completeness, what happens if you scp a file to the three different
pools?  If the results are the same as NFS and iSCSI, then I think the network
can be ruled out.

I would be leaning toward thinking there is some mismatch between the network
protocols and the external controllers/cables/arrays.

Are the controllers the same hardware/firmware/driver for the internal vs.
external drives?

Keep digging.  I think you are getting close.

Cheers,
Marty
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-13 19:16 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> The only thing that still stands out is that network
> operations (iSCSI and NFS) to external drives are
> slow, correct?
Yes, that pretty much resume it. 
> Just for completeness, what happens if you scp a file
> to the three different pools?  If the results are the
> same as NFS and iSCSI, then I think the network can
> be ruled out.
This is where it gets interesting...

When we use the external disks as a local file system of the Nexenta box then it
is fast.  We can scp through the network some files from the linux box to the
external drives without problem as long as we address the local file system
(wherever the disks are).  BUT, whenever iSCSI or NFS are involved it all goes
sour.
> I would be leaning toward thinking there is some
> mismatch between the network protocols and the
> external controllers/cables/arrays.
Sounds like it.  The arrays are plenty fast on the same
machine/controller/cables/arrays  when we''re not using a network
protocol.
> Are the controllers the same hardware/firmware/driver
> for the internal vs. external drives?
No.  The internal drives (the syspool and 13 SSDs for the cache) are on a Dell
H700 RAID controller.  The external drives are on 3x LSI 9200-8e SAS HBAs
connected to 7x Dell MD1000 and MD1200 JBODs.

Thanks a lot!
-- 
This message posted from opensolaris.org

Orvar Korvar

2010-Oct-13 19:45 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Would it be possible to install OpenSolaris to an USB disk and boot from it and
try? That would take 1-2h and could maybe help you narrow things down further?
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-13 20:15 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Would it be possible to install OpenSolaris to an USB
> disk and boot from it and try? That would take 1-2h
> and could maybe help you narrow things down further?
I''m a little afraid to lose my data, i wouldnt be the end of the world,
but I''d rather avoid that.  I''ll do it in last resort.

Ian
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-13 20:18 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

More stuff...

We ran the same tests on another Nexenta box with fairly similar hardware and
had the exact same issues.  The two boxes have the same models of HBAs, NICs and
JBODs but different CPUs and motherboards.

Our next test is to try with a different kind of HBA, we have a Dell H800 lying
around.

Ian
-- 
This message posted from opensolaris.org

erik.ableson

2010-Oct-14 07:37 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

On 13 oct. 2010, at 18:37, Marty Scholes wrote:
> The only thing that still stands out is that network operations (iSCSI and
NFS) to external drives are slow, correct?
> 
> Just for completeness, what happens if you scp a file to the three
different pools?  If the results are the same as NFS and iSCSI, then I think the
network can be ruled out.
> 
> I would be leaning toward thinking there is some mismatch between the
network protocols and the external controllers/cables/arrays.
Sounding more and more like a networking issue - are the network cards set up in
an aggregate? I had some similar issues on GbE where there was a mismatch
between the aggregate settings on the switches and the LACP settings on the
server. Basically the network was wasting a ton of time trying to renegotiate
the LACP settings and slowing everything down.

Ditto for the Linux networking - single port or aggregated dual port?

Erik

Ian D

2010-Oct-14 17:08 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Sounding more and more like a networking issue - are
> the network cards set up in an aggregate? I had some
> similar issues on GbE where there was a mismatch
> between the aggregate settings on the switches and
> the LACP settings on the server. Basically the
> network was wasting a ton of time trying to
> renegotiate the LACP settings and slowing everything
> down.
> 
> Ditto for the Linux networking - single port or
> aggregated dual port?
We''re only using one port on both boxes (we never have been able to
saturate them yet), but maybe they are somehow set wrong.  We''ll
investigate.
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-14 17:11 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

I''ve had a few people sending emails directly suggesting it might have
something to do with the ZIL/SLOG.   I guess I should have said that the issue
happen both ways, whether we copy TO or FROM the Nexenta box.
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-14 21:02 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Our next test is to try with a different kind of HBA,
> we have a Dell H800 lying around.
ok... we''re making progress.  After swapping the LSI HBA for a Dell
H800 the issue disappeared.  Now, I''d rather not use those controllers
because they don''t have a JBOD mode. We have no choice but to make
individual RAID0 volumes for each disks which means we need to reboot the server
every time we replace a failed drive.  That''s not good...

What can we do with the LSI HBA?  Would you call LSI''s support?  Is
there anything we should try besides the obvious (using the latests
firmware/driver)?

To resume the issue, when we copy files from/to the JBODs connected to that HBA
using NFS/iSCSI, we get slow transfer rate <20M/s and a 1-2 second pause
between each file.   When we do the same experiment locally using the external
drives as a local volume (no NFS/iSCSI involved) then it goes upward of 350M/sec
with no delay between files.

Ian

Message was edited by: reward72
-- 
This message posted from opensolaris.org

Marion Hakanson

2010-Oct-14 22:15 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

reward72 at hotmail.com said:> ok... we''re making progress.  After swapping the LSI HBA for a
Dell H800 the
> issue disappeared.  Now, I''d rather not use those controllers
because they
> don''t have a JBOD mode. We have no choice but to make individual
RAID0
> volumes for each disks which means we need to reboot the server every time
we
> replace a failed drive.  That''s not good... 
Earlier you said you had eliminated the ZIL as an issue, but one difference
between the Dell H800 and the LSI HBA is that the H800 has an NV cache (if
you have the battery backup present).

A very simple test would be when things are running slow, try disabling
the ZIL temporarily, to see if that makes things go fast.

Regards,

Marion

Ian D

2010-Oct-14 23:10 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Earlier you said you had eliminated the ZIL as an
> issue, but one difference
> between the Dell H800 and the LSI HBA is that the
> H800 has an NV cache (if
> you have the battery backup present).
> 
> A very simple test would be when things are running
> slow, try disabling
> the ZIL temporarily, to see if that makes things go
> fast.
We''ll try that, but keep in mind that we''re having the issue
even when we READ from the JBODs, not just during WRITES.
-- 
This message posted from opensolaris.org

Edward Ned Harvey

2010-Oct-15 01:54 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Ian D
> 
> ok... we''re making progress.  After swapping the LSI HBA for a
Dell
> H800 the issue disappeared.  Now, I''d rather not use those
controllers
> because they don''t have a JBOD mode. We have no choice but to make
> individual RAID0 volumes for each disks which means we need to reboot
> the server every time we replace a failed drive.  That''s not
good...
I believe those are rebranded LSI controllers.  I know the PERC controllers
are.  I use MegaCLI on Perc systems for this purpose.

You should be able to find a utility which allows you to do this sort of
thing while the OS is running.

If you happen to find that MegaCLI is the right tool for your hardware, let
me know, and I''ll paste a few commands here, which will simplify your
life.
When I first started using it, I found it terribly cumbersome.  But now
I''ve
gotten used to it, and MegaCLI commands just roll off the tongue.

> To resume the issue, when we copy files from/to the JBODs connected to
> that HBA using NFS/iSCSI, we get slow transfer rate <20M/s and a 1-2
> second pause between each file.   When we do the same experiment
> locally using the external drives as a local volume (no NFS/iSCSI
> involved) then it goes upward of 350M/sec with no delay between files.
Baffling.

Wilkinson, Alex

2010-Oct-15 02:05 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux [SEC=UNCLASSIFIED]

0n Thu, Oct 14, 2010 at 09:54:09PM -0400, Edward Ned Harvey wrote: 

    >If you happen to find that MegaCLI is the right tool for your hardware,
let
    >me know, and I''ll paste a few commands here, which will
simplify your life.
    >When I first started using it, I found it terribly cumbersome.  But now
I''ve
    >gotten used to it, and MegaCLI commands just roll off the tongue.

can you paste them anyway ?

  -Alex

IMPORTANT: This email remains the property of the Department of Defence and is
subject to the jurisdiction of section 70 of the Crimes Act 1914. If you have
received this email in error, you are requested to contact the sender and delete
the email.

Edward Ned Harvey

2010-Oct-15 04:04 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux [SEC=UNCLASSIFIED]

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Wilkinson, Alex
> 
> can you paste them anyway ?
Note:  If you have more than one adapter, I believe you can specify -aALL in
the commands below, instead of -a0

I have 2 disks (slots 4 & 5) that are removable and rotate offsite for
backups.
  To remove disks safely:
    zpool export removable-pool
    export EnclosureID=`MegaCli -PDList -a0 | grep ''Enclosure Device
ID'' |
uniq | sed ''s/.* //''`
    for DriveNum in 4 5 ; do MegaCli -PDOffline
PhysDrv[${EnclosureID}:${DriveNum}] -a0 ; done

    Disks blink alternate orange & green.  Safe to remove.

  To insert disks safely:
    Insert disks.
    MegaCli -CfgForeign -Clear -a0
    MegaCli -CfgEachDskRaid0 -a0
    devfsadm -Cv
    zpool import -a

To clear foreign config off drives:
MegaCli -CfgForeign -Clear -a0

To create a one-disk raid0 for each disk that''s not currently part of
another group:
MegaCli -CfgEachDskRaid0 -a0

To configure all drives WriteThrough
MegaCli -LDSetProp WT Lall -aALL

To configure all drives WriteBack
MegaCli -LDSetProp WB Lall -aALL

Phil Harman

2010-Oct-15 06:47 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Ian,

It would help to have some config detail (e.g. what options are you using? zpool
status output; property lists for specific filesystems and zvols; etc)

Some basic Solaris stats can be very helpful too (e.g. peak flow samples of
vmstat 1, mpstst 1, iostat -xnz 1, etc)

It would also be great to know how you are running you tests.

I''d also like to know what version of NFS and mount options. A network
trace down to NFS RPC or iSCSI operation level with timings would be great too.

I''m wondering whether your HBA has a write through or write back cache
enabled? The latter might make things very fast, but could put data at risk if
not sufficiently non-volatile.

Cheers,
Phil

On 14 Oct 2010, at 22:02, Ian D <reward72 at hotmail.com> wrote:
>> Our next test is to try with a different kind of HBA,
>> we have a Dell H800 lying around.
> 
> ok... we''re making progress.  After swapping the LSI HBA for a
Dell H800 the issue disappeared.  Now, I''d rather not use those
controllers because they don''t have a JBOD mode. We have no choice but
to make individual RAID0 volumes for each disks which means we need to reboot
the server every time we replace a failed drive.  That''s not good...
> 
> What can we do with the LSI HBA?  Would you call LSI''s support? 
Is there anything we should try besides the obvious (using the latests
firmware/driver)?
> 
> To resume the issue, when we copy files from/to the JBODs connected to that
HBA using NFS/iSCSI, we get slow transfer rate <20M/s and a 1-2 second pause
between each file.   When we do the same experiment locally using the external
drives as a local volume (no NFS/iSCSI involved) then it goes upward of 350M/sec
with no delay between files.
> 
> Ian
> 
> Message was edited by: reward72
> -- 
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Edward Ned Harvey

2010-Oct-15 12:36 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Phil Harman
> 
> I''m wondering whether your HBA has a write through or write back
cache
> enabled? The latter might make things very fast, but could put data at
> risk if not sufficiently non-volatile.
He already said he has SSD''s for dedicated log.  This means the best
solution is to disable WriteBack and just use WriteThrough.  Not only is it
more reliable than WriteBack, it''s faster.

And I know I''ve said this many times before, but I don''t mind
repeating:  If
you have slog devices, then surprisingly, it actually hurts performance to
enable the WriteBack on the HBA.

Think of it like this:

Speed of a "naked" disk:  1.0
Speed of a disk with WriteBack:  2.2
Speed of a disk with slog and WB:  2.8
Speed of a disk with slog and no WB:  3.0

Of course those are really rough numbers, that vary by architecture and
usage patterns.  But you get the idea.  The consistent result is that disk
with slog is the fastest, with WB disabled.

Ian D

2010-Oct-15 14:58 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

As I have mentioned already, we have the same performance issues whether we READ
or we WRITE to the array, shouldn''t that rule out caching issues?

Also we can get great performances with the LSI HBA if we use the JBODs as a
local file system.  The issues only arise when it is done through iSCSI and NFS.

I''m opening tickets with LSI to see if they can help.

Thanks all!
Ian
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-15 15:28 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> He already said he has SSD''s for dedicated log.  This
> means the best
> solution is to disable WriteBack and just use
> WriteThrough.  Not only is it
> more reliable than WriteBack, it''s faster.
> 
> And I know I''ve said this many times before, but I
> don''t mind repeating:  If
> you have slog devices, then surprisingly, it actually
> hurts performance to
> enable the WriteBack on the HBA.
The HBA who gives us problem is a LSI 9200-16e which has no cache whatsoever. 
We do get great performances with a Dell H800 that has cache.  We''ll
use H800s if we have to, but i really would like to find a way to make the
LSI''s work.

Thanks!
-- 
This message posted from opensolaris.org

Marty Scholes

2010-Oct-15 16:37 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> I''ve had a few people sending emails directly
> suggesting it might have something to do with the
> ZIL/SLOG.   I guess I should have said that the issue
> happen both ways, whether we copy TO or FROM the
> Nexenta box.
You mentioned a second Nexenta box earlier.  To rule out client-side issues,
have you considered testing with Nexenta as the iSCSI/NFS client?
-- 
This message posted from opensolaris.org

Phil Harman

2010-Oct-15 17:07 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

As I have mentioned already, it would be useful to know more about the 
config, how the tests are being done, and to see some basic system 
performance stats.

On 15/10/2010 15:58, Ian D wrote:> As I have mentioned already, we have the same performance issues whether we
READ or we WRITE to the array, shouldn''t that rule out caching issues?
>
> Also we can get great performances with the LSI HBA if we use the JBODs as
a local file system.  The issues only arise when it is done through iSCSI and
NFS.
>
> I''m opening tickets with LSI to see if they can help.
>
> Thanks all!
> Ian

Ian D

2010-Oct-15 18:09 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> You mentioned a second Nexenta box earlier.  To rule
> out client-side issues, have you considered testing
> with Nexenta as the iSCSI/NFS client?
If you mean running the NFS client AND server on the same box then yes, and it
doesn''t show the same performance issues.  It''s only when a
Linux box SEND/RECEIVE data to the NFS/iSCSI shares that we have problems.  But
if the Linux box send/receive file through scp on the external disks mounted by
the Nexenta box as a local filesystem then there is no problem.

Ian
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-15 18:10 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> As I have mentioned already, it would be useful to
>  know more about the 
> onfig, how the tests are being done, and to see some
> basic system 
> performance stats.
I will shortly.  Thanks!
-- 
This message posted from opensolaris.org

Darren J Moffat

2010-Oct-15 18:14 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

On 15/10/2010 19:09, Ian D wrote:> It''s only when a Linux box SEND/RECEIVE data to the NFS/iSCSI
shares that we have problems.  But if the Linux box send/receive file through
scp on the external disks mounted by the Nexenta box as a local filesystem then
there is no problem.
Does the Linux box have the same issue to any other server ?
What if the client box isn''t Linux but Solaris or Windows or MacOS X ?

-- 
Darren J Moffat

Ian D

2010-Oct-15 18:58 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Does the Linux box have the same issue to any other
> server ?
> What if the client box isn''t Linux but Solaris or
> Windows or MacOS X ?
That would be a good test.  We''ll try that.
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-15 19:01 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

After contacting LSI they say that the 9200-16e HBA is not supported in
OpenSolaris, just Solaris.  Aren''t Solaris drivers the same as
OpenSolaris?

Is there anyone here using 9200-16e HBAs?  What about the 9200-8e?  We have a
couple lying around and we''ll test one shortly.

Ian
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-15 20:19 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

A little setback....  We found out that we also have the issue with the Dell
H800 controllers, not just the LSI 9200-16e.  With the Dell it''s
initially faster as we benefit from the cache, but after a little while it goes
sour- from 350MB/sec down to less than 40MB/sec.  We''ve also tried with
a LSI 9200-8e with the same results.

So to recap...  No matter what HBA we use, copying through the network to/from
the external drives is painfully slow when access is done through either NFS or
iSCSI.  HOWEVER, it is plenty fast when we do a scp where the data is written to
the external drives (or internal ones for that matter) when they are seen by the
Nexenta box as local drives- ie when neither NFS or iSCSI are involved.

What now?  :)
-- 
This message posted from opensolaris.org

erik.ableson

2010-Oct-15 20:41 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

On 15 oct. 2010, at 22:19, Ian D wrote:
> A little setback....  We found out that we also have the issue with the
Dell H800 controllers, not just the LSI 9200-16e.  With the Dell it''s
initially faster as we benefit from the cache, but after a little while it goes
sour- from 350MB/sec down to less than 40MB/sec.  We''ve also tried with
a LSI 9200-8e with the same results.
> 
> So to recap...  No matter what HBA we use, copying through the network
to/from the external drives is painfully slow when access is done through either
NFS or iSCSI.  HOWEVER, it is plenty fast when we do a scp where the data is
written to the external drives (or internal ones for that matter) when they are
seen by the Nexenta box as local drives- ie when neither NFS or iSCSI are
involved.
Sounds an awful lot like client side issues coupled possibly with networking
problems.

Have you looked into disabling the Nagle algorithm on the client side?
That''s something that can impact both iSCSI and NFS badly, but ssh is
usually not as affected... I vaguely remember that being a real performance
killer on some Linux versions.

Another thing to check would be ensure that noatime is set so that your reads
aren''t triggering writes across the network as well.

Cheers,

Erik

Saxon, Will

2010-Oct-15 20:45 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> -----Original Message-----
> From: zfs-discuss-bounces at opensolaris.org 
> [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Ian D
> Sent: Friday, October 15, 2010 4:19 PM
> To: zfs-discuss at opensolaris.org
> Subject: Re: [zfs-discuss] Performance issues with iSCSI under Linux
> 
> A little setback....  We found out that we also have the 
> issue with the Dell H800 controllers, not just the LSI 
> 9200-16e.  With the Dell it''s initially faster as we benefit 
> from the cache, but after a little while it goes sour- from 
> 350MB/sec down to less than 40MB/sec.  We''ve also tried with 
> a LSI 9200-8e with the same results.
> 
> So to recap...  No matter what HBA we use, copying through 
> the network to/from the external drives is painfully slow 
> when access is done through either NFS or iSCSI.  HOWEVER, it 
> is plenty fast when we do a scp where the data is written to 
> the external drives (or internal ones for that matter) when 
> they are seen by the Nexenta box as local drives- ie when 
> neither NFS or iSCSI are involved.  
Has anyone suggested either removing L2ARC/SLOG entirely or relocating them so
that all devices are coming off the same controller? You''ve swapped the
external controller but the H700 with the internal drives could be the real
culprit. Could there be issues with cross-controller IO in this case? Does the
H700 use the same chipset/driver as the other controllers you''ve tried?

I don''t have a good understanding of where the various software
components here fit together, but it seems like the problem is not with the
controller(s) but with whatever is queueing network IO requests to the storage
subsystem (or controlling queues/buffers/etc for this). Do NFS and iSCSI share a
code path for this?

-Will

Ian D

2010-Oct-15 21:34 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Has anyone suggested either removing L2ARC/SLOG
> entirely or relocating them so that all devices are
> coming off the same controller? You''ve swapped the
> external controller but the H700 with the internal
> drives could be the real culprit. Could there be
> issues with cross-controller IO in this case? Does
> the H700 use the same chipset/driver as the other
> controllers you''ve tried? 
We''ll try that.  We have a couple other devices we could use for the
SLOG like a DDRDrive X1 and an OCZ Z-Drive which are both PCIe cards and
don''t use the local controller.

Thanks
-- 
This message posted from opensolaris.org

Ross Walker

2010-Oct-15 22:33 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

On Oct 15, 2010, at 5:34 PM, Ian D <reward72 at hotmail.com> wrote:
>> Has anyone suggested either removing L2ARC/SLOG
>> entirely or relocating them so that all devices are
>> coming off the same controller? You''ve swapped the
>> external controller but the H700 with the internal
>> drives could be the real culprit. Could there be
>> issues with cross-controller IO in this case? Does
>> the H700 use the same chipset/driver as the other
>> controllers you''ve tried? 
> 
> We''ll try that.  We have a couple other devices we could use for
the SLOG like a DDRDrive X1 and an OCZ Z-Drive which are both PCIe cards and
don''t use the local controller.
What mount options are you using on the Linux client for the NFS share?

-Ross

Ian D

2010-Oct-22 18:57 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Some numbers...

zpool status

  pool: Pool_sas
 state: ONLINE
 scan: none requested
config:

        NAME                     STATE     READ WRITE CKSUM
        Pool_sas                 ONLINE       0     0     0
          c4t5000C5000006A6D3d0  ONLINE       0     0     0
          c4t5000C5000006A777d0  ONLINE       0     0     0
          c4t5000C5000006AA43d0  ONLINE       0     0     0
          c4t5000C5000006AC4Fd0  ONLINE       0     0     0
          c4t5000C5000006AEF7d0  ONLINE       0     0     0
          c4t5000C5000006B27Fd0  ONLINE       0     0     0
          c4t5000C5000006B28Bd0  ONLINE       0     0     0
          c4t5000C5000006B46Bd0  ONLINE       0     0     0
          c4t5000C5000006B563d0  ONLINE       0     0     0
          c4t5000C5000006B643d0  ONLINE       0     0     0
          c4t5000C5000006B6D3d0  ONLINE       0     0     0
          c4t5000C5000006BBE7d0  ONLINE       0     0     0
          c4t5000C5000006C407d0  ONLINE       0     0     0
          c4t5000C5000006C657d0  ONLINE       0     0     0

errors: No known data errors

  pool: Pool_test
 state: ONLINE
 scan: none requested
config:

        NAME                     STATE     READ WRITE CKSUM
        Pool_test                ONLINE       0     0     0
          c4t5000C5002103F093d0  ONLINE       0     0     0
          c4t5000C50021101683d0  ONLINE       0     0     0
          c4t5000C50021102AA7d0  ONLINE       0     0     0
          c4t5000C500211034D3d0  ONLINE       0     0     0
          c4t5000C500211035DFd0  ONLINE       0     0     0
          c4t5000C5002110480Fd0  ONLINE       0     0     0
          c4t5000C50021104F0Fd0  ONLINE       0     0     0
          c4t5000C50021119A43d0  ONLINE       0     0     0
          c4t5000C5002112392Fd0  ONLINE       0     0     0

errors: No known data errors

  pool: syspool
 state: ONLINE
 scan: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        syspool     ONLINE       0     0     0
          c0t0d0s0  ONLINE       0     0     0

errors: No known data errors
========================================================
Pool_sas is made of 14x 146G 15K SAS Drives in a big stripe.  For this test
there is no log device or cache.  Connected to it is a RedHat box using iSCSI
through an Intel X520 10GbE NIC. It runs several large MySQL queries at once-
each taking minutes to compute.

Pool_test is a stripe of 2TB SATA drives and a terrabyte of files is being
copied to it for another box during this test.

Here''s the pastebin of "iostat -xdn 10" on the Linux box:
http://pastebin.com/431ESYaz

Here''s the pastebin of "iostat -xdn 10" on the Nexenta box:
http://pastebin.com/9g7KD3Ku

Here''s the pastebin "zpool iostat -v 10" on the Nexenta box:
http://pastebin.com/05fJL5sw
>From these numbers it looks like the Linux box is waiting for data all the
time while the Nexenta box isn''t pulling nearly as much throughput and
IOPS as it could.  Where is the bottleneck?
One thing suspicious is that we notice a slow down of one pool when the other is
under load.  How can that be?

Ian
-- 
This message posted from opensolaris.org

Haudy Kazemi

2010-Oct-23 03:40 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

>
> One thing suspicious is that we notice a slow down of one pool when the
other is under load.  How can that be?
>
> Ian
>   A network switch that is being maxed out?  Some switches cannot switch 
at rated line speed on all their ports all at the same time.  Their 
internal buses simply don''t have the bandwidth needed for that.  Maybe 
you are running into that limit?  (I know you mentioned bypassing the 
switch completely in some other tests and not noticing any difference.)

Any other hardware in common?

Tim Cook

2010-Oct-23 03:55 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

On Fri, Oct 22, 2010 at 10:40 PM, Haudy Kazemi <kaze0010 at umn.edu>
wrote:
>
>
>> One thing suspicious is that we notice a slow down of one pool when the
>> other is under load.  How can that be?
>>
>> Ian
>>
>>
> A network switch that is being maxed out?  Some switches cannot switch at
> rated line speed on all their ports all at the same time.  Their internal
> buses simply don''t have the bandwidth needed for that.  Maybe you
are
> running into that limit?  (I know you mentioned bypassing the switch
> completely in some other tests and not noticing any difference.)
>
> Any other hardware in common?
>
>
>
There''s almost 0 chance a switch is being overrun by a single gigE
connection.  The worst switch I''ve seen is roughly 8:1 oversubscribed.
 You''d have to be maxing out many, many ports for a switch to be a
problem.

Likely you don''t have enough ram or CPU in the box.

--Tim
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101022/9ddf8b59/attachment-0001.html>

Phil Harman

2010-Oct-23 04:53 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

What more info could you provide? Quite a lot more, actually, like: how many
streams of SQL and copy are you running? how are the filesystems/zvols
configured (recordsize, etc)? some CPU, VM and network stats would also be nice.

Based on the nexenta iostats you''ve provided (a tiny window on
what''s happening), it appears that you have an 8k recordsize for SQL.

If you add up all the IOPS for the SQL, it''s roughly 2000 reads at
around 3ms each. Which might indicate at least 6 reads outstanding at any time.
So how many queries do you have running in parallel? If you add more,
I''d expect the service times to increase.

3ms isn''t much for spinning rust, but isn''t this why you are
planning to use lots of L2ARC?

Could be a similar story on writes. How many parallel streams? How many files?
What''s the average file size? What''s the client filesysyem?
How much does it sync to the server? Could it be that your client apps are
always waiting for the spinning rust? Does an SSD log make any difference on
this pool?

Sent from my iPhone

On 22 Oct 2010, at 19:57, Ian D <reward72 at hotmail.com> wrote:
> Some numbers...
> 
> zpool status
> 
>  pool: Pool_sas
> state: ONLINE
> scan: none requested
> config:
> 
>        NAME                     STATE     READ WRITE CKSUM
>        Pool_sas                 ONLINE       0     0     0
>          c4t5000C5000006A6D3d0  ONLINE       0     0     0
>          c4t5000C5000006A777d0  ONLINE       0     0     0
>          c4t5000C5000006AA43d0  ONLINE       0     0     0
>          c4t5000C5000006AC4Fd0  ONLINE       0     0     0
>          c4t5000C5000006AEF7d0  ONLINE       0     0     0
>          c4t5000C5000006B27Fd0  ONLINE       0     0     0
>          c4t5000C5000006B28Bd0  ONLINE       0     0     0
>          c4t5000C5000006B46Bd0  ONLINE       0     0     0
>          c4t5000C5000006B563d0  ONLINE       0     0     0
>          c4t5000C5000006B643d0  ONLINE       0     0     0
>          c4t5000C5000006B6D3d0  ONLINE       0     0     0
>          c4t5000C5000006BBE7d0  ONLINE       0     0     0
>          c4t5000C5000006C407d0  ONLINE       0     0     0
>          c4t5000C5000006C657d0  ONLINE       0     0     0
> 
> errors: No known data errors
> 
>  pool: Pool_test
> state: ONLINE
> scan: none requested
> config:
> 
>        NAME                     STATE     READ WRITE CKSUM
>        Pool_test                ONLINE       0     0     0
>          c4t5000C5002103F093d0  ONLINE       0     0     0
>          c4t5000C50021101683d0  ONLINE       0     0     0
>          c4t5000C50021102AA7d0  ONLINE       0     0     0
>          c4t5000C500211034D3d0  ONLINE       0     0     0
>          c4t5000C500211035DFd0  ONLINE       0     0     0
>          c4t5000C5002110480Fd0  ONLINE       0     0     0
>          c4t5000C50021104F0Fd0  ONLINE       0     0     0
>          c4t5000C50021119A43d0  ONLINE       0     0     0
>          c4t5000C5002112392Fd0  ONLINE       0     0     0
> 
> errors: No known data errors
> 
>  pool: syspool
> state: ONLINE
> scan: none requested
> config:
> 
>        NAME        STATE     READ WRITE CKSUM
>        syspool     ONLINE       0     0     0
>          c0t0d0s0  ONLINE       0     0     0
> 
> errors: No known data errors
> ========================================================> 
> Pool_sas is made of 14x 146G 15K SAS Drives in a big stripe.  For this test
there is no log device or cache.  Connected to it is a RedHat box using iSCSI
through an Intel X520 10GbE NIC. It runs several large MySQL queries at once-
each taking minutes to compute.
> 
> Pool_test is a stripe of 2TB SATA drives and a terrabyte of files is being
copied to it for another box during this test.
> 
> Here''s the pastebin of "iostat -xdn 10" on the Linux
box:
> http://pastebin.com/431ESYaz
> 
> Here''s the pastebin of "iostat -xdn 10" on the Nexenta
box:
> http://pastebin.com/9g7KD3Ku
> 
> Here''s the pastebin "zpool iostat -v 10" on the Nexenta
box:
> http://pastebin.com/05fJL5sw
> 
> From these numbers it looks like the Linux box is waiting for data all the
time while the Nexenta box isn''t pulling nearly as much throughput and
IOPS as it could.  Where is the bottleneck?
> 
> One thing suspicious is that we notice a slow down of one pool when the
other is under load.  How can that be?
> 
> Ian
> -- 
> This message posted from opensolaris.org
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Haudy Kazemi

2010-Oct-23 05:06 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Tim Cook wrote:>
>
> On Fri, Oct 22, 2010 at 10:40 PM, Haudy Kazemi <kaze0010 at umn.edu 
> <mailto:kaze0010 at umn.edu>> wrote:
>
>
>
>         One thing suspicious is that we notice a slow down of one pool
>         when the other is under load.  How can that be?
>
>         Ian
>          
>
>     A network switch that is being maxed out?  Some switches cannot
>     switch at rated line speed on all their ports all at the same
>     time.  Their internal buses simply don''t have the bandwidth
needed
>     for that.  Maybe you are running into that limit?  (I know you
>     mentioned bypassing the switch completely in some other tests and
>     not noticing any difference.)
>
>     Any other hardware in common?
>
>
>
>
> There''s almost 0 chance a switch is being overrun by a single gigE
> connection.  The worst switch I''ve seen is roughly 8:1
oversubscribed.
>  You''d have to be maxing out many, many ports for a switch to be a
> problem.
>
> Likely you don''t have enough ram or CPU in the box.
>
> --Tim
>
I agree, but also trying not to assume anything.  Looking back, Ian''s 
first email said ''10GbE on a dedicated switch''.  I
don''t think the
switch model was ever identified...perhaps it is a 1 GbE switch with a 
few 10 GbE ports?  (Drawing at straws.)


What happens when Windows is the iSCSI initiator connecting to an iSCSI 
target on ZFS?  If that is also slow, the issue is likely not in Windows 
or in Linux.

Do CIFS shares (connected to from Linux and from Windows) show the same 
performance problems as iSCSI and NFS?  If yes, this would suggest a 
common cause item on the ZFS side.


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101023/e880b788/attachment.html>

Ian D

2010-Oct-23 11:26 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> A network switch that is being maxed out?  Some
> switches cannot switch 
> at rated line speed on all their ports all at the
> same time.  Their 
> internal buses simply don''t have the bandwidth needed
> for that.  Maybe 
> you are running into that limit?  (I know you
> mentioned bypassing the 
> switch completely in some other tests and not
> noticing any difference.)
We''re using a 10GbE switch and NICs and they have their own separate
network- we''re not even close to the limit.
Thanks
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-23 11:31 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Likely you don&#39;t have enough ram or CPU in the box.
The Nexenta box has 256G of RAM and the latest X7500 series CPUs.  That said,
the load does get crazy high (like 35+) very quickly.  We can''t figure
out what''s taking so much CPU.  It happens even when
checksum/compression/unduping are off.

Ian
-- 
This message posted from opensolaris.org

Ian D

2010-Oct-23 11:38 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> I don''t think the switch model was ever identified...perhaps it is
a 1 GbE switch with a few 10 GbE ports?&nbsp; (Drawing at straws.)
<p>
It is a a Dell 8024F.  It has 24 SPF+ 10GbE ports and every NICs we connect to
it are Intel X520.  One issue we do have with it is when we turn jumbo frames
on, the linux boxes crashes.  So all our tests are done without jumbo frames,
but that alone cannot make that much of a difference.  We know the hardware of
the Nexenta box can do several times better- it does when we run Linux on it.
> What happens when Windows is the iSCSI initiator
> connecting to an iSCSI
> target on ZFS?&nbsp; If that is also slow, the issue
> is likely not in Windows or in Linux.<br>
> <br>
> Do CIFS shares (connected to from Linux and from
> Windows) show the same
> performance problems as iSCSI and NFS?&nbsp; If yes,
> this would suggest a
> common cause item on the ZFS side.<br>
<p>
We need to try that.  We did try two versions of Linux (RedHat and SuSE) and
ended up with the same problem, but we haven''t tried with Windows/Mac
yet.

Thanks all!
Ian
-- 
This message posted from opensolaris.org

Richard Elling

2010-Oct-23 16:45 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

On Oct 23, 2010, at 4:31 AM, Ian D wrote:
>> Likely you don&#39;t have enough ram or CPU in the box.
> 
> The Nexenta box has 256G of RAM and the latest X7500 series CPUs.  That
said, the load does get crazy high (like 35+) very quickly.  We can''t
figure out what''s taking so much CPU.  It happens even when
checksum/compression/unduping are off.
To see how many threads are running for a process, use prstat.
 -- richard

-- 
OpenStorage Summit, October 25-27, Palo Alto, CA
http://nexenta-summit2010.eventbrite.com
USENIX LISA ''10 Conference, November 7-12, San Jose, CA
ZFS and performance consulting
http://www.RichardElling.com

Ian D

2010-Oct-30 12:48 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

I owe you all an update...

We found out a clear pattern we can now recreate at will.  Whenever we
read/write the pool, it gives expected throughput and IOPS for a while, but at
some point it slows down to a crawl, nothing is responding and pretty much
"hang" for a few seconds and then things go back to normal for a
little more while.  Sometimes the problem is barely noticeable and only happen
once every few minutes, at other times it is every few seconds.  We could be
doing the exact same operation and sometimes it is fast and sometimes it is
slow. The more clients are connected the worse the problem typically gets- and
no, it''s not happening every 30 seconds when things are committed to
disk.

Now... every time that slow down occurs, the load on the Nexenta box gets crazy
high- it can reach 35 and more and the console dont even respond anymore.  The
rest of the time the load barely reaches 3.  The box has four 7500 series Intel
Xeon CPUs and 256G of RAM and use 15K SAS HDDs in mirrored stripes on LSI
9200-8e HBAs- so we''re certainly not underpowered.  We also have the
same issue when using a box with two of the latest AMD Opteron CPUs (the
Magny-Cours) and 128G of RAM.

We are able to reach 800MB/sec and more over the network when things go well,
but the average get destroyed by the slow downs when there is zero throughput.

These tests are run without any L2ARC or SLOG, but past tests have shown the
same issue when using them.  We''ve tried with 12x 100G Samsung SLC SSDs
and DDRDrive X1s among other thing- and while they make the whole thing much
faster, they don''t prevent those intermittent slow downs from
happening.

Our next step is to isolate the process that take all that CPU...

Ian
-- 
This message posted from opensolaris.org

zfs user

2010-Oct-30 20:56 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

So  maybe a next step is to run zilstat, arcstat, iostat -xe?? (I forget what 
people like to use for these params), zpool iostat -v in 4 term windows while 
running the same test and try to see what is spiking when that high load 
period occurs.

Not sure if there is a better version than this:
http://www.solarisinternals.com/wiki/index.php/Arcstat

Richard''s zilstat:
http://blog.richardelling.com/2009/02/zilstat-improved.html

Other arc tools:
http://vt-100.blogspot.com/2010/03/top-with-zfs-arc.html
http://www.cuddletech.com/blog/pivot/entry.php?id=979


On 10/30/10 5:48 AM, Ian D wrote:> I owe you all an update...
>
> We found out a clear pattern we can now recreate at will.  Whenever we
read/write the pool, it gives expected throughput and IOPS for a while, but at
some point it slows down to a crawl, nothing is responding and pretty much
"hang" for a few seconds and then things go back to normal for a
little more while.  Sometimes the problem is barely noticeable and only happen
once every few minutes, at other times it is every few seconds.  We could be
doing the exact same operation and sometimes it is fast and sometimes it is
slow. The more clients are connected the worse the problem typically gets- and
no, it''s not happening every 30 seconds when things are committed to
disk.
>
> Now... every time that slow down occurs, the load on the Nexenta box gets
crazy high- it can reach 35 and more and the console dont even respond anymore. 
The rest of the time the load barely reaches 3.  The box has four 7500 series
Intel Xeon CPUs and 256G of RAM and use 15K SAS HDDs in mirrored stripes on LSI
9200-8e HBAs- so we''re certainly not underpowered.  We also have the
same issue when using a box with two of the latest AMD Opteron CPUs (the
Magny-Cours) and 128G of RAM.
>
> We are able to reach 800MB/sec and more over the network when things go
well, but the average get destroyed by the slow downs when there is zero
throughput.
>
> These tests are run without any L2ARC or SLOG, but past tests have shown
the same issue when using them.  We''ve tried with 12x 100G Samsung SLC
SSDs and DDRDrive X1s among other thing- and while they make the whole thing
much faster, they don''t prevent those intermittent slow downs from
happening.
>
> Our next step is to isolate the process that take all that CPU...
>
> Ian

zfs user

2010-Oct-30 21:10 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Here is a total guess - but what if it has to do with zfs processing running 
on one CPU having to talk to the memory "owned" by a different CPU? I
don''t
know if many people are running fully populated boxes like you are, so maybe 
it is something people are not seeing due to not having huge amounts of ram on 
these new multichip transport systems.

Maybe you could test it by going to 1 Mangy-Cours CPU and just the memory 
populated for that one on your AMD box and see if you get the same periodic 
high loads.


On 10/30/10 5:48 AM, Ian D wrote:> I owe you all an update...
>
> We found out a clear pattern we can now recreate at will.  Whenever we
read/write the pool, it gives expected throughput and IOPS for a while, but at
some point it slows down to a crawl, nothing is responding and pretty much
"hang" for a few seconds and then things go back to normal for a
little more while.  Sometimes the problem is barely noticeable and only happen
once every few minutes, at other times it is every few seconds.  We could be
doing the exact same operation and sometimes it is fast and sometimes it is
slow. The more clients are connected the worse the problem typically gets- and
no, it''s not happening every 30 seconds when things are committed to
disk.
>
> Now... every time that slow down occurs, the load on the Nexenta box gets
crazy high- it can reach 35 and more and the console dont even respond anymore. 
The rest of the time the load barely reaches 3.  The box has four 7500 series
Intel Xeon CPUs and 256G of RAM and use 15K SAS HDDs in mirrored stripes on LSI
9200-8e HBAs- so we''re certainly not underpowered.  We also have the
same issue when using a box with two of the latest AMD Opteron CPUs (the
Magny-Cours) and 128G of RAM.
>
> We are able to reach 800MB/sec and more over the network when things go
well, but the average get destroyed by the slow downs when there is zero
throughput.
>
> These tests are run without any L2ARC or SLOG, but past tests have shown
the same issue when using them.  We''ve tried with 12x 100G Samsung SLC
SSDs and DDRDrive X1s among other thing- and while they make the whole thing
much faster, they don''t prevent those intermittent slow downs from
happening.
>
> Our next step is to isolate the process that take all that CPU...
>
> Ian

Khushil Dep

2010-Oct-30 21:46 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

We had the same issue with a 24 core box a while ago. Check your l2 cache
hits and misses. Sometimes more cores does not mean more performance dtrace
is your friend!

On 30 Oct 2010 14:12, "zfs user" <zfsml at itsbeen.sent.com>
wrote:

Here is a total guess - but what if it has to do with zfs processing running
on one CPU having to talk to the memory "owned" by a different CPU? I
don''t
know if many people are running fully populated boxes like you are, so maybe
it is something people are not seeing due to not having huge amounts of ram
on these new multichip transport systems.

Maybe you could test it by going to 1 Mangy-Cours CPU and just the memory
populated for that one on your AMD box and see if you get the same periodic
high loads.

On 10/30/10 5:48 AM, Ian D wrote:
>
> I owe you all an update...
>
> We found out a clear pattern we can now recreate at will.  Whenev...

_______________________________________________
zfs-discuss mailing list
zfs-discuss at opensolaris.or...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101030/c044a0f9/attachment.html>

Eugen Leitl

2010-Oct-30 22:49 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

On Sat, Oct 30, 2010 at 02:10:49PM -0700, zfs user wrote:
> 1 Mangy-Cours CPU      ^^^^^

Dunno whether deliberate, or malapropism, but I love it.

Khushil Dep

2010-Oct-31 00:26 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

If you take a look at http://www.brendangregg.com/cachekit.html you will see
some DTrace yummyness which should let you tell...

---
W. A. Khushil Dep - khushil.dep at gmail.com -  07905374843

Visit my blog at http://www.khushil.com/






On 30 October 2010 15:49, Eugen Leitl <eugen at leitl.org> wrote:
> On Sat, Oct 30, 2010 at 02:10:49PM -0700, zfs user wrote:
>
> > 1 Mangy-Cours CPU
>    ^^^^^
>
> Dunno whether deliberate, or malapropism, but I love it.
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101030/abe5695f/attachment.html>

zfs user

2010-Oct-31 02:07 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

I did it deliberately - how dumb are these product managers that they name 
products with weird names and not expect them to be abused? On the other hand, 
if you do a search for mangy cours you''ll find a bunch of hits where it
is
clearly a misspelling on serious tech articles, postings, etc.

"I am seeing some spotty performance with my new Mangy Cours CPU"...
It is like they are asking for it.  I think they be better off doing something 
like Intel core arch names using city names "Santa Rosa", etc.

On 10/30/10 3:49 PM, Eugen Leitl wrote:> On Sat, Oct 30, 2010 at 02:10:49PM -0700, zfs user wrote:
>
>> 1 Mangy-Cours CPU
>      ^^^^^
>
> Dunno whether deliberate, or malapropism, but I love it.
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Erik Trimble

2010-Oct-31 03:02 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

On 10/30/2010 7:07 PM, zfs user wrote:> I did it deliberately - how dumb are these product managers that they 
> name products with weird names and not expect them to be abused? On 
> the other hand, if you do a search for mangy cours you''ll find a
bunch
> of hits where it is clearly a misspelling on serious tech articles, 
> postings, etc.
>
>
> "I am seeing some spotty performance with my new Mangy Cours
CPU"...
> It is like they are asking for it.  I think they be better off doing 
> something like Intel core arch names using city names "Santa
Rosa", etc.
>
> On 10/30/10 3:49 PM, Eugen Leitl wrote:
>> On Sat, Oct 30, 2010 at 02:10:49PM -0700, zfs user wrote:
>>
>>> 1 Mangy-Cours CPU
>>      ^^^^^
>>
>> Dunno whether deliberate, or malapropism, but I love it.
>> _______________________________________________Well,

First off, it''s "Magny Cours", not "Mangy Cours". 
Though, to an
English-speaker, the letter reversal (and mispronunciation) come easily.

AMD does use City names. Just not American ones.  Magny-Cours is a place 
in France (near a well-known Formula One circuit).  So far as I can 
tell, their (recent) platform names seem to be English nouns, and the 
CPU architectures European ones.

But who knows how people think of codenames. Frankly, I get really 
annoyed with Intel''s penchance to put a codename on every little last 
thing, and change them at the drop of a hat.  I get Intel pre-production 
hardware here, and there''s a half-dozen codenames on each server. Makes
keeping track of what''s what a nightmare.

-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

zfs user

2010-Oct-31 03:56 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Right, I realized it was Magny not Mangy, but I thought it was related to the 
race track or racing not a town.

I completely agree with you on codenames, the Linux distro codename irk me - 
hey, guys it might be easy for you to keep track of which release is "Bushy
Beaver" or "Itchy Ibis" or "Mayonnaise" but if you deal
with many products
like I do, it would be a blessing to get just a sane pattern of increasing 
numbers or number-2_or_3_letters or something that gives a clue to where this 
X falls within the spectrum of the Y product lifetime and if I am wishing - a 
_dedicated_ web page that lists each release with a freaking date (I have a 
long separate rant about how tech articles don''t seem to include dates
and
production version numbers of the products that they are writing about) and a 
short blurb about what changed.

On 10/30/10 8:02 PM, Erik Trimble wrote:> On 10/30/2010 7:07 PM, zfs user wrote:
>> I did it deliberately - how dumb are these product managers that they
name
>> products with weird names and not expect them to be abused? On the
other
>> hand, if you do a search for mangy cours you''ll find a bunch
of hits where
>> it is clearly a misspelling on serious tech articles, postings, etc.
>>
>>
>> "I am seeing some spotty performance with my new Mangy Cours
CPU"...
>> It is like they are asking for it. I think they be better off doing
>> something like Intel core arch names using city names "Santa
Rosa", etc.
>>
>> On 10/30/10 3:49 PM, Eugen Leitl wrote:
>>> On Sat, Oct 30, 2010 at 02:10:49PM -0700, zfs user wrote:
>>>
>>>> 1 Mangy-Cours CPU
>>> ^^^^^
>>>
>>> Dunno whether deliberate, or malapropism, but I love it.
>>> _______________________________________________
> Well,
>
> First off, it''s "Magny Cours", not "Mangy
Cours". Though, to an
> English-speaker, the letter reversal (and mispronunciation) come easily.
>
> AMD does use City names. Just not American ones. Magny-Cours is a place in
> France (near a well-known Formula One circuit). So far as I can tell, their
> (recent) platform names seem to be English nouns, and the CPU architectures
> European ones.
>
> But who knows how people think of codenames. Frankly, I get really annoyed
> with Intel''s penchance to put a codename on every little last
thing, and
> change them at the drop of a hat. I get Intel pre-production hardware here,
> and there''s a half-dozen codenames on each server. Makes keeping
track of
> what''s what a nightmare.
>

Ian D

2010-Oct-31 19:40 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

I get that multi-cores doesn''t necessarily better performances, but I
doubt that both the latest AMD CPUs (the Magny-Cours) and the latest Intel CPUs
(the Beckton) suffer from incredibly bad cache management.   Our two test system
have 2 and 4 of each respectively.  The thing is that the performances are not
consistently slow, they are great for a while and *stops* for a few seconds
before they get great again.  It''s like they are choking or
something...

Thanks
Ian
-- 
This message posted from opensolaris.org

Edward Ned Harvey

2010-Oct-31 21:55 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Ian D
> 
> I get that multi-cores doesn''t necessarily better performances,
but I
> doubt that both the latest AMD CPUs (the Magny-Cours) and the latest
> Intel CPUs (the Beckton) suffer from incredibly bad cache management.
You "doubt" AMD or Intel cpu''s suffer from bad cache mgmt?
I can''t say if that problem affects you or not, but you
shouldn''t dismiss it
just like that.  Back when Pentium 4 was the latest greatest, I had a system
that supported max 8G ram.  It was originally built 4G, and worked fine.  We
later upgraded to 8G, and as long as it used 8G ram, it performed about 1%
of normal capacity.  The root cause was something like insufficient cache on
the cpu.  The solution was to run with 4G of ram, and no more.

Probably not anything like what you''re experiencing, but the point is,
yes
that sort of thing can happen in real life.

And if you think that''s bad, imagine what it must have been like back
when
the Pentium-1 chip was first being discovered to have the floating point
division error...

Khushil Dep

2010-Nov-01 12:23 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

If you do a dd to the storage from the heads do you still get the same
issues?

On 31 Oct 2010 12:40, "Ian D" <reward72 at hotmail.com> wrote:

I get that multi-cores doesn''t necessarily better performances, but I
doubt
that both the latest AMD CPUs (the Magny-Cours) and the latest Intel CPUs
(the Beckton) suffer from incredibly bad cache management.   Our two test
system have 2 and 4 of each respectively.  The thing is that the
performances are not consistently slow, they are great for a while and
*stops* for a few seconds before they get great again.  It''s like they
are
choking or something...

Thanks

Ian
-- 
This message posted from opensolaris.org
_______________________________________________

zfs-discuss mailing list
zfs-discuss at opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zf...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101101/a9bbded7/attachment.html>

Ian D

2010-Nov-01 18:36 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> If you do a dd to the storage from the heads do
> you still get the same issues? 
no, local read/writes are great, they never choke.  It''s whenever NFS
or iSCSI are involved and that the read/writes are done from a remote box that
we experience the problem. Local operations barely affects the CPU load.

Ian
-- 
This message posted from opensolaris.org

2010-Nov-01 19:35 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

What if you connect locally via NFS or iscsi?   

SR
-- 
This message posted from opensolaris.org

Khushil Dep

2010-Nov-01 20:14 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Check your TXG settings, it could be a timing issue, nagles issue, also TCP
buffer issue. Check setup system properties.

On 1 Nov 2010 19:36, "SR" <rraja05 at gmail.com> wrote:

What if you connect locally via NFS or iscsi?

SR

-- 
This message posted from opensolaris.org
_______________________________________________
zfs-dis...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101101/0b8a60c2/attachment.html>

Roy Sigurd Karlsbakk

2010-Nov-01 20:27 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

----- Original Message -----> > Likely you don&#39;t have enough ram or CPU in the box.
> 
> The Nexenta box has 256G of RAM and the latest X7500 series CPUs. That
> said, the load does get crazy high (like 35+) very quickly. We
can''t
> figure out what''s taking so much CPU. It happens even when
> checksum/compression/unduping are off.
Is your data deduplicated? It won''t get de-deduplicated if you merely
turn off dedup. Dedup is not really stable atm, so that might be the problem

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy at karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er
et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og
relevante synonymer p? norsk.

2010-Nov-01 20:52 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Maybe you are experiencing this:


http://opensolaris.org/jive/thread.jspa?threadID=119421
-- 
This message posted from opensolaris.org

Ian D

2010-Nov-01 21:05 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> You "doubt" AMD or Intel cpu''s suffer from bad cache
> mgmt?
In order to clear that out, we''ve tried using an older server (about 4
years old) as the head and we see the same pattern.  It''s actually more
obvious that it consumes a whole lot of CPU cycles.  Using the same box as a
Linux-based NFS server and running the same tests barely has an impact on the
CPUs.

Ian
-- 
This message posted from opensolaris.org

Ian D

2010-Nov-01 21:09 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Maybe you are experiencing this:
> http://opensolaris.org/jive/thread.jspa?threadID=11942
It does look like this... Is this really the expected behaviour? 
That''s just unacceptable.  It is so bad it sometimes drop connection
and fail copies and SQL queries...

Ian
-- 
This message posted from opensolaris.org

Ross Walker

2010-Nov-01 22:52 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

On Nov 1, 2010, at 5:09 PM, Ian D <reward72 at hotmail.com> wrote:
>> Maybe you are experiencing this:
>> http://opensolaris.org/jive/thread.jspa?threadID=11942
> 
> It does look like this... Is this really the expected behaviour? 
That''s just unacceptable.  It is so bad it sometimes drop connection
and fail copies and SQL queries...
Then set the zfs_write_limit_override to a reasonable value.

Depending on the speed of your ZIL and/or backing store (for async IO) you will
need to limit the write size in such a way so TXG1 is fully committed before
TXG2 fills.

Myself, with a RAID controller with a 512MB BBU write-back cache I set the write
limit to 512MB which allows my setup to commit-before-fill.

It also prevents ARC from discarding good read cache data in favor of write
cache.

Others may have a good calculation based on ARC execution plan timings, disk
seek and sustained throughput to give an accurate figure based on one''s
setup, otherwise start with a reasonable value, say 1GB, and decrease until the
pauses stop.

-Ross

Haudy Kazemi

2010-Nov-01 23:52 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Ross Walker wrote:> On Nov 1, 2010, at 5:09 PM, Ian D <reward72 at hotmail.com> wrote:
>
>   
>>> Maybe you are experiencing this:
>>> http://opensolaris.org/jive/thread.jspa?threadID=11942
>>>       
>> It does look like this... Is this really the expected behaviour? 
That''s just unacceptable.  It is so bad it sometimes drop connection
and fail copies and SQL queries...
>>     
>
> Then set the zfs_write_limit_override to a reasonable value.
>
> Depending on the speed of your ZIL and/or backing store (for async IO) you
will need to limit the write size in such a way so TXG1 is fully committed
before TXG2 fills.
>
> Myself, with a RAID controller with a 512MB BBU write-back cache I set the
write limit to 512MB which allows my setup to commit-before-fill.
>
> It also prevents ARC from discarding good read cache data in favor of write
cache.
>
> Others may have a good calculation based on ARC execution plan timings,
disk seek and sustained throughput to give an accurate figure based on
one''s setup, otherwise start with a reasonable value, say 1GB, and
decrease until the pauses stop.
>
> -Ross
>   
If this is the root cause, it sounds like some default configuration 
parameters need to be calculated, determined and adjusted differently 
from how they are now.  It is highly preferable that the default 
parameters do not exhibit severe problems.  Defaults should offer 
stability and performance to the extent that stability is not 
compromised.  (I.e. no dedupe by default under its current state).  
(Manually tweaked parameters are a different story in that they should 
allow a knowledgeable user to get a little more performance even if that 
is more risky).

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101101/7009bf42/attachment.html>

Ian D

2010-Nov-02 13:39 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

> Then set the zfs_write_limit_override to a reasonable
> value.
Our first experiments are showing progress.  We''ll play with it some
more and let you know. Thanks!

Ian
-- 
This message posted from opensolaris.org

Ian D

2010-Dec-17 21:56 UTC

head link

[zfs-discuss] Performance issues with iSCSI under Linux

Here''s a long due update for you all...

After updating countless drivers, BIOSes and Nexenta, it seems that our issue
has disappeared.  We''re slowly moving our production to our three
appliances and things are going well so far.  Sadly we don''t know
exactly what update fixed our issue. I wish I would know, but we tried so many
different things that we lost count.

We''ve also updated our DDRDrive X1s and they''re giving us
stellar performances.

Thanks to the people and Nexenta and DDRDrive!  It''s been more
challenging than we expected, but we''re now optimistic about the future
of all this.

Ian
-- 
This message posted from opensolaris.org

Apparently Analagous Threads

Search for more possibly parallel threads

zfs discuss - Oct 2010 - Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux [SEC=UNCLASSIFIED]

[zfs-discuss] Performance issues with iSCSI under Linux [SEC=UNCLASSIFIED]

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

[zfs-discuss] Performance issues with iSCSI under Linux

Apparently Analagous Threads