thr3ads.net - zfs discuss - [zfs-discuss] poor CIFS and NFS performance [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Eugen Leitl

2012-Dec-30 17:02 UTC

[zfs-discuss] poor CIFS and NFS performance

Happy $holidays,

I have a pool of 8x ST31000340AS on an LSI 8-port adapter as
a raidz3 (no compression nor dedup) with reasonable bonnie++ 
1.03 values, e.g.  145 MByte/s Seq-Write @ 48% CPU and 291 MByte/s 
Seq-Read @ 53% CPU. It scrubs with 230+ MByte/s with reasonable
system load. No hybrid pools yet. This is latest beta napp-it 
on OpenIndiana 151a5 server, living on a dedicated 64 GByte SSD.

The system is a MSI E350DM-E33 with 8 GByte PC1333 DDR3
memory, no ECC. All the systems have Intel NICs with mtu 9000
enabled, including all switches in the path.

My problem is pretty poor network throughput. An NFS
mount on 12.04 64 bit Ubuntu (mtu 9000) or CIFS are
read at about 23 MBytes/s. Windows 7 64 bit (also jumbo
frames) reads at about 65 MBytes/s. The highest transfer
speed on Windows just touches 90 MByte/s, before falling
back to the usual 60-70 MBytes/s.

I kinda can live with above values, but I have a feeling
the setup should be able to saturate GBit Ethernet with
large file transfers, especially on Linux (20 MByte/s
is nothing to write home about).

Does anyone have any suggestions on how to debug/optimize
throughput?

Thanks, and happy 2013.

P.S. Not sure whether this is pathological, but the system
does produce occasional soft errors like e.g. dmesg

Dec 30 17:45:00 oizfs scsi: [ID 107833 kern.notice]     Requested Block: 0      
Error Block: 0
Dec 30 17:45:00 oizfs scsi: [ID 107833 kern.notice]     Vendor: ATA             
Serial Number:
Dec 30 17:45:00 oizfs scsi: [ID 107833 kern.notice]     Sense Key: Soft_Error
Dec 30 17:45:00 oizfs scsi: [ID 107833 kern.notice]     ASC: 0x0 (<vendor
unique code 0x0>), ASCQ: 0x1d, FRU: 0x0
Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk at
g5000c50009c72c48 (sd9):
Dec 30 17:45:01 oizfs   Error for Command: <undecoded cmd 0xa1>    Error
Level: Recovered
Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Requested Block: 0      
Error Block: 0
Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Vendor: ATA             
Serial Number:
Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Sense Key: Soft_Error
Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     ASC: 0x0 (<vendor
unique code 0x0>), ASCQ: 0x1d, FRU: 0x0
Dec 30 17:45:01 oizfs pcplusmp: [ID 805372 kern.info] pcplusmp: ide (ata)
instance 0 irq 0xe vector 0x45 ioapic 0x3 intin 0xe is bound to cpu 0
Dec 30 17:45:01 oizfs pcplusmp: [ID 805372 kern.info] pcplusmp: ide (ata)
instance 0 irq 0xe vector 0x45 ioapic 0x3 intin 0xe is bound to cpu 1
Dec 30 17:45:01 oizfs pcplusmp: [ID 805372 kern.info] pcplusmp: ide (ata)
instance 0 irq 0xe vector 0x45 ioapic 0x3 intin 0xe is bound to cpu 0
Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk at
g5000c50009c73968 (sd4):
Dec 30 17:45:01 oizfs   Error for Command: <undecoded cmd 0xa1>    Error
Level: Recovered
Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Requested Block: 0      
Error Block: 0
Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Vendor: ATA             
Serial Number:
Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Sense Key: Soft_Error
Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     ASC: 0x0 (<vendor
unique code 0x0>), ASCQ: 0x1d, FRU: 0x0
Dec 30 17:45:03 oizfs scsi: [ID 107833 kern.warning] WARNING: /scsi_vhci/disk at
g5000c500098be9dd (sd10):
Dec 30 17:45:03 oizfs   Error for Command: <undecoded cmd 0xa1>    Error
Level: Recovered
Dec 30 17:45:03 oizfs scsi: [ID 107833 kern.notice]     Requested Block: 0      
Error Block: 0
Dec 30 17:45:03 oizfs scsi: [ID 107833 kern.notice]     Vendor: ATA             
Serial Number:
Dec 30 17:45:03 oizfs scsi: [ID 107833 kern.notice]     Sense Key: Soft_Error
Dec 30 17:45:03 oizfs scsi: [ID 107833 kern.notice]     ASC: 0x0 (<vendor
unique code 0x0>), ASCQ: 0x1d, FRU: 0x0
Dec 30 17:45:04 oizfs scsi: [ID 107833 kern.warning] WARNING: /pci at
0,0/pci1462,7720 at 11/disk at 3,0 (sd8):
Dec 30 17:45:04 oizfs   Error for Command: <undecoded cmd 0xa1>    Error
Level: Recovered
Dec 30 17:45:04 oizfs scsi: [ID 107833 kern.notice]     Requested Block: 0      
Error Block: 0
Dec 30 17:45:04 oizfs scsi: [ID 107833 kern.notice]     Vendor: ATA             
Serial Number:
Dec 30 17:45:04 oizfs scsi: [ID 107833 kern.notice]     Sense Key: Soft_Error
Dec 30 17:45:04 oizfs scsi: [ID 107833 kern.notice]     ASC: 0x0 (no additional
sense info), ASCQ: 0x0, FRU: 0x0

Richard Elling

2012-Dec-30 18:40 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Dec 30, 2012, at 9:02 AM, Eugen Leitl <eugen at leitl.org> wrote:
> 
> Happy $holidays,
> 
> I have a pool of 8x ST31000340AS on an LSI 8-port adapter as
> a raidz3 (no compression nor dedup) with reasonable bonnie++ 
> 1.03 values, e.g.  145 MByte/s Seq-Write @ 48% CPU and 291 MByte/s 
> Seq-Read @ 53% CPU. It scrubs with 230+ MByte/s with reasonable
> system load. No hybrid pools yet. This is latest beta napp-it 
> on OpenIndiana 151a5 server, living on a dedicated 64 GByte SSD.
> 
> The system is a MSI E350DM-E33 with 8 GByte PC1333 DDR3
> memory, no ECC. All the systems have Intel NICs with mtu 9000
> enabled, including all switches in the path.
Does it work faster with the default MTU?
Also check for retrans and errors, using the usual network performance
debugging checks.
> 
> My problem is pretty poor network throughput. An NFS
> mount on 12.04 64 bit Ubuntu (mtu 9000) or CIFS are
> read at about 23 MBytes/s. Windows 7 64 bit (also jumbo
> frames) reads at about 65 MBytes/s. The highest transfer
> speed on Windows just touches 90 MByte/s, before falling
> back to the usual 60-70 MBytes/s.
> 
> I kinda can live with above values, but I have a feeling
> the setup should be able to saturate GBit Ethernet with
> large file transfers, especially on Linux (20 MByte/s
> is nothing to write home about).
> 
> Does anyone have any suggestions on how to debug/optimize
> throughput?
> 
> Thanks, and happy 2013.
> 
> P.S. Not sure whether this is pathological, but the system
> does produce occasional soft errors like e.g. dmesg
More likely these are due to SMART commands not being properly handled
for SATA devices. They are harmless.
 -- richard
> 
> Dec 30 17:45:00 oizfs scsi: [ID 107833 kern.notice]     Requested Block: 0 
Error Block: 0
> Dec 30 17:45:00 oizfs scsi: [ID 107833 kern.notice]     Vendor: ATA        
Serial Number:
> Dec 30 17:45:00 oizfs scsi: [ID 107833 kern.notice]     Sense Key:
Soft_Error
> Dec 30 17:45:00 oizfs scsi: [ID 107833 kern.notice]     ASC: 0x0
(<vendor unique code 0x0>), ASCQ: 0x1d, FRU: 0x0
> Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/disk at g5000c50009c72c48 (sd9):
> Dec 30 17:45:01 oizfs   Error for Command: <undecoded cmd 0xa1>   
Error Level: Recovered
> Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Requested Block: 0 
Error Block: 0
> Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Vendor: ATA        
Serial Number:
> Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Sense Key:
Soft_Error
> Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     ASC: 0x0
(<vendor unique code 0x0>), ASCQ: 0x1d, FRU: 0x0
> Dec 30 17:45:01 oizfs pcplusmp: [ID 805372 kern.info] pcplusmp: ide (ata)
instance 0 irq 0xe vector 0x45 ioapic 0x3 intin 0xe is bound to cpu 0
> Dec 30 17:45:01 oizfs pcplusmp: [ID 805372 kern.info] pcplusmp: ide (ata)
instance 0 irq 0xe vector 0x45 ioapic 0x3 intin 0xe is bound to cpu 1
> Dec 30 17:45:01 oizfs pcplusmp: [ID 805372 kern.info] pcplusmp: ide (ata)
instance 0 irq 0xe vector 0x45 ioapic 0x3 intin 0xe is bound to cpu 0
> Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/disk at g5000c50009c73968 (sd4):
> Dec 30 17:45:01 oizfs   Error for Command: <undecoded cmd 0xa1>   
Error Level: Recovered
> Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Requested Block: 0 
Error Block: 0
> Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Vendor: ATA        
Serial Number:
> Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     Sense Key:
Soft_Error
> Dec 30 17:45:01 oizfs scsi: [ID 107833 kern.notice]     ASC: 0x0
(<vendor unique code 0x0>), ASCQ: 0x1d, FRU: 0x0
> Dec 30 17:45:03 oizfs scsi: [ID 107833 kern.warning] WARNING:
/scsi_vhci/disk at g5000c500098be9dd (sd10):
> Dec 30 17:45:03 oizfs   Error for Command: <undecoded cmd 0xa1>   
Error Level: Recovered
> Dec 30 17:45:03 oizfs scsi: [ID 107833 kern.notice]     Requested Block: 0 
Error Block: 0
> Dec 30 17:45:03 oizfs scsi: [ID 107833 kern.notice]     Vendor: ATA        
Serial Number:
> Dec 30 17:45:03 oizfs scsi: [ID 107833 kern.notice]     Sense Key:
Soft_Error
> Dec 30 17:45:03 oizfs scsi: [ID 107833 kern.notice]     ASC: 0x0
(<vendor unique code 0x0>), ASCQ: 0x1d, FRU: 0x0
> Dec 30 17:45:04 oizfs scsi: [ID 107833 kern.warning] WARNING: /pci at
0,0/pci1462,7720 at 11/disk at 3,0 (sd8):
> Dec 30 17:45:04 oizfs   Error for Command: <undecoded cmd 0xa1>   
Error Level: Recovered
> Dec 30 17:45:04 oizfs scsi: [ID 107833 kern.notice]     Requested Block: 0 
Error Block: 0
> Dec 30 17:45:04 oizfs scsi: [ID 107833 kern.notice]     Vendor: ATA        
Serial Number:
> Dec 30 17:45:04 oizfs scsi: [ID 107833 kern.notice]     Sense Key:
Soft_Error
> Dec 30 17:45:04 oizfs scsi: [ID 107833 kern.notice]     ASC: 0x0 (no
additional sense info), ASCQ: 0x0, FRU: 0x0
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--

Richard.Elling at RichardElling.com
+1-760-896-4422









-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20121230/c620786c/attachment.html>

Edward Ned Harvey (opensolarisisdeadlongliveopensolaris)

2012-Dec-31 12:27 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-
> bounces at opensolaris.org] On Behalf Of Eugen Leitl
> 
> I have a pool of 8x ST31000340AS on an LSI 8-port adapter as
> a raidz3 (no compression nor dedup) with reasonable bonnie++
> 1.03 values, e.g.  145 MByte/s Seq-Write @ 48% CPU and 291 MByte/s
> Seq-Read @ 53% CPU. 
For 8-disk raidz3 (effectively 5 disks) I would expect approx 640MB/s for both
seq read and seq write.  The first halving (from 640 down to 291) could maybe be
explained by bottlenecking through a single HBA or something like that, so I
wouldn''t be too concerned about that.  But the second halving, from 291
down to 145 ... A single disk should do 128MB/sec no problem, so the whole pool
writing at only 145MB/sec sounds wrong to me.

But as you said ... This isn''t the area of complaint...  Moving on, you
can start a new discussion about this if you want to later...

> My problem is pretty poor network throughput. An NFS
> mount on 12.04 64 bit Ubuntu (mtu 9000) or CIFS are
> read at about 23 MBytes/s. Windows 7 64 bit (also jumbo
> frames) reads at about 65 MBytes/s. The highest transfer
> speed on Windows just touches 90 MByte/s, before falling
> back to the usual 60-70 MBytes/s.
> 
> Does anyone have any suggestions on how to debug/optimize
> throughput?
The first thing I would do is build another openindiana box and try NFS / CIFS
to/from it.  See how it behaves.  Whenever I''ve seen this sort of
problem before, it was version incompatibility requiring tweaks between the
client and server.  I don''t know which version of samba / solaris cifs
is being used ... But at some point in history (win7), windows transitioned from
NTLM v1 to v2, and at that point, all the older servers became 4x slower with
the new clients, but if you built a new server with the new clients, then the
old version was 4x slower than the new.

Not to mention, I''ve had times when I couldn''t even get linux
& solars to *talk* to each other over NFS, due to version differences,
nevermind tweak all the little performance knobs.

So my advice is to first eliminate any question about version / implementation
differences, and see where that takes you.

Eugen Leitl

2013-Jan-02 10:03 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Sun, Dec 30, 2012 at 10:40:39AM -0800, Richard Elling
wrote:> On Dec 30, 2012, at 9:02 AM, Eugen Leitl <eugen at leitl.org> wrote:
> > The system is a MSI E350DM-E33 with 8 GByte PC1333 DDR3
> > memory, no ECC. All the systems have Intel NICs with mtu 9000
> > enabled, including all switches in the path.
> 
> Does it work faster with the default MTU?
No, it was even slower, that''s why I went from 1500 to 9000.
I estimate it brought ~20 MByte/s more peak on Windows 7 64 bit CIFS.
> Also check for retrans and errors, using the usual network performance
> debugging checks.
Wireshark or tcpdump on Linux/Windows? What would
you suggest for OI?
 > > P.S. Not sure whether this is pathological, but the system
> > does produce occasional soft errors like e.g. dmesg
> 
> More likely these are due to SMART commands not being properly handled
Otherwise napp-it attests full SMART support.
> for SATA devices. They are harmless.

Richard Elling

2013-Jan-02 18:36 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Jan 2, 2013, at 2:03 AM, Eugen Leitl <eugen at leitl.org> wrote:
> On Sun, Dec 30, 2012 at 10:40:39AM -0800, Richard Elling wrote:
>> On Dec 30, 2012, at 9:02 AM, Eugen Leitl <eugen at leitl.org>
wrote:
> 
>>> The system is a MSI E350DM-E33 with 8 GByte PC1333 DDR3
>>> memory, no ECC. All the systems have Intel NICs with mtu 9000
>>> enabled, including all switches in the path.
>> 
>> Does it work faster with the default MTU?
> 
> No, it was even slower, that''s why I went from 1500 to 9000.
> I estimate it brought ~20 MByte/s more peak on Windows 7 64 bit CIFS.
OK, then you have something else very wrong in your network.
>> Also check for retrans and errors, using the usual network performance
>> debugging checks.
> 
> Wireshark or tcpdump on Linux/Windows? What would
> you suggest for OI?
Look at all of the stats for all NICs and switches on both ends of each wire.
Look for collisions (should be 0), drops (should be 0), dups (should be 0),
retrans (should be near 0), flow control (server shouldn''t see flow
control
activity), etc. There is considerable written material on how to diagnose
network flakiness.
> 
>>> P.S. Not sure whether this is pathological, but the system
>>> does produce occasional soft errors like e.g. dmesg
>> 
>> More likely these are due to SMART commands not being properly handled
> 
> Otherwise napp-it attests full SMART support.
> 
>> for SATA devices. They are harmless.

Yep, this is a SATA/SAS/SMART interaction where assumptions are made
that might not be true. Usually it means that the SMART probes are using SCSI
commands on SATA disks.
 -- richard

--

Richard.Elling at RichardElling.com
+1-760-896-4422









-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20130102/eec9aff1/attachment.html>

Eugen Leitl

2013-Jan-03 20:33 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Sun, Dec 30, 2012 at 06:02:40PM +0100, Eugen Leitl
wrote:> 
> Happy $holidays,
> 
> I have a pool of 8x ST31000340AS on an LSI 8-port adapter as
Just a little update on the home NAS project.

I''ve set the pool sync to disabled, and added a couple
of

       8. c4t1d0 <ATA-INTELSSDSA2M080-02G9 cyl 11710 alt 2 hd 224 sec 56>
          /pci at 0,0/pci1462,7720 at 11/disk at 1,0
       9. c4t2d0 <ATA-INTELSSDSA2M080-02G9 cyl 11710 alt 2 hd 224 sec 56>
          /pci at 0,0/pci1462,7720 at 11/disk at 2,0

I had no clue what the partitions names (created with napp-it web
interface, a la 5% log and 95% cache, of 80 GByte) were and so
did a iostat -xnp

    1.4    0.3    5.5    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0
    0.1    0.0    3.7    0.0  0.0  0.0    0.0    0.5   0   0 c4t1d0s2
    0.1    0.0    2.6    0.0  0.0  0.0    0.0    0.5   0   0 c4t1d0s8
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c4t1d0p0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p1
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p2
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p3
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p4
    1.2    0.3    1.4    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0
    0.0    0.0    0.6    0.0  0.0  0.0    0.0    0.4   0   0 c4t2d0s2
    0.0    0.0    0.7    0.0  0.0  0.0    0.0    0.4   0   0 c4t2d0s8
    0.1    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c4t2d0p0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0p1
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0p2

then issued

# zpool add tank0 cache /dev/dsk/c4t1d0p1 /dev/dsk/c4t2d0p1
# zpool add tank0 log mirror /dev/dsk/c4t1d0p0 /dev/dsk/c4t2d0p0

which resulted in 

root at oizfs:~# zpool status
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0 in 0h1m with 0 errors on Wed Jan  2 21:09:23 2013
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c4t3d0s0  ONLINE       0     0     0

errors: No known data errors

  pool: tank0
 state: ONLINE
  scan: scrub repaired 0 in 5h17m with 0 errors on Wed Jan  2 17:53:20 2013
config:

        NAME                       STATE     READ WRITE CKSUM
        tank0                      ONLINE       0     0     0
          raidz3-0                 ONLINE       0     0     0
            c3t5000C500098BE9DDd0  ONLINE       0     0     0
            c3t5000C50009C72C48d0  ONLINE       0     0     0
            c3t5000C50009C73968d0  ONLINE       0     0     0
            c3t5000C5000FD2E794d0  ONLINE       0     0     0
            c3t5000C5000FD37075d0  ONLINE       0     0     0
            c3t5000C5000FD39D53d0  ONLINE       0     0     0
            c3t5000C5000FD3BC10d0  ONLINE       0     0     0
            c3t5000C5000FD3E8A7d0  ONLINE       0     0     0
        logs
          mirror-1                 ONLINE       0     0     0
            c4t1d0p0               ONLINE       0     0     0
            c4t2d0p0               ONLINE       0     0     0
        cache
          c4t1d0p1                 ONLINE       0     0     0
          c4t2d0p1                 ONLINE       0     0     0

errors: No known data errors

which resulted in bonnie++
befo'':

NAME	 SIZE	 Bonnie	 Date(y.m.d)	 File	 Seq-Wr-Chr	 %CPU	 Seq-Write	 %CPU	
Seq-Rewr	 %CPU	 Seq-Rd-Chr	 %CPU	 Seq-Read	 %CPU	 Rnd Seeks	 %CPU	 Files	
Seq-Create	 Rnd-Create
 rpool	 59.5G	 start	 2012.12.28	 15576M	 24 MB/s	 61	 47 MB/s	 18	 40 MB/s	 19	
26 MB/s	 98	 273 MB/s	 48	 2657.2/s	 25	 16	 12984/s	 12058/s
 tank0	 7.25T	 start	 2012.12.29	 15576M	 35 MB/s	 86	 145 MB/s	 48	 109 MB/s	
50	 25 MB/s	 97	 291 MB/s	 53	 819.9/s	 12	 16	 12634/s	 9194/s

aftuh:

-Wr-Chr	 %CPU	 Seq-Write	 %CPU	 Seq-Rewr	 %CPU	 Seq-Rd-Chr	 %CPU	 Seq-Read	 %CPU
Rnd Seeks	 %CPU	 Files	 Seq-Create	 Rnd-Create
 rpool	 59.5G	 start	 2012.12.28	 15576M	 24 MB/s	 61	 47 MB/s	 18	 40 MB/s	 19	
26 MB/s	 98	 273 MB/s	 48	 2657.2/s	 25	 16	 12984/s	 12058/s
 tank0	 7.25T	 start	 2013.01.03	 15576M	 35 MB/s	 86	 149 MB/s	 48	 111 MB/s	
50	 26 MB/s	 98	 404 MB/s	 76	 1094.3/s	 12	 16	 12601/s	 9937/s

Does the layout make sense? Do the stats make sense, or is there still something
very wrong
with that pool?

Thanks.

Richard Elling

2013-Jan-03 20:44 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Jan 3, 2013, at 12:33 PM, Eugen Leitl <eugen at leitl.org> wrote:
> On Sun, Dec 30, 2012 at 06:02:40PM +0100, Eugen Leitl wrote:
>> 
>> Happy $holidays,
>> 
>> I have a pool of 8x ST31000340AS on an LSI 8-port adapter as
> 
> Just a little update on the home NAS project.
> 
> I''ve set the pool sync to disabled, and added a couple
> of
> 
>       8. c4t1d0 <ATA-INTELSSDSA2M080-02G9 cyl 11710 alt 2 hd 224 sec
56>
>          /pci at 0,0/pci1462,7720 at 11/disk at 1,0
>       9. c4t2d0 <ATA-INTELSSDSA2M080-02G9 cyl 11710 alt 2 hd 224 sec
56>
>          /pci at 0,0/pci1462,7720 at 11/disk at 2,0
Setting sync=disabled means your log SSDs (slogs) will not be used.
 -- richard
> 
> I had no clue what the partitions names (created with napp-it web
> interface, a la 5% log and 95% cache, of 80 GByte) were and so
> did a iostat -xnp
> 
>    1.4    0.3    5.5    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0
>    0.1    0.0    3.7    0.0  0.0  0.0    0.0    0.5   0   0 c4t1d0s2
>    0.1    0.0    2.6    0.0  0.0  0.0    0.0    0.5   0   0 c4t1d0s8
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c4t1d0p0
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p1
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p2
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p3
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p4
>    1.2    0.3    1.4    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0
>    0.0    0.0    0.6    0.0  0.0  0.0    0.0    0.4   0   0 c4t2d0s2
>    0.0    0.0    0.7    0.0  0.0  0.0    0.0    0.4   0   0 c4t2d0s8
>    0.1    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c4t2d0p0
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0p1
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0p2
> 
> then issued
> 
> # zpool add tank0 cache /dev/dsk/c4t1d0p1 /dev/dsk/c4t2d0p1
> # zpool add tank0 log mirror /dev/dsk/c4t1d0p0 /dev/dsk/c4t2d0p0
> 
> which resulted in 
> 
> root at oizfs:~# zpool status
>  pool: rpool
> state: ONLINE
>  scan: scrub repaired 0 in 0h1m with 0 errors on Wed Jan  2 21:09:23 2013
> config:
> 
>        NAME        STATE     READ WRITE CKSUM
>        rpool       ONLINE       0     0     0
>          c4t3d0s0  ONLINE       0     0     0
> 
> errors: No known data errors
> 
>  pool: tank0
> state: ONLINE
>  scan: scrub repaired 0 in 5h17m with 0 errors on Wed Jan  2 17:53:20 2013
> config:
> 
>        NAME                       STATE     READ WRITE CKSUM
>        tank0                      ONLINE       0     0     0
>          raidz3-0                 ONLINE       0     0     0
>            c3t5000C500098BE9DDd0  ONLINE       0     0     0
>            c3t5000C50009C72C48d0  ONLINE       0     0     0
>            c3t5000C50009C73968d0  ONLINE       0     0     0
>            c3t5000C5000FD2E794d0  ONLINE       0     0     0
>            c3t5000C5000FD37075d0  ONLINE       0     0     0
>            c3t5000C5000FD39D53d0  ONLINE       0     0     0
>            c3t5000C5000FD3BC10d0  ONLINE       0     0     0
>            c3t5000C5000FD3E8A7d0  ONLINE       0     0     0
>        logs
>          mirror-1                 ONLINE       0     0     0
>            c4t1d0p0               ONLINE       0     0     0
>            c4t2d0p0               ONLINE       0     0     0
>        cache
>          c4t1d0p1                 ONLINE       0     0     0
>          c4t2d0p1                 ONLINE       0     0     0
> 
> errors: No known data errors
> 
> which resulted in bonnie++
> befo'':
> 
> NAME	 SIZE	 Bonnie	 Date(y.m.d)	 File	 Seq-Wr-Chr	 %CPU	 Seq-Write	 %CPU	
Seq-Rewr	 %CPU	 Seq-Rd-Chr	 %CPU	 Seq-Read	 %CPU	 Rnd Seeks	 %CPU	 Files	
Seq-Create	 Rnd-Create
> rpool	 59.5G	 start	 2012.12.28	 15576M	 24 MB/s	 61	 47 MB/s	 18	 40 MB/s	
19	 26 MB/s	 98	 273 MB/s	 48	 2657.2/s	 25	 16	 12984/s	 12058/s
> tank0	 7.25T	 start	 2012.12.29	 15576M	 35 MB/s	 86	 145 MB/s	 48	 109
MB/s	 50	 25 MB/s	 97	 291 MB/s	 53	 819.9/s	 12	 16	 12634/s	 9194/s
> 
> aftuh:
> 
> -Wr-Chr	 %CPU	 Seq-Write	 %CPU	 Seq-Rewr	 %CPU	 Seq-Rd-Chr	 %CPU	 Seq-Read	
%CPU	 Rnd Seeks	 %CPU	 Files	 Seq-Create	 Rnd-Create
> rpool	 59.5G	 start	 2012.12.28	 15576M	 24 MB/s	 61	 47 MB/s	 18	 40 MB/s	
19	 26 MB/s	 98	 273 MB/s	 48	 2657.2/s	 25	 16	 12984/s	 12058/s
> tank0	 7.25T	 start	 2013.01.03	 15576M	 35 MB/s	 86	 149 MB/s	 48	 111
MB/s	 50	 26 MB/s	 98	 404 MB/s	 76	 1094.3/s	 12	 16	 12601/s	 9937/s
> 
> Does the layout make sense? Do the stats make sense, or is there still
something very wrong
> with that pool?
> 
> Thanks. 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
--

Richard.Elling at RichardElling.com
+1-760-896-4422









-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20130103/eeab2525/attachment-0001.html>

Eugen Leitl

2013-Jan-03 20:47 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Thu, Jan 03, 2013 at 12:44:26PM -0800, Richard Elling
wrote:> 
> On Jan 3, 2013, at 12:33 PM, Eugen Leitl <eugen at leitl.org> wrote:
> 
> > On Sun, Dec 30, 2012 at 06:02:40PM +0100, Eugen Leitl wrote:
> >> 
> >> Happy $holidays,
> >> 
> >> I have a pool of 8x ST31000340AS on an LSI 8-port adapter as
> > 
> > Just a little update on the home NAS project.
> > 
> > I''ve set the pool sync to disabled, and added a couple
> > of
> > 
> >       8. c4t1d0 <ATA-INTELSSDSA2M080-02G9 cyl 11710 alt 2 hd 224
sec 56>
> >          /pci at 0,0/pci1462,7720 at 11/disk at 1,0
> >       9. c4t2d0 <ATA-INTELSSDSA2M080-02G9 cyl 11710 alt 2 hd 224
sec 56>
> >          /pci at 0,0/pci1462,7720 at 11/disk at 2,0
> 
> Setting sync=disabled means your log SSDs (slogs) will not be used.
>  -- richard
Whoops. Set it back to sync=standard. Will rerun the bonnie++ once
the scrub finishes, and post the results.

Phillip Wagstrom

2013-Jan-03 21:21 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

Eugen,

	Be aware that p0 corresponds to the entire disk, regardless of how it is
partitioned with fdisk.  The fdisk partitions are 1 - 4.  By using p0 for log
and p1 for cache, you could very well be writing to same location on the SSD and
corrupting things.
	Personally, I''d recommend putting a standard Solaris fdisk partition
on the drive and creating the two slices under that.

-Phil

On Jan 3, 2013, at 2:33 PM, Eugen Leitl wrote:
> On Sun, Dec 30, 2012 at 06:02:40PM +0100, Eugen Leitl wrote:
>> 
>> Happy $holidays,
>> 
>> I have a pool of 8x ST31000340AS on an LSI 8-port adapter as
> 
> Just a little update on the home NAS project.
> 
> I''ve set the pool sync to disabled, and added a couple
> of
> 
>       8. c4t1d0 <ATA-INTELSSDSA2M080-02G9 cyl 11710 alt 2 hd 224 sec
56>
>          /pci at 0,0/pci1462,7720 at 11/disk at 1,0
>       9. c4t2d0 <ATA-INTELSSDSA2M080-02G9 cyl 11710 alt 2 hd 224 sec
56>
>          /pci at 0,0/pci1462,7720 at 11/disk at 2,0
> 
> I had no clue what the partitions names (created with napp-it web
> interface, a la 5% log and 95% cache, of 80 GByte) were and so
> did a iostat -xnp
> 
>    1.4    0.3    5.5    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0
>    0.1    0.0    3.7    0.0  0.0  0.0    0.0    0.5   0   0 c4t1d0s2
>    0.1    0.0    2.6    0.0  0.0  0.0    0.0    0.5   0   0 c4t1d0s8
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c4t1d0p0
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p1
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p2
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p3
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p4
>    1.2    0.3    1.4    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0
>    0.0    0.0    0.6    0.0  0.0  0.0    0.0    0.4   0   0 c4t2d0s2
>    0.0    0.0    0.7    0.0  0.0  0.0    0.0    0.4   0   0 c4t2d0s8
>    0.1    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c4t2d0p0
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0p1
>    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0p2
> 
> then issued
> 
> # zpool add tank0 cache /dev/dsk/c4t1d0p1 /dev/dsk/c4t2d0p1
> # zpool add tank0 log mirror /dev/dsk/c4t1d0p0 /dev/dsk/c4t2d0p0
> 
> which resulted in 
> 
> root at oizfs:~# zpool status
>  pool: rpool
> state: ONLINE
>  scan: scrub repaired 0 in 0h1m with 0 errors on Wed Jan  2 21:09:23 2013
> config:
> 
>        NAME        STATE     READ WRITE CKSUM
>        rpool       ONLINE       0     0     0
>          c4t3d0s0  ONLINE       0     0     0
> 
> errors: No known data errors
> 
>  pool: tank0
> state: ONLINE
>  scan: scrub repaired 0 in 5h17m with 0 errors on Wed Jan  2 17:53:20 2013
> config:
> 
>        NAME                       STATE     READ WRITE CKSUM
>        tank0                      ONLINE       0     0     0
>          raidz3-0                 ONLINE       0     0     0
>            c3t5000C500098BE9DDd0  ONLINE       0     0     0
>            c3t5000C50009C72C48d0  ONLINE       0     0     0
>            c3t5000C50009C73968d0  ONLINE       0     0     0
>            c3t5000C5000FD2E794d0  ONLINE       0     0     0
>            c3t5000C5000FD37075d0  ONLINE       0     0     0
>            c3t5000C5000FD39D53d0  ONLINE       0     0     0
>            c3t5000C5000FD3BC10d0  ONLINE       0     0     0
>            c3t5000C5000FD3E8A7d0  ONLINE       0     0     0
>        logs
>          mirror-1                 ONLINE       0     0     0
>            c4t1d0p0               ONLINE       0     0     0
>            c4t2d0p0               ONLINE       0     0     0
>        cache
>          c4t1d0p1                 ONLINE       0     0     0
>          c4t2d0p1                 ONLINE       0     0     0
> 
> errors: No known data errors
> 
> which resulted in bonnie++
> befo'':
> 
> NAME	 SIZE	 Bonnie	 Date(y.m.d)	 File	 Seq-Wr-Chr	 %CPU	 Seq-Write	 %CPU	
Seq-Rewr	 %CPU	 Seq-Rd-Chr	 %CPU	 Seq-Read	 %CPU	 Rnd Seeks	 %CPU	 Files	
Seq-Create	 Rnd-Create
> rpool	 59.5G	 start	 2012.12.28	 15576M	 24 MB/s	 61	 47 MB/s	 18	 40 MB/s	
19	 26 MB/s	 98	 273 MB/s	 48	 2657.2/s	 25	 16	 12984/s	 12058/s
> tank0	 7.25T	 start	 2012.12.29	 15576M	 35 MB/s	 86	 145 MB/s	 48	 109
MB/s	 50	 25 MB/s	 97	 291 MB/s	 53	 819.9/s	 12	 16	 12634/s	 9194/s
> 
> aftuh:
> 
> -Wr-Chr	 %CPU	 Seq-Write	 %CPU	 Seq-Rewr	 %CPU	 Seq-Rd-Chr	 %CPU	 Seq-Read	
%CPU	 Rnd Seeks	 %CPU	 Files	 Seq-Create	 Rnd-Create
> rpool	 59.5G	 start	 2012.12.28	 15576M	 24 MB/s	 61	 47 MB/s	 18	 40 MB/s	
19	 26 MB/s	 98	 273 MB/s	 48	 2657.2/s	 25	 16	 12984/s	 12058/s
> tank0	 7.25T	 start	 2013.01.03	 15576M	 35 MB/s	 86	 149 MB/s	 48	 111
MB/s	 50	 26 MB/s	 98	 404 MB/s	 76	 1094.3/s	 12	 16	 12601/s	 9937/s
> 
> Does the layout make sense? Do the stats make sense, or is there still
something very wrong
> with that pool?
> 
> Thanks. 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Eugen Leitl

2013-Jan-03 21:33 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Thu, Jan 03, 2013 at 03:21:33PM -0600, Phillip Wagstrom
wrote:> Eugen,
> 
> 	Be aware that p0 corresponds to the entire disk, regardless of how it is
partitioned with fdisk.  The fdisk partitions are 1 - 4.  By using p0 for log
and p1 for cache, you could very well be writing to same location on the SSD and
corrupting things.
My partitions are like this:

partition> print
Current partition table (original):
Total disk cylinders available: 496 + 2 (reserved cylinders)

Part      Tag    Flag     Cylinders         Size            Blocks
  0 unassigned    wm       0                0         (0/0/0)             0
  1 unassigned    wm       0                0         (0/0/0)             0
  2     backup    wu       0 - 11709       70.04GB    (11710/0/0) 146890240
  3 unassigned    wm       0                0         (0/0/0)             0
  4 unassigned    wm       0                0         (0/0/0)             0
  5 unassigned    wm       0                0         (0/0/0)             0
  6 unassigned    wm       0                0         (0/0/0)             0
  7 unassigned    wm       0                0         (0/0/0)             0
  8       boot    wu       0 -     0        6.12MB    (1/0/0)         12544
  9 unassigned    wm       0                0         (0/0/0)             0

am I writing to the same location?
> 	Personally, I''d recommend putting a standard Solaris fdisk
partition on the drive and creating the two slices under that.
Which command invocations would you use to do that, under Open Indiana?

Phillip Wagstrom

2013-Jan-03 21:44 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Jan 3, 2013, at 3:33 PM, Eugen Leitl wrote:
> On Thu, Jan 03, 2013 at 03:21:33PM -0600, Phillip Wagstrom wrote:
>> Eugen,
>> 
>> 	Be aware that p0 corresponds to the entire disk, regardless of how it
is partitioned with fdisk.  The fdisk partitions are 1 - 4.  By using p0 for log
and p1 for cache, you could very well be writing to same location on the SSD and
corrupting things.
> 
> My partitions are like this:
> 
> partition> print
> Current partition table (original):
> Total disk cylinders available: 496 + 2 (reserved cylinders)
> 
> Part      Tag    Flag     Cylinders         Size            Blocks
>  0 unassigned    wm       0                0         (0/0/0)             0
>  1 unassigned    wm       0                0         (0/0/0)             0
>  2     backup    wu       0 - 11709       70.04GB    (11710/0/0) 146890240
>  3 unassigned    wm       0                0         (0/0/0)             0
>  4 unassigned    wm       0                0         (0/0/0)             0
>  5 unassigned    wm       0                0         (0/0/0)             0
>  6 unassigned    wm       0                0         (0/0/0)             0
>  7 unassigned    wm       0                0         (0/0/0)             0
>  8       boot    wu       0 -     0        6.12MB    (1/0/0)         12544
>  9 unassigned    wm       0                0         (0/0/0)             0
> 
> am I writing to the same location?
	Okay.  The above are the slices within the Solaris fdisk partition.  These
would be the "s0" part of "c0t0d0s0".  These are modified
with via format under "partition".
	p1 through p4 refers to the x86 fdisk partition which is administered with the
fdisk command or called from the format command via
"fdisk"> 
>> 	Personally, I''d recommend putting a standard Solaris fdisk
partition on the drive and creating the two slices under that.
> 
> Which command invocations would you use to do that, under Open Indiana?
	format -> partition then set the size of each there.

-Phil

Eugen Leitl

2013-Jan-03 21:52 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Thu, Jan 03, 2013 at 03:44:54PM -0600, Phillip Wagstrom
wrote:> 
> On Jan 3, 2013, at 3:33 PM, Eugen Leitl wrote:
> 
> > On Thu, Jan 03, 2013 at 03:21:33PM -0600, Phillip Wagstrom wrote:
> >> Eugen,
> >> 
> >> 	Be aware that p0 corresponds to the entire disk, regardless of
how it is partitioned with fdisk.  The fdisk partitions are 1 - 4.  By using p0
for log and p1 for cache, you could very well be writing to same location on the
SSD and corrupting things.
> > 
> > My partitions are like this:
> > 
> > partition> print
> > Current partition table (original):
> > Total disk cylinders available: 496 + 2 (reserved cylinders)
> > 
> > Part      Tag    Flag     Cylinders         Size            Blocks
> >  0 unassigned    wm       0                0         (0/0/0)          
0
> >  1 unassigned    wm       0                0         (0/0/0)          
0
> >  2     backup    wu       0 - 11709       70.04GB    (11710/0/0)
146890240
> >  3 unassigned    wm       0                0         (0/0/0)          
0
> >  4 unassigned    wm       0                0         (0/0/0)          
0
> >  5 unassigned    wm       0                0         (0/0/0)          
0
> >  6 unassigned    wm       0                0         (0/0/0)          
0
> >  7 unassigned    wm       0                0         (0/0/0)          
0
> >  8       boot    wu       0 -     0        6.12MB    (1/0/0)        
12544
> >  9 unassigned    wm       0                0         (0/0/0)          
0
> > 
> > am I writing to the same location?
> 
> 	Okay.  The above are the slices within the Solaris fdisk partition.  These
would be the "s0" part of "c0t0d0s0".  These are modified
with via format under "partition".
> 	p1 through p4 refers to the x86 fdisk partition which is administered with
the fdisk command or called from the format command via "fdisk"
> > 
> >> 	Personally, I''d recommend putting a standard Solaris
fdisk partition on the drive and creating the two slices under that.
> > 
> > Which command invocations would you use to do that, under Open
Indiana?
> 
> 	format -> partition then set the size of each there.
Thanks. Apparently, napp-it web interface did not do what I asked it to do.
I''ll try to remove the cache and the log devices from the pool, and
redo it
from the command line interface.

Cindy Swearingen

2013-Jan-03 22:21 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

Free advice is cheap...

I personally don''t see the advantage of caching reads
and logging writes to the same devices. (Is this recommended?)

If this pool is serving CIFS/NFS, I would recommend testing
for best performance with a mirrored log device first without
a separate cache device:

# zpool add tank0 log mirror c4t1d0 c4t2d0

Thanks, Cindy

On 01/03/13 14:21, Phillip Wagstrom wrote:> Eugen,
>
> 	Be aware that p0 corresponds to the entire disk, regardless of how it is
partitioned with fdisk.  The fdisk partitions are 1 - 4.  By using p0 for log
and p1 for cache, you could very well be writing to same location on the SSD and
corrupting things.
> 	Personally, I''d recommend putting a standard Solaris fdisk
partition on the drive and creating the two slices under that.
>
> -Phil
>
> On Jan 3, 2013, at 2:33 PM, Eugen Leitl wrote:
>
>> On Sun, Dec 30, 2012 at 06:02:40PM +0100, Eugen Leitl wrote:
>>>
>>> Happy $holidays,
>>>
>>> I have a pool of 8x ST31000340AS on an LSI 8-port adapter as
>>
>> Just a little update on the home NAS project.
>>
>> I''ve set the pool sync to disabled, and added a couple
>> of
>>
>>        8. c4t1d0<ATA-INTELSSDSA2M080-02G9 cyl 11710 alt 2 hd 224 sec
56>
>>           /pci at 0,0/pci1462,7720 at 11/disk at 1,0
>>        9. c4t2d0<ATA-INTELSSDSA2M080-02G9 cyl 11710 alt 2 hd 224 sec
56>
>>           /pci at 0,0/pci1462,7720 at 11/disk at 2,0
>>
>> I had no clue what the partitions names (created with napp-it web
>> interface, a la 5% log and 95% cache, of 80 GByte) were and so
>> did a iostat -xnp
>>
>>     1.4    0.3    5.5    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0
>>     0.1    0.0    3.7    0.0  0.0  0.0    0.0    0.5   0   0 c4t1d0s2
>>     0.1    0.0    2.6    0.0  0.0  0.0    0.0    0.5   0   0 c4t1d0s8
>>     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c4t1d0p0
>>     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p1
>>     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p2
>>     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p3
>>     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t1d0p4
>>     1.2    0.3    1.4    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0
>>     0.0    0.0    0.6    0.0  0.0  0.0    0.0    0.4   0   0 c4t2d0s2
>>     0.0    0.0    0.7    0.0  0.0  0.0    0.0    0.4   0   0 c4t2d0s8
>>     0.1    0.0    0.0    0.0  0.0  0.0    0.0    0.2   0   0 c4t2d0p0
>>     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0p1
>>     0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t2d0p2
>>
>> then issued
>>
>> # zpool add tank0 cache /dev/dsk/c4t1d0p1 /dev/dsk/c4t2d0p1
>> # zpool add tank0 log mirror /dev/dsk/c4t1d0p0 /dev/dsk/c4t2d0p0
>>
>> which resulted in
>>
>> root at oizfs:~# zpool status
>>   pool: rpool
>> state: ONLINE
>>   scan: scrub repaired 0 in 0h1m with 0 errors on Wed Jan  2 21:09:23
2013
>> config:
>>
>>         NAME        STATE     READ WRITE CKSUM
>>         rpool       ONLINE       0     0     0
>>           c4t3d0s0  ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>>   pool: tank0
>> state: ONLINE
>>   scan: scrub repaired 0 in 5h17m with 0 errors on Wed Jan  2 17:53:20
2013
>> config:
>>
>>         NAME                       STATE     READ WRITE CKSUM
>>         tank0                      ONLINE       0     0     0
>>           raidz3-0                 ONLINE       0     0     0
>>             c3t5000C500098BE9DDd0  ONLINE       0     0     0
>>             c3t5000C50009C72C48d0  ONLINE       0     0     0
>>             c3t5000C50009C73968d0  ONLINE       0     0     0
>>             c3t5000C5000FD2E794d0  ONLINE       0     0     0
>>             c3t5000C5000FD37075d0  ONLINE       0     0     0
>>             c3t5000C5000FD39D53d0  ONLINE       0     0     0
>>             c3t5000C5000FD3BC10d0  ONLINE       0     0     0
>>             c3t5000C5000FD3E8A7d0  ONLINE       0     0     0
>>         logs
>>           mirror-1                 ONLINE       0     0     0
>>             c4t1d0p0               ONLINE       0     0     0
>>             c4t2d0p0               ONLINE       0     0     0
>>         cache
>>           c4t1d0p1                 ONLINE       0     0     0
>>           c4t2d0p1                 ONLINE       0     0     0
>>
>> errors: No known data errors
>>
>> which resulted in bonnie++
>> befo'':
>>
>> NAME	 SIZE	 Bonnie	 Date(y.m.d)	 File	 Seq-Wr-Chr	 %CPU	 Seq-Write	
%CPU	 Seq-Rewr	 %CPU	 Seq-Rd-Chr	 %CPU	 Seq-Read	 %CPU	 Rnd Seeks	 %CPU	 Files	
Seq-Create	 Rnd-Create
>> rpool	 59.5G	 start	 2012.12.28	 15576M	 24 MB/s	 61	 47 MB/s	 18	 40
MB/s	 19	 26 MB/s	 98	 273 MB/s	 48	 2657.2/s	 25	 16	 12984/s	 12058/s
>> tank0	 7.25T	 start	 2012.12.29	 15576M	 35 MB/s	 86	 145 MB/s	 48	 109
MB/s	 50	 25 MB/s	 97	 291 MB/s	 53	 819.9/s	 12	 16	 12634/s	 9194/s
>>
>> aftuh:
>>
>> -Wr-Chr	 %CPU	 Seq-Write	 %CPU	 Seq-Rewr	 %CPU	 Seq-Rd-Chr	 %CPU	
Seq-Read	 %CPU	 Rnd Seeks	 %CPU	 Files	 Seq-Create	 Rnd-Create
>> rpool	 59.5G	 start	 2012.12.28	 15576M	 24 MB/s	 61	 47 MB/s	 18	 40
MB/s	 19	 26 MB/s	 98	 273 MB/s	 48	 2657.2/s	 25	 16	 12984/s	 12058/s
>> tank0	 7.25T	 start	 2013.01.03	 15576M	 35 MB/s	 86	 149 MB/s	 48	 111
MB/s	 50	 26 MB/s	 98	 404 MB/s	 76	 1094.3/s	 12	 16	 12601/s	 9937/s
>>
>> Does the layout make sense? Do the stats make sense, or is there still
something very wrong
>> with that pool?
>>
>> Thanks.
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Gea

2013-Jan-04 11:41 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

> 
> Thanks. Apparently, napp-it web interface did not do what I asked it to do.
> I''ll try to remove the cache and the log devices from the pool,
and redo it
> from the command line interface.
> 

napp-it up to 0.8 does not support slices or partitions
napp-it 0.9 supports partitions an offers partitioning with menu disk-partitions

You can reinitialize a disk with a missing or unwanted partition table with menu
disk-initialize

Eugen Leitl

2013-Jan-04 17:07 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Thu, Jan 03, 2013 at 03:21:33PM -0600, Phillip Wagstrom
wrote:> Eugen,
Thanks Phillip and others, most illuminating (pun intended).
 > 	Be aware that p0 corresponds to the entire disk, regardless of how it is
partitioned with fdisk.  The fdisk partitions are 1 - 4.  By using p0 for log
and p1 for cache, you could very well be writing to same location on the SSD and
corrupting things.
Does this mean that with 

Part      Tag    Flag     Cylinders         Size            Blocks
  0 unassigned    wm       0 -   668        4.00GB    (669/0/0)     8391936
  1 unassigned    wm     669 - 12455       70.50GB    (11787/0/0) 147856128
  2     backup    wu       0 - 12456       74.51GB    (12457/0/0) 156260608
  3 unassigned    wm       0                0         (0/0/0)             0
  4 unassigned    wm       0                0         (0/0/0)             0
  5 unassigned    wm       0                0         (0/0/0)             0
  6 unassigned    wm       0                0         (0/0/0)             0
  7 unassigned    wm       0                0         (0/0/0)             0
  8       boot    wu       0 -     0        6.12MB    (1/0/0)         12544
  9 unassigned    wm       0                0         (0/0/0)             0

/dev/dsk/c4t1d0p0 /dev/dsk/c4t2d0p0 means the whole disk? 
I thought the backup partition would be that, and that''s p2?
> 	Personally, I''d recommend putting a standard Solaris fdisk
partition on the drive and creating the two slices under that.
Can you please give me the rundown for commands for that?
I seem to partition a Solaris disk every decade, or so, so
I have no idea what I''m doing.

I''ve redone the

# zpool remove tank0 /dev/dsk/c4t1d0p1 /dev/dsk/c4t2d0p1
# zpool remove tank0 mirror-1

so the pool is back to mice and pumpkins:

  pool: tank0
 state: ONLINE
  scan: scrub in progress since Fri Jan  4 16:55:12 2013
    773G scanned out of 3.49T at 187M/s, 4h15m to go
    0 repaired, 21.62% done
config:

        NAME                       STATE     READ WRITE CKSUM
        tank0                      ONLINE       0     0     0
          raidz3-0                 ONLINE       0     0     0
            c3t5000C500098BE9DDd0  ONLINE       0     0     0
            c3t5000C50009C72C48d0  ONLINE       0     0     0
            c3t5000C50009C73968d0  ONLINE       0     0     0
            c3t5000C5000FD2E794d0  ONLINE       0     0     0
            c3t5000C5000FD37075d0  ONLINE       0     0     0
            c3t5000C5000FD39D53d0  ONLINE       0     0     0
            c3t5000C5000FD3BC10d0  ONLINE       0     0     0
            c3t5000C5000FD3E8A7d0  ONLINE       0     0     0

errors: No known data errors

Robert Milkowski

2013-Jan-04 18:57 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

> 	Personally, I''d recommend putting a standard Solaris fdisk
> partition on the drive and creating the two slices under that.
Why? In most cases giving zfs an entire disk is the best option.
I wouldn''t bother with any manual partitioning.

-- 
Robert Milkowski
http://milek.blogspot.com

Eugen Leitl

2013-Jan-04 19:07 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

On Fri, Jan 04, 2013 at 06:57:44PM -0000, Robert Milkowski
wrote:> 
> > 	Personally, I''d recommend putting a standard Solaris fdisk
> > partition on the drive and creating the two slices under that.
> 
> Why? In most cases giving zfs an entire disk is the best option.
> I wouldn''t bother with any manual partitioning.
Caches are ok, but log needs a mirror, and I only have
two SSDs.

Phillip Wagstrom

2013-Jan-04 19:10 UTC

head link

[zfs-discuss] poor CIFS and NFS performance

If you''re dedicating the disk to a single task (data, SLOG, L2ARC) then
absolutely.  If you''re splitting tasks and wanting to make a drive do
two things, like SLOG and L2ARC, then you have to do this.

	Some of the confusion here is between what is a traditional FDISK partition
(p1, p2, p3, p4, etc.) and what is a Solaris slice (s0 - s9), which lives inside
a FDISK partition on x86.

-Phil

On Jan 4, 2013, at 12:57 PM, Robert Milkowski wrote:
> 
>> 	Personally, I''d recommend putting a standard Solaris fdisk
>> partition on the drive and creating the two slices under that.
> 
> Why? In most cases giving zfs an entire disk is the best option.
> I wouldn''t bother with any manual partitioning.
> 
> -- 
> Robert Milkowski
> http://milek.blogspot.com
> 
>

zfs discuss - Dec 2012 - poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance

[zfs-discuss] poor CIFS and NFS performance