Nathan Kroenert
2007-Aug-27 00:12 UTC
[zfs-discuss] NV_65 AMD64 - ZFS seems to write fast and slow to a single spindle
Hey all -
Just saw something really weird.
I have been playing with by box for a little while now, and just noticed
something whilst checking how fast / slow my IDE ports were on a newish
motherboard...
I had been copying around an image. Not a particularly large one - 500M
ISO...
I had been observing the read speed off disk, and write speed to disk.
When reading from one disk and writing to another, I was seeing about
60MB/s and all was as expected.
But, then, I thought I''d do one more run, and copied the *same* image
as
my last run...
Of course, the image was in memory, so I expected there would be no
reads and lots of writes. What I saw was lots of not very impressive
speed (cmdk1 is the target):
extended device statistics
device r/s w/s kr/s kw/s wait actv svc_t %w %b
cmdk0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
cmdk1 0.0 35.3 0.0 4514.1 32.4 2.0 975.3 100 100
extended device statistics
device r/s w/s kr/s kw/s wait actv svc_t %w %b
cmdk0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
cmdk1 0.0 36.7 0.0 4697.6 32.4 2.0 936.8 100 100
extended device statistics
device r/s w/s kr/s kw/s wait actv svc_t %w %b
cmdk0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0
cmdk1 0.0 36.8 0.0 4650.6 32.4 2.0 935.1 100 100
extended device statistics
device r/s w/s kr/s kw/s wait actv svc_t %w %b
cmdk0 0.0 0.0 0.0 0.0 0.1 0.0 0.0 0 0
cmdk1 0.0 37.5 0.0 4424.9 32.3 2.0 913.8 100 100
So my target disk, which is owned exclusively by ZFS, was apparently
flat out writing at 4.4MB/s.
At the time it goes bad, the svc_t jumps from about 125ms to 950ms.
Ouch!!
On closer inspection, I see that -
- The cp returns almost immediately. (somewhat expected)
- ZFS starts writing at about 60MB/s, but only for about 2 seconds
(This is changable. Sometimes, it writes the whole image at the
slower rate.)
- the write rate drops back to 4 - 5MB/s
- CPU usage is only 8%
- I still have 1.5GB of 4GB free memory (Though I *am* running Xen at
the moment. Not sure if that matters)
- If I kick off a second copy to a different filename whilst the first
is running it does not get any faster.
- If I kick off a write to a raw zvol on the same pool, the write rate
to the disk jumps back up to the expected 60MB/s, but drops again as
soon as it''s completed the write to the raw zvol... So, it seems
it''s
not the disk itself.
- The zpool *has* been exported and imported this boot. Not sure if
that matters either.
- I had a hunch that memory availability might be playing a part, so I
forced a whole heap to be freed up with a honking big malloc and walk of
the pages. I freed up 3GB (box has 4GB total) and it seems that I start
to see the problem much more frequently as I get to about 1.5GB free.
- It''s not entirely predictable. Sometimes, it''ll write at
50-60MB/s
for up to 8 or so seconds, and others, it''ll only write fast for a
burst
right at the start, then take quite some time to write out the rest.
It''s almost as if we are being throttled on the rate at which we can
push data through the ZFS in-memory cache when writing previously read
and written data. Or something equally bogus like me expecting that ZFS
would write as fast as it can all the time, which I guess might be an
invalid assumption?
Now: This is running NV_65 with the Xen bits from back then. Not sure if
that really matters.
Does not seem that the disk is having problem -
beaker:/disk2/crap # zpool status
pool: disk2
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
disk2 ONLINE 0 0 0
c3d0 ONLINE 0 0 0
errors: No known data errors
pool: zfs
state: ONLINE
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
zfs ONLINE 0 0 0
c1d0s7 ONLINE 0 0 0
errors: No known data errors
c1d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: ST3320620AS Revision: Serial No: 6QF Size:
320.07GB <320070352896 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
c3d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0
Model: ST3320620AS Revision: Serial No: 6QF Size:
320.07GB <320070352896 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0
c0t1d0 Soft Errors: 4 Hard Errors: 302 Transport Errors: 0
If anyone within SWAN on the ZFS team wanted to take a look at this box
and see if it''s a new bug or just me being a bonehead and not
understanding what I''m seeing, please respond to me directly, and I can
provide access. (I''ll make an effort not to reboot the box just in case
it''s only this boot that sees the problems.)
Nathan. :)