I''m seeing some odd behaviour with ZFS and a reasonably heavy workload. I''m currently on contract to BBC R&D to build what is effectively a network-based personal video recorder. To that end, I have a rather large collection of discs, arranged very poorly as it''s something of a hack at present, and a T1000 capturing the data. The data is about 20 separate streams of multicast content (anyone in the UK multicast peering with the BBC should be able to pick it up; mail me for details if you''re interested) ranging from a couple of hundred Kb/s for the radio stations, via 6Mb/s or so for the standard definition channels, up to 17Mb/s for the HD stuff. This little lot totals about 75Mb/s into the machine, which equates to about 9MB/s to the media. Not exactly a lot. We''re using Sun''s Solaris 10 rather than OpenSolaris for historical reasons. Specifically: SunOS cr0.kw.bbc.co.uk 5.10 Generic_118833-17 sun4v sparc SUNW,Sun-Fire-T1000 My problem is this: with 20 processes recording data to a single pool, everything is fine. Copying that data off again, via a separate network interface connected to a separate switch (so we can rule out network hardware being the problem), it appears that the read operations cripple the writes, causing the inbound buffers to fill and packets be dropped. This makes for unwatchable telly. The data being compressed video, there isn''t much point in attempting to compress it further at the filesystem level, so we''re not. My pool is badly configured, granted, in the following configuration. This is by no means final, and is something of an error. It''s like this as we needed to get more drives onto the thing (it''s a staging post; the fileservers are elsewhere) as we hadn''t got the content stores up and running yet. bash-3.00# zpool status -v pool: content state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM content ONLINE 0 0 0 c1t0d0 ONLINE 0 0 0 c1t0d1 ONLINE 0 0 0 c1t0d2 ONLINE 0 0 0 c1t0d3 ONLINE 0 0 0 raidz ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c2t0d1 ONLINE 0 0 0 c2t0d2 ONLINE 0 0 0 c2t0d3 ONLINE 0 0 0 c2t0d4 ONLINE 0 0 0 c2t0d5 ONLINE 0 0 0 c2t0d6 ONLINE 0 0 0 c2t0d7 ONLINE 0 0 0 raidz ONLINE 0 0 0 c2t0d8 ONLINE 0 0 0 c2t0d9 ONLINE 0 0 0 c2t0d10 ONLINE 0 0 0 c2t0d11 ONLINE 0 0 0 c2t0d12 ONLINE 0 0 0 c2t0d13 ONLINE 0 0 0 c2t0d14 ONLINE 0 0 0 c2t0d15 ONLINE 0 0 0 errors: No known data errors bash-3.00# where each of the drives is a 500GB SATA disc in a SATA <-> SCSI RAID device set to JBOD. iostat -x: extended device statistics device r/s w/s kr/s kw/s wait actv svc_t %w %b md0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md10 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 md11 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 sd1 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 sd2 0.0 25.6 0.0 1052.7 0.0 0.1 4.7 0 3 sd3 0.2 24.4 12.8 923.7 0.0 0.1 4.9 0 3 sd4 0.0 13.0 0.0 880.2 0.0 0.1 8.3 0 3 sd17 0.8 10.4 51.1 1303.1 0.0 0.1 10.7 0 4 sd30 79.6 6.8 2669.4 103.4 0.0 2.0 23.7 0 32 sd34 90.4 6.8 2564.3 103.0 0.0 1.9 19.8 0 30 sd36 77.1 6.8 2675.7 103.8 0.0 1.9 23.0 0 30 sd39 96.0 6.8 2632.2 103.0 0.0 2.1 20.3 0 31 sd42 76.9 6.8 2666.9 103.4 0.0 2.0 23.5 0 31 sd46 95.6 6.8 2608.7 102.6 0.0 2.0 19.8 0 30 sd48 77.7 6.8 2648.5 103.4 0.0 2.0 23.6 0 31 sd51 94.0 6.8 2588.9 102.6 0.0 1.9 19.0 0 30 sd53 0.0 4.0 0.0 143.0 0.0 0.1 13.7 0 2 sd56 0.0 4.2 0.0 142.0 0.0 0.1 15.4 0 2 sd58 0.0 4.0 0.0 143.0 0.0 0.1 13.4 0 2 sd60 0.0 4.2 0.0 142.1 0.0 0.1 14.8 0 2 sd62 0.0 4.0 0.0 143.1 0.0 0.1 12.7 0 2 sd65 0.0 4.2 0.0 142.1 0.0 0.1 14.4 0 2 sd67 0.0 3.8 0.0 143.0 0.0 0.1 14.3 0 2 sd70 0.0 4.2 0.0 142.0 0.0 0.1 15.2 0 2 Each recording process takes about 0.4% CPU according to prstat. This is an 8-core T1000. The reading process (Perl) chews 3.5% or so, which I make to be one thread plus a bit that can be parallelised automatically. The reader appears to be CPU-bound, which also concerns me; when we unbind it, I''m expecting this problem to get worse. My questions are: * Is this what I should expect? * Why? I''d''ve thought the extensive caching the filesystem does to sort this out for me? * Is there any way around it that doesn''t involve editing the code? Thankyou for your time. -- Dickon Hood Due to digital rights management, my .sig is temporarily unavailable. Normal service will be resumed as soon as possible. We apologise for the inconvenience in the meantime. No virus was found in this outgoing message as I didn''t bother looking.