I''m using a FC flash drive as a cache device to one of my pools:
          zpool  add  pool-name  cache  device-name
and I''m running random IO tests to assess performance on a 
snv-78 x86 system
I have a set of threads each doing random reads to about 25% of
its own, previously written, large file ... a test run will read in 
about 20GB on a server with 2GB of RAM
using   zpool iostat,    I can see that the SSD device is being used
aggressively, and each time I run my random read test I find
better performance than the previous execution ... I also see my
SSD drive filling up more and more between runs
this behavior is what I expect, and the performance improvements
I see are quite good (4X improvement over 5 runs), but I''m getting
hung from time to time
after several successful runs of my test application, some run of
my test will be running fine, but at some point before it finishes,
I see that all IO to the pool has stopped, and, while I still can use
the system for other things, most operations that involve the pool
will also hang (e.g.   a      wc    on a pool based file will hang)
any of these hung processes seem to sleep in the kernel 
at an uninterruptible level, and will not die on a  kill -9  attempt
any attempt to shutdown will hang, and the only way I can recover
is to use the   reboot   -qnd   command (I think that the -d option
in the key since it keeps the system from trying to sync before
reboot)
when I reboot, everything is fine again and I can continue testing
until I run into this problem again ... does anyone have any thoughts
on this issue ? ... thanks, Bill
 
 
This message posted from opensolaris.org
bill at cs.uml.edu said:> I have a set of threads each doing random reads to about 25% of its own, > previously written, large file ... a test run will read in about 20GB on a > server with 2GB of RAM > . . . > after several successful runs of my test application, some run of my test > will be running fine, but at some point before it finishes, I see that all IO > to the pool has stopped, and, while I still can use the system for other > things, most operations that involve the pool will also hang (e.g. a > wc on a pool based file will hang)Bill, Unencumbered by full knowledge of the history of your project, I''ll say that I think you need more RAM. I''ve seen this behavior on a system with 16GB RAM (and no SSD for cache), if heavy I/O goes on long enough. If larger RAM is not feasible, or you don''t have a 64-bit CPU, you could try limiting the size of the ARC as well. That''s not to say you''re not seeing some other issue, but 2GB for heavy ZFS I/O seems a little on the small side, given my experience. Regards, Marion
Marion Hakanson wrote:> bill at cs.uml.edu said: > >> I have a set of threads each doing random reads to about 25% of its own, >> previously written, large file ... a test run will read in about 20GB on a >> server with 2GB of RAM >> . . . >> after several successful runs of my test application, some run of my test >> will be running fine, but at some point before it finishes, I see that all IO >> to the pool has stopped, and, while I still can use the system for other >> things, most operations that involve the pool will also hang (e.g. a >> wc on a pool based file will hang) >> > > > Bill, > > Unencumbered by full knowledge of the history of your project, I''ll say > that I think you need more RAM. I''ve seen this behavior on a system > with 16GB RAM (and no SSD for cache), if heavy I/O goes on long enough. > If larger RAM is not feasible, or you don''t have a 64-bit CPU, you could > try limiting the size of the ARC as well. > > That''s not to say you''re not seeing some other issue, but 2GB for heavy > ZFS I/O seems a little on the small side, given my experience. >If this is the case, you might try using arcstat to view ARC usage. http://blogs.sun.com/realneel/entry/zfs_arc_statistics -- richard
Thanks Marion and richard, but I''ve run these tests with much larger data sets and have never had this kind of problem when no cache device was involved In fact, if I remove the SSD cache device from my pool and run the tests, they seem to run with no issues (except for some reduced performance as I would expect) the same SSD disk works perfectly as a separate ZIL device, providing improved IO with synchronous writes on large test runs of > 100GBs ... Bill This message posted from opensolaris.org
On 17 January, 2008 - Bill Moloney sent me these 0,7K bytes:> Thanks Marion and richard, > but I''ve run these tests with much larger data sets > and have never had this kind of problem when no > cache device was involved > > In fact, if I remove the SSD cache device from my > pool and run the tests, they seem to run with no issues > (except for some reduced performance as I would expect)My uneducated guess is that without the SSD, the disk performance is low enough that you don''t need that much memory.. with the SSD, performance goes up and so does memory usage due to caches.. Limiting the ARC or lowering the flush timeout might help.. /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
bill at cs.uml.edu wrote:> Hi richard, > using kstat -m zfs as you recommended produces some > interesting results in the L2 catagory > > I can see the l2_size field increase immediately > after doing a: > zpool add pool cache cache_device > and the l2_hits value increase with each test > run as the cache becomes more populated with data > from the test set (l2_size grows from run to run, > verifying what zpool iostat tells me about the > used capacity of the cache device) > > this is all very cool, along with a consistent > increase in performance between each test run, but > I''m still getting these hangs from time to time, and > I don''t see anything in the kstat output that looks > unusual when I get into this state >I''d be more interested in looking for a severe memory shortfall. If the ARC grows large, then this may be what is happening. You can, of course, limit the ARC size. Using a tool like fenxi (https://fenxi.dev.java.net/) can help you see what is happening on a system-wide basis during your tests, as well as managing a series of performance experiments. -- richard> ... Bill > > > > > >> Marion Hakanson wrote: >> >>> bill at cs.uml.edu said: >>> >>> >>>> I have a set of threads each doing random reads to about 25% of its >>>> own, >>>> previously written, large file ... a test run will read in about 20GB >>>> on a >>>> server with 2GB of RAM >>>> . . . >>>> after several successful runs of my test application, some run of my >>>> test >>>> will be running fine, but at some point before it finishes, I see that >>>> all IO >>>> to the pool has stopped, and, while I still can use the system for >>>> other >>>> things, most operations that involve the pool will also hang (e.g. a >>>> wc on a pool based file will hang) >>>> >>>> >>> Bill, >>> >>> Unencumbered by full knowledge of the history of your project, I''ll say >>> that I think you need more RAM. I''ve seen this behavior on a system >>> with 16GB RAM (and no SSD for cache), if heavy I/O goes on long enough. >>> If larger RAM is not feasible, or you don''t have a 64-bit CPU, you could >>> try limiting the size of the ARC as well. >>> >>> That''s not to say you''re not seeing some other issue, but 2GB for heavy >>> ZFS I/O seems a little on the small side, given my experience. >>> >>> >> If this is the case, you might try using arcstat to view ARC usage. >> http://blogs.sun.com/realneel/entry/zfs_arc_statistics >> -- richard >> >>