Paul Kraus
2012-Jun-04 18:09 UTC
[zfs-discuss] Question about how resilver works (zpool version 22)
Gathered knowledge, I have a moderately large version 22 zpool, zpool list reports 75 TB and it is all raidZ2 made up of 22 vdevs each 5 x 750 GB drives. We take snapshots hourly and keep them for 5 weeks for operational backups (Disaster Recovery backups are via zfs send | zfs recv to another physical system). `zpool list` reports 44 TB allocated. We had a drive fail, the hot spare stepped in, and as soon as we had a replacement from Oracle we `zpool replace`d the failed drive (the layout assures that each vdev hits one drive in each of five J4400 so we can lose up to 2 of the J4400 and not lose data). The resilver ran for days and hit 100% done but kept going, for over two weeks, and still going. Each 750 GB drive involved reported resilvering over 4 TB! I had seen this before, but not to this extent. Now for my questions: 1) I assume the percent done is the resilver of the base zpool and datasets but does not include snapshots. This means that once we hit 100% the _current_ data has been resilvered and it is working on the snapshots. 2) Is the resilver operation walking through all the data in all of the snapshots ? If so, then I should be able to estimate total completion by taking time to get to 100% and multiplying by the number of snapshots (assuming all snapshots are about the same size). I know there are "fixes" for this with later version zpools, but we are stuck at 22 for right now. NOTE: Since the snapshots are our backups we really can''t disable them, if we do we run into a different zpool 22 issue where the amount of RAM we will need to destroy a large snapshot will be more than we have. This is also fixed with zpool 26. -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Assistant Technical Director, LoneStarCon 3 (http://lonestarcon3.org/) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, Troy Civic Theatre Company -> Technical Advisor, RPI Players