Joe Little
2006-Aug-24 14:07 UTC
[zfs-discuss] unaccounted for daily growth in ZFS disk space usage
We finally flipped the switch on one of our ZFS-based servers, with approximately 1TB of 2.8TB (3 stripes of 950MB or so, each of which is a RAID5 volume on the adaptec card). We have snapshots every 4 hours for the first few days. If you add up the snapshot references it appears somewhat high versus daily use (mostly mail boxes, spam, etc changing), but say an aggregate of no more than 400+MB a day. However, zfs list shows our daily pool as a whole, and per day we are growing by .01TB, or more specifically 80GB a day. That''s a far cry different than the 400MB we can account for. Is it possible that metadata/ditto blocks, or the like is trully growing that rapidly. By our calculations, we will triple our disk space (sitting still) in 6 months and use up the remaining 1.7TB. Of course, this is only with 2-3 days of churn, but its an alarming rate where before on the NetApp we didn''t see anything close to this rate.
Matthew Ahrens
2006-Aug-24 17:07 UTC
[zfs-discuss] unaccounted for daily growth in ZFS disk space usage
On Thu, Aug 24, 2006 at 07:07:45AM -0700, Joe Little wrote:> We finally flipped the switch on one of our ZFS-based servers, with > approximately 1TB of 2.8TB (3 stripes of 950MB or so, each of which is > a RAID5 volume on the adaptec card). We have snapshots every 4 hours > for the first few days. If you add up the snapshot references it > appears somewhat high versus daily use (mostly mail boxes, spam, etc > changing), but say an aggregate of no more than 400+MB a day. > > However, zfs list shows our daily pool as a whole, and per day we are > growing by .01TB, or more specifically 80GB a day. That''s a far cry > different than the 400MB we can account for. Is it possible that > metadata/ditto blocks, or the like is trully growing that rapidly. By > our calculations, we will triple our disk space (sitting still) in 6 > months and use up the remaining 1.7TB. Of course, this is only with > 2-3 days of churn, but its an alarming rate where before on the NetApp > we didn''t see anything close to this rate.How are you calculating this 400MB/day figure? Keep in mind that space "used" by each snapshot is the amount of space unique to that snapshot. Adding up the space "used" by all your snapshots is *not* the amount of space that they are all taking up cumulatively. For leaf filesystems (those with no descendents), you can calculate the space used by all snapshots as (fs''s "used" - fs''s "referenced"). How many filesystems do you have? Can you send me the output of ''zfs list'' and ''zfs get -r all <pool>''? How much space did you expect to be using, and what data is that based on? Are you sure you aren''t writing 80GB/day to your pool? --matt
Joe Little
2006-Aug-24 21:21 UTC
[zfs-discuss] unaccounted for daily growth in ZFS disk space usage
On 8/24/06, Matthew Ahrens <ahrens at eng.sun.com> wrote:> On Thu, Aug 24, 2006 at 07:07:45AM -0700, Joe Little wrote: > > We finally flipped the switch on one of our ZFS-based servers, with > > approximately 1TB of 2.8TB (3 stripes of 950MB or so, each of which is > > a RAID5 volume on the adaptec card). We have snapshots every 4 hours > > for the first few days. If you add up the snapshot references it > > appears somewhat high versus daily use (mostly mail boxes, spam, etc > > changing), but say an aggregate of no more than 400+MB a day. > > > > However, zfs list shows our daily pool as a whole, and per day we are > > growing by .01TB, or more specifically 80GB a day. That''s a far cry > > different than the 400MB we can account for. Is it possible that > > metadata/ditto blocks, or the like is trully growing that rapidly. By > > our calculations, we will triple our disk space (sitting still) in 6 > > months and use up the remaining 1.7TB. Of course, this is only with > > 2-3 days of churn, but its an alarming rate where before on the NetApp > > we didn''t see anything close to this rate. > > How are you calculating this 400MB/day figure? Keep in mind that space > "used" by each snapshot is the amount of space unique to that snapshot. > Adding up the space "used" by all your snapshots is *not* the amount of > space that they are all taking up cumulatively. For leaf filesystems > (those with no descendents), you can calculate the space used by > all snapshots as (fs''s "used" - fs''s "referenced"). > > How many filesystems do you have? Can you send me the output of ''zfs > list'' and ''zfs get -r all <pool>''? > > How much space did you expect to be using, and what data is that based > on? Are you sure you aren''t writing 80GB/day to your pool? > > --matt >well, by deleting my 4-hourlies I reclaimed most of the data. To answer some of the questions, its about 15 filesystems (decendents included). I''m aware of the space used by snapshots overlapping. I was looking at the total space (zpool iostat reports) and seeing the diff per day. The 400MB/day was be inspection and by looking at our nominal growth on a netapp. It would appear that if one days many snapshots, there is an initial quick growth in disk usage, but once those snapshot meet their retention level (say 12), the growth would appear to match our typical 400MB/day. Time will prove this one way or other. By simply getting rid of hourly snapshots and collapsing to dailies for two days worth, I reverted to only ~1-2GB total growth, which is much more in line with expectations. For various reasons, I can''t post the zfs list type results as yet. I''ll need to get the ok for that first.. Sorry..
Matthew Ahrens
2006-Aug-24 21:28 UTC
[zfs-discuss] unaccounted for daily growth in ZFS disk space usage
On Thu, Aug 24, 2006 at 02:21:33PM -0700, Joe Little wrote:> well, by deleting my 4-hourlies I reclaimed most of the data. To > answer some of the questions, its about 15 filesystems (decendents > included). I''m aware of the space used by snapshots overlapping. I was > looking at the total space (zpool iostat reports) and seeing the diff > per day. The 400MB/day was be inspection and by looking at our nominal > growth on a netapp. > > It would appear that if one days many snapshots, there is an initial > quick growth in disk usage, but once those snapshot meet their > retention level (say 12), the growth would appear to match our typical > 400MB/day. Time will prove this one way or other. By simply getting > rid of hourly snapshots and collapsing to dailies for two days worth, > I reverted to only ~1-2GB total growth, which is much more in line > with expectations.OK, so sounds like there is no problem here, right? You were taking snapshots every 4 hours, which took up no more space than was needed, but more than you would like (and more than daily snapshots). Using daily snapshots the space usage is in line with daily snapshots on NetApp.> For various reasons, I can''t post the zfs list type results as yet. > I''ll need to get the ok for that first.. Sorry..It sounds like there is no problem here so no need to post the output. --matt
Joe Little
2006-Aug-25 03:26 UTC
[zfs-discuss] unaccounted for daily growth in ZFS disk space usage
On 8/24/06, Matthew Ahrens <ahrens at eng.sun.com> wrote:> On Thu, Aug 24, 2006 at 02:21:33PM -0700, Joe Little wrote: > > well, by deleting my 4-hourlies I reclaimed most of the data. To > > answer some of the questions, its about 15 filesystems (decendents > > included). I''m aware of the space used by snapshots overlapping. I was > > looking at the total space (zpool iostat reports) and seeing the diff > > per day. The 400MB/day was be inspection and by looking at our nominal > > growth on a netapp. > > > > It would appear that if one days many snapshots, there is an initial > > quick growth in disk usage, but once those snapshot meet their > > retention level (say 12), the growth would appear to match our typical > > 400MB/day. Time will prove this one way or other. By simply getting > > rid of hourly snapshots and collapsing to dailies for two days worth, > > I reverted to only ~1-2GB total growth, which is much more in line > > with expectations. > > OK, so sounds like there is no problem here, right? You were taking > snapshots every 4 hours, which took up no more space than was needed, > but more than you would like (and more than daily snapshots). Using > daily snapshots the space usage is in line with daily snapshots on > NetApp. > > > For various reasons, I can''t post the zfs list type results as yet. > > I''ll need to get the ok for that first.. Sorry.. > > It sounds like there is no problem here so no need to post the output. > > --matt >Hi. The NetApp had the same snapshot schedule and didn''t show the same growth in storage behind the base use. Again, I do suspect that ZFS snapshots (or perhaps the metadata) may be somewhat fatter over all than one would expect. Perhaps its the difference of default block sizing. It was an alarming rate until I suspected it was all in the snapshot overhead and reduced the overall number.
Rob Logan
2006-Aug-26 19:20 UTC
[zfs-discuss] unaccounted for daily growth in ZFS disk space usage
> For various reasons, I can''t post the zfs list typehere is one, and it seems inline with expected netapp(tm) type usage considering the "cluster" size differences. 14 % cat snap_sched #!/bin/sh snaps=15 for fs in `echo Videos Movies Music users local` do i=$snaps zfs destroy zfs/$fs@$i while [ $i -gt 1 ] ; do i=`expr $i - 1` zfs rename zfs/$fs@$i zfs/$fs@`expr $i + 1` done zfs snapshot zfs/$fs@$i done day=`date +%j` nuke=`expr $day - 181` if [ $nuke -lt 0 ] ; then nuke=`expr 365 + $nuke` fi zfs destroy zfs/backup@$nuke zfs snapshot zfs/backup@$day zfs list -H zfs 15 % zfs list NAME USED AVAIL REFER MOUNTPOINT zfs 1.71T 592G 75K /zfs zfs at root 57K - 69K - zfs/Movies 1.36T 592G 1.36T /zfs/Movies zfs/Movies at 15 238K - 1.25T - zfs/Movies at 14 62K - 1.27T - zfs/Movies at 13 66K - 1.27T - zfs/Movies at 12 56K - 1.27T - zfs/Movies at 11 48K - 1.27T - zfs/Movies at 10 48K - 1.27T - zfs/Movies at 9 132K - 1.30T - zfs/Movies at 8 0 - 1.33T - zfs/Movies at 7 0 - 1.33T - zfs/Movies at 6 188K - 1.33T - zfs/Movies at 5 178K - 1.33T - zfs/Movies at 4 0 - 1.35T - zfs/Movies at 3 0 - 1.35T - zfs/Movies at 2 0 - 1.35T - zfs/Movies at 1 154K - 1.36T - zfs/Music 6.96G 592G 6.96G /zfs/Music zfs/Music at 15 0 - 6.96G - zfs/Music at 14 0 - 6.96G - zfs/Music at 13 0 - 6.96G - zfs/Music at 12 0 - 6.96G - zfs/Music at 11 0 - 6.96G - zfs/Music at 10 0 - 6.96G - zfs/Music at 9 0 - 6.96G - zfs/Music at 8 0 - 6.96G - zfs/Music at 7 0 - 6.96G - zfs/Music at 6 0 - 6.96G - zfs/Music at 5 45K - 6.96G - zfs/Music at 4 0 - 6.96G - zfs/Music at 3 0 - 6.96G - zfs/Music at 2 0 - 6.96G - zfs/Music at 1 0 - 6.96G - zfs/Videos 157G 592G 157G /zfs/Videos zfs/Videos at 15 0 - 156G - zfs/Videos at 14 0 - 156G - zfs/Videos at 13 50K - 156G - zfs/Videos at 12 0 - 156G - zfs/Videos at 11 0 - 156G - zfs/Videos at 10 0 - 156G - zfs/Videos at 9 146K - 157G - zfs/Videos at 8 0 - 157G - zfs/Videos at 7 0 - 157G - zfs/Videos at 6 0 - 157G - zfs/Videos at 5 54K - 157G - zfs/Videos at 4 0 - 157G - zfs/Videos at 3 0 - 157G - zfs/Videos at 2 0 - 157G - zfs/Videos at 1 0 - 157G - zfs/backup 172G 592G 131G /zfs/backup zfs/backup at 166 341M - 140G - zfs/backup at 167 295M - 140G - zfs/backup at 168 265M - 140G - zfs/backup at 169 236M - 140G - zfs/backup at 170 247M - 140G - zfs/backup at 171 288M - 140G - zfs/backup at 172 251M - 140G - zfs/backup at 173 268M - 141G - zfs/backup at 174 260M - 141G - zfs/backup at 175 201M - 141G - zfs/backup at 176 284M - 141G - zfs/backup at 177 316M - 141G - zfs/backup at 178 309M - 141G - zfs/backup at 179 289M - 141G - zfs/backup at 180 252M - 141G - zfs/backup at 181 269M - 141G - zfs/backup at 182 268M - 141G - zfs/backup at 183 220M - 141G - zfs/backup at 184 241M - 141G - zfs/backup at 185 242M - 141G - zfs/backup at 186 11.9M - 141G - zfs/backup at 187 9.59M - 141G - zfs/backup at 188 266M - 142G - zfs/backup at 189 241M - 142G - zfs/backup at 190 259M - 142G - zfs/backup at 191 274M - 143G - zfs/backup at 192 254M - 141G - zfs/backup at 193 257M - 141G - zfs/backup at 194 261M - 141G - zfs/backup at 195 274M - 141G - zfs/backup at 196 220M - 141G - zfs/backup at 197 188M - 141G - zfs/backup at 198 234M - 141G - zfs/backup at 205 294M - 129G - zfs/backup at 206 260M - 129G - zfs/backup at 207 244M - 129G - zfs/backup at 208 173M - 129G - zfs/backup at 209 145M - 129G - zfs/backup at 210 225M - 129G - zfs/backup at 211 235M - 130G - zfs/backup at 212 249M - 130G - zfs/backup at 213 260M - 130G - zfs/backup at 214 252M - 130G - zfs/backup at 215 280M - 131G - zfs/backup at 216 277M - 131G - zfs/backup at 217 259M - 129G - zfs/backup at 218 253M - 129G - zfs/backup at 219 239M - 129G - zfs/backup at 220 260M - 130G - zfs/backup at 221 311M - 130G - zfs/backup at 222 256M - 130G - zfs/backup at 223 255M - 130G - zfs/backup at 224 262M - 130G - zfs/backup at 225 213M - 131G - zfs/backup at 226 258M - 131G - zfs/backup at 227 303M - 130G - zfs/backup at 228 258M - 130G - zfs/backup at 229 261M - 131G - zfs/backup at 230 345M - 131G - zfs/backup at 231 266M - 131G - zfs/backup at 232 232M - 131G - zfs/backup at 233 248M - 131G - zfs/backup at 234 540M - 131G - zfs/backup at 235 240M - 131G - zfs/backup at 237 260M - 131G - zfs/local 1.80G 592G 1.80G /zfs/local zfs/local at 15 439K - 1.80G - zfs/local at 14 439K - 1.80G - zfs/local at 13 429K - 1.80G - zfs/local at 12 429K - 1.80G - zfs/local at 11 477K - 1.80G - zfs/local at 10 411K - 1.80G - zfs/local at 9 258K - 1.80G - zfs/local at 8 248K - 1.80G - zfs/local at 7 248K - 1.80G - zfs/local at 6 258K - 1.80G - zfs/local at 5 251K - 1.80G - zfs/local at 4 251K - 1.80G - zfs/local at 3 251K - 1.80G - zfs/local at 2 310K - 1.80G - zfs/local at 1 681K - 1.80G - zfs/users 20.2G 592G 20.2G /zfs/users zfs/users at 15 516K - 20.2G - zfs/users at 14 101K - 20.2G - zfs/users at 13 101K - 20.2G - zfs/users at 12 155K - 20.2G - zfs/users at 11 163K - 20.2G - zfs/users at 10 85K - 20.2G - zfs/users at 9 85K - 20.2G - zfs/users at 8 85K - 20.2G - zfs/users at 7 85K - 20.2G - zfs/users at 6 97K - 20.2G - zfs/users at 5 133K - 20.2G - zfs/users at 4 85K - 20.2G - zfs/users at 3 85K - 20.2G - zfs/users at 2 97K - 20.2G - zfs/users at 1 97K - 20.2G -