thr3ads.net - zfs discuss - [zfs-discuss] time-sliderd doesn''t remove snapshots [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Bill Shannon

2011-Feb-18 19:34 UTC

[zfs-discuss] time-sliderd doesn''t remove snapshots

In the last few days my performance has gone to hell.  I''m running:

# uname -a
SunOS nissan 5.11 snv_150 i86pc i386 i86pc

(I''ll upgrade as soon as the desktop hang bug is fixed.)

The performance problems seem to be due to excessive I/O on the main
disk/pool.

The only things I''ve changed recently is that I''ve created and
destroyed
a snapshot, and I used "zpool upgrade".

Here''s what I''m seeing:

# zpool iostat rpool 5
                  capacity     operations    bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
rpool       13.3G   807M      7     85  15.9K   548K
rpool       13.3G   807M      3     89  1.60K   723K
rpool       13.3G   810M      5     91  5.19K   741K
rpool       13.3G   810M      3     94  2.59K   756K

Using iofileb.d from the dtrace toolkit shows:

# iofileb.d
Tracing... Hit Ctrl-C to end.
^C
      PID CMD              KB FILE
        0 sched             6 <none>
        5 zpool-rpool    7770 <none>

zpool status doesn''t show any problems:

# zpool status rpool
     pool: rpool
    state: ONLINE
    scan: none requested
config:

           NAME        STATE     READ WRITE CKSUM
           rpool       ONLINE       0     0     0
             c3d0s0    ONLINE       0     0     0


Perhaps related to this or perhaps not, I discovered recently that time-sliderd
was doing just a ton of "close" requests.  I disabled time-sliderd
while trying
to solve my performance problem.

I was also getting these error messages in the time-sliderd log file:

Warning: Cleanup failed to destroy: rpool/ROOT at
zfs-auto-snap_hourly-2010-11-10-15h01
Details:
[''/usr/bin/pfexec'', ''/usr/sbin/zfs'',
''destroy'', ''-d'',
''rpool/ROOT at zfs-auto-snap_hourly-2010-11-10-15h01''] failed
with exit code 1
cannot destroy ''rpool/ROOT at
zfs-auto-snap_hourly-2010-11-10-15h01'': unsupported
version

That was the reason I did the zpool upgrade.

I discovered that I had a *ton* of snapshots from time-slider that
hadn''t been destroyed, over 6500 of them, presumably all because of
this
version problem?

I manually removed all the snapshots and my performance returned to normal.

I don''t quite understand what the "-d" option to "zfs
destroy" does.
Why does time-sliderd use it, and why does it prevent these snapshots
from being destroyed?

Shouldn''t time-sliderd detect that it can''t destroy any of the
snapshots
it''s created and stop creating snapshots?

And since I don''t quite understand why time-sliderd was failing to
begin with,
I''m nervous about re-enabling it.  Do I need to do a "zpool
upgrade" on all
my pools to make it work?

Cindy Swearingen

2011-Feb-18 20:07 UTC

head link

[zfs-discuss] time-sliderd doesn''t remove snapshots

Hi Bill,

I think the root cause of this problem is that time slider implemented
the zfs destroy -d feature but this feature is only available in later
pool versions. This means that the routine removal of time slider 
generated snapshots fails on older pool versions.

The zfs destroy -d feature (snapshot user holds) was introduced in pool 
version 18.

I think this bug describes some or all of the problem:

https://defect.opensolaris.org/bz/show_bug.cgi?id=16361

Thanks,

Cindy



On 02/18/11 12:34, Bill Shannon wrote:> In the last few days my performance has gone to hell.  I''m
running:
> 
> # uname -a
> SunOS nissan 5.11 snv_150 i86pc i386 i86pc
> 
> (I''ll upgrade as soon as the desktop hang bug is fixed.)
> 
> The performance problems seem to be due to excessive I/O on the main
> disk/pool.
> 
> The only things I''ve changed recently is that I''ve
created and destroyed
> a snapshot, and I used "zpool upgrade".
> 
> Here''s what I''m seeing:
> 
> # zpool iostat rpool 5
>                  capacity     operations    bandwidth
> pool        alloc   free   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> rpool       13.3G   807M      7     85  15.9K   548K
> rpool       13.3G   807M      3     89  1.60K   723K
> rpool       13.3G   810M      5     91  5.19K   741K
> rpool       13.3G   810M      3     94  2.59K   756K
> 
> Using iofileb.d from the dtrace toolkit shows:
> 
> # iofileb.d
> Tracing... Hit Ctrl-C to end.
> ^C
>      PID CMD              KB FILE
>        0 sched             6 <none>
>        5 zpool-rpool    7770 <none>
> 
> zpool status doesn''t show any problems:
> 
> # zpool status rpool
>     pool: rpool
>    state: ONLINE
>    scan: none requested
> config:
> 
>           NAME        STATE     READ WRITE CKSUM
>           rpool       ONLINE       0     0     0
>             c3d0s0    ONLINE       0     0     0
> 
> 
> Perhaps related to this or perhaps not, I discovered recently that 
> time-sliderd
> was doing just a ton of "close" requests.  I disabled
time-sliderd while
> trying
> to solve my performance problem.
> 
> I was also getting these error messages in the time-sliderd log file:
> 
> Warning: Cleanup failed to destroy: 
> rpool/ROOT at zfs-auto-snap_hourly-2010-11-10-15h01
> Details:
> [''/usr/bin/pfexec'', ''/usr/sbin/zfs'',
''destroy'', ''-d'',
> ''rpool/ROOT at zfs-auto-snap_hourly-2010-11-10-15h01'']
failed with exit code 1
> cannot destroy ''rpool/ROOT at
zfs-auto-snap_hourly-2010-11-10-15h01'':
> unsupported version
> 
> That was the reason I did the zpool upgrade.
> 
> I discovered that I had a *ton* of snapshots from time-slider that
> hadn''t been destroyed, over 6500 of them, presumably all because
of this
> version problem?
> 
> I manually removed all the snapshots and my performance returned to normal.
> 
> I don''t quite understand what the "-d" option to
"zfs destroy" does.
> Why does time-sliderd use it, and why does it prevent these snapshots
> from being destroyed?
> 
> Shouldn''t time-sliderd detect that it can''t destroy any
of the snapshots
> it''s created and stop creating snapshots?
> 
> And since I don''t quite understand why time-sliderd was failing to
begin
> with,
> I''m nervous about re-enabling it.  Do I need to do a "zpool
upgrade" on all
> my pools to make it work?
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Bill Shannon

2011-Feb-18 20:17 UTC

head link

[zfs-discuss] time-sliderd doesn''t remove snapshots

One of my old pools was version 10, another was version 13.
I guess that explains the problem.

Seems like time-sliderd should refuse to run on pools that
aren''t of a sufficient version.


Cindy Swearingen wrote on 02/18/11 12:07 PM:> Hi Bill,
>
> I think the root cause of this problem is that time slider implemented
> the zfs destroy -d feature but this feature is only available in later
> pool versions. This means that the routine removal of time slider
> generated snapshots fails on older pool versions.
>
> The zfs destroy -d feature (snapshot user holds) was introduced in pool
> version 18.
>
> I think this bug describes some or all of the problem:
>
> https://defect.opensolaris.org/bz/show_bug.cgi?id=16361
>
> Thanks,
>
> Cindy
>
>
>
> On 02/18/11 12:34, Bill Shannon wrote:
>> In the last few days my performance has gone to hell.  I''m
running:
>>
>> # uname -a
>> SunOS nissan 5.11 snv_150 i86pc i386 i86pc
>>
>> (I''ll upgrade as soon as the desktop hang bug is fixed.)
>>
>> The performance problems seem to be due to excessive I/O on the main
>> disk/pool.
>>
>> The only things I''ve changed recently is that I''ve
created and destroyed
>> a snapshot, and I used "zpool upgrade".
>>
>> Here''s what I''m seeing:
>>
>> # zpool iostat rpool 5
>>                   capacity     operations    bandwidth
>> pool        alloc   free   read  write   read  write
>> ----------  -----  -----  -----  -----  -----  -----
>> rpool       13.3G   807M      7     85  15.9K   548K
>> rpool       13.3G   807M      3     89  1.60K   723K
>> rpool       13.3G   810M      5     91  5.19K   741K
>> rpool       13.3G   810M      3     94  2.59K   756K
>>
>> Using iofileb.d from the dtrace toolkit shows:
>>
>> # iofileb.d
>> Tracing... Hit Ctrl-C to end.
>> ^C
>>       PID CMD              KB FILE
>>         0 sched             6<none>
>>         5 zpool-rpool    7770<none>
>>
>> zpool status doesn''t show any problems:
>>
>> # zpool status rpool
>>      pool: rpool
>>     state: ONLINE
>>     scan: none requested
>> config:
>>
>>            NAME        STATE     READ WRITE CKSUM
>>            rpool       ONLINE       0     0     0
>>              c3d0s0    ONLINE       0     0     0
>>
>>
>> Perhaps related to this or perhaps not, I discovered recently that
>> time-sliderd
>> was doing just a ton of "close" requests.  I disabled
time-sliderd while
>> trying
>> to solve my performance problem.
>>
>> I was also getting these error messages in the time-sliderd log file:
>>
>> Warning: Cleanup failed to destroy:
>> rpool/ROOT at zfs-auto-snap_hourly-2010-11-10-15h01
>> Details:
>> [''/usr/bin/pfexec'',
''/usr/sbin/zfs'', ''destroy'',
''-d'',
>> ''rpool/ROOT at
zfs-auto-snap_hourly-2010-11-10-15h01''] failed with exit code 1
>> cannot destroy ''rpool/ROOT at
zfs-auto-snap_hourly-2010-11-10-15h01'':
>> unsupported version
>>
>> That was the reason I did the zpool upgrade.
>>
>> I discovered that I had a *ton* of snapshots from time-slider that
>> hadn''t been destroyed, over 6500 of them, presumably all
because of this
>> version problem?
>>
>> I manually removed all the snapshots and my performance returned to
normal.
>>
>> I don''t quite understand what the "-d" option to
"zfs destroy" does.
>> Why does time-sliderd use it, and why does it prevent these snapshots
>> from being destroyed?
>>
>> Shouldn''t time-sliderd detect that it can''t destroy
any of the snapshots
>> it''s created and stop creating snapshots?
>>
>> And since I don''t quite understand why time-sliderd was
failing to begin
>> with,
>> I''m nervous about re-enabling it.  Do I need to do a
"zpool upgrade" on all
>> my pools to make it work?
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss at opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Maybe Matching Threads

Search for more maybe matching threads

zfs discuss - Feb 2011 - time-sliderd doesn''t remove snapshots

[zfs-discuss] time-sliderd doesn''t remove snapshots

[zfs-discuss] time-sliderd doesn''t remove snapshots

[zfs-discuss] time-sliderd doesn''t remove snapshots

Maybe Matching Threads