thr3ads.net - zfs discuss - [zfs-discuss] CR# 6574286, remove slog device [May 2009]

If this information is useful, please help other people find it:
Share via:

Paul B. Henson

2009-May-19 19:16 UTC

[zfs-discuss] CR# 6574286, remove slog device

I was checking with Sun support regarding this issue, and they say "The CR
currently has a high priority and the fix is understood. However, there is
no eta, workaround, nor IDR."

If it''s a high priority, and it''s known how to fix it, I was
curious as to
why has there been no progress? As I understand, if a failure of the log
device occurs while the pool is active, it automatically switches back to
an embedded pool log. It seems removal would be as simple as following the
failure path to an embedded log, and then update the pool metadata to
remove the log device. Is it more complicated than that? We''re about to
do
some testing with slogs, and it would make me a lot more comfortable to
deploy one in production if there was a backout plan :)...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Dave

2009-May-19 19:57 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Paul B. Henson wrote:> I was checking with Sun support regarding this issue, and they say
"The CR
> currently has a high priority and the fix is understood. However, there is
> no eta, workaround, nor IDR."
> 
> If it''s a high priority, and it''s known how to fix it, I
was curious as to
> why has there been no progress? As I understand, if a failure of the log
> device occurs while the pool is active, it automatically switches back to
> an embedded pool log. It seems removal would be as simple as following the
> failure path to an embedded log, and then update the pool metadata to
> remove the log device. Is it more complicated than that? We''re
about to do
> some testing with slogs, and it would make me a lot more comfortable to
> deploy one in production if there was a backout plan :)...
If you don''t have mirrored slogs and the slog fails, you may lose any 
data that was in a txg group waiting to be committed to the main pool 
vdevs - you will never know if you lost any data or not.

I think this thread is the latest discussion about slogs and their behavior:

https://opensolaris.org/jive/thread.jspa?threadID=102392&tstart=0

--
Dave

Richard Elling

2009-May-19 20:52 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Paul B. Henson wrote:> I was checking with Sun support regarding this issue, and they say
"The CR
> currently has a high priority and the fix is understood. However, there is
> no eta, workaround, nor IDR."
>
> If it''s a high priority, and it''s known how to fix it, I
was curious as to
> why has there been no progress? As I understand, if a failure of the log
> device occurs while the pool is active, it automatically switches back to
> an embedded pool log. It seems removal would be as simple as following the
> failure path to an embedded log, and then update the pool metadata to
> remove the log device. Is it more complicated than that? We''re
about to do
> some testing with slogs, and it would make me a lot more comfortable to
> deploy one in production if there was a backout plan :)...
>
>   
Removal of a slog is different than failure of a slog.  Removal is an
administrative task, not a repair task.  You can lump slog removal in
with the more general shrink or top-level vdev removal tasks.
 -- richard

Nicholas Lee

2009-May-19 21:08 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Does Solaris flush a slog device before it powers down?  If so, removal
during a shutdown cycle wouldn''t lose any data.



On Wed, May 20, 2009 at 7:57 AM, Dave <dave-zfs at dubkat.com> wrote:
> If you don''t have mirrored slogs and the slog fails, you may lose
any data
> that was in a txg group waiting to be committed to the main pool vdevs -
you
> will never know if you lost any data or not.
>
> I think this thread is the latest discussion about slogs and their
> behavior:
>
> https://opensolaris.org/jive/thread.jspa?threadID=102392&tstart=0
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090520/3e4a17fb/attachment.html>

Paul B. Henson

2009-May-19 22:00 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Tue, 19 May 2009, Dave wrote:
> If you don''t have mirrored slogs and the slog fails, you may lose
any
> data that was in a txg group waiting to be committed to the main pool
> vdevs - you will never know if you lost any data or not.
True; but from what I understand the failure rate of SSD''s is so low
it''s
not worth mirroring them. My main issue is that we''re considering
dropping
an SSD into an x4500, which is not an officially supported configuration
and I''d like an easy way to back out of it if it becomes an issue.
> I think this thread is the latest discussion about slogs and their
> behavior:
>
> https://opensolaris.org/jive/thread.jspa?threadID=102392&tstart=0
Yah,  that''s my thread too :).


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Paul B. Henson

2009-May-19 22:07 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Tue, 19 May 2009, Richard Elling wrote:
> Removal of a slog is different than failure of a slog.  Removal is an
> administrative task, not a repair task.  You can lump slog removal in
> with the more general shrink or top-level vdev removal tasks.
Granted; however, for shrinking or removing vdevs you first need to do the
technical step of getting data off of the devices you want to remove.
Presumably that code has not yet been implemented, so before you can
implement removal of data devices, you have to write that code. On the
other hand, clearly there already exists code that will switch from a
separate log device back to an embedded one. The only thing missing is a
simple "update the metadata" step. While they might be lumped together
in
the same category, in terms of amount of effort and complexity to implement
unless there''s something I''m not understanding there should be
a couple of
orders of magnitude difference.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Paul B. Henson

2009-May-19 22:09 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Tue, 19 May 2009, Nicholas Lee wrote:
> Does Solaris flush a slog device before it powers down?  If so, removal
> during a shutdown cycle wouldn''t lose any data.
Actually, if you remove a slog from a pool while the pool is exported, it
becomes completely inaccessible. I have an open support ticket requesting
details of any potential recovery method for such a situation.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Eric Schrock

2009-May-20 03:43 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On May 19, 2009, at 12:57 PM, Dave wrote:>
> If you don''t have mirrored slogs and the slog fails, you may lose
> any data that was in a txg group waiting to be committed to the main  
> pool vdevs - you will never know if you lost any data or not.
None of the above is correct.  First off, you only lose data if the  
slog fails *and* the machine panics/reboots before the transaction  
groups is synced (5-30s by default depending on load, though there is  
a CR filed to immediately sync on slog failure).  You will not lose  
any data once the txg is synced - syncing the transaction group does  
not require reading from the slog, so failure of the log device does  
not impact normal operation.

The latter half of the above statement is also incorrect.  Should you  
find yourself in the double-failure described above, you will get an  
FMA fault that describes the nature of the problem and the  
implications.  If the slog is truly dead, you can ''zpool
clear'' (or
''fmadm repair'') the fault and use whatever data you still have
in the
pool.  If the slog is just missing, you can insert it and continue  
without losing data.  In no cases will ZFS silently continue without  
committed data.

- Eric

--
Eric Schrock, Fishworks                        http://blogs.sun.com/eschrock

Nicholas Lee

2009-May-20 03:56 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

So txg is sync to the slog device but retained in memory, and then rather
than reading it back from the slog to memory it is copied to the pool from
memory the copy?
With the txg being a working set of the active commit, so might be a set of
NFS iops?


On Wed, May 20, 2009 at 3:43 PM, Eric Schrock <Eric.Schrock at sun.com>
wrote:
>
> On May 19, 2009, at 12:57 PM, Dave wrote:
>
>>
>> If you don''t have mirrored slogs and the slog fails, you may
lose any data
>> that was in a txg group waiting to be committed to the main pool vdevs
- you
>> will never know if you lost any data or not.
>>
>
> None of the above is correct.  First off, you only lose data if the slog
> fails *and* the machine panics/reboots before the transaction groups is
> synced (5-30s by default depending on load, though there is a CR filed to
> immediately sync on slog failure).  You will not lose any data once the txg
> is synced - syncing the transaction group does not require reading from the
> slog, so failure of the log device does not impact normal operation.
>
> The latter half of the above statement is also incorrect.  Should you find
> yourself in the double-failure described above, you will get an FMA fault
> that describes the nature of the problem and the implications.  If the slog
> is truly dead, you can ''zpool clear'' (or ''fmadm
repair'') the fault and use
> whatever data you still have in the pool.  If the slog is just missing, you
> can insert it and continue without losing data.  In no cases will ZFS
> silently continue without committed data.
>
> - Eric
>
> --
> Eric Schrock, Fishworks
> http://blogs.sun.com/eschrock
>
>
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090520/6fdeecfe/attachment.html>

Eric Schrock

2009-May-20 04:03 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On May 19, 2009, at 8:56 PM, Nicholas Lee wrote:
>
> So txg is sync to the slog device but retained in memory, and then  
> rather than reading it back from the slog to memory it is copied to  
> the pool from memory the copy?
Yes, that is correct.  It is best to think of the ZIL and the txg sync  
process as orthogonal - data goes to both locations at different  
times.  The ZIL (technically "all ZILs" since they''re
per-filesystem)
is *only* read in the event of log replay (unclean shutdown).  During  
normal operation it is never read. Hence the benefit of write-biases  
SSDs - it doesn''t matter if reads are fast for slogs.
> With the txg being a working set of the active commit, so might be a  
> set of NFS iops?
If the NFS ops are synchronous, then yes.  Async operations do not use  
the ZIL and therefore don''t have anything to do with slogs.

- Eric

--
Eric Schrock, Fishworks                        http://blogs.sun.com/eschrock

Paul B. Henson

2009-May-20 04:10 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Tue, 19 May 2009, Eric Schrock wrote:
> The latter half of the above statement is also incorrect.  Should you
> find yourself in the double-failure described above, you will get an FMA
> fault that describes the nature of the problem and the implications.  If
> the slog is truly dead, you can ''zpool clear'' (or
''fmadm repair'') the
> fault and use whatever data you still have in the pool.  If the slog is
> just missing, you can insert it and continue without losing data.  In no
> cases will ZFS silently continue without committed data.
How about the case where a slog device dies while a pool is not active? I
created a pool with one mirror pair and a slog, and then intentionally
corrupted the slog while the pool was exported (dd if=/dev/zero
of=/dev/dsk/<slog>), and the pool is now inaccessible:

-----
root at s10 ~ # zpool import
  pool: export
    id: 7254558150370674682
 state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
        devices and try again.
   see: http://www.sun.com/msg/ZFS-8000-6X
config:

        export      UNAVAIL  missing device
          mirror    ONLINE
            c0t1d0  ONLINE
            c0t2d0  ONLINE

        Additional devices are known to be part of this pool, though their
        exact configuration cannot be determined.
-----

zpool clear doesn''t help:

-----
root at s10 ~ # zpool clear export
cannot open ''export'': no such pool
-----

and there''s no fault logged:

-----
root at s10 ~ # fmdump
TIME                 UUID                                 SUNW-MSG-ID
fmdump: /var/fm/fmd/fltlog is empty
-----

How do you recover from this scenario?

BTW, you don''t happen to have any insight on why slog removal
hasn''t been
implemented yet?


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Nicholas Lee

2009-May-20 04:24 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

I guess this also means the relative value of a slog is also limited by the
amount memory that can be allocated to the txg.



On Wed, May 20, 2009 at 4:03 PM, Eric Schrock <Eric.Schrock at sun.com>
wrote:
>
> Yes, that is correct.  It is best to think of the ZIL and the txg sync
> process as orthogonal - data goes to both locations at different times. 
The
> ZIL (technically "all ZILs" since they''re
per-filesystem) is *only* read in
> the event of log replay (unclean shutdown).  During normal operation it is
> never read. Hence the benefit of write-biases SSDs - it doesn''t
matter if
> reads are fast for slogs.
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090520/62997ba7/attachment.html>

Dave

2009-May-20 04:45 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Eric Schrock wrote:> 
> On May 19, 2009, at 12:57 PM, Dave wrote:
>>
>> If you don''t have mirrored slogs and the slog fails, you may
lose any
>> data that was in a txg group waiting to be committed to the main pool 
>> vdevs - you will never know if you lost any data or not.
> 
> None of the above is correct.  First off, you only lose data if the slog 
> fails *and* the machine panics/reboots before the transaction groups is 
> synced (5-30s by default depending on load, though there is a CR filed 
> to immediately sync on slog failure).  You will not lose any data once 
> the txg is synced - syncing the transaction group does not require 
> reading from the slog, so failure of the log device does not impact 
> normal operation.
> 
Thanks for correcting my statement. There is still a potential 
approximate 60 second window for data loss if there are 2 transaction 
groups waiting to sync with a 30 second txg commit timer, correct?
> The latter half of the above statement is also incorrect.  Should you 
> find yourself in the double-failure described above, you will get an FMA 
> fault that describes the nature of the problem and the implications.  If 
> the slog is truly dead, you can ''zpool clear'' (or
''fmadm repair'') the
> fault and use whatever data you still have in the pool.  If the slog is 
> just missing, you can insert it and continue without losing data.  In no 
> cases will ZFS silently continue without committed data.
> 
How will it know that data was actually lost? Or does it just alert you 
that it''s possible data was lost?

There''s also the worry that the pool is not importable if you did have 
the double failure scenario and the log really is gone. Re: bug ID 
6733267 . E.g. if you had done a ''zpool import -o cachefile=none
mypool''.

--
Dave

Richard Elling

2009-May-20 04:53 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Eric Schrock wrote:>
> On May 19, 2009, at 8:56 PM, Nicholas Lee wrote:
>
>>
>> So txg is sync to the slog device but retained in memory, and then 
>> rather than reading it back from the slog to memory it is copied to 
>> the pool from memory the copy?
>
> Yes, that is correct.  It is best to think of the ZIL and the txg sync 
> process as orthogonal - data goes to both locations at different 
> times.  The ZIL (technically "all ZILs" since they''re
per-filesystem)
> is *only* read in the event of log replay (unclean shutdown).  During 
> normal operation it is never read. Hence the benefit of write-biases 
> SSDs - it doesn''t matter if reads are fast for slogs.
Yes.  OTOH we also know that if the pool is exported, then there
is nothing to be read from the slog.  So in the case of a properly
exported pool, we should be allowed to import sans slog. No?
I seem to recall an RFE for this, but can''t seem to locate it now.
 -- richard

Eric Schrock

2009-May-20 05:49 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On May 19, 2009, at 9:45 PM, Dave wrote:>
> Thanks for correcting my statement. There is still a potential  
> approximate 60 second window for data loss if there are 2  
> transaction groups waiting to sync with a 30 second txg commit  
> timer, correct?
No, only the syncing transaction group is affected.  And that''s only  
if you''re pushing a very small amount of data.  Once you''ve
reached a
certain amount of data we push the txg regardless of how long it''s  
been.  I believe the ZFS team is also re-evaluating that timeout (it  
used to be much smaller), so it''s definitely not set in stone.
> How will it know that data was actually lost?
It attempts to replay the log records.
> Or does it just alert you that it''s possible data was lost?
No, we know that there should be a log record but we couldn''t read it.
> There''s also the worry that the pool is not importable if you did
> have the double failure scenario and the log really is gone. Re: bug  
> ID 6733267 . E.g. if you had done a ''zpool import -o
cachefile=none
> mypool''.
Yes, import relies on the vdev GUID sum matching, which is orthogonal  
to a failed slog device during open.   A failed slog device can  
prevent such a pool from being imported.

- Eric

--
Eric Schrock, Fishworks                        http://blogs.sun.com/eschrock

Darren J Moffat

2009-May-20 08:09 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Paul B. Henson wrote:> I was checking with Sun support regarding this issue, and they say
"The CR
> currently has a high priority and the fix is understood. However, there is
> no eta, workaround, nor IDR."
> 
> If it''s a high priority, and it''s known how to fix it, I
was curious as to
> why has there been no progress? As I understand, if a failure of the log
Why do you think there is no progress ?  The code for this is hard to 
implement.  People are working on very hard but that doesn''t mean there
isn''t any progress if there is no eta or workaround available via Sun 
Service.

--
Darren J Moffat

Robert Milkowski

2009-May-20 10:45 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Tue, 19 May 2009, Dave wrote:
> Paul B. Henson wrote:
>> I was checking with Sun support regarding this issue, and they say
"The CR
>> currently has a high priority and the fix is understood. However, there
is
>> no eta, workaround, nor IDR."
>> 
>> If it''s a high priority, and it''s known how to fix
it, I was curious as to
>> why has there been no progress? As I understand, if a failure of the
log
>> device occurs while the pool is active, it automatically switches back
to
>> an embedded pool log. It seems removal would be as simple as following
the
>> failure path to an embedded log, and then update the pool metadata to
>> remove the log device. Is it more complicated than that? We''re
about to do
>> some testing with slogs, and it would make me a lot more comfortable to
>> deploy one in production if there was a backout plan :)...
>
> If you don''t have mirrored slogs and the slog fails, you may lose
any data
> that was in a txg group waiting to be committed to the main pool vdevs -
you
> will never know if you lost any data or not.
>
No, you won''t loose any data.
The issue is that if you export the pool currently you won''t be able to
import it with a broken log device. But while a pool is up and its slog 
device fails it won''t loose any data.

-- 
Robert Milkowski
http://milek.blogspot.com

Paul B. Henson

2009-May-20 14:35 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Wed, 20 May 2009, Darren J Moffat wrote:
> Why do you think there is no progress ?
Sorry if that''s a wrong assumption, but I posted questions regarding it
to
this list with no response from a Sun employee until yours, and the
engineer assigned to my support ticket was unable to provide any
information as to the current state or if anyone was working on it or why
it was so hard to do. Barring any data to the contrary it appeared from an
external perspective that it was stalled.
> The code for this is hard to implement.  People are working on very hard
> but that doesn''t mean there isn''t any progress if there
is no eta or
> workaround available via Sun Service.
Why is it so hard? I understand that removing a data vdev or shrinking the
size of a pool is complicated, but what makes it so difficult to remove a
slog? If the slog fails the pool returns to an embedded log, it seems the
only difference between a pool with a failed slog and a pool with no slog
is that the former knows it used to have a slog. Why is it not as simple as
updating the metadata for a pool with a failed slog so it no longer thinks
it has a slog?

On another note, do you have any idea how one might recover from the case
where a slog device fails while the pool is inactive and renders it
inaccessible?

Thanks...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Darren J Moffat

2009-May-20 14:53 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Paul B. Henson wrote:> On Wed, 20 May 2009, Darren J Moffat wrote:
> 
>> Why do you think there is no progress ?
> 
> Sorry if that''s a wrong assumption, but I posted questions
regarding it to
> this list with no response from a Sun employee until yours, and the
> engineer assigned to my support ticket was unable to provide any
> information as to the current state or if anyone was working on it or why
> it was so hard to do. Barring any data to the contrary it appeared from an
> external perspective that it was stalled.
How Sun Services reports the status of escalations to customers under 
contract is not a discussion for a public alias like this so I won''t 
comment on this.
>> The code for this is hard to implement.  People are working on very
hard
>> but that doesn''t mean there isn''t any progress if
there is no eta or
>> workaround available via Sun Service.
> 
> Why is it so hard? I understand that removing a data vdev or shrinking the
> size of a pool is complicated, but what makes it so difficult to remove a
> slog? If the slog fails the pool returns to an embedded log, it seems the
> only difference between a pool with a failed slog and a pool with no slog
> is that the former knows it used to have a slog. Why is it not as simple as
> updating the metadata for a pool with a failed slog so it no longer thinks
> it has a slog?
If the engineers that are working on this wish to comment I''m sure they
will, but I know it really isn''t that simple.
> On another note, do you have any idea how one might recover from the case
> where a slog device fails while the pool is inactive and renders it
> inaccessible?
I do; because I''ve done it to my own personal data pool.  However it is
not a procedure I''m willing to tell anyone how to do - so please
don''t
ask - a) it was highly dangerous and involved using multiple different 
zfs kernel modules was well as updating disk labels from inside kmdb and 
b) it was several months ago and I think I''d forget a critical step.

I''m not even sure I could repeat it on a pool with a non trivial number
of datasets.   It did how ever give me an appreciation for the issues in 
implementing this for a generic solution in a way that is safe to do and 
that works for all types of pool and slog config (mine was a very simple 
configuration: mirror + slog).

--
Darren J Moffat

Paul B. Henson

2009-May-20 15:27 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Wed, 20 May 2009, Darren J Moffat wrote:
> How Sun Services reports the status of escalations to customers under
> contract is not a discussion for a public alias like this so I
won''t
> comment on this.
Heh, but maybe it should be a discussion for some internal forum; more
information = less anxious customers :)...
> If the engineers that are working on this wish to comment I''m sure
they
> will, but I know it really isn''t that simple.
I hope they do, as an information vacuum tends to result in false
assumptions.
> I do; because I''ve done it to my own personal data pool.  However
it is
> not a procedure I''m willing to tell anyone how to do - so please
don''t
[...]> implementing this for a generic solution in a way that is safe to do and
> that works for all types of pool and slog config (mine was a very simple
> configuration: mirror + slog).
Hmm, well, it just seems horribly wrong for the failure of a slog to result
in complete data loss, *particularly* when all of the data is perfectly
valid and just sitting there beyond your reach.

One suggestion I received off-list was to dump your virgin slog right
after creation (I did a dd=/dev/zero of=/dev/dsk/<slogtobe>, a zfs add
pool
log <slog>, then a dd if=/dev/<slog> of=slog.dd count=<blocks
until
everything is zeros>) and then you could restore it if you lost your slog.
I tested this, and sure enough if I wrote the slog dump to the corrupted
device the pool was happy again. This only seemed to work if I restored it
to the exact same device it was on before, restoring it to a different
device didn''t work (I thought zfs was device name agnostic?).

-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Darren J Moffat

2009-May-20 15:49 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Paul B. Henson wrote:> On Wed, 20 May 2009, Darren J Moffat wrote:
> 
>> How Sun Services reports the status of escalations to customers under
>> contract is not a discussion for a public alias like this so I
won''t
>> comment on this.
> 
> Heh, but maybe it should be a discussion for some internal forum; more
> information = less anxious customers :)...
> 
>> If the engineers that are working on this wish to comment I''m
sure they
>> will, but I know it really isn''t that simple.
> 
> I hope they do, as an information vacuum tends to result in false
> assumptions.
As does attempting to gain the same information via several routes.

Since you apparently have a Sun Service support contract I highly 
recommend that you purse that route alone for further information on 
this bug status for your production needs.

Note that I don''t represent the engineers working on this issue and the
statements are urely my personal view on things based on my experience 
of recovering from the situation.  I was only able to do that recovery 
due to my experience in the ZFS code base due to the ZFS crypto project.

-- 
Darren J Moffat

Miles Nordin

2009-May-20 16:36 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

>>>>> "re" == Richard Elling <richard.elling at
gmail.com> writes:
    re> in the case of a properly exported pool, we should be allowed
    re> to import sans slog.

seems so, but the non properly exported case is still important.  for
example NFS HA clusters would stop working if slogs were always
ignored on import---you have to ''import -f'' and roll the slog
like
it does now.

And I think the desire for the removal feature is both these:

 * don''t want to lose whole pool if slog goes bad

 * want to test out slog, then remove it.  or replace with smaller
   slog.

so your case handles the second (though you''d have to export/import to
remove, while you can add online)

but it does not handle the first.

I want to make sure the common conception of the scope of the slog
user-whining doesn''t ``creep'''' narrower.  I think the
original slog
work included a comment like:

  q. shouldn''t we be able to remove them?

  a. yes that would be extremely easy to do, but my deadline is
     already tight so I want to resist scope creep.

now, here we are, a year later...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090520/81553d99/attachment.bin>

Miles Nordin

2009-May-20 16:42 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

>>>>> "djm" == Darren J Moffat <darrenm at
opensolaris.org> writes:
   djm> I do; because I''ve done it to my own personal data pool.
   djm> However it is not a procedure I''m willing to tell anyone how
   djm> to do - so please don''t ask - 

k, fine, fair enough and noted.

   djm> a) it was highly dangerous and involved using multiple
   djm> different zfs kernel modules was well as

however...utter hogwash!  Nothing is ``highly dangerous'''' when
your
pool is completely unreadable.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090520/3cf428c9/attachment.bin>

Will Murnane

2009-May-20 17:21 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Wed, May 20, 2009 at 12:42, Miles Nordin <carton at ivy.net>
wrote:>>>>>> "djm" == Darren J Moffat <darrenm at
opensolaris.org> writes:
> ? djm> a) it was highly dangerous and involved using multiple
> ? djm> different zfs kernel modules was well as
>
> however...utter hogwash! ?Nothing is ``highly dangerous''''
when your
> pool is completely unreadable.It is if you turn your "unreadable but fixable" pool into a
"completely unrecoverable" pool.  If my pool loses its log disk,
I''m
waiting for an official tool to fix it.

Will

Richard Elling

2009-May-20 18:49 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Will Murnane wrote:> On Wed, May 20, 2009 at 12:42, Miles Nordin <carton at ivy.net>
wrote:
>   
>>>>>>> "djm" == Darren J Moffat <darrenm at
opensolaris.org> writes:
>>>>>>>               
>>   djm> a) it was highly dangerous and involved using multiple
>>   djm> different zfs kernel modules was well as
>>
>> however...utter hogwash!  Nothing is ``highly
dangerous'''' when your
>> pool is completely unreadable.
>>     
> It is if you turn your "unreadable but fixable" pool into a
> "completely unrecoverable" pool.  If my pool loses its log disk,
I''m
> waiting for an official tool to fix it.
>   
Whoa.

The slog is a top-level vdev like the others.  The current situation is that
loss of a top-level vdev results in a pool that cannot be imported. If you
are concerned about the loss of a top-level vdev, then you need to protect
them.  For slogs, mirrors work.  For the main pool, mirrors and raidz[12]
work.

There was a conversation regarding whether it would be a best practice
to always mirror the slog. Since the recovery from slog failure modes is
better than that of the other top-level vdevs, the case for recommending
a mirrored slog is less clear. If you are paranoid, then mirror the slog.
 -- richard

Miles Nordin

2009-May-20 19:44 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

>>>>> "re" == Richard Elling <richard.elling at
gmail.com> writes:
    re> Whoa.

    re> The slog is a top-level vdev like the others.  The current
    re> situation is that loss of a top-level vdev results in a pool
    re> that cannot be imported.

this taxonomy is wilfully ignorant of the touted way pools will keep
working if their slog dies, reverting to normal inside-the-main-pool
zil.  and about ZFS''s supposed ability to tell from the other pool
devices and report to fmdump if the slog was empty or full before it
failed.  also the difference between slogs failing on imported pools
(okay) vs failed slogs on exported pools (entire pool lost).  It''s not
as intuitive as you imply, and I don''t think worrying about a corner
case where you lose the whole pool is ``paranoid''''.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090520/56721da4/attachment.bin>

Dave

2009-May-20 19:51 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Richard Elling wrote:> Will Murnane wrote:
>> On Wed, May 20, 2009 at 12:42, Miles Nordin <carton at ivy.net>
wrote:
>>  
>>>>>>>> "djm" == Darren J Moffat <darrenm
at opensolaris.org> writes:
>>>>>>>>               
>>>   djm> a) it was highly dangerous and involved using multiple
>>>   djm> different zfs kernel modules was well as
>>>
>>> however...utter hogwash!  Nothing is ``highly
dangerous'''' when your
>>> pool is completely unreadable.
>>>     
>> It is if you turn your "unreadable but fixable" pool into a
>> "completely unrecoverable" pool.  If my pool loses its log
disk, I''m
>> waiting for an official tool to fix it.
>>   
> 
> Whoa.
> 
> The slog is a top-level vdev like the others.  The current situation is 
> that
> loss of a top-level vdev results in a pool that cannot be imported. If you
> are concerned about the loss of a top-level vdev, then you need to protect
> them.  For slogs, mirrors work.  For the main pool, mirrors and raidz[12]
> work.
> 
> There was a conversation regarding whether it would be a best practice
> to always mirror the slog. Since the recovery from slog failure modes is
> better than that of the other top-level vdevs, the case for recommending
> a mirrored slog is less clear. If you are paranoid, then mirror the slog.
> -- richard
> 
I can''t test this myself at the moment, but the reporter of Bug ID 
6733267 says even one failed slog from a pair of mirrored slogs will 
prevent an exported zpool from being imported. Has anyone tested this 
recently?

--
Dave

Nicholas Lee

2009-May-20 22:19 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Not sure if this is a wacky question.
Given a slog device does not really need much more than 10 GB.  If I was to
use a pair of X25-E (or STEC devices or whatever) in a mirror as the boot
device and then either 1. created a loopback file vdev or 2. separate
mirrored slice for the slog would this cause issues?    Other than h/w
management details.


Nicholas
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090521/50a46736/attachment.html>

Mike Gerdts

2009-May-21 16:21 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Wed, May 20, 2009 at 9:35 AM, Paul B. Henson <henson at acm.org>
wrote:> On Wed, 20 May 2009, Darren J Moffat wrote:
>
>> Why do you think there is no progress ?
>
> Sorry if that''s a wrong assumption, but I posted questions
regarding it to
> this list with no response from a Sun employee until yours, and the
> engineer assigned to my support ticket was unable to provide any
> information as to the current state or if anyone was working on it or why
> it was so hard to do. Barring any data to the contrary it appeared from an
> external perspective that it was stalled.
>
>> The code for this is hard to implement. ?People are working on very
hard
>> but that doesn''t mean there isn''t any progress if
there is no eta or
>> workaround available via Sun Service.
>
> Why is it so hard? I understand that removing a data vdev or shrinking the
> size of a pool is complicated, but what makes it so difficult to remove a
> slog? If the slog fails the pool returns to an embedded log, it seems the
> only difference between a pool with a failed slog and a pool with no slog
> is that the former knows it used to have a slog. Why is it not as simple as
> updating the metadata for a pool with a failed slog so it no longer thinks
> it has a slog?
>
> On another note, do you have any idea how one might recover from the case
> where a slog device fails while the pool is inactive and renders it
> inaccessible?
>
> Thanks...
I stumbled across this just now while performing a search for something else.

http://opensolaris.org/jive/thread.jspa?messageID=377018

I have no idea of the quality or correctness of this solution.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/

Richard Elling

2009-May-21 18:58 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Miles Nordin wrote:>>>>>> "re" == Richard Elling <richard.elling at
gmail.com> writes:
>>>>>>             
>
>     re> Whoa.
>
>     re> The slog is a top-level vdev like the others.  The current
>     re> situation is that loss of a top-level vdev results in a pool
>     re> that cannot be imported.
>
> this taxonomy is wilfully ignorant of the touted way pools will keep
> working if their slog dies, reverting to normal inside-the-main-pool
> zil.  and about ZFS''s supposed ability to tell from the other pool
> devices and report to fmdump if the slog was empty or full before it
> failed.
What happens depends on what "it" is when "it failed" as
well as the
nature of the failure. ZFS reads the slog on import, so there is no notion
of the slog being empty or full at any given instant other than at import
time when we may also know whether the pool was exported or not.

Another way to look at this, there is no explicit flag set in the pool
that indicates whether the slog is empty or full. Adding one would be
silly because it would be inherently slow, and improving performance
is why the slog exists in the first place. We can''t count on the in-RAM
status, since that won''t survive an outage. At import time, the vdevs
trees
are reconstructed -- this is where the cleverness needs to be applied.
>   also the difference between slogs failing on imported pools
> (okay) vs failed slogs on exported pools (entire pool lost).  It''s
not
> as intuitive as you imply, and I don''t think worrying about a
corner
> case where you lose the whole pool is ``paranoid''''.
>   In any case, if you lose a device which has unprotected data on it, then
the data is lost.  If you do not protect the slog, then your system is
susceptible to data loss if you lose the slog device.  This fact is not
under question.  What is under question is the probability that you
will lose a slog device *and* lose data, which we know to be less than
the  probability of losing a device -- we just can''t say how much less.

Redundancy offers nothing more than insurance.  The people who buy
insurance are either gullible or, to some extent, paranoid.  I prefer to
believe that the folks on this forum are not gullible :-)
  -- richard

Peter Woodman

2009-May-21 19:43 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Well, it worked for me, at least. Note that this is a very limited recovery
case- it only works if you have the GUID of the slog device from
zpool.cache, which in the case of a fail-on-export and reimport might not be
available. The original author of the fix seems to imply that you can use
any size device as the replacement slog, but I had trouble doing that.
Didn''t investigate enough to say conclusively that that is not
possible,
though.
It''s a very limited fix, but for the case Paul outlined, it will work,
assuming zpool.cache is available.

On Thu, May 21, 2009 at 9:21 AM, Mike Gerdts <mgerdts at gmail.com> wrote:
> On Wed, May 20, 2009 at 9:35 AM, Paul B. Henson <henson at acm.org>
wrote:
> > On Wed, 20 May 2009, Darren J Moffat wrote:
> >
> >> Why do you think there is no progress ?
> >
> > Sorry if that''s a wrong assumption, but I posted questions
regarding it
> to
> > this list with no response from a Sun employee until yours, and the
> > engineer assigned to my support ticket was unable to provide any
> > information as to the current state or if anyone was working on it or
why
> > it was so hard to do. Barring any data to the contrary it appeared
from
> an
> > external perspective that it was stalled.
> >
> >> The code for this is hard to implement.  People are working on
very hard
> >> but that doesn''t mean there isn''t any progress
if there is no eta or
> >> workaround available via Sun Service.
> >
> > Why is it so hard? I understand that removing a data vdev or shrinking
> the
> > size of a pool is complicated, but what makes it so difficult to
remove a
> > slog? If the slog fails the pool returns to an embedded log, it seems
the
> > only difference between a pool with a failed slog and a pool with no
slog
> > is that the former knows it used to have a slog. Why is it not as
simple
> as
> > updating the metadata for a pool with a failed slog so it no longer
> thinks
> > it has a slog?
> >
> > On another note, do you have any idea how one might recover from the
case
> > where a slog device fails while the pool is inactive and renders it
> > inaccessible?
> >
> > Thanks...
>
> I stumbled across this just now while performing a search for something
> else.
>
> http://opensolaris.org/jive/thread.jspa?messageID=377018
>
> I have no idea of the quality or correctness of this solution.
>
> --
> Mike Gerdts
> http://mgerdts.blogspot.com/
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090521/f2d3b658/attachment.html>

Miles Nordin

2009-May-21 20:47 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

>>>>> "re" == Richard Elling <richard.elling at
gmail.com> writes:
>>>>> "es" == Eric Schrock <Eric.Schrock at
Sun.COM> writes:
    re> Another way to look at this, there is no explicit flag set in
    re> the pool that indicates whether the slog is empty or
    re> full.

Not that it makes a huge difference to me, but Eric seemed to say that
actually there is just such a flag:

  dave> Or does it just alert you that it''s possible data was lost?

    es> No, we know that there should be a log record but we
couldn''t
    es> read it.

doesn''t make perfect sense to me, either, since keeping a slog-full
bit synchronously updated seems like it''d have in many cases almost
the same cost as not using the slog.  Maybe it''s asynchronously
updated and useful but not perfectly reliable, or maybe it''s a ``slog
empty'''' but that''s only set on export or clean
shutdown.

Anyway, Richard I think your whole argument is ridiculous: you''re
acting like losing 30 seconds of data and losing the entire pool are
equivalent.  Who is this line of reasoning supposed to serve?  From
here it looks like everyone loses the further you advance it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090521/6092bc84/attachment.bin>

Bob Friesenhahn

2009-May-21 22:03 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Thu, 21 May 2009, Miles Nordin wrote:>
> Anyway, Richard I think your whole argument is ridiculous: you''re
> acting like losing 30 seconds of data and losing the entire pool are
> equivalent.  Who is this line of reasoning supposed to serve?  From
> here it looks like everyone loses the further you advance it.
For some people losing 30 seconds of data and losing the entire pool 
could be equivalent.  In fact, it could be a billion dollar error.

Bob
--
Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Richard Elling

2009-May-21 22:16 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Miles Nordin wrote:>>>>>> "re" == Richard Elling <richard.elling at
gmail.com> writes:
>>>>>> "es" == Eric Schrock <Eric.Schrock at
Sun.COM> writes:
>>>>>>             
>
>     re> Another way to look at this, there is no explicit flag set in
>     re> the pool that indicates whether the slog is empty or
>     re> full.
>
> Not that it makes a huge difference to me, but Eric seemed to say that
> actually there is just such a flag:
>
>   dave> Or does it just alert you that it''s possible data was
lost?
>
>     es> No, we know that there should be a log record but we
couldn''t
>     es> read it.
>
> doesn''t make perfect sense to me, either, since keeping a
slog-full
> bit synchronously updated seems like it''d have in many cases
almost
> the same cost as not using the slog.  Maybe it''s asynchronously
> updated and useful but not perfectly reliable, or maybe it''s a
``slog
> empty'''' but that''s only set on export or clean
shutdown.
>   
In the case of an export, we know that the log has been flushed to the
pool and we know than an export has occurred.  This is why there is
an RFE to use this info to permit the pool to be imported sans slog.
> Anyway, Richard I think your whole argument is ridiculous: you''re
> acting like losing 30 seconds of data and losing the entire pool are
> equivalent.  Who is this line of reasoning supposed to serve?  From
> here it looks like everyone loses the further you advance it.
>   
I recommend protecting your data.  Data retention is important.
I don''t see where there is an argument here, just an engineering
trade-off between data retention, data availability, performance,
and cost... nothing new here.
 -- richard

Paul B. Henson

2009-May-22 04:58 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Thu, 21 May 2009, Peter Woodman wrote:
> Well, it worked for me, at least. Note that this is a very limited
> recovery case- it only works if you have the GUID of the slog device from
> zpool.cache, which in the case of a fail-on-export and reimport might not
> be available. The original author of the fix seems to imply that you can
> use any size device as the replacement slog, but I had trouble doing
> that. Didn''t investigate enough to say conclusively that that is
not
> possible, though.
>
> It''s a very limited fix, but for the case Paul outlined, it will
work,
> assuming zpool.cache is available.
Cool, thanks for the info. While if you came across this info after a
critical failure it wouldn''t be much help, knowing about this
beforehand I
can make a backup of the cache file stored someplace other than the pool
with the slog. I''ll have to test it out some, but if it works out it
will
make me more comfortable about deploying a slog in production. Still
haven''t heard anything official from Sun support about recovering from
this
situation.

I''m also still really curious why it''s so hard to remove a
slog, but no one
in the know has replied, and that''s not the type of question Sun
support
tends to follow up on :(...


--
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Paul B. Henson

2009-May-22 05:54 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Thu, 21 May 2009, Bob Friesenhahn wrote:
> For some people losing 30 seconds of data and losing the entire pool
> could be equivalent.  In fact, it could be a billion dollar error.
I don''t think anybody''s saying to just ignore a missing slog
and continue
on like nothing''s wrong. Let the pool fail to import, generate errors
and
faults. But if I''m willing to lose whatever uncommitted transactions
are in
the slog, why should my entire pool sit unaccessible? Some manual option to
force an import of a pool with a missing slog would at least give the
option of getting to your data, which would be a lot better than the
current situation.


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  henson at csupomona.edu
California State Polytechnic University  |  Pomona CA 91768

Darren J Moffat

2009-May-22 11:05 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Miles Nordin wrote:>>>>>> "djm" == Darren J Moffat <darrenm at
opensolaris.org> writes:
> 
>    djm> I do; because I''ve done it to my own personal data
pool.
>    djm> However it is not a procedure I''m willing to tell
anyone how
>    djm> to do - so please don''t ask - 
> 
> k, fine, fair enough and noted.
> 
>    djm> a) it was highly dangerous and involved using multiple
>    djm> different zfs kernel modules was well as
> 
> however...utter hogwash!  Nothing is ``highly dangerous''''
when your
> pool is completely unreadable.
That''s your opinion.  However I''m the one that did this on
*my* data and
I knew what I was doing because I develop in the ZFS code base.  I 
calculated the risk I was taken on *my* pool based on the hacks I did 
for this.  I considered what I was doing was "highly dangerous"
because
I could if I made a mistake make the situation even worse for myself 
than it already was, one that would be even harder to recover from.   I 
had some files I didn''t have a backup of yet in that pool that I really
didn''t want to lose because if I did I''d be in trouble with my
spouse.
So for me this was highly dangerous on my data.   I would expect others 
to make a similar risk judgement based based on the value of the data in 
the pool.

If the only copy of the data is in the pool what I did (and remember you 
have no idea how I did this because I''m not going to explain it because
I know I can''t accurately reproduce it in email) it is risky.

-- 
Darren J Moffat

Mike Gerdts

2009-May-22 18:59 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

On Tue, May 19, 2009 at 2:16 PM, Paul B. Henson <henson at acm.org>
wrote:>
> I was checking with Sun support regarding this issue, and they say
"The CR
> currently has a high priority and the fix is understood. However, there is
> no eta, workaround, nor IDR."
>
> If it''s a high priority, and it''s known how to fix it, I
was curious as to
> why has there been no progress? As I understand, if a failure of the log
> device occurs while the pool is active, it automatically switches back to
> an embedded pool log. It seems removal would be as simple as following the
> failure path to an embedded log, and then update the pool metadata to
> remove the log device. Is it more complicated than that? We''re
about to do
> some testing with slogs, and it would make me a lot more comfortable to
> deploy one in production if there was a backout plan :)...
>
A rather interesting putback just happened...

http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64

6803605 <http://bugs.opensolaris.org/view_bug.do?bug_id=6803605> should be
able to offline log devices
6726045
<http://bugs.opensolaris.org/view_bug.do?bug_id=6726045>vdev_deflate_ratio
is not set when offlining a log device
6599442 <http://bugs.opensolaris.org/view_bug.do?bug_id=6599442> zpool
import has faults in the display

I love comments that tell you what is really going on...

    8.75 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.75>  		/*
    8.76 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.76>
		 * If this device has the only valid copy of some data,
    8.77 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.77>
-		 * don''t allow it to be offlined.
    8.78 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.78>
+		 * don''t allow it to be offlined. Log devices are always
    8.79 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.79>
+		 * expendable.
    8.80 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.80>  		
*/


For some reason, the CR''s listed above are not available through
bugs.opensolaris.org.  However, at least 6833605 is available through
sunsolve if you have a support contract.

-- 
Mike Gerdts
http://mgerdts.blogspot.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090522/0398b396/attachment.html>

Miles Nordin

2009-May-22 19:39 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

>>>>> "mg" == Mike Gerdts <mgerdts at gmail.com>
writes:
    mg> A rather interesting putback just happened...

yeah, it is good when you can manually offline the same set of devices
as the set of those which are allowed to fail without invoking the
pool''s failmode.  I guess the putback means one less such difference.

However the set of devices you need available to (force-) import a
pool should also be the same as the set of devices you need to run it.
The goal should be to make all three set requirements (zpool offline,
failmode, import) the same.  The offline and online cases aren''t
totally equivalent, so there will be corner cases like the quorum
rules there were with SVM, but by following procedures, banging on
things, invoking ``simon-sez'''' flags, the three cases should
ultimately end up demanding the same sets of devices.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 304 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090522/e0714999/attachment.bin>

George Wilson

2009-May-23 20:49 UTC

head link

[zfs-discuss] CR# 6574286, remove slog device

Mike Gerdts wrote:> On Tue, May 19, 2009 at 2:16 PM, Paul B. Henson <henson at acm.org 
> <mailto:henson at acm.org>> wrote:
> >
> > I was checking with Sun support regarding this issue, and they say 
> "The CR
> > currently has a high priority and the fix is understood. However, 
> there is
> > no eta, workaround, nor IDR."
> >
> > If it''s a high priority, and it''s known how to fix
it, I was curious
> as to
> > why has there been no progress? As I understand, if a failure of the
log
> > device occurs while the pool is active, it automatically switches 
> back to
> > an embedded pool log. It seems removal would be as simple as 
> following the
> > failure path to an embedded log, and then update the pool metadata to
> > remove the log device. Is it more complicated than that?
We''re about
> to do
> > some testing with slogs, and it would make me a lot more comfortable
to
> > deploy one in production if there was a backout plan :)...
> >
>
> A rather interesting putback just happened...
>
> http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64
>
> 6803605 <http://bugs.opensolaris.org/view_bug.do?bug_id=6803605> 
> should be able to offline log devices
> 6726045 <http://bugs.opensolaris.org/view_bug.do?bug_id=6726045> 
> vdev_deflate_ratio is not set when offlining a log device
> 6599442 <http://bugs.opensolaris.org/view_bug.do?bug_id=6599442>
zpool
> import has faults in the display
>
> I love comments that tell you what is really going on...
>
>     8.75 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.75> 
/*
>     8.76 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.76> 
* If this device has the only valid copy of some data,
>
>     8.77 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.77>
-		 * don''t allow it to be offlined.
>     8.78 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.78>
+		 * don''t allow it to be offlined. Log devices are always
>
>     8.79 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.79>
+		 * expendable.
>     8.80 <http://hg.genunix.org/onnv-gate.hg/rev/cc5b64682e64#l8.80> 
*/
>
>
> For some reason, the CR''s listed above are not available through 
> bugs.opensolaris.org <http://bugs.opensolaris.org>.  However, at
least
> 6833605 is available through sunsolve if you have a support contract.
>
> -- 
> Mike Gerdts
> http://mgerdts.blogspot.com/
>This putback is the precursor to slog device removal and the ability to  
import pools with failed slogs. I''ll provide more details as we get 
closer to integrating the slog removal feature. We are working on it, it 
is one of our top priorities.

Stay tuned for more details...

George

zfs discuss - May 2009 - CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device

[zfs-discuss] CR# 6574286, remove slog device