thr3ads.net - freebsd stable - Not-so stable if you take a CAM error.... [Jul 2016]

If this information is useful, please help other people find it:
Share via:

Ian Lepore

2016-Jul-11 17:39 UTC

Not-so stable if you take a CAM error....

On Mon, 2016-07-11 at 12:30 -0500, Karl Denninger wrote:> On 7/11/2016 11:32, Ian Lepore wrote:
> > On Mon, 2016-07-11 at 09:50 -0400, Brandon Allbery wrote:
> > > On Mon, Jul 11, 2016 at 9:46 AM, Karl Denninger <
> > > karl at denninger.net>
> > > wrote:
> > > 
> > > > Here's the backtrace ... sounds like expected behavior,
which
> > > > is
> > > > not-so
> > > > good all-in for a situation like this.  I guess the strategy
is
> > > > to
> > > > turn
> > > > off softupdates before attempting such an update so as not
to
> > > > crash
> > > > the
> > > > host machine if there's a problem with the card.
> > > > 
> > > I would tend to assume that removable media should not have
> > > softupdates
> > > enabled. Even with properly working media, it's practically
> > > begging
> > > for
> > > corruption.
> > > 
> > Writing to an sdcard without softupdates enabled will be an
> > exercise in
> > patience.  Like, come back next week and maybe it'll be done.
> > 
> > The only thing that comes to mind with this is maybe some sort of
> > mount
> > flag to say you're willing to live with any amount of filesystem
> > corruption in lieu of panicking.  I'm not sure how easy/practical
> > that
> > would be to implement, though.
> > 
> > -- Ian
> Why not force-detach the volume that takes the error instead of a
> panic()?
> 
Patches welcome.

-- Ian
> That would lead to a panic if the detached volume was the system
> volume
> (obviously) but for a data volume it would simply result in it being
> forcibly unmounted (and dirty, so if it's corrupt it will get caught
> when reattached.)
> 
> It seems that the current paradigm of saying "screw you, panic the
> machine" violates the principle of least astonishment and is overly
> punitive vis-a-vis necessity.  Refusing further I/O because the
> volume
> may now have a corrupt filesystem appears to be facially reasonable,
> but
> that doesn't necessarily wind up being fatal the system itself -- it
> is
> if that's the system volume and is not covered by some sort of
> redundancy, obviously, but it's not in all cases.
> 
> (Note that you can't just unmount the filesystem involved in the
> error;
> it has to be the volume that gets forcibly detached and whatever
> flows
> through from that you have to live with.  The reason is that on any
> sort
> of solid-state media the OS has zero control over zoning and write
> amplification means far more the data you were actually modifying may
> have been lost -- it's entirely possible that *several megabytes* of
> data just got trashed by the write error, and it's even possible that
> the block(s) involved cross a filesystem boundary!)
>

Karl Denninger

2016-Jul-11 17:44 UTC

head link

Not-so stable if you take a CAM error....

On 7/11/2016 12:39, Ian Lepore wrote:> On Mon, 2016-07-11 at 12:30 -0500, Karl Denninger wrote:
>> On 7/11/2016 11:32, Ian Lepore wrote:
>>> On Mon, 2016-07-11 at 09:50 -0400, Brandon Allbery wrote:
>>>> On Mon, Jul 11, 2016 at 9:46 AM, Karl Denninger <
>>>> karl at denninger.net>
>>>> wrote:
>>>>
>>>>> Here's the backtrace ... sounds like expected behavior,
which
>>>>> is
>>>>> not-so
>>>>> good all-in for a situation like this.  I guess the
strategy is
>>>>> to
>>>>> turn
>>>>> off softupdates before attempting such an update so as not
to
>>>>> crash
>>>>> the
>>>>> host machine if there's a problem with the card.
>>>>>
>>>> I would tend to assume that removable media should not have
>>>> softupdates
>>>> enabled. Even with properly working media, it's practically
>>>> begging
>>>> for
>>>> corruption.
>>>>
>>> Writing to an sdcard without softupdates enabled will be an
>>> exercise in
>>> patience.  Like, come back next week and maybe it'll be done.
>>>
>>> The only thing that comes to mind with this is maybe some sort of
>>> mount
>>> flag to say you're willing to live with any amount of
filesystem
>>> corruption in lieu of panicking.  I'm not sure how
easy/practical
>>> that
>>> would be to implement, though.
>>>
>>> -- Ian
>> Why not force-detach the volume that takes the error instead of a
>> panic()?
>>
> Patches welcome.
>
> -- IanAny hints on where the routine(s) live that would forcibly detach a
volume?  (I'll go digging as well but shortening the time would help :))

-- 
Karl Denninger
karl at denninger.net <mailto:karl at denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2996 bytes
Desc: S/MIME Cryptographic Signature
URL:
<http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20160711/17e6dbdd/attachment.bin>

freebsd stable - Jul 2016 - Not-so stable if you take a CAM error....

Not-so stable if you take a CAM error....

Not-so stable if you take a CAM error....