Darren J Moffat
2007-Aug-15 17:06 UTC
[zfs-crypto-discuss] Setting zio->io_error during zio_write
For an encrypted dataset it is possible that by the time we arrive in zio_write() [ zio_write_encrypt() ] that when we lookup which key is needed to encrypted this data that key isn''t available to us. Is there some value of zio->io_error I can set that will not result in a panic ? but will put the write in to some state where we can try again later - I guess not just this write but maybe the whole transaction group ? I can, and I think I have, fixed the mount time case by returning false from zfs_is_mountable() if the dataset is encrypted and the key needed isn''t present. However the key could go away - without us knowing so in some cases (not really for the pure software case but it can happen with hardware token keys). -- Darren J Moffat
Darren J Moffat wrote:> For an encrypted dataset it is possible that by the time we arrive in > zio_write() [ zio_write_encrypt() ] that when we lookup which key is > needed to encrypted this data that key isn''t available to us. > > Is there some value of zio->io_error I can set that will not result in a > panic ? but will put the write in to some state where we can try again > later - I guess not just this write but maybe the whole transaction group ? >No, we have no ability to do this. With George''s fix for 6565042, we will introduce the ability to "hang" the pool on an IO failure... this may give you what you want.> I can, and I think I have, fixed the mount time case by returning false > from zfs_is_mountable() if the dataset is encrypted and the key needed > isn''t present. However the key could go away - without us knowing so in > some cases (not really for the pure software case but it can happen with > hardware token keys). >
Darren J Moffat
2007-Aug-16 09:58 UTC
[zfs-crypto-discuss] [zfs-code] Setting zio->io_error during zio_write
Mark Maybee wrote:> Darren J Moffat wrote: >> For an encrypted dataset it is possible that by the time we arrive in >> zio_write() [ zio_write_encrypt() ] that when we lookup which key is >> needed to encrypted this data that key isn''t available to us. >> >> Is there some value of zio->io_error I can set that will not result in >> a panic ? but will put the write in to some state where we can try >> again later - I guess not just this write but maybe the whole >> transaction group ? >> > No, we have no ability to do this. With George''s fix for 6565042, we > will introduce the ability to "hang" the pool on an IO failure... this > may give you what you want.It might well do, but will it allow "unhanging" later ? I couldn''t tell much from that bug unfortunately. -- Darren J Moffat
Darren J Moffat wrote:> Mark Maybee wrote: >> Darren J Moffat wrote: >>> For an encrypted dataset it is possible that by the time we arrive in >>> zio_write() [ zio_write_encrypt() ] that when we lookup which key is >>> needed to encrypted this data that key isn''t available to us. >>> >>> Is there some value of zio->io_error I can set that will not result in >>> a panic ? but will put the write in to some state where we can try >>> again later - I guess not just this write but maybe the whole >>> transaction group ? >>> >> No, we have no ability to do this. With George''s fix for 6565042, we >> will introduce the ability to "hang" the pool on an IO failure... this >> may give you what you want. > > It might well do, but will it allow "unhanging" later ? I couldn''t tell > much from that bug unfortunately. >Once the error is corrected you will be able to resume the IOs. - George
Darren J Moffat
2007-Aug-16 14:48 UTC
[zfs-crypto-discuss] [zfs-code] Setting zio->io_error during zio_write
George Wilson wrote:> > Darren J Moffat wrote: >> Mark Maybee wrote: >>> Darren J Moffat wrote: >>>> For an encrypted dataset it is possible that by the time we arrive in >>>> zio_write() [ zio_write_encrypt() ] that when we lookup which key is >>>> needed to encrypted this data that key isn''t available to us. >>>> >>>> Is there some value of zio->io_error I can set that will not result in >>>> a panic ? but will put the write in to some state where we can try >>>> again later - I guess not just this write but maybe the whole >>>> transaction group ? >>>> >>> No, we have no ability to do this. With George''s fix for 6565042, we >>> will introduce the ability to "hang" the pool on an IO failure... this >>> may give you what you want. >> It might well do, but will it allow "unhanging" later ? I couldn''t tell >> much from that bug unfortunately. >> > > Once the error is corrected you will be able to resume the IOs.Great. I''m thinking I might in the missing encryption key case even want to fire off a sysevent so that someone knows the key is missing. Does that sound reasonable ? -- Darren J Moffat
Darren J Moffat wrote:> George Wilson wrote: >> >> Darren J Moffat wrote: >>> Mark Maybee wrote: >>>> Darren J Moffat wrote: >>>>> For an encrypted dataset it is possible that by the time we arrive >>>>> in zio_write() [ zio_write_encrypt() ] that when we lookup which >>>>> key is needed to encrypted this data that key isn''t available to us. >>>>> >>>>> Is there some value of zio->io_error I can set that will not result >>>>> in a panic ? but will put the write in to some state where we can >>>>> try again later - I guess not just this write but maybe the whole >>>>> transaction group ? >>>>> >>>> No, we have no ability to do this. With George''s fix for 6565042, we >>>> will introduce the ability to "hang" the pool on an IO failure... this >>>> may give you what you want. >>> It might well do, but will it allow "unhanging" later ? I couldn''t >>> tell much from that bug unfortunately. >>> >> >> Once the error is corrected you will be able to resume the IOs. > > Great. I''m thinking I might in the missing encryption key case even > want to fire off a sysevent so that someone knows the key is missing. > Does that sound reasonable ? >BTW, you will need to ensure that if you set the io_error in zio_encrypt_write() that it does not get overwritten in a later stage. Would you send the sysevent in zio_write_encrypt() or wait till zio_done()? Keep in mind that the user may decide to set the failure mode to one of three options: 1). wait - stop all IOs and block until the error is cleared 2). continue - new IOs will return EIO but in-flight IOs will block until the error is cleared 3). panic - same behavior, nicer message. Thanks, George
cindi
2007-Aug-16 16:34 UTC
[zfs-crypto-discuss] [zfs-code] Setting zio->io_error during zio_write
Who is someone? sysevent is only good if ''someone'' is listening. If its the admin you want to alert, an FMA ereport is more appropriate. ZFS already has a diagnosis engine that you could modify to understand the new ereport. Cindi Darren J Moffat wrote:> George Wilson wrote: >> Darren J Moffat wrote: >>> Mark Maybee wrote: >>>> Darren J Moffat wrote: >>>>> For an encrypted dataset it is possible that by the time we arrive in >>>>> zio_write() [ zio_write_encrypt() ] that when we lookup which key is >>>>> needed to encrypted this data that key isn''t available to us. >>>>> >>>>> Is there some value of zio->io_error I can set that will not result in >>>>> a panic ? but will put the write in to some state where we can try >>>>> again later - I guess not just this write but maybe the whole >>>>> transaction group ? >>>>> >>>> No, we have no ability to do this. With George''s fix for 6565042, we >>>> will introduce the ability to "hang" the pool on an IO failure... this >>>> may give you what you want. >>> It might well do, but will it allow "unhanging" later ? I couldn''t tell >>> much from that bug unfortunately. >>> >> Once the error is corrected you will be able to resume the IOs. > > Great. I''m thinking I might in the missing encryption key case even > want to fire off a sysevent so that someone knows the key is missing. > Does that sound reasonable ? >
Darren J Moffat
2007-Aug-17 09:11 UTC
[zfs-crypto-discuss] [zfs-code] Setting zio->io_error during zio_write
cindi wrote:> Who is someone? sysevent is only good if ''someone'' is listening. If > its the admin you want to alert, an FMA ereport is more appropriate. ZFS > already has a diagnosis engine that you could modify to understand the > new ereport.I hadn''t considered using FMA mainly because this isn''t a fault but more of an async event that someone needs to respond to. The reason for suggesting a sysevent was because it allows multiple different possibilities fow "who" is listening. One possible case is that HAL picks up the sysevent and passes it to something in the users GNOME session using DBUS. I will have a look at FMA though, thanks for the suggestion. -- Darren J Moffat
Darren J Moffat
2007-Aug-17 09:22 UTC
[zfs-crypto-discuss] [zfs-code] Setting zio->io_error during zio_write
George Wilson wrote:> > > Darren J Moffat wrote: >> George Wilson wrote: >>> >>> Darren J Moffat wrote: >>>> Mark Maybee wrote: >>>>> Darren J Moffat wrote: >>>>>> For an encrypted dataset it is possible that by the time we arrive >>>>>> in zio_write() [ zio_write_encrypt() ] that when we lookup which >>>>>> key is needed to encrypted this data that key isn''t available to us. >>>>>> >>>>>> Is there some value of zio->io_error I can set that will not >>>>>> result in a panic ? but will put the write in to some state where >>>>>> we can try again later - I guess not just this write but maybe the >>>>>> whole transaction group ? >>>>>> >>>>> No, we have no ability to do this. With George''s fix for 6565042, we >>>>> will introduce the ability to "hang" the pool on an IO failure... this >>>>> may give you what you want. >>>> It might well do, but will it allow "unhanging" later ? I couldn''t >>>> tell much from that bug unfortunately. >>>> >>> >>> Once the error is corrected you will be able to resume the IOs. >> >> Great. I''m thinking I might in the missing encryption key case even >> want to fire off a sysevent so that someone knows the key is missing. >> Does that sound reasonable ? >> > > BTW, you will need to ensure that if you set the io_error in > zio_encrypt_write() that it does not get overwritten in a later stage. > Would you send the sysevent in zio_write_encrypt() or wait till zio_done()?I''d probably sent it in zio_crypt_key_lookup() which zio_write_encrypt() and zio_read_decrypt() both call. Any opinion on what a suitable value for io_error would be ? EIO feels wrong, EPERM is maybe closer but not quite correct.> Keep in mind that the user may decide to set the failure mode to one of > three options: > > 1). wait - stop all IOs and block until the error is cleared > 2). continue - new IOs will return EIO but in-flight IOs will block > until the error is cleared > 3). panic - same behavior, nicer message.This sounds like a new pool level property, would that be the correct interpretation ? -- Darren J Moffat
Darren J Moffat wrote:> George Wilson wrote: >> >> Darren J Moffat wrote: >>> George Wilson wrote: >>>> Darren J Moffat wrote: >>>>> Mark Maybee wrote: >>>>>> Darren J Moffat wrote: >>>>>>> For an encrypted dataset it is possible that by the time we arrive >>>>>>> in zio_write() [ zio_write_encrypt() ] that when we lookup which >>>>>>> key is needed to encrypted this data that key isn''t available to us. >>>>>>> >>>>>>> Is there some value of zio->io_error I can set that will not >>>>>>> result in a panic ? but will put the write in to some state where >>>>>>> we can try again later - I guess not just this write but maybe the >>>>>>> whole transaction group ? >>>>>>> >>>>>> No, we have no ability to do this. With George''s fix for 6565042, we >>>>>> will introduce the ability to "hang" the pool on an IO failure... this >>>>>> may give you what you want. >>>>> It might well do, but will it allow "unhanging" later ? I couldn''t >>>>> tell much from that bug unfortunately. >>>>> >>>> Once the error is corrected you will be able to resume the IOs. >>> Great. I''m thinking I might in the missing encryption key case even >>> want to fire off a sysevent so that someone knows the key is missing. >>> Does that sound reasonable ? >>> >> BTW, you will need to ensure that if you set the io_error in >> zio_encrypt_write() that it does not get overwritten in a later stage. >> Would you send the sysevent in zio_write_encrypt() or wait till zio_done()? > > I''d probably sent it in zio_crypt_key_lookup() which zio_write_encrypt() > and zio_read_decrypt() both call. > > Any opinion on what a suitable value for io_error would be ? EIO feels > wrong, EPERM is maybe closer but not quite correct.What about EACCES?> >> Keep in mind that the user may decide to set the failure mode to one of >> three options: >> >> 1). wait - stop all IOs and block until the error is cleared >> 2). continue - new IOs will return EIO but in-flight IOs will block >> until the error is cleared >> 3). panic - same behavior, nicer message. > > This sounds like a new pool level property, would that be the correct > interpretation ? >Yep, that is the correct interpretation. Thanks, George
Cynthia McGuire
2007-Aug-17 18:57 UTC
[zfs-crypto-discuss] [zfs-code] Setting zio->io_error during zio_write
Darren J Moffat wrote:> cindi wrote: >> Who is someone? sysevent is only good if ''someone'' is listening. If >> its the admin you want to alert, an FMA ereport is more appropriate. >> ZFS already has a diagnosis engine that you could modify to understand >> the new ereport. > > I hadn''t considered using FMA mainly because this isn''t a fault but more > of an async event that someone needs to respond to.I think we are talking about a problem to which you want to inform some human so that some action can be taken to correct that problem. This is what FMA is specifically designed to do. The only issue is that we may inform the wrong human (i.e. not the end user) as the FMA diagnosis goes to the ZFS agent, console, syslog and SNMP.> > The reason for suggesting a sysevent was because it allows multiple > different possibilities fow "who" is listening. One possible case is > that HAL picks up the sysevent and passes it to something in the users > GNOME session using DBUS.There''s nothing preventing you from doing both: issue a sysevent and an ereport. The FMA team is also looking at how to integrate diagnoses into a desktop environment such that diagnoses are observable in the GNOME session. Cindi