Keir Fraser
2006-Nov-05 12:17 UTC
[Xen-devel] Getting rid of xenbus_suspend(): tpmfront driver impacted?
I''m planning to get xenbus_suspend() as part of work to move device reconnection work all to the restore side of save/restore. This will make restarting a suspended guest much easier if save/restore/relocation fails for any reason. Currently the only consumer of xenbus_suspend() is the tpmfront driver. So this is mainly a heads up that, whatever it is doing with that hook, it''ll need to find some way round it. Perhaps it can use techniques employed in other frontend drivers to do all the necessary work on resume? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan Berger
2006-Nov-05 16:27 UTC
[Xen-devel] Re: Getting rid of xenbus_suspend(): tpmfront driver impacted?
Keir Fraser <Keir.Fraser@cl.cam.ac.uk> wrote on 11/05/2006 07:17:12 AM:> > I''m planning to get xenbus_suspend() as part of work to move device > reconnection work all to the restore side of save/restore. This willmake> restarting a suspended guest much easier if save/restore/relocationfails> for any reason. > > Currently the only consumer of xenbus_suspend() is the tpmfront driver.So> this is mainly a heads up that, whatever it is doing with that hook,it''ll> need to find some way round it. Perhaps it can use techniques employedin> other frontend drivers to do all the necessary work on resume?Thanks for the heads up. The problem with the TPM is that I need a mechanisms to wait for a request that has been sent to the TPM to finish and get the response back since any command can change the internal state of that device, like for example one of its registers. So resending a command after the resume would not be correct since this can change the internal state again, which would lead to an incorrect state. I am not as fimiliar with how the other drivers are handling the shutdown, but I know that any program using a networking protocol will need to recover from losses of UDP packets by itself and TCP does this automatically anyway -- thererfore there it is not necessary to wait for outstanding reponses. The TPM protocol in contrast assumes a reliable connection from the computer to the device and that all commands finish correctly and responses are received by the apps *and* that requests are not resent. How does the block driver handle this? Will the frontend driver still receive explicit notification of a shutdown? Stefan> > -- Keir > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Nov-05 16:44 UTC
RE: [Xen-devel] Re: Getting rid of xenbus_suspend(): tpmfront driverimpacted?
The TPM protocol in contrast assumes a> reliable connection from the computer to the device and that allcommands> finish correctly and responses are received by the apps *and* thatrequests> are not resent. How does the block driver handle this? Will thefrontend> driver still receive explicit notification of a shutdown?Blkfront remembers un-acknowledged requests and reissues them after a re-connect. It doesn''t matter whether the requests have previously been issued or not, as we know there will be no reordering hazards within the request stream, as there are no ordering guarantees. It currently doesn''t use the suspend notification. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Nov-05 17:01 UTC
[Xen-devel] Re: Getting rid of xenbus_suspend(): tpmfront driver impacted?
> The TPM protocol in contrast assumes a reliable connection from the computer > to the device and that all commands finish correctly and responses are > received by the apps *and* that requests are not resent. How does the block > driver handle this? Will the frontend driver still receive explicit > notification of a shutdown?What¹s the story on save/restore/migration of TPM state so that the guest sees the expected state on restore? It¹s not like it¹s part of the save image format right now. Assuming you have some out-of-band mechanism, how about making a message counter (or something) a part of that save state and something that tpmfront can interrogate when it reconnects to find out exactly up to what point in its request stream processing ceased? The current tpmfront_suspend() method is a bit cheesy as far as I can see, so something more integrated in the tpmfront/back protocol would be nice. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan Berger
2006-Nov-05 17:37 UTC
Re: [Xen-devel] Re: Getting rid of xenbus_suspend(): tpmfront driver impacted?
xen-devel-bounces@lists.xensource.com wrote on 11/05/2006 12:01:55 PM:> The TPM protocol in contrast assumes a reliable connection from the > computer to the device and that all commands finish correctly and > responses are received by the apps *and* that requests are not > resent. How does the block driver handle this? Will the frontend > driver still receive explicit notification of a shutdown? > > What?s the story on save/restore/migration of TPM state so that the > guest sees the expected state on restore? It?s not like it?s part ofFor that there is the external device migration facility in Xend that is used to tell a virtual TPM to serialize its state and can have its state transferred over the network. However, a virtual TPM cannot serialize its state as long as its processing a command, such as for example if it''s busy creating a key pair. So for that reason its necessary to wait for an issued command to finish anyway. Using the suspend method I could so far catch the response for that command.> the save image format right now. Assuming you have some out-of-band > mechanism, how about making a message counter (or something) a part > of that save state and something that tpmfront can interrogate when > it reconnects to find out exactly up to what point in its request > stream processing ceased? The current tpmfront_suspend() method is aAs I said above, the last command (and there''s only one command being processed at a time for an OS) must have finished anyway. So a mechanisms would have to be to tell the virtual TPM to catch that last response instead of sending it into /dev/vtpm on the backend side, and have that last response serialized as part of the state *before* the VM is shut down. This might be much more complicated, though.> bit cheesy as far as I can see, so something more integrated in the > tpmfront/back protocol would be nice.In terms of time needed for migration there won''t be a difference. Is supporting that .suspend really so problematic? Stefan> > -- Keir_______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan Berger
2006-Nov-05 17:45 UTC
RE: [Xen-devel] Re: Getting rid of xenbus_suspend(): tpmfront driverimpacted?
xen-devel-bounces@lists.xensource.com wrote on 11/05/2006 11:44:14 AM:> The TPM protocol in contrast assumes a > > reliable connection from the computer to the device and that all > commands > > finish correctly and responses are received by the apps *and* that > requests > > are not resent. How does the block driver handle this? Will the > frontend > > driver still receive explicit notification of a shutdown? > > Blkfront remembers un-acknowledged requests and reissues them after a > re-connect. It doesn''t matter whether the requests have previously beenI cannot possibly reissue commands like a TPM_Extend(value). It''s performing a hashing operation on a register of the TPM using a formular like PCR_n = SHA1(PCR_n || value) where || is a concatenation of two byte arrays. So issuing this command twice would put the TPM''s PCR register into a state that it is not supposed to be in.> issued or not, as we know there will be no reordering hazards within the > request stream, as there are no ordering guarantees. It currently > doesn''t use the suspend notification.Unfortunately the TPM is a device with different constraints. Stefan PS: Specs for the TPM are here: https://www.trustedcomputinggroup.org/specs/TPM/tpmwg-mainrev62_Part3_Commands.pdf page 114 talks about the TPM_Extend command.> > Ian > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Nov-05 18:00 UTC
Re: [Xen-devel] Re: Getting rid of xenbus_suspend(): tpmfront driver impacted?
On 5/11/06 5:37 pm, "Stefan Berger" <stefanb@us.ibm.com> wrote:> As I said above, the last command (and there''s only one command being > processed at a time for an OS) must have finished anyway. So a mechanisms > would have to be to tell the virtual TPM to catch that last response instead > of sending it into /dev/vtpm on the backend side, and have that last response > serialized as part of the state *before* the VM is shut down. This might be > much more complicated, though. >If we could do this then we could simply send the ring-buffer page after the VTPM has quiesced. Then the response would be sitting there for tpmfront to pick up on resume, which would be much nicer than the tpmfront_suspend() wait for a bit¹ loop. But I expect it¹s a bit of a pain to integrate into the current save/restore code.> >> > bit cheesy as far as I can see, so something more integrated in the >> > tpmfront/back protocol would be nice. > > In terms of time needed for migration there won''t be a difference. Is > supporting that .suspend really so problematic?Well, we are going to handle save/restore and migration failure by continuing execution of the original domain. In this case xenbus_resume() will not be executed, so tpmfront will (I think) hang. I suppose we could keep suspend() and also introduce a suspend_cancelled() hook... -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jacob Gorm Hansen
2006-Nov-06 08:27 UTC
Re: [Xen-devel] Re: Getting rid of xenbus_suspend(): tpmfront driver impacted?
On 11/5/06, Stefan Berger <stefanb@us.ibm.com> wrote:> In terms of time needed for migration there won''t be a difference. Is > supporting that .suspend really so problematic?I would love to see .suspend go away, when doing self-checkpointing or self-migration it is very annoying to have to shut down external state at a point where things are supposed to be atomic in the middle of trying to take a checkpoint. Can''t you problem be solved simply by retrying the last TMP transaction if it fails, similar to how syscalls can fail in UNIX/Linux? Surely the backend has to be able to deal with these kinds of failure modes, or the guest would be able to DoS it pretty easily. Jacob _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan Berger
2006-Nov-06 15:35 UTC
Re: [Xen-devel] Re: Getting rid of xenbus_suspend(): tpmfront driver impacted?
xen-devel-bounces@lists.xensource.com wrote on 11/05/2006 01:00:26 PM:> On 5/11/06 5:37 pm, "Stefan Berger" <stefanb@us.ibm.com> wrote:> As I said above, the last command (and there''s only one command > being processed at a time for an OS) must have finished anyway. So a > mechanisms would have to be to tell the virtual TPM to catch that > last response instead of sending it into /dev/vtpm on the backend > side, and have that last response serialized as part of the state > *before* the VM is shut down. This might be much more complicated,though.> If we could do this then we could simply send the ring-buffer page > after the VTPM has quiesced. Then the response would be sitting > there for tpmfront to pick up on resume, which would be much nicer > than the tpmfront_suspend() ?wait for a bit? loop. But I expect it?s > a bit of a pain to integrate into the current save/restore code. > > > bit cheesy as far as I can see, so something more integrated in the > > tpmfront/back protocol would be nice. > > In terms of time needed for migration there won''t be a difference. > Is supporting that .suspend really so problematic? > > Well, we are going to handle save/restore and migration failure by > continuing execution of the original domain. In this case > xenbus_resume() will not be executed, so tpmfront will (I think) hang.Yes, it would hang. Some notification would be necessary to have it resume. Stefan> > I suppose we could keep suspend() and also introduce a > suspend_cancelled() hook... > > -- Keir_______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kieran Mansley
2006-Nov-06 15:52 UTC
Re: [Xen-devel] Re: Getting rid of xenbus_suspend(): tpmfront driver impacted?
Keir Fraser wrote:> I suppose we could keep suspend() and also introduce asuspend_cancelled()> hook... > > -- KeirThis would certainly make our life easier. For our work on accelerated network drivers we''re currently using suspend() to give the guest a chance to stop using any hardware resources that have been mapped up into it. Without suspend(), you could be in the situation where the guest was in the middle of accessing a hardware resource when it is migrated to another server where it clearly won''t be able to access that hardware resource. With suspend() and resume() we can stop it using the hardware resources it has, then renegotiate equivalents on the the server if they are available, before carrying on as before. Working around this would be tricky. Kieran _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan Berger
2006-Nov-06 16:48 UTC
Re: [Xen-devel] Re: Getting rid of xenbus_suspend(): tpmfront driver impacted?
jacobgorm@gmail.com wrote on 11/06/2006 03:27:17 AM:> On 11/5/06, Stefan Berger <stefanb@us.ibm.com> wrote: > > > In terms of time needed for migration there won''t be a difference. Is > > supporting that .suspend really so problematic? > > I would love to see .suspend go away, when doing self-checkpointing or > self-migration it is very annoying to have to shut down external state > at a point where things are supposed to be atomic in the middle of > trying to take a checkpoint. > > Can''t you problem be solved simply by retrying the last TMP > transaction if it fails, similar to how syscalls can fail in > UNIX/Linux? Surely the backend has to be able to deal with these kinds > of failure modes, or the guest would be able to DoS it pretty easily.Once a TPM command has been started to be executed it may change the internal state of the TPM, such as for example one of its registers. If that command was to be executed a 2nd time it may again change that register and lead to an unwanted state of the device. Registers are not simply loaded on that device, but a hash operation is performed on its contents that, if performed twice, leads to a totally different result. Also the ''connection'' to the device is assumed to be reliable - it''s usually an on-board chip. Here''s another email in regards to this: http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00234.html Another problem occurrs when you migrate a domain from one platform to another while not having the possibility of replaying commands to a device. The device driver needs to be able to allow the migration-supporting software to take a snapshot of the device''s state after the n-th command has finished and after only the n-th command has been sent by the domain. At that point the OS''s state and the device''s state are completly in sync and that''s when you have to snapshot and not snapshot the domain after issuing the n-th command and the device after the (n +/- x)th command. This holds true for the TPM and is rather strict there, but maybe holds also true for operations of a block device driver on a filesystem image. A constructed scenarion for the block device would be to snapshot a filesystem image and replicating it and only after that snapshot has been done the OS on the source system issues a block operation to remove an inode. When the OS appears on the target system it thinks it has removed the inode, but in reality it''s not reflected in that image. Stefan> > Jacob_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel