Hi. I''ve been playing around with Xen a couple of months now and I am very much impressed. I am particularly found of the "live migration" feature. I was wondering if it is possible to make a instance continuously replicate the state of another instance and then make the other instance run if the original instance fails. -- Per Andreas Buer ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Hi. I''ve been playing around with Xen a couple of months now and I am very much impressed. I am particularly found of the "live migration" feature. I was wondering if it is possible to make a instance continuously replicate the state of another instance and then make the other instance run if the original instance fails. This could mean we could deploy Linux in enviroments dominated by Tandem and IBM mainframes. :-) -- Per Andreas Buer ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2005-Jan-07 15:34 UTC
Re: [Xen-devel] Is continuous replication of state possible?
> I''ve been playing around with Xen a couple of months now and I am very > much impressed. I am particularly found of the "live migration" feature. > I was wondering if it is possible to make a instance continuously > replicate the state of another instance and then make the other instance > run if the original instance fails.Software-implemented hardware fault-tolerance is on the Xen research roadmap. It basically just requires deterministic execution and event injection. Doing this for uniprocessor guests is fairly straight forward. Doing it for SMP guests (with decent performance) is going to be a huge challenge, as determinism is hard to achieve. We''re looking in to it... Ian ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Mark Williamson
2005-Jan-07 15:43 UTC
Re: [Xen-devel] Is continuous replication of state possible?
> I''ve been playing around with Xen a couple of months now and I am very > much impressed. I am particularly found of the "live migration" feature. > I was wondering if it is possible to make a instance continuously > replicate the state of another instance and then make the other instance > run if the original instance fails.Work on this is ongoing and should make things like failover and majority voting possible.> This could mean we could deploy Linux in enviroments dominated by Tandem > and IBM mainframes. :-)Yup :-) Mark ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Avery Pennarun
2005-Jan-07 16:38 UTC
Re: [Xen-devel] Is continuous replication of state possible?
On Fri, Jan 07, 2005 at 03:34:09PM +0000, Ian Pratt wrote:> > I''ve been playing around with Xen a couple of months now and I am very > > much impressed. I am particularly found of the "live migration" feature. > > I was wondering if it is possible to make a instance continuously > > replicate the state of another instance and then make the other instance > > run if the original instance fails. > > Software-implemented hardware fault-tolerance is on the Xen > research roadmap. > > It basically just requires deterministic execution and event > injection. Doing this for uniprocessor guests is fairly straight > forward. Doing it for SMP guests (with decent performance) is > going to be a huge challenge, as determinism is hard to achieve. We''re > looking in to it...I gather that the request was not to keep both of them *executing* at once, but to simply keep the memory image on the clone system "mostly" in sync at all times, so the failover could happen faster. This should be easier than actually trying to execute the code deterministically. Have fun, Avery ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Rik van Riel
2005-Jan-07 17:18 UTC
Re: [Xen-devel] Is continuous replication of state possible?
On Fri, 7 Jan 2005, Ian Pratt wrote:> It basically just requires deterministic execution and event > injection.This will be hard when running crypto programs that get their random numbers directly from the CPU ... -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2005-Jan-07 17:29 UTC
Re: [Xen-devel] Is continuous replication of state possible?
> On Fri, 7 Jan 2005, Ian Pratt wrote: > > > It basically just requires deterministic execution and event > > injection. > > This will be hard when running crypto programs that get > their random numbers directly from the CPU ...Sure, when CPUs finally get embedded crypto features (and applications start using and _insisting_ on their presence) we''ll have problems. For the moment, the technique should work. Ian ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2005-Jan-07 17:42 UTC
Re: [Xen-devel] Is continuous replication of state possible?
> > It basically just requires deterministic execution and event > > injection. Doing this for uniprocessor guests is fairly straight > > forward. Doing it for SMP guests (with decent performance) is > > going to be a huge challenge, as determinism is hard to achieve. We''re > > looking in to it... > > I gather that the request was not to keep both of them *executing* at once, > but to simply keep the memory image on the clone system "mostly" in sync at > all times, so the failover could happen faster. This should be easier than > actually trying to execute the code deterministically.It''s not clear whether shipping checkpoints or just doing deterministic execution will actually work best; that''s why its on the research roadmap. There are implementations in-progress of both CoW checkpoints and deterministic execution. Ian ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
George Washington Dunlap III
2005-Jan-07 17:51 UTC
[Xen-devel] Re: Is Continuous replication of state possible?
On Fri, 7 Jan 2005 xen-devel-request@lists.sourceforge.net wrote:> This will be hard when running crypto programs that get > their random numbers directly from the CPU ...Yes, any sort of non-deterministic instruction must be either made to execute the same, or disallowed. :-) I''m not familiar with the setup of those kinds of CPUs. Is this situation any different than that of RDTSC? We can''t allow native execution of RDTSC on different platforms either. More annoying is the CPUID instruction, which is non-privileged (and so cannot be trapped & emulated like RDTSC), but returns different results on processors that are not identical... -George Dunlap +-------------------+---------------------------------------- | dunlapg@umich.edu | http://www-personal.umich.edu/~dunlapg +-------------------+---------------------------------------- | Who could move a mountain, who could love their enemy? | Who could rejoice in pain, and turn the other cheek? | - Rich Mullins, "Surely God is With Us" +------------------------------------------------------------ | Outlaw Junk Email! Support HR 1748 (www.cauce.org) ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Rik van Riel
2005-Jan-07 18:02 UTC
Re: [Xen-devel] Is continuous replication of state possible?
On Fri, 7 Jan 2005, Ian Pratt wrote:>> On Fri, 7 Jan 2005, Ian Pratt wrote: >> >>> It basically just requires deterministic execution and event >>> injection. >> >> This will be hard when running crypto programs that get >> their random numbers directly from the CPU ... > > Sure, when CPUs finally get embedded crypto features (and > applications start using and _insisting_ on their presence) we''ll > have problems. For the moment, the technique should work.Of course, it should be easy for CPU manufacturers to make sure that the crypto instructions are trappable, so Xen can make sure that both lockstepped virtual machines get the same random number ... -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ky Srinivasan
2005-Jan-07 19:22 UTC
Re: [Xen-devel] Is continuous replication of state possible?
Ian, as you noted earlier, on an MP machine getting reliable replay will be difficult - locks need to be acquired in the same order during the replay as they were acquired initially. In my previous life we had toyed with implementing a similar strategy on a microkernel (Chorous) based unix system. K. Y>>> Ian Pratt <Ian.Pratt@cl.cam.ac.uk> 1/7/2005 12:29:27 PM >>> > On Fri, 7 Jan 2005, Ian Pratt wrote: > > > It basically just requires deterministic execution and event > > injection. > > This will be hard when running crypto programs that get > their random numbers directly from the CPU ...Sure, when CPUs finally get embedded crypto features (and applications start using and _insisting_ on their presence) we''ll have problems. For the moment, the technique should work. Ian ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Jacob Gorm Hansen
2005-Jan-09 02:21 UTC
Re: [Xen-devel] Is continuous replication of state possible?
Ian Pratt wrote:>>I''ve been playing around with Xen a couple of months now and I am very >>much impressed. I am particularly found of the "live migration" feature. >>I was wondering if it is possible to make a instance continuously >>replicate the state of another instance and then make the other instance >>run if the original instance fails. > > > Software-implemented hardware fault-tolerance is on the Xen > research roadmap. > > It basically just requires deterministic execution and event > injection. Doing this for uniprocessor guests is fairly straight > forward. Doing it for SMP guests (with decent performance) is > going to be a huge challenge, as determinism is hard to achieve. We''re > looking in to it...I did a little reading on this subject a couple of years back, and it seems that on Pentiums getting deterministic execution is impossible even for UPs, as long as you allow preemptive multitasking. Because (according to the Intel manuals) the precision of the Pentium performance counters cannot be relied on, the timer and other interrupts will essentially act as a random generator. Naturally you can do peridic checkpointing, but there will be no way correctness can be guaranteed, unless you coordinate all outgoing traffic between replicas before making it visible to the outside world. There is a paper by Bressoud and Schneider about hypervisor-based fault tolerance on the PA-RISC (which had precise performance counters) which is worth reading, I found a copy online at http://roc.cs.berkeley.edu/294fall01/readings/bressoud.pdf . I think is more likely to work at a higher level, when you know the semantics of your application, Dmitrii Zagorodnov did some work on that and reported good results, see for instance http://ieeexplore.ieee.org/iel5/8589/27228/01209950.pdf . Best regards, Jacob ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Harry Butterworth
2005-Jan-09 14:43 UTC
Re: [Xen-devel] Is continuous replication of state possible?
On Sat, 2005-01-08 at 18:21 -0800, Jacob Gorm Hansen wrote:> There is a paper by Bressoud and Schneider about hypervisor-based fault > tolerance on the PA-RISC (which had precise performance counters) which > is worth reading, I found a copy online at > http://roc.cs.berkeley.edu/294fall01/readings/bressoud.pdf .Thanks, I hadn''t read this paper before and found it interesting. I think there are a few good bits in it like the use of the recovery register for epochs of instruction execution but I think they have made a mistake by choosing to design a primary-backup rather than an N-way active system and I think the I/O architecture is fairly fundamentally flawed as a result of a confusion as to whether the peripheral devices are inside or outside the boundary of the replicated system. I think a better approach for a fail-stop fault tolerant system would be to choose an N-way active approach and put peripherals very clearly outside the boundary of the replicated system: With this approach... 1) A distributed consensus protocol is used to reach agreement on a sequence of inputs to the system. 2) The same sequence of inputs is passed to all replicas. This is the input boundary of the replicated fault-tolerant context. 3) The replicas start with the same state and receive the same sequence of inputs so make the same sequence of responses. Every response of all replicas is of the form "execute response X on node Y". So the output of the replicas is the same for all replicas. This is the output boundary of the replicated fault-tolerant context. 4) The output of the replicas contains the information specifying which node must actually action the output. One node actions the output, the remaining nodes do nothing. 5) With peripherals outside the replication boundary, all communication from a peripheral to the virtual machine is passed through the distributed consensus protocol and committed to the sequence of inputs before being processed by all replicas. All communication from the fault-tolerant virtual machine to peripherals is made through a specific node chosen by the fault-tolerant virtual machine. If a peripheral is accessible from multiple nodes then the virtual machine will see multiple redundant paths to the peripheral and may use multi-pathing software to perform path failover when a node fails. If a peripheral is only accessible to one node then the fault-tolerant virtual machine may use RAID over several such peripherals attached to different nodes to create a fault-tolerant virtual peripheral. This approach is also applicable to byzantine fault tolerant systems if you enhance it with a byzantine fault tolerant distributed consensus protocol and get each replica to digitally sign each output and forward the signature to the replica responsible for actioning it so the actioner can prove that it is operating on behalf of a quorum of replicas. In this case a byzantine failure of a node still looks like a path failure to the virtual machine because communications from the node are dropped by the recipient when the digital signatures are found to be insufficient. Here are a few random other comments on the paper: With the recovery register approach you can obviously break out early if you encounter an idle instruction since this will be deterministic across all replicas. If you emulate the CPU in software then this approach is very easy. The time of day clock is a good example of solving the problem of non-deterministic operations by asking the hypervisor and having it pass the result through the consensus protocol to all replicas. In terms of the approach outlined above this translates into a replica output to a specific node to ask that node the time. The node sends the time through the distributed consensus protocol to be received by all replicas. This works for random number generation as well (as someone noted in another post on this topic) but in that case you''d want to ask a node for a big batch of random numbers in advance. The node would generate a batch and pass it through the distributed consensus protocol to the replicas so that each replica had the same random number pool and didn''t incur the consensus protocol overhead on every request. Using a consensus protocol like PAXOS which supports multiple concurrent ballots would allow the random number pool to be replenished at a low-water mark without ever stalling the virtual machine operation. "Fundamental to our design is communication among the hypervisors. This implies that the hypervisors must not be partitioned by communications failures." PAXOS does much better and, I think shows the limit of what can be achieved for fail-stop fault tolerant systems. "I/O Accessibility Assumption: I/O operations possible by the processor executing the primary virtual machine are also possible by the processor executing a backup virtual machine." See the discussion above. "We assume that the channel linking the primary and backup processors is FIFO...that the processor executing the backup detects the primary''s processor failure only after receiving the last message sent by the primary''s hypervisor" The PAXOS protocol does significantly better than this. For anyone interested in replication, "The Part Time Parliament" by Leslie Lamport which describes PAXOS is a great read. After that, I''d recommend reading "Practical Byzantine Fault Tolerance" by Miguel Castro and "Provably Secure Competitive Routing against Proactive Byzantine Adversaries via Reinforcement Learning" by Baruch Awerbuch, David Holmer and Herbert Rubens. -- Harry Butterworth <harry@hebutterworth.freeserve.co.uk> ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Hello, I am sorry for a possible stupid question, but may anybody please explain to me what this "deterministic" means? When studying papers concerning virtual machine, I encoutered many of this terminology, like "deterministic event", "deterministic execution",... and dont understand its meaning. I did some googles, but to no avail, since google returned too many results, and without the correct context, I cannot know what is the good links to follow. Any references to read on this problem is very highly appriciated. Thank you very much, AQ ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Mark Williamson
2005-Jan-09 16:43 UTC
Re: [Xen-devel] Is continuous replication of state possible?
http://foldoc.doc.ic.ac.uk/foldoc/foldoc.cgi?query=deterministic&action=Search IOW, "Deterministic" basically means that something will proceed predictably, with no randomness. This means that if we know the starting state of a deterministic piece of code then we can predict everything it''s going to do perfectly. A real machine has lots of non-determinism - for instance interrupts arrive randomly and they affect the execution of the code on the machine. For deterministic replay, you''d have to sample all the random events so that you could deliver them at the right times when rerunning a virtual machine''s execution. HTH, Mark On Sunday 09 January 2005 16:21, aq wrote:> Hello, > > I am sorry for a possible stupid question, but may anybody please > explain to me what this "deterministic" means? When studying papers > concerning virtual machine, I encoutered many of this terminology, > like "deterministic event", "deterministic execution",... and dont > understand its meaning. I did some googles, but to no avail, since > google returned too many results, and without the correct context, I > cannot know what is the good links to follow. > > Any references to read on this problem is very highly appriciated. > > Thank you very much, > AQ > > > ------------------------------------------------------- > The SF.Net email is sponsored by: Beat the post-holiday blues > Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. > It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/xen-devel------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Harry Butterworth
2005-Jan-09 16:56 UTC
Re: [Xen-devel] Is continuous replication of state possible?
On Mon, 2005-01-10 at 01:21 +0900, aq wrote:> Hello, > > I am sorry for a possible stupid question, but may anybody please > explain to me what this "deterministic" means?Deterministic means that the outcome is a function of nothing more than the starting state and the input so if you start with the same starting state and apply the same input you are guaranteed to end up with the same outcome every time. So, deterministic code works with the replicated state machine approach because the replicas start off in the same state and have the same input applied and the same execution happens on all the replicas and they end up in the same state and so remain replicas of one another. If you try to execute non-deterministic code in the context of the replicated state machine approach then the replicas might start off in the same state and receive the same input but the different results from executing the non-deterministic code would cause them to diverge. Once the replicas have diverged they are no longer good as backups of one another. Examples of operations that would ordinarily be non-deterministic: Asking the CPU for a random number---All of the replicas start off in the same state with the program counter pointing to an instruction that asks for a random number; all of the replicas receive the same input: a command instructing them to execute one instruction; all of the replicas ask the CPU for a random number. All the replicas get a different random number from their CPU and the subsequent execution diverges. Taking an interrupt from a peripheral device---typically this will only happen on one of the nodes. If it was fed directly to a replica then that replica would immediately diverge from the others. -- Harry Butterworth <harry@hebutterworth.freeserve.co.uk> ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
George Washington Dunlap III
2005-Jan-10 16:53 UTC
[Xen-devel] Re: Is continuous replication of state possible?
> I did a little reading on this subject a couple of years back, and it > seems that on Pentiums getting deterministic execution is impossible > even for UPs, as long as you allow preemptive multitasking. Because > (according to the Intel manuals) the precision of the Pentium > performance counters cannot be relied on, the timer and other interrupts > will essentially act as a random generator. Naturally you can do peridic > checkpointing, but there will be no way correctness can be guaranteed, > unless you coordinate all outgoing traffic between replicas before > making it visible to the outside world. > > There is a paper by Bressoud and Schneider about hypervisor-based fault > tolerance on the PA-RISC (which had precise performance counters) which > is worth reading, I found a copy online at > http://roc.cs.berkeley.edu/294fall01/readings/bressoud.pdf .There''s another paper by Dunlap, et al (see the From field) about implementing deterministic execution for a uniprocessor for Athlons, and we have since gotten the same thing working for P4s. :-) It can be found here: http://www.eecs.umich.edu/CoVirt/papers/revirt.pdf The main trick is that there are several different counters you could use. The one spoken of mainly in the literature is the instruction counter (which, on both Athlons and P4''s is unusable). However, there are repeatable branch counters on both platforms. Logging the <eip, branch_count> tuple at every interrupt allows us to re-deliver the interrupts precisely. (See Mellor-Crummey89 for a software version of this same idea.) Maybe sometime I''ll post a whitepaper about the dirty details of doing deterministic replay on P4''s and Athlons. If you''re interested, I have an e-mail that I''ve already sent to several people (including the Xen team) with the details. And we''re currently working on extending deterministic replay for SMP. :-) -George Dunlap G. W. Dunlap, S. T. King, S. Cinar, M. Basrai, and P. M. Chen. ReVirt: Enabling Intrusion Analysis through Virtual-Machine Logging and Replay. In Proceedings of the 2002 Symposium on Operating Systems Design and Implementation, pages 211-224, December 2002 J. M. Mellor-Crummey and T. J. LeBlanc. A Software Instruction Counter. In Proceedings of the 1989 International Conference on Architectural Support for Programming Languages and Operating Systems, page 78-86, April 1989. ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Jacob Gorm Hansen
2005-Jan-11 02:22 UTC
Re: [Xen-devel] Re: Is continuous replication of state possible?
George Washington Dunlap III wrote:> The main trick is that there are several different counters you could > use. The one spoken of mainly in the literature is the instruction > counter (which, on both Athlons and P4''s is unusable). However, there > are repeatable branch counters on both platforms. Logging the > <eip, branch_count> tuple at every interrupt allows us to re-deliver the > interrupts precisely. (See Mellor-Crummey89 for a software version of > this same idea.) > > Maybe sometime I''ll post a whitepaper about the dirty details of doing > deterministic replay on P4''s and Athlons. If you''re interested, I have > an e-mail that I''ve already sent to several people (including the Xen > team) with the details.I''d like to see a copy of that email, thanks. Jacob ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It''s fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel