Hi, I am trying to understand and change the network buffering that is being used by Remus, the HA solution present in Xen. From what i understood from reading the code, Remus calls the postsuspend method of the BufferedNIC after it suspends the domain that sends TC_PLUG_CHECKPOINT message and start the buffering and then calls the commit method of BufferedNIC after it gets the acknowledgement that sends the TC_PLUG_RELEASE message and releases the buffered packets. I have the following doubts 1) How does remus ensure that the packets gets buffered for an entire epoch ? 2) If i comment out the lines in the "postsuspend" and "commit" lines of the BufferedNIC class that send the TC_PLUG_CHECKPOINT and TC_PLUG_RELEASE, i see that all the network packets are being buffered and i cannot ping the Remus-protected VM at all. Where is the packet buffering happen if i comment out these 2 lines ? Thanking you, Rahul. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Thu, Apr 19, 2012 at 9:15 AM, Rahul Singh <singh.rahul.1983@gmail.com>wrote:> Hi, > I am trying to understand and change the network buffering that is being > used by Remus, the HA solution present in Xen. From what i understood from > reading the code, Remus calls the postsuspend method of the BufferedNIC > after it suspends the domain that sends TC_PLUG_CHECKPOINT message and > start the buffering >for packets in the "next" epoch [the one that starts after the domain is resumed]> and then calls the commit method of BufferedNIC after it gets the > acknowledgement that sends the TC_PLUG_RELEASE message and releases the > buffered packets. >of the previous current epoch [the one whose checkpoint was just committed on the backup machine]> I have the following doubts > 1) How does remus ensure that the packets gets buffered for an entire > epoch ? >I would suggest you go through the comments in http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=net/sched/sch_plug.c;h=89f8fcf73f18f6e091881cc829861285c0c8f8b7;hb=HEAD the sch_plug module in upstream linux.> 2) If i comment out the lines in the "postsuspend" and "commit" lines of > the BufferedNIC class that send the TC_PLUG_CHECKPOINT and TC_PLUG_RELEASE, > i see that all the network packets are being buffered and i cannot ping the > Remus-protected VM at all. Where is the packet buffering happen if i > comment out these 2 lines ? >the module starts out in the buffered mode and releases packets only upon receiving a RELEASE command. hope that helps. cheers shriram> Thanking you, > Rahul. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Thanks for pointing me to the comments, Shriram. That was very helpful. What i am unable to figure out is how does Remus tell the module to only release packets belonging to a given epoch ? I am looking at the BufferedNIC class in xen-unstable/tools/python/xen/remus/device.py and the postsuspend method calls self._sendqmsg(qdisc.TC_PLUG_CHECKPOINT) while the commit method calls self._sendqmsg(qdisc.TC_PLUG_RELEASE). Where is it telling that only packets of the last epoch should be released ? Also, as you said that module starts out with buffering everything. So if i comment out the self._sendqmsg(qdisc.TC_PLUG_CHECKPOINT) but keep the self._sendqmsg(qdisc.TC_PLUG_RELEASE), there should be no network buffering at all, but i still see that all packets are buffered. Thanking you, Rahul. On Apr 19, 2012, at 5:24 PM, Shriram Rajagopalan wrote:> On Thu, Apr 19, 2012 at 9:15 AM, Rahul Singh <singh.rahul.1983@gmail.com> wrote: > Hi, > I am trying to understand and change the network buffering that is being used by Remus, the HA solution present in Xen. From what i understood from reading the code, Remus calls the postsuspend method of the BufferedNIC after it suspends the domain that sends TC_PLUG_CHECKPOINT message and start the buffering > > for packets in the "next" epoch [the one that starts after the domain is resumed] > > and then calls the commit method of BufferedNIC after it gets the acknowledgement that sends the TC_PLUG_RELEASE message and releases the buffered packets. > > of the previous current epoch [the one whose checkpoint was just committed on the backup machine] > > I have the following doubts > 1) How does remus ensure that the packets gets buffered for an entire epoch ? > > I would suggest you go through the comments in > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=net/sched/sch_plug.c;h=89f8fcf73f18f6e091881cc829861285c0c8f8b7;hb=HEAD > the sch_plug module in upstream linux. > 2) If i comment out the lines in the "postsuspend" and "commit" lines of the BufferedNIC class that send the TC_PLUG_CHECKPOINT and TC_PLUG_RELEASE, i see that all the network packets are being buffered and i cannot ping the Remus-protected VM at all. Where is the packet buffering happen if i comment out these 2 lines ? > > > the module starts out in the buffered mode and releases packets only upon receiving a RELEASE command. > > hope that helps. > > cheers > shriram > > Thanking you, > Rahul. > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On 2012-04-19, at 3:17 PM, Rahul Singh <singh.rahul.1983@gmail.com> wrote:> Thanks for pointing me to the comments, Shriram. That was very helpful. > > What i am unable to figure out is how does Remus tell the module to only release packets belonging to a given epoch ? I am looking at the BufferedNIC class in xen-unstable/tools/python/xen/remus/device.py and the postsuspend method calls self._sendqmsg(qdisc.TC_PLUG_CHECKPOINT) while the commit method calls self._sendqmsg(qdisc.TC_PLUG_RELEASE). Where is it telling that only packets of the last epoch should be released ? >The release command tells the module that the packets of last epoch should be released. (implicit). It is called after the checkpoint command for the next epoch has been issued in the postsuspend phase. There is an ASCII diagram on the modules header that shows the sequence.> Also, as you said that module starts out with buffering everything. So if i comment out the self._sendqmsg(qdisc.TC_PLUG_CHECKPOINT) but keep the self._sendqmsg(qdisc.TC_PLUG_RELEASE), there should be no network buffering at all, but i still see that all packets are buffered.To understand this, you need to read the sch_plug.c source> > Thanking you, > Rahul. > > On Apr 19, 2012, at 5:24 PM, Shriram Rajagopalan wrote: > >> On Thu, Apr 19, 2012 at 9:15 AM, Rahul Singh <singh.rahul.1983@gmail.com> wrote: >> Hi, >> I am trying to understand and change the network buffering that is being used by Remus, the HA solution present in Xen. From what i understood from reading the code, Remus calls the postsuspend method of the BufferedNIC after it suspends the domain that sends TC_PLUG_CHECKPOINT message and start the buffering >> >> for packets in the "next" epoch [the one that starts after the domain is resumed] >> >> and then calls the commit method of BufferedNIC after it gets the acknowledgement that sends the TC_PLUG_RELEASE message and releases the buffered packets. >> >> of the previous current epoch [the one whose checkpoint was just committed on the backup machine] >> >> I have the following doubts >> 1) How does remus ensure that the packets gets buffered for an entire epoch ? >> >> I would suggest you go through the comments in >> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=net/sched/sch_plug.c;h=89f8fcf73f18f6e091881cc829861285c0c8f8b7;hb=HEAD >> the sch_plug module in upstream linux. >> 2) If i comment out the lines in the "postsuspend" and "commit" lines of the BufferedNIC class that send the TC_PLUG_CHECKPOINT and TC_PLUG_RELEASE, i see that all the network packets are being buffered and i cannot ping the Remus-protected VM at all. Where is the packet buffering happen if i comment out these 2 lines ? >> >> >> the module starts out in the buffered mode and releases packets only upon receiving a RELEASE command. >> >> hope that helps. >> >> cheers >> shriram >> >> Thanking you, >> Rahul. >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel >> >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel