Hi Folks, Perhaps somebody can illuminate the status of Remus. It looks like a nice HA option (certainly simpler than cobbling together DRBD, Pacemaker, etc.) - but it also looks like it requires using an older kernel, and there have been no documentation updates since sometime in 2010. So... is the project alive, is it catching up with the rest of Xen, or is it best to avoid it? Thanks, Miles Fidelman -- In theory, there is no difference between theory and practice. In<fnord> practice, there is. .... Yogi Berra _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tue, Mar 29, 2011 at 04:36:08PM -0400, Miles Fidelman wrote:> Hi Folks, > > Perhaps somebody can illuminate the status of Remus. It looks like a > nice HA option (certainly simpler than cobbling together DRBD, > Pacemaker, etc.) - but it also looks like it requires using an older > kernel, and there have been no documentation updates since sometime in > 2010. > > So... is the project alive, is it catching up with the rest of Xen, or > is it best to avoid it? >Hello, There''s a new and active maintainer for Remus. He''s been posting many patches recently to xen-devel mailinglist. Just remember Remus is FT (Fault Tolerance) solution.. often it''s better use software/service based HA instead of vm-based FT. FT is way slower than "normal" HA. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Pasi Kärkkäinen wrote:> There''s a new and active maintainer for Remus. He''s been posting > many patches recently to xen-devel mailinglist. >Good to hear!> Just remember Remus is FT (Fault Tolerance) solution.. > often it''s better use software/service based HA instead of vm-based FT. > FT is way slower than "normal" HA. >Can you elaborate a bit, both re. what you see as the difference between fault tolerance vs. HA, and re. speed? Currently, I''m running a collection of services on a single VM, with DRBD/heartbeat/pacemaker failover to a 2nd node. Failover takes a LONG time. Looks to me like Remus should provide instant failover. Thanks, Miles Fidelman -- In theory, there is no difference between theory and practice. In<fnord> practice, there is. .... Yogi Berra _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
My quess would be that, Pasi is talking about the over all performance/efficiency of the solution, not the fail over time. Even tho Remus might shorten the fail over time significantly, it might be less efficient in utilizing your resources, because of the way it works. What I mean is that with imaginary software/service HA solution you might be able to get 100 request/sec or what ever metric you might want to use and Remus could perform less, like 80 request/sec or so. I have no idea how fast or efficient solution Remus is, so don''t take those numbers as any sort of indication of the predicted performance of any solution, they were there just to clarify what I am trying to say. Between H(igh)A(vaileability) and F(ault)T(olerancy) and all the other Two letter combinations, it seem''s to be quite hard to draw clear lines between them. My take on this matter would be that where HA certainly has something to do with FT, they are not the same. Where HA might be more generic term used for securing the service, by making it availeable even in disaster conditions, FT might be part of that agenda. Having dual power supply''s on a server increases the fault tolerancy of that server but does it make the services running in it more highly availeable? I quess in a way it might but the primary goal of adding PSU was to increase the fault tolerancy, not to make some service more or less availeable. One could say that he is trying to make the service Highly Availeable by increasing it''s fault tolerancy? Not sure if this makes any sence but I couldn''t find any gospel like definition for the terms, so I''m making this up while I''m writing. In the case of remus you are "runing two instances of the same VM" and because you have kind of back up VM ready to take over for the primary, it increases the fault tolerancy of that VM. But because the VM''s are (if I understand correctly) more or less the same, bug on a software (for example in apache or what ever service you are trying to make more highly availeable), could possibly make both of the VM''s go down, since they are the same and the bug could affect them equally. Sorry for the long post and putting my spoon to a soup that is not mine. Hope some of the things I said made sence and if possible help you in some way. -Henrik Andersson On 11 April 2011 01:42, Miles Fidelman <mfidelman@meetinghouse.net> wrote:> Pasi Kärkkäinen wrote: > >> There''s a new and active maintainer for Remus. He''s been posting >> many patches recently to xen-devel mailinglist. >> >> > > Good to hear! > > Just remember Remus is FT (Fault Tolerance) solution.. >> often it''s better use software/service based HA instead of vm-based FT. >> FT is way slower than "normal" HA. >> >> > > Can you elaborate a bit, both re. what you see as the difference between > fault tolerance vs. HA, and re. speed? > > Currently, I''m running a collection of services on a single VM, with > DRBD/heartbeat/pacemaker failover to a 2nd node. Failover takes a LONG > time. Looks to me like Remus should provide instant failover. > > Thanks, > > Miles Fidelman > > > -- > In theory, there is no difference between theory and practice. > In<fnord> practice, there is. .... Yogi Berra > > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Sun, Apr 10, 2011 at 06:42:47PM -0400, Miles Fidelman wrote:> Pasi Kärkkäinen wrote: >> There''s a new and active maintainer for Remus. He''s been posting >> many patches recently to xen-devel mailinglist. >> > > Good to hear! >> Just remember Remus is FT (Fault Tolerance) solution.. >> often it''s better use software/service based HA instead of vm-based FT. >> FT is way slower than "normal" HA. >> > > Can you elaborate a bit, both re. what you see as the difference between > fault tolerance vs. HA, and re. speed? > > Currently, I''m running a collection of services on a single VM, with > DRBD/heartbeat/pacemaker failover to a 2nd node. Failover takes a LONG > time. Looks to me like Remus should provide instant failover. >Fault Tolerance means syncing the VM cpustate / memory / disk and net io between two physical hosts.. That is not very efficient and will create huge delays compared to just executing the VM locally.. So you might get way better performance with running a active/active HA cluster with a network/application level loadbalancer in front. -- Pasi _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users