Sebastian Kuttnig
2025-May-09 16:50 UTC
[Nut-upsdev] Questions about failover architecture
Hello Jim, Many thanks for the detailed insights?also the clarification on driver multiplexing versus failover, that was very helpful. While I don?t have much to contribute on the multiplexing front, I may take a shot at exploring the failover and proxying drivers. At the very least, it might be useful for my personal requirements; at best, perhaps to others as well, or even in support of later multiplexing efforts. Best regards, Sebastian Am Fr., 9. Mai 2025 um 18:27 Uhr schrieb Jim Klimov <jimklimov+nut at gmail.com>:> Hello, and thanks for raising this discussion. > > FWIW, the idea is (very) slowly brewing for at least a decade - backlogged > since https://github.com/networkupstools/nut/issues/273 (which proposes > to implement this over dummy-ups or clone drivers as they already have ways > to talk to other NUT drivers to represent them for various purposes). > > There is a little collection of related tickets seen by GitHub tag: > https://github.com/networkupstools/nut/issues?q=state%3Aopen%20label%3A%22Data%20multipathing%22 > > General thoughts collected so far are that: > * Failover differs from multiplexing :) (Using one driver at a time or > several at once) > * Practical goals may include having best info about the device, with each > single driver/protocol only offering a partial aspect, and the redundancy > against loss of one of the connections (e.g. serial/USB + SNMP). > * Different drivers (or even e.g. SNMP mappings to numerous served MIBs in > the same driver, see https://github.com/networkupstools/nut/issues/2036) > can offer different insights into the device - sets of named data points, > their precision, setable variables and instant commands. When multiplexing, > we need a way (automatic and/or user-tunable) to select which variant we > prefer (exclusively, or what to try first - in case of commands or > writeable settings). > * The approach of `clone*` drivers (which talk to another driver directly > over local socket) and recent advances in `upsdrvquery.{c,h}` may be more > applicable than using a `dummy-ups` driver (is a NUT networked protocol > client when going in relay/proxy mode) that I initially thought could be > useful here. > > But so far nothing has been fleshed out beyond that (AFAIK)... > > Hope this helps, > Jim Klimov > > > > On Fri, May 9, 2025 at 12:37?PM Sebastian Kuttnig via Nut-upsdev < > nut-upsdev at alioth-lists.debian.net> wrote: > >> Hi all, >> >> Yesterday?s PR #2945 got me thinking about my basic failover setup and >> raised >> some questions I didn?t want to derail the PR with?hope this is the right >> place >> to ask. Apologies if not. >> >> The PR includes this note: >> >> > NOTE: There is currently no support for "multiplexing" information or >> > commands for devices with multiple-media/multiple-driver access, e.g. >> to >> > gain redundant connection or a better data collection than any one >> driver >> > provides. For some devices it may be possible to run several drivers >> > independently (probably for your monitoring system to cherry-pick data >> > points from one driver or another), while others might only enable any >> one >> > link exclusively. >> >> That raised a few questions: >> >> - Is there any recommended or emerging approach for UPS failover using >> multiple devices or drivers? From the note, it sounds like this isn?t >> widely >> supported yet. >> >> - Would there be interest in developing a driver (or utility) >> specifically aimed >> at supporting failover logic within NUT? >> >> For context: I currently use a custom NUT driver that polls an HTTP(S) >> endpoint >> which aggregates `upsc` dumps and handles failover logic outside of NUT. >> The >> driver itself is simple?it just reports what it gets, much like `upsc`. >> >> This setup works, but may be heavier than needed for basic failover use >> cases. >> I?ve been thinking about splitting the responsibilities into two drivers: >> >> - `http-ups` (or `clone-http`): polls an HTTP(S) endpoint and reports >> `upsc`-style data; maybe useful for development or external handling. >> >> - `failover`: monitors the sockets of other local drivers and promotes a >> primary according to some logic, if the current one fails; basic >> internal handling. >> >> Does this approach make sense in the broader NUT architecture? Would this >> be >> something worth developing or upstreaming? >> >> Thanks! >> _______________________________________________ >> Nut-upsdev mailing list >> Nut-upsdev at alioth-lists.debian.net >> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsdev >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://alioth-lists.debian.net/pipermail/nut-upsdev/attachments/20250509/bb98b9f3/attachment.htm>
Perhaps it would be helpful to articulate what the problem is, that needs solving. It sort of sounds like there are UPS units that one can access via multiple mechanisms (e.g. serial and USB, or USB and SNMP/ethernet). people are able to hook them up both ways at the same time people have problems where - the UPS keeps working - the communications via method A fails - at the same time, method B keeps working - this happens often enough that spending time making a scheme to talk to both and integrate them, or pick one, is worthwhile and in total I find this surprising. Some real examples of problems and desired approaches would help.