Hi, On Fri, Feb 14, 2025 at 11:51:47AM -0500, Chris Rapier wrote:> Would there be any interest in an implementation for OpenSSH or should I > develop it for HPN-SSH first and report back?I do wonder if HE is really still needed in 2025. If one of the protocols is not available at all, failover is already quick - and if one is broken, it's useful to actually notice that it is so, and not hide it under "ah, let's pretend nothing happens and quickly use the other one". (The web people seem to be unwilling to take the risk that anything could be slow, because, advertising revenue and such) gert -- "If was one thing all people took for granted, was conviction that if you feed honest figures into a computer, honest figures come out. Never doubted it myself till I met a computer with a sense of humor." Robert A. Heinlein, The Moon is a Harsh Mistress Gert Doering - Munich, Germany gert at greenie.muc.de
On 2/14/25 13:04, Gert Doering wrote:> Hi, > > On Fri, Feb 14, 2025 at 11:51:47AM -0500, Chris Rapier wrote: >> Would there be any interest in an implementation for OpenSSH or should I >> develop it for HPN-SSH first and report back? > > I do wonder if HE is really still needed in 2025. If one of the protocols > is not available at all, failover is already quick - and if one is broken, > it's useful to actually notice that it is so, and not hide it under > "ah, let's pretend nothing happens and quickly use the other one". > > (The web people seem to be unwilling to take the risk that anything could > be slow, because, advertising revenue and such)It's an issue in a number environments aside from advertising. I'm largely thinking of temporally constrained data transfers in HPC environments. For example, in some distributed astronomy observations you have a pretty small window to move the observational data to the central collector. Being that the data might be tens to hundreds of GB in size dealing with a timeout might cause cascading data losses. Having a faster failover in the event of a transitory resolution or network issue is important. This could be resolved by changing the connection timeout in OpenSSH but it might make more sense to implement RFC 8305. Does this mean that some failures might not be as obvious anymore? Maybe, but an organization should have other methods of fault detection to pick up on issues like this. Not that user feedback isn't important but that should be the last layer of fault detection. Now, the obvious alternative solution would be to just use globus/gridftp but that now comes with licensing constraints that can make it less attractive for under-resourced institutions (to say nothing of the higher administrative burden). Note: I'm not saying that any of the OpenSSH maintainers should do the work on this. Chris