Peter Stuge
2021-Sep-09 22:57 UTC
ssh-agent: perform AGENTC_REMOVE_ALL_IDENTITIES on SIGUSR1
Hi Steffen, Steffen Nurpmeso wrote:> |I think the lack of feedback about removal success/failure inherent > |with signals is problematic. > > Bah. What hair-splitting is that? Sorry.To clarify, my comment is about using a signal to trigger the deletion; with a signal the sender can not know whether removal completed successfully, failed or is ongoing. They can merel hope for the best. That's a very weak security promise. I for one would expect suspend to wait for successful removal and that the suspend is cancelled with an error message should removal fail, to know the state of agents.> |> Since _i think_ the poll(2) will not immediately tick due to > |> SA_RESTART if the "timeout" parameter is not -1,..> |"The details vary across UNIX systems" suggests that you may need to > |do research on this. > > Wouldn't you agree that the approach that was chosen covers exactly that?Yes! Another fd can make poll() return reliably. I'd probably choose a pipe.> |There's no error handling after socket(), which is a problem in itself > |but also ties into the feedback problem. Since this is intended to be > > To put that straight you mean the opposite, that _if_ socket(2) > fails we will not be able to connect(2), and therefore the loop > will not tick.Sorry, I wasn't so clear - the patch did test socket() success, but neither connect() nor close() success, at a minimum connect() must succeed for poll() to return.> If it wakes up the next time it _will_ remove the keys.Without feedback (blocking removal) that could be post resume. Such an addition would merely provide a false sense of security.> |a security feature there really needs to be some feedback, making..> call the cleanup_handler() and then fatal_r()ly abort the program instead?I find that a good addition in error paths but it doesn't help with the race condition inherent to using a signal.> |direct interaction with the agents look a lot better, with the > > How do you do this, Mister, from an ACPI script running as root?for SSH_AUTH_SOCK in /tmp/ssh-*/agent.*; do ssh-add -D; done # maybe?> |significant additional advantage of needing zero new agent code. > > Oh nooo! Spoiler!!!When in doubt, keep the trusted codebase small. //Peter
Darren Tucker
2021-Sep-10 03:57 UTC
ssh-agent: perform AGENTC_REMOVE_ALL_IDENTITIES on SIGUSR1
On Fri, 10 Sept 2021 at 09:04, Peter Stuge <peter at stuge.se> wrote:> Hi Steffen, > > Steffen Nurpmeso wrote: > [...] > > |"The details vary across UNIX systems" suggests that you may need to > > |do research on this. > > > > Wouldn't you agree that the approach that was chosen covers exactly that? > > Yes! Another fd can make poll() return reliably. I'd probably choose a > pipe. >You could change the poll() to ppoll() and give it an appropriate signal mask, then the signal handler would only have to set a sig_atomic_t flag. Not every platform has ppoll(), but we already have compat code implementing both poll() and pselect() on top of select(), so we've already got most of what would be needed to do ppoll too. -- Darren Tucker (dtucker at dtucker.net) GPG key 11EAA6FA / A86E 3E07 5B19 5880 E860 37F4 9357 ECEF 11EA A6FA (new) Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.
Steffen Nurpmeso
2021-Sep-10 12:06 UTC
ssh-agent: perform AGENTC_REMOVE_ALL_IDENTITIES on SIGUSR1
Hello Peter. Peter Stuge wrote in <20210909225708.2121.qmail at stuge.se>: |Steffen Nurpmeso wrote: |>|I think the lack of feedback about removal success/failure inherent |>|with signals is problematic. |> |> Bah. What hair-splitting is that? Sorry. | |To clarify, my comment is about using a signal to trigger the deletion; Well in the end i am only def and blind and need some time for wrapping my head around a problem. I think in fact that de Raadt's initial comment pointed into the right direction, and that requires nothing such, but the patch would be diff --git a/ssh-agent.1 b/ssh-agent.1 index ed8c870969..dd2b0e8af1 100644 --- a/ssh-agent.1 +++ b/ssh-agent.1 @@ -178,6 +178,9 @@ will automatically use them if present. is also used to remove keys from .Nm and to query the keys that are held in one. +.Nm +will remove all keys when it receives the user signal +.Dv SIGUSR1 . .Pp Connections to .Nm diff --git a/ssh-agent.c b/ssh-agent.c index 48a47d45a9..014cf7b38c 100644 --- a/ssh-agent.c +++ b/ssh-agent.c @@ -171,6 +171,11 @@ static int fingerprint_hash = SSH_FP_HASH_DEFAULT; /* Refuse signing of non-SSH messages for web-origin FIDO keys */ static int restrict_websafe = 1; +/* Request to remove_all_identities() pending */ +static volatile sig_atomic_t remove_all_asap = 0; + +static enum run_state { RS_RUN, RS_POLL } run_state = RS_RUN; + static void close_socket(SocketEntry *e) { @@ -529,11 +534,10 @@ process_remove_identity(SocketEntry *e) } static void -process_remove_all_identities(SocketEntry *e) +remove_all_identities(void) { Identity *id; - debug2_f("entering"); /* Loop over all identities and clear the keys. */ for (id = TAILQ_FIRST(&idtab->idlist); id; id = TAILQ_FIRST(&idtab->idlist)) { @@ -543,6 +547,13 @@ process_remove_all_identities(SocketEntry *e) /* Mark that there are no identities. */ idtab->nentries = 0; +} + +static void +process_remove_all_identities(SocketEntry *e) +{ + debug2_f("entering"); + remove_all_identities(); /* Send success. */ send_status(e, 1); @@ -1342,6 +1353,16 @@ cleanup_handler(int sig) _exit(2); } +/*ARGSUSED*/ +static void +remove_all_handler(int sig) +{ + if (run_state == RS_POLL) + remove_all_identities(); + else + remove_all_asap = 1; +} + static void check_parent_exists(void) { @@ -1619,6 +1640,7 @@ skip: ssh_signal(SIGINT, (d_flag | D_flag) ? cleanup_handler : SIG_IGN); ssh_signal(SIGHUP, cleanup_handler); ssh_signal(SIGTERM, cleanup_handler); + ssh_signal(SIGUSR1, remove_all_handler); if (pledge("stdio rpath cpath unix id proc exec", NULL) == -1) fatal("%s: pledge: %s", __progname, strerror(errno)); @@ -1626,11 +1648,17 @@ skip: while (1) { prepare_poll(&pfd, &npfd, &timeout, maxfds); + run_state = RS_POLL; result = poll(pfd, npfd, timeout); + run_state = RS_RUN; saved_errno = errno; if (parent_alive_interval != 0) check_parent_exists(); - (void) reaper(); /* remove expired keys */ + if (remove_all_asap) { + remove_all_asap = 0; + remove_all_identities(); + } else + (void) reaper(); /* remove expired keys */ if (result == -1) { if (saved_errno == EINTR) continue; And if you really want to be super exact, one would add diff --git a/ssh-agent.c b/ssh-agent.c index 014cf7b38c..1ca8d3229e 100644 --- a/ssh-agent.c +++ b/ssh-agent.c @@ -1647,6 +1647,10 @@ skip: platform_pledge_agent(); while (1) { + if (remove_all_asap) { + remove_all_asap = 0; + remove_all_identities(); + } prepare_poll(&pfd, &npfd, &timeout, maxfds); run_state = RS_POLL; result = poll(pfd, npfd, timeout); So with these two there is the security guarantee you were asking for, that the removal happens instantly. It is also portable. Since all signals are blocked, all we need to know is whether we are hanging on poll(2), in which case it is "safe" to remove from within the signal handler in that no other code is currently running in user space. I'd say. (Of course on Cygwin poll(2) could be implemented like grazy, i have not looked in that for i would say really twenty years, select(2) by then, but then again i think they should block signals while working this tremendous piece of code in something that people think are system calls.) |with a signal the sender can not know whether removal completed successf\ |ully, |failed or is ongoing. They can merel hope for the best. That's a very weak |security promise. But why so Peter? Signals are used for communication on Unix since ever? If you send a signal to a process, and it has not ignored the signal (if it can), the action behind happens. You can see that easily on your Linux system, it kill aborts any program you have rights for, just specify a realtime signal. |I for one would expect suspend to wait for successful removal and |that the suspend is cancelled with an error message should removal |fail, to know the state of agents. What should fail when zeroing memory? There is no status for this operation anyhow, is it? It is merely you calling ssh-add, which then reports interaction with the agent was successful. ... |>|direct interaction with the agents look a lot better, with the |> |> How do you do this, Mister, from an ACPI script running as root? | |for SSH_AUTH_SOCK in /tmp/ssh-*/agent.*; do ssh-add -D; done # maybe? This is overly funny. |>|significant additional advantage of needing zero new agent code. |> |> Oh nooo! Spoiler!!! | |When in doubt, keep the trusted codebase small. Done. I always hate it to look in codebases that i do not know. There is so much work and time in such a codebase, done by people knowing the context exactly, and i just hate it. I mean this is effectively an easy thing here (but for me), but in general someone who knows a codebase can implement something _right_ (avoiding pitfalls) in a very short time, a state that third parties can achieve only with real effort. So the patch above should possibly have been it from the start. Maybe it will be considered, that would be nice. (Just 's/^ //.) Ciao from Germany, --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)