I?m trying to build a version of OpenSSH 9.0p1 with the GSSAPI patch at https://sources.debian.org/patches/openssh/1:9.0p1-1/gssapi.patch/. The patch applied just fine and I had no issues with the build or the tests (all of which passed). That said, I think I?ve discovered a bug in the patch. I know that?s not the responsibility of the OpenSSH maintainers, but I was hoping to get some help from this list on a higher-level question about the right way to work with sshbufs, as I haven?t had a lot of experience actually making changes to OpenSSH. The issue occurs when the GSS client kex code receives a KEXGSS_HOSTKEY message. There?s code to handle this message, but it runs into an issue where it reports "ssh_packet_read: read: internal error: buffer is read-only?. When I looked into this more closely, the issue is not actually that the sshbuf it is using is read-only. It?s that the sshbuf has a refcount greater than 1 at the time the code tries to reuse it. Here?s the code in question: struct sshbuf *server_host_key_blob = NULL; ...snip... /* If we've sent them data, they should reply */ do { type = ssh_packet_read(ssh); if (type == SSH2_MSG_KEXGSS_HOSTKEY) { debug("Received KEXGSS_HOSTKEY"); if (server_host_key_blob) fatal("Server host key received more than once"); if ((r = sshpkt_getb_froms(ssh, &server_host_key_blob)) != 0) fatal("Failed to read server host key: %s", ssh_err(r)); } } while (type == SSH2_MSG_KEXGSS_HOSTKEY); ...snip... out: sshbuf_free(server_host_key_blob); I believe the problem here is that the call to sshpkt_getb_froms() is returning an sshbuf in server_host_key_blob which is a reference to the string being consumed from the packet being read, setting that original packet as its parent. As a result, the ?ssh? buffer now has a refcount of 2, and when it returns to the top of the do {...} while and tries to read another packet into ?ssh?, it gets the error about the sshbuf being ?read-only? (for good reason). There?s already code in place to free server_host_key_blob, so I think the solution here is probably to make a copy of this blob into a new sshbuf rather than getting a reference to the existing memory. However, I?m not sure what the best function is to use in place of sshpkt_getb_froms() which will guarantee a new allocation is performed. Anyone have a suggestion on this? The sshbuf_get_stringb() seems close, but it doesn?t take into account that ?ssh? needs to be accessed as a packet. I see various sshpkt_get_string functions, but not specifically an sshpkt_get_stringb(). It wouldn?t be hard to add, but if it?s not already present there?s probably a good reason for that. Is there a better way to do this? As an aside, I?m guessing this refcounting capability was added after the patch was originally written, and the patch was never updated to consider this issue, meaning there might be other places in the patch where this occurs. Is there anything I can read that might provide more information about which functions began returning references rather than always making copies of buffers? Thanks in advance for any help you can provide! -- Ron Frederick ronf at timeheart.net
On Sat, 21 May 2022, Ron Frederick wrote:> I?m trying to build a version of OpenSSH 9.0p1 with the GSSAPI patch > at https://sources.debian.org/patches/openssh/1:9.0p1-1/gssapi.patch/. > The patch applied just fine and I had no issues with the build or the > tests (all of which passed). > > That said, I think I?ve discovered a bug in the patch. I know that?s > not the responsibility of the OpenSSH maintainers, but I was hoping > to get some help from this list on a higher-level question about the > right way to work with sshbufs, as I haven?t had a lot of experience > actually making changes to OpenSSH. > > The issue occurs when the GSS client kex code receives a > KEXGSS_HOSTKEY message. There?s code to handle this message, but it > runs into an issue where it reports "ssh_packet_read: read: internal > error: buffer is read-only?. When I looked into this more closely, the > issue is not actually that the sshbuf it is using is read-only. It?s > that the sshbuf has a refcount greater than 1 at the time the code > tries to reuse it. Here?s the code in question: > > struct sshbuf *server_host_key_blob = NULL; > > ...snip... > /* If we've sent them data, they should reply */ > do { > type = ssh_packet_read(ssh); > if (type == SSH2_MSG_KEXGSS_HOSTKEY) { > debug("Received KEXGSS_HOSTKEY"); > if (server_host_key_blob) > fatal("Server host key received more than once"); > if ((r = sshpkt_getb_froms(ssh, &server_host_key_blob)) != 0) > fatal("Failed to read server host key: %s", ssh_err(r)); > } > } while (type == SSH2_MSG_KEXGSS_HOSTKEY); > ...snip... > out: > sshbuf_free(server_host_key_blob); > > I believe the problem here is that the call to sshpkt_getb_froms() is > returning an sshbuf in server_host_key_blob which is a reference to > the string being consumed from the packet being read, setting that > original packet as its parent. As a result, the ?ssh? buffer now has > a refcount of 2, and when it returns to the top of the do {...} while > and tries to read another packet into ?ssh?, it gets the error about > the sshbuf being ?read-only? (for good reason).IMO the best fix for this would be to put a sshbuf_free() at the start of the do {} loop (it will handle a NULL argument fine).
On 5/22/22 02:17, Ron Frederick wrote:> I?m trying to build a version of OpenSSH 9.0p1 with the GSSAPI patch at https://sources.debian.org/patches/openssh/1:9.0p1-1/gssapi.patch/. The patch applied just fine and I had no issues with the build or the tests (all of which passed). > > That said, I think I?ve discovered a bug in the patch. I know that?s not the responsibility of the OpenSSH maintainers, but I was hoping to get some help from this list on a higher-level question about the right way to work with sshbufs, as I haven?t had a lot of experience actually making changes to OpenSSH. > > The issue occurs when the GSS client kex code receives a KEXGSS_HOSTKEY message. There?s code to handle this message, but it runs into an issue where it reports "ssh_packet_read: read: internal error: buffer is read-only?. When I looked into this more closely, the issue is not actually that the sshbuf it is using is read-only. It?s that the sshbuf has a refcount greater than 1 at the time the code tries to reuse it. Here?s the code in question: > > struct sshbuf *server_host_key_blob = NULL; > > ...snip... > /* If we've sent them data, they should reply */ > do { > type = ssh_packet_read(ssh); > if (type == SSH2_MSG_KEXGSS_HOSTKEY) { > debug("Received KEXGSS_HOSTKEY"); > if (server_host_key_blob) > fatal("Server host key received more than once"); > if ((r = sshpkt_getb_froms(ssh, &server_host_key_blob)) != 0) > fatal("Failed to read server host key: %s", ssh_err(r)); > } > } while (type == SSH2_MSG_KEXGSS_HOSTKEY); > ...snip... > out: > sshbuf_free(server_host_key_blob); > > I believe the problem here is that the call to sshpkt_getb_froms() is returning an sshbuf in server_host_key_blob which is a reference to the string being consumed from the packet being read, setting that original packet as its parent. As a result, the ?ssh? buffer now has a refcount of 2, and when it returns to the top of the do {...} while and tries to read another packet into ?ssh?, it gets the error about the sshbuf being ?read-only? (for good reason). > > There?s already code in place to free server_host_key_blob, so I think the solution here is probably to make a copy of this blob into a new sshbuf rather than getting a reference to the existing memory. However, I?m not sure what the best function is to use in place of sshpkt_getb_froms() which will guarantee a new allocation is performed. Anyone have a suggestion on this? The sshbuf_get_stringb() seems close, but it doesn?t take into account that ?ssh? needs to be accessed as a packet. I see various sshpkt_get_string functions, but not specifically an sshpkt_get_stringb(). It wouldn?t be hard to add, but if it?s not already present there?s probably a good reason for that. Is there a better way to do this? > > As an aside, I?m guessing this refcounting capability was added after the patch was originally written, and the patch was never updated to consider this issue, meaning there might be other places in the patch where this occurs. Is there anything I can read that might provide more information about which functions began returning references rather than always making copies of buffers? > > Thanks in advance for any help you can provide!We track the gsskex patches in the following github repository: https://github.com/openssh-gsskex/openssh-gsskex/ I did not read into details about that, but believe I already saw this issue and we were fixing it: https://github.com/openssh-gsskex/openssh-gsskex/pull/19 Unfortunately, the repository is not completely up to date, but both Colin and Dmitry should be able to help you around here. Regards, -- Jakub Jelen Crypto Team, Security Engineering Red Hat, Inc.