thr3ads.net - freebsd stable - Attention 7.x and 8.x ptmx/pts users (read if you set kern.pts.enable=1) [Dec 2007]

If this information is useful, please help other people find it:
Share via:

Robert Watson

2007-Dec-03 15:38 UTC

Attention 7.x and 8.x ptmx/pts users (read if you set kern.pts.enable=1)

(If you aren't interested in the details of our ptmx/pty/pts driver, skip to
  the paragraph that reads "So, why the long-winded story?)

Dear all:

The current ptmx/pts implementation makes use of devfs(4) cloning: a user 
process wanting to allocate a pty/pts pair opens /dev/ptmx, which returns a 
reference to a new pty master.  An ioctl is then performed to query which pts 
number was returned, and the pts device is then opened.  Internally, the 
lookup of /dev/ptmx causes the driver to instantiate the pty, and then when 
the pty is opened, the pts is created.  The pty and pts nodes are both 
destroyed when last close occurs, cleaning up the bits automatically when the 
last process attached to thee pair exits.  Sounds good. :-)

Unfortunately, the current implementation is subject to a potential resource 
leak: the pty is created when the lookup occurs, but if the open never takes 
place, then the pty is leaked.  In principle, we have facilities to GC unused 
device nodes "eventually", although not a race-free way to determine
that no
race occurs, assuming that we implemented that.  This leakage turns out to 
interact particularly poorly with our resource limits on pty/pts pairs -- both 
the administrative limit imposed by sysctl and also the functional limit on 
the number of entries in /etc/ttys.  It's possible to imagine various 
sometimes messy techniques of performing this garbage collection.

Instead, what I'd like to do is modify the ptmx code to have a race-free 
protocol, in which eventual termination of processes referencing the node 
results in freeing of the nodes.  On some systems, ptmx performs a 
"bait-and-switch", in which the file descriptor of the pty node is
silently
substituted for the file descriptor of the ptmx code--similar to our model, 
only no window between lookup and open, but also not easily supported in our 
current VFS.  Another possibility is to introduce a new system call and bypass 
ptmx entirely -- similar to pipe(), socketpair(), etc.

The change that seemed to be the least disruptive, and which I have 
implemented, introduces ptmx as a true device node (not a devfs clone), and an 
ioctl that causes the allocation of the pty and pts pair -- however, the pair 
is also added to a garbage collection list.  If the ptmx node is closed 
*before* the pty is opened, then the nodes are garbage collected.  It turns 
out this also isn't easily implementable in our VFS, as we don't offer a
per-file descriptor opaque to be used by device driver, nor offer the file 
descriptor pointer to the device driver (as in, say, Linux).  At some point, 
this functionality will turn up, as there has been consistent interest in it 
over time.  What I've done is implement an approximation of that model -- an
"open counter" for ptmx, which when it hits zero across all
references, causes
a garbage collection sweep.  If/when we can use per-file descriptor state, it 
is easily modified to sweep on close of a specific descriptor.

--> start reading here if you were bored by the above

Why the long-winded story?  Well, this turns out to change the convention by 
which libc communications with the kernel -- instead of a simple open of ptmx 
and then ioctl to find the pts, we now open ptmx, perform an ioctl to allocate 
the pair, and then open both the pty and pts nodes explicitly.  Thus, libc 
requires modification, and libcs that know how to speak to the old ptmx
don't
know how to speak to the new one, and, in effect, vice versa.  This doesn't 
meet our ABI requirements for a stable branch, so what I plan to do is 
withdraw the ptmx/pts implementation from 7.0 before the release by disabling 
it in the kernel and libc.  This will prevent us from nailing down the ABI, 
and we'll instead merge the revised protocol for 7.1.  This change will, 
however, affect users of the 8-CURRENT branch, as during an upgrade cycle, 
it's likely that libc and kernel will be out of sync, and therefore if pts 
support is enabled (via the kern.pts.enable sysctl), pty devices will not be 
available, which might crimp the style of anyone performing a remote upgrade 
via, say, ssh.

So, this is notice of two upcoming changes:

(1) kern.pts.enable will be removed in 7.x, for reintroduction in 7.1.  If
     kern.pts.enable was set, then your system will silently revert to using
     old-style ptys, and the setting of the sysctl will lead to an error.

(2) I will merge the revised ptmx implementation to 7.x, potentially
     disrupting use of pty/pts devices for users who have kern.pts.enable
     explicitly set to a non-zero value.

Hopefully this will resolve the known resource leaks in the ptmx code, and get 
us on track to start enabling it by default in the near future ... in 8.x, and 
at least offering it as a production feature in 7.x.

Thanks,

Robert N M Watson
Computer Laboratory
University of Cambridge

Ed Schouten

2007-Dec-04 02:23 UTC

head link

Attention 7.x and 8.x ptmx/pts users (read if you set kern.pts.enable=1)

* Robert Watson <rwatson@FreeBSD.org> wrote:> Unfortunately, the current implementation is subject to a potential 
> resource leak: the pty is created when the lookup occurs, but if the open 
> never takes place, then the pty is leaked.  In principle, we have 
> facilities to GC unused device nodes "eventually", although not a
race-free
> way to determine that no race occurs, assuming that we implemented that.  
> This leakage turns out to interact particularly poorly with our resource 
> limits on pty/pts pairs -- both the administrative limit imposed by sysctl 
> and also the functional limit on the number of entries in /etc/ttys. 
It's
> possible to imagine various sometimes messy techniques of performing this 
> garbage collection.
So this is the same issue I sent a message to arch@ about some time ago,
that /dev/ptmx already returns a reference to the new pty, already when
you stat(2) it (for example by running `ls -l /dev/ptmx')?
> Instead, what I'd like to do is modify the ptmx code to have a
race-free
> protocol, in which eventual termination of processes referencing the node 
> results in freeing of the nodes.  On some systems, ptmx performs a 
> "bait-and-switch", in which the file descriptor of the pty node
is silently
> substituted for the file descriptor of the ptmx code--similar to our model,
> only no window between lookup and open, but also not easily supported in 
> our current VFS.  Another possibility is to introduce a new system call and
> bypass ptmx entirely -- similar to pipe(), socketpair(), etc.
I actually think that this sounds pretty nice. You mean something like
an in-kernel implementation for openpty()?

Another thing that would make the TTY code a little bit cleaner in my
opinion is removing the PRIV_TTY_PRISON check and making something
generic inside devfs. If we have proper garbage collecting on TTY's,
then we can just change make_dev_cred() to bind the new device node to a
certain jail. That way you could even choose to hide nodes in /dev that
don't belong to the jail in question.

-- 
 Ed Schouten <ed@fxq.nl>
 WWW: http://g-rave.nl/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url :
http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20071204/bcac58f8/attachment.pgp

freebsd stable - Dec 2007 - Attention 7.x and 8.x ptmx/pts users (read if you set kern.pts.enable=1)

Attention 7.x and 8.x ptmx/pts users (read if you set kern.pts.enable=1)

Attention 7.x and 8.x ptmx/pts users (read if you set kern.pts.enable=1)