Hi!> Cryptographic libraries carry pseudo random number generators to > quickly provide randomness when needed. If such a random pool gets > cloned, secrets may get revealed, as the same random number may get > used multiple times. For fork, this was fixed using the WIPEONFORK > madvise flag [1].> Unfortunately, the same problem surfaces when a virtual machine gets > cloned. The existing flag does not help there. This patch introduces a > new flag to automatically clear memory contents on VM suspend/resume, > which will allow random number generators to reseed when virtual > machines get cloned.Umm. If this is real problem, should kernel provide such rng in the vsdo page using vsyscalls? Kernel can have special interface to its vsyscalls, but we may not want to offer this functionality to rest of userland...> - Provides a simple mechanism to avoid RAM exfiltration during > traditional sleep/hibernate on a laptop or desktop when memory, > and thus secrets, are vulnerable to offline tampering or > inspection.This second use has nothing to do with RNGs, right? And I don't think we should do this in kernel. It is userspace that initiates the suspend transition. Userspace should lock the screen _before_ starting it, for example. Userspace should also get rid of any secrets, first... Best regards, Pavel -- (english) livejournal.com/~pavelmachek (cesky, pictures) atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: Digital signature URL: <lists.linuxfoundation.org/pipermail/virtualization/attachments/20200704/e11a30e4/attachment.sig>
On Sat, Jul 4, 2020 at 12:44 AM Pavel Machek <pavel at ucw.cz> wrote:> > Cryptographic libraries carry pseudo random number generators to > > quickly provide randomness when needed. If such a random pool gets > > cloned, secrets may get revealed, as the same random number may get > > used multiple times. For fork, this was fixed using the WIPEONFORK > > madvise flag [1]. > > > Unfortunately, the same problem surfaces when a virtual machine gets > > cloned. The existing flag does not help there. This patch introduces a > > new flag to automatically clear memory contents on VM suspend/resume, > > which will allow random number generators to reseed when virtual > > machines get cloned. > > Umm. If this is real problem, should kernel provide such rng in the > vsdo page using vsyscalls? Kernel can have special interface to its > vsyscalls, but we may not want to offer this functionality to rest of > userland...And then the kernel would just need to maintain a sequence number in the vDSO data page that gets bumped on suspend, right?
Hi!> > > Cryptographic libraries carry pseudo random number generators to > > > quickly provide randomness when needed. If such a random pool gets > > > cloned, secrets may get revealed, as the same random number may get > > > used multiple times. For fork, this was fixed using the WIPEONFORK > > > madvise flag [1]. > > > > > Unfortunately, the same problem surfaces when a virtual machine gets > > > cloned. The existing flag does not help there. This patch introduces a > > > new flag to automatically clear memory contents on VM suspend/resume, > > > which will allow random number generators to reseed when virtual > > > machines get cloned. > > > > Umm. If this is real problem, should kernel provide such rng in the > > vsdo page using vsyscalls? Kernel can have special interface to its > > vsyscalls, but we may not want to offer this functionality to rest of > > userland... > > And then the kernel would just need to maintain a sequence > number in the vDSO data page that gets bumped on suspenYes, something like that would work. Plus, we'd be free to change the mechanism in future. Best regards, Pavel -- (english) livejournal.com/~pavelmachek (cesky, pictures) atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 181 bytes Desc: Digital signature URL: <lists.linuxfoundation.org/pipermail/virtualization/attachments/20200704/ef38889d/attachment.sig>
On Mon, Jul 6, 2020 at 2:27 PM Alexander Graf <graf at amazon.com> wrote:> Unless we create a vsyscall that returns both the PID as well as the > epoch and thus handles fork *and* suspend. I need to think about this a > bit more :).You can't reliably detect forking by checking the PID if it is possible for multiple forks to be chained before the reuse check runs: - pid 1000 remembers its PID - pid 1000 forks, creating child pid 1001 - pid 1000 exits and is waited on by init - the pid allocator wraps around - pid 1001 forks, creating child pid 1000 - child with pid 1000 tries to check for forking, determines that its PID is 1000, and concludes that it is still the original process
On Mon 06-07-20 14:52:07, Jann Horn wrote:> On Mon, Jul 6, 2020 at 2:27 PM Alexander Graf <graf at amazon.com> wrote: > > Unless we create a vsyscall that returns both the PID as well as the > > epoch and thus handles fork *and* suspend. I need to think about this a > > bit more :). > > You can't reliably detect forking by checking the PID if it is > possible for multiple forks to be chained before the reuse check runs: > > - pid 1000 remembers its PID > - pid 1000 forks, creating child pid 1001 > - pid 1000 exits and is waited on by init > - the pid allocator wraps around > - pid 1001 forks, creating child pid 1000 > - child with pid 1000 tries to check for forking, determines that its > PID is 1000, and concludes that it is still the original processI must be really missing something here because I really fail to see why there has to be something new even invented. Sure, checking for pid is certainly a suboptimal solution because pids are terrible tokens to work with. We do have a concept of file descriptors which a much better and supports signaling. There is a clear source of the signal IIUC (migration) and there are consumers to act upon that (e.g. crypto backends). So what does really prevent to use a standard signal delivery over fd for this usecase? -- Michal Hocko SUSE Labs
On Tue 07-07-20 10:01:23, Alexander Graf wrote:> On 07.07.20 09:44, Michal Hocko wrote: > > On Mon 06-07-20 14:52:07, Jann Horn wrote: > > > On Mon, Jul 6, 2020 at 2:27 PM Alexander Graf <graf at amazon.com> wrote: > > > > Unless we create a vsyscall that returns both the PID as well as the > > > > epoch and thus handles fork *and* suspend. I need to think about this a > > > > bit more :). > > > > > > You can't reliably detect forking by checking the PID if it is > > > possible for multiple forks to be chained before the reuse check runs: > > > > > > - pid 1000 remembers its PID > > > - pid 1000 forks, creating child pid 1001 > > > - pid 1000 exits and is waited on by init > > > - the pid allocator wraps around > > > - pid 1001 forks, creating child pid 1000 > > > - child with pid 1000 tries to check for forking, determines that its > > > PID is 1000, and concludes that it is still the original process > > > > I must be really missing something here because I really fail to see why > > there has to be something new even invented. Sure, checking for pid is > > certainly a suboptimal solution because pids are terrible tokens to work > > with. We do have a concept of file descriptors which a much better and > > supports signaling. There is a clear source of the signal IIUC > > (migration) and there are consumers to act upon that (e.g. crypto > > backends). So what does really prevent to use a standard signal delivery > > over fd for this usecase? > > I wasn't part of the discussions on why things like WIPEONFORK were invented > instead of just using signalling mechanisms, but the main reason I can think > of are libraries.Well, I would argue that WIPEONFORK is conceptually different. It is one time initialization mechanism with a very clear life time semantic. So any programming model is really as easy as, the initial state is always 0 for a new task without any surprises later on because you own the memory (essentially an extension to initialized .data section on exec to any new task). Compare that to a completely async nature of this interface. Any read would essentially have to be properly synchronized with the external event otherwise the state could have been corrupted. Such a consistency model is really cumbersome to work with.> As a library, you are under no control of the main loop usually, which means > you just don't have a way to poll for an fd. As a library author, I would > usually try to avoid very hard to create such a dependency, because it makes > it really hard to glue pieces together. > > The same applies to signals btw, which would also be a possible way to > propagate such events.Just to clarify I didn't really mean posix signals here. Those would be quite clumsy indeed. But I can imagine that a library registers to a system wide means to get a notification. There are many examples for that, including a lot of usage inside libraries. All different *bus interfaces. -- Michal Hocko SUSE Labs