Eric W. Biederman
2020-Apr-30 15:38 UTC
[PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
David Hildenbrand <david at redhat.com> writes:> Some devices/drivers that add memory via add_memory() and friends (e.g., > dax/kmem, but also virtio-mem in the future) don't want to create entries > in /sys/firmware/memmap/ - primarily to hinder kexec from adding this > memory to the boot memmap of the kexec kernel. > > In fact, such memory is never exposed via the firmware memmap as System > RAM (e.g., e820), so exposing this memory via /sys/firmware/memmap/ is > wrong: > "kexec needs the raw firmware-provided memory map to setup the > parameter segment of the kernel that should be booted with > kexec. Also, the raw memory map is useful for debugging. For > that reason, /sys/firmware/memmap is an interface that provides > the raw memory map to userspace." [1] > > We don't have to worry about firmware_map_remove() on the removal path. > If there is no entry, it will simply return with -EINVAL. > > [1] > https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-memmapYou know what this justification is rubbish, and I have previously explained why it is rubbish. Nacked-by: "Eric W. Biederman" <ebiederm at xmission.com> This needs to be based on weather the added memory is ultimately normal ram or is something special. At least when we are talking memory resources. Keeping it out of the firmware map that is fine. If the hotplugged memory is the result of plugging a stick of ram into the kernel and can and should used be like any other memory it should be treated like any normal memory. If the hotplugged memory is something special it should be treated as something special. Justifying behavior by documentation that does not consider memory hotplug is bad thinking.> Cc: Andrew Morton <akpm at linux-foundation.org> > Cc: Michal Hocko <mhocko at suse.com> > Cc: Pankaj Gupta <pankaj.gupta.linux at gmail.com> > Cc: Wei Yang <richard.weiyang at gmail.com> > Cc: Baoquan He <bhe at redhat.com> > Cc: Eric Biederman <ebiederm at xmission.com> > Signed-off-by: David Hildenbrand <david at redhat.com> > --- > include/linux/memory_hotplug.h | 8 ++++++++ > mm/memory_hotplug.c | 3 ++- > 2 files changed, 10 insertions(+), 1 deletion(-) > > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h > index 0151fb935c09..4ca418a731eb 100644 > --- a/include/linux/memory_hotplug.h > +++ b/include/linux/memory_hotplug.h > @@ -68,6 +68,14 @@ struct mhp_params { > pgprot_t pgprot; > }; > > +/* Flags used for add_memory() and friends. */ > + > +/* > + * Don't create entries in /sys/firmware/memmap/. The memory is detected and > + * added via a device driver, not via the initial (firmware) memmap. > + */ > +#define MHP_NO_FIRMWARE_MEMMAP 1 > + > /* > * Zone resizing functions > * > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index c01be92693e3..e94ede9cad00 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -1062,7 +1062,8 @@ int __ref add_memory_resource(int nid, struct resource *res, > BUG_ON(ret); > > /* create new memmap entry */ > - firmware_map_add_hotplug(start, start + size, "System RAM"); > + if (!(flags & MHP_NO_FIRMWARE_MEMMAP)) > + firmware_map_add_hotplug(start, start + size, "System RAM"); > > /* device_online() will take the lock when calling online_pages() */ > mem_hotplug_done();
David Hildenbrand
2020-Apr-30 15:52 UTC
[PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
On 30.04.20 17:38, Eric W. Biederman wrote:> David Hildenbrand <david at redhat.com> writes: > >> Some devices/drivers that add memory via add_memory() and friends (e.g., >> dax/kmem, but also virtio-mem in the future) don't want to create entries >> in /sys/firmware/memmap/ - primarily to hinder kexec from adding this >> memory to the boot memmap of the kexec kernel. >> >> In fact, such memory is never exposed via the firmware memmap as System >> RAM (e.g., e820), so exposing this memory via /sys/firmware/memmap/ is >> wrong: >> "kexec needs the raw firmware-provided memory map to setup the >> parameter segment of the kernel that should be booted with >> kexec. Also, the raw memory map is useful for debugging. For >> that reason, /sys/firmware/memmap is an interface that provides >> the raw memory map to userspace." [1] >> >> We don't have to worry about firmware_map_remove() on the removal path. >> If there is no entry, it will simply return with -EINVAL. >> >> [1] >> https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-memmap > > > You know what this justification is rubbish, and I have previously > explained why it is rubbish.Actually, no, I don't think it is rubbish. See patch #3 and the cover letter why this is the right thing to do *for special memory*, *not ordinary DIMMs*. And to be quite honest, I think your response is a little harsh. I don't recall you replying to my virtio-mem-related comments.> > Nacked-by: "Eric W. Biederman" <ebiederm at xmission.com> > > This needs to be based on weather the added memory is ultimately normal > ram or is something special.Yes, that's what the caller are expected to decide, see patch #3. kexec should try to be as closely as possible to a real reboot - IMHO.> > At least when we are talking memory resources. Keeping it out of the > firmware map that is fine. > > If the hotplugged memory is the result of plugging a stick of ram > into the kernel and can and should used be like any other memory > it should be treated like any normal memory. > > If the hotplugged memory is something special it should be treated as > something special.I am really sorry, I can't make sense of what you are trying to say here.> > Justifying behavior by documentation that does not consider memory > hotplug is bad thinking.Are you maybe confusing this patch series with the arm64 approach? This is not about ordinary hotplugged DIMMs. I'd love to get Dan's, Dave's and Michal's opinion. -- Thanks, David / dhildenb
Dave Hansen
2020-Apr-30 16:04 UTC
[PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
On 4/30/20 8:52 AM, David Hildenbrand wrote:>> Justifying behavior by documentation that does not consider memory >> hotplug is bad thinking. > Are you maybe confusing this patch series with the arm64 approach? This > is not about ordinary hotplugged DIMMs. > > I'd love to get Dan's, Dave's and Michal's opinion.The impact on kexec from the DAX "kmem" driver's use of hotplug was inadvertent and unfortunate. The problem statement and solution seem pretty sane to me.
Eric W. Biederman
2020-Apr-30 16:33 UTC
[PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
David Hildenbrand <david at redhat.com> writes:> On 30.04.20 17:38, Eric W. Biederman wrote: >> David Hildenbrand <david at redhat.com> writes: >> >>> Some devices/drivers that add memory via add_memory() and friends (e.g., >>> dax/kmem, but also virtio-mem in the future) don't want to create entries >>> in /sys/firmware/memmap/ - primarily to hinder kexec from adding this >>> memory to the boot memmap of the kexec kernel. >>> >>> In fact, such memory is never exposed via the firmware memmap as System >>> RAM (e.g., e820), so exposing this memory via /sys/firmware/memmap/ is >>> wrong: >>> "kexec needs the raw firmware-provided memory map to setup the >>> parameter segment of the kernel that should be booted with >>> kexec. Also, the raw memory map is useful for debugging. For >>> that reason, /sys/firmware/memmap is an interface that provides >>> the raw memory map to userspace." [1] >>> >>> We don't have to worry about firmware_map_remove() on the removal path. >>> If there is no entry, it will simply return with -EINVAL. >>> >>> [1] >>> https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-firmware-memmap >> >> >> You know what this justification is rubbish, and I have previously >> explained why it is rubbish. > > Actually, no, I don't think it is rubbish. See patch #3 and the cover > letter why this is the right thing to do *for special memory*, *not > ordinary DIMMs*. > > And to be quite honest, I think your response is a little harsh. I don't > recall you replying to my virtio-mem-related comments. > >> >> Nacked-by: "Eric W. Biederman" <ebiederm at xmission.com> >> >> This needs to be based on weather the added memory is ultimately normal >> ram or is something special. > > Yes, that's what the caller are expected to decide, see patch #3. > > kexec should try to be as closely as possible to a real reboot - IMHO.That is very fuzzy in terms of hotplug memory. The kexec'd kernel should see the hotplugged memory assuming it is ordinary memory. But kexec is not a reboot although it is quite similar. Kexec is swapping one running kernel and it's state for another kernel without rebooting.>> Justifying behavior by documentation that does not consider memory >> hotplug is bad thinking. > > Are you maybe confusing this patch series with the arm64 approach? This > is not about ordinary hotplugged DIMMs.I think I am. My challenge is that I don't see anything in the description that says this isn't about ordinary hotplugged DIMMs. All I saw was hotplug memory. If the class of memory is different then please by all means let's mark it differently in struct resource so everyone knows it is different. But that difference needs to be more than hotplug. That difference needs to be the hypervisor loaned us memory and might take it back at any time, or this memory is persistent and so it has these different characteristics so don't use it as ordinary ram. That information is also useful to other people looking at the system and seeing what is going on. Just please don't muddle the concepts, or assume that whatever subset of hotplug memory you are dealing with is the only subset. I didn't see that flag making the distinction about the kind of memory it is. Eric
Possibly Parallel Threads
- [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
- [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
- [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
- [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP
- [PATCH v2 2/3] mm/memory_hotplug: Introduce MHP_NO_FIRMWARE_MEMMAP