I've looked through the code for sys_mmap2 on several architectures, and it looks like some architectures plays by the "shift is always 12" rule, e.g. SPARC, and some expect userspace to actually obtain the page size, e.g. PowerPC and MIPS. On some architectures, e.g. x86 and ARM, the point is moot since PAGE_SIZE is always 2^12. a. Is this correct, or have I misunderstood the code? b. If so, is this right, or is this a bug? Right now both klibc and ?Clibc consider the latter a bug. c. Which architectures are affected which way? -hpa
From: "H. Peter Anvin" <hpa@zytor.com> Date: Wed, 22 Feb 2006 13:45:46 -0800> I've looked through the code for sys_mmap2 on several architectures, and > it looks like some architectures plays by the "shift is always 12" rule, > e.g. SPARC, and some expect userspace to actually obtain the page > size, e.g. PowerPC and MIPS. On some architectures, e.g. x86 and ARM, > the point is moot since PAGE_SIZE is always 2^12. > > a. Is this correct, or have I misunderstood the code? > > b. If so, is this right, or is this a bug? Right now both klibc and > ?Clibc consider the latter a bug. > > c. Which architectures are affected which way?Right. On sparc32 we had the issue where we had a 8K page size platform (sun4) and the rest were using 4K page size. I can't even think why we do that fixed shift actually. I think Jakub Jalinek thought this might be a way to make applications assuming 4K page size work on the 8K page size machines. I'm going to say that you can feel free to fix this to use PAGE_SHIFT correctly all the time. Applications should be calling getpagesize() and not assume what that value might be. Please double check that we report the correct page size to userspace and not a fixed 4K value :-)
On Wed, Feb 22, 2006 at 01:45:46PM -0800, H. Peter Anvin wrote:> I've looked through the code for sys_mmap2 on several architectures, and > it looks like some architectures plays by the "shift is always 12" rule, > e.g. SPARC, and some expect userspace to actually obtain the page > size, e.g. PowerPC and MIPS. On some architectures, e.g. x86 and ARM, > the point is moot since PAGE_SIZE is always 2^12.The sys_mmap2() ABI is that the page shift is always fixed to whatever page size is reasonable for the architecture, typically 2^12. The syscall should never be exposed as mmap2(), only as the large file size version of mmap() (aka mmap64()). The other consideration is that it should not be implemented in 64 bit ABIs, as those machines should be using a 64 bit native mmap(). Does that clear things up a bit? Cheers, -ben -- "Ladies and gentlemen, I'm sorry to interrupt, but the police are here and they've asked us to stop the party." Don't Email: <dont@kvack.org>.
H. Peter Anvin writes:> I've looked through the code for sys_mmap2 on several architectures, and > it looks like some architectures plays by the "shift is always 12" rule, > e.g. SPARC, and some expect userspace to actually obtain the page > size, e.g. PowerPC and MIPS. On some architectures, e.g. x86 and ARM, > the point is moot since PAGE_SIZE is always 2^12. > > a. Is this correct, or have I misunderstood the code?PowerPC always uses 12, even if PAGE_SHIFT is 16 (i.e. for 64k pages).> b. If so, is this right, or is this a bug? Right now both klibc and > ?Clibc consider the latter a bug.Glibc seems to expect it to always be 12, according to this excerpt from sysdeps/unix/sysv/linux/mmap64.c: /* This is always 12, even on architectures where PAGE_SHIFT != 12. */ # ifndef MMAP2_PAGE_SHIFT # define MMAP2_PAGE_SHIFT 12 # endif I would be very reluctant to change the shift to be PAGE_SHIFT, since that would be a change in the user/kernel ABI. Of course, userspace is still expected to make sure addresses and offsets are multiples of the page size (and thus the offset argument to mmap2 has to be a multiple of 16 if the page size is 64k). Regards, Paul.