On Sunday, December 14, 2014 10:53:14 AM Alfred Perlstein wrote:> On Dec 14, 2014, at 10:12 AM, Ian Lepore wrote: > > On Sun, 2014-12-14 at 10:09 -0800, Alfred Perlstein wrote: > >> On Dec 14, 2014, at 9:47 AM, Ian Lepore wrote: > >>> This is an out of the blue FYI post to let people know that despite all > >>> the misinformation you'll run across if you search for information on > >>> FreeBSD PAE support, it (still) works just fine. I've been using it > >>> (for reasons related to our build system and products at $work) since > >>> 2006, and I can say unequivocally that it works fine on 6.x, 8.x, and > >>> now 10.x (and presumably on the odd-numbered releases too but I've never > >>> tried those). > >>> > >>> In my most recent testing with 10-stable, I found it was compatible with > >>> drm2 and radeonkms drivers and I was able to run Xorg and gnome just > >>> fine. All my devices, and apps, and even the linuxulator worked just > >>> fine. > >>> > >>> One thing that changed somewhere between 8.4 and 10.1 is that I had to > >>> add a kernel tuning option to my kernel config: > >>> > >>> option KVA_PAGES=768 # Default is 512 > >>> > >>> I suspect that the most frequent use of PAE is on laptops that have 4gb > >>> and the default tuning is adequate for that. My desktop machine has > >>> 12gb and I needed to bump up that value to avoid errors related to being > >>> unable to create new kernel stacks. > >> > >> There already is a #define that is bifurcated based on PAE in pmap.h: > >> > >> #ifndef KVA_PAGES > >> #ifdef PAE > >> #define KVA_PAGES 512 > >> #else > >> #define KVA_PAGES 256 > >> #endif > >> #endif > >> > >> Do you think it will harm things to apply your suggested default to this > >> file?> > > I would have to defer to someone who actually understands just what that > > parm is tuning. It was purely speculation on my part that the current > > default is adequate for less memory than I have, and I don't know what > > that downside might be for setting it too high. > > KVA pages is the amount of pages reserved for kernel address space: > > * Size of Kernel address space. This is the number of page table pages > * (4MB each) to use for the kernel. 256 pages == 1 Gigabyte. > * This **MUST** be a multiple of 4 (eg: 252, 256, 260, etc). > * For PAE, the page table page unit size is 2MB. This means that 512 pages > * is 1 Gigabyte. Double everything. It must be a multiple of 8 for PAE. > > It appears that our default for PAE leaves 1GB for kernel address to play > with? That's an interesting default. Wonder if it really makes sense for > PAE since the assumption is that you'll have >4GB ram in the box, wiring > down 1.5GB for kernel would seem to make sense? Probably make sense to ask > Peter or Alan on this.It's always been a 1GB/3GB split. It was never a problem until certain scaling defaults were changed to scale solely based on physical ram without regard for kva limits. With the current settings and layout of the userland address space between the zero-memory hole, the reservation for maxdsiz, followed by the ld-elf.so.1 space and shared libraries, there's just enough room to mmap a 2GB file and have a tiny bit of wiggle room left. With changing the kernel/user split to 1.5/2.5 then userland is more restricted and is typically around the 1.8/1.9GB range. You can get a large memory PAE system to boot with default settings by seriously scaling things down like kern.maxusers, mbufs limits, etc. However, we have run ref11-i386 and ref10-i386 in the cluster for 18+ months with a 1.5/2.5 split and even then we've run out of kva and we've hit a few pmap panics and things that appear to be fallout of bounce buffer problems. While yes, you can make it work, I am personally not convinced that it is reliable. My last i386 PAE machine died earlier this year with a busted scsi backplane for the drives. It went to the great server crusher.> Also wondering how bad it would be to make these tunables, I see they > trickle down quite a bit into the system, hopefully not defining some > static arrays, but I haven't dived down that far.They cause extensive compile time macro expansion variations that are exported to assembler code via genassym. KVA_PAGES is not a good candidate for a runtime tunable unless you like the pain of i386/locore.s and friends. -- Peter Wemm - peter at wemm.org; peter at FreeBSD.org; peter at yahoo-inc.com; KI6FJV UTF-8: for when a ' or ... just won\342\200\231t do\342\200\246 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: This is a digitally signed message part. URL: <http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20141215/d3c7de16/attachment.sig>
> On Dec 15, 2014, at 3:42 PM, Peter Wemm <peter at wemm.org> wrote: > >> On Sunday, December 14, 2014 10:53:14 AM Alfred Perlstein wrote: >>> On Dec 14, 2014, at 10:12 AM, Ian Lepore wrote: >>>> On Sun, 2014-12-14 at 10:09 -0800, Alfred Perlstein wrote: >>>>> On Dec 14, 2014, at 9:47 AM, Ian Lepore wrote: >>>>> This is an out of the blue FYI post to let people know that despite all >>>>> the misinformation you'll run across if you search for information on >>>>> FreeBSD PAE support, it (still) works just fine. I've been using it >>>>> (for reasons related to our build system and products at $work) since >>>>> 2006, and I can say unequivocally that it works fine on 6.x, 8.x, and >>>>> now 10.x (and presumably on the odd-numbered releases too but I've never >>>>> tried those). >>>>> >>>>> In my most recent testing with 10-stable, I found it was compatible with >>>>> drm2 and radeonkms drivers and I was able to run Xorg and gnome just >>>>> fine. All my devices, and apps, and even the linuxulator worked just >>>>> fine. >>>>> >>>>> One thing that changed somewhere between 8.4 and 10.1 is that I had to >>>>> add a kernel tuning option to my kernel config: >>>>> >>>>> option KVA_PAGES=768 # Default is 512 >>>>> >>>>> I suspect that the most frequent use of PAE is on laptops that have 4gb >>>>> and the default tuning is adequate for that. My desktop machine has >>>>> 12gb and I needed to bump up that value to avoid errors related to being >>>>> unable to create new kernel stacks. >>>> >>>> There already is a #define that is bifurcated based on PAE in pmap.h: >>>> >>>> #ifndef KVA_PAGES >>>> #ifdef PAE >>>> #define KVA_PAGES 512 >>>> #else >>>> #define KVA_PAGES 256 >>>> #endif >>>> #endif >>>> >>>> Do you think it will harm things to apply your suggested default to this >>>> file?> >>> I would have to defer to someone who actually understands just what that >>> parm is tuning. It was purely speculation on my part that the current >>> default is adequate for less memory than I have, and I don't know what >>> that downside might be for setting it too high. >> >> KVA pages is the amount of pages reserved for kernel address space: >> >> * Size of Kernel address space. This is the number of page table pages >> * (4MB each) to use for the kernel. 256 pages == 1 Gigabyte. >> * This **MUST** be a multiple of 4 (eg: 252, 256, 260, etc). >> * For PAE, the page table page unit size is 2MB. This means that 512 pages >> * is 1 Gigabyte. Double everything. It must be a multiple of 8 for PAE. >> >> It appears that our default for PAE leaves 1GB for kernel address to play >> with? That's an interesting default. Wonder if it really makes sense for >> PAE since the assumption is that you'll have >4GB ram in the box, wiring >> down 1.5GB for kernel would seem to make sense? Probably make sense to ask >> Peter or Alan on this. > > It's always been a 1GB/3GB split. It was never a problem until certain > scaling defaults were changed to scale solely based on physical ram without > regard for kva limits.Hmm the original patch I gave for that only changed scaling for machines with 64 bit pointers. Why was it that the 32 bit stuff was made to change?> > With the current settings and layout of the userland address space between the > zero-memory hole, the reservation for maxdsiz, followed by the ld-elf.so.1 > space and shared libraries, there's just enough room to mmap a 2GB file and > have a tiny bit of wiggle room left. > > With changing the kernel/user split to 1.5/2.5 then userland is more > restricted and is typically around the 1.8/1.9GB range. > > You can get a large memory PAE system to boot with default settings by > seriously scaling things down like kern.maxusers, mbufs limits, etc. > > However, we have run ref11-i386 and ref10-i386 in the cluster for 18+ months > with a 1.5/2.5 split and even then we've run out of kva and we've hit a few > pmap panics and things that appear to be fallout of bounce buffer problems. > > While yes, you can make it work, I am personally not convinced that it is > reliable. > > My last i386 PAE machine died earlier this year with a busted scsi backplane > for the drives. It went to the great server crusher.Oh I made dumb assumption that pae was 4/4 basically not split. Ok thanks.> >> Also wondering how bad it would be to make these tunables, I see they >> trickle down quite a bit into the system, hopefully not defining some >> static arrays, but I haven't dived down that far. > > They cause extensive compile time macro expansion variations that are exported > to assembler code via genassym. KVA_PAGES is not a good candidate for a > runtime tunable unless you like the pain of i386/locore.s and friends.Ouch. Ok. -Alfred.