Pekka Paalanen
2008-Apr-16 17:59 UTC
[Nouveau] [BUG/PATCH] x86 mmiotrace: dynamically disable non-boot CPUs
On Wed, 16 Apr 2008 13:46:09 +0200 Ingo Molnar <mingo at elte.hu> wrote:> > * Pekka Paalanen <pq at iki.fi> wrote: > > > > we should fix this restriction ASAP. Forcibly dropping to UP will > > > cause mmiotrace to be much less useful for diagnostic purposes of > > > Linux > > > > Ok, how do you propose we solve this? > > > > I have asked the question before, and then I had two ideas. Well, the > > first one was actually your idea (so I hear) to solve the same problem for > > kmemcheck. > > - per-cpu page tables > > - instead of single-stepping, emulate the faulting instruction and never > > disarm pages during tracing. (Use and modify code from KVM.) > > > > I don't believe either of these is easy or fast to implement. Given > > some months, I might be able to achieve emulation. Page tables are > > still magic to me. > > yeah - it looks complex. Not a showstopper for now :-) > > but given that Xorg is usually just a single task, do we _really_ need > this?We're not tracing Xorg at all. Mmiotrace still cannot catch accesses originating in user space. It is tracing MMIO accesses from within the kernel, and this means that IRQ services and device syscalls may be accessing the hardware at the same time. Vblank interrupts happen quite often, some GPU commands are actually emulated in kernel via interrupts and whatnot. The nvidia proprietary kernel blob is many times bigger than my bzImage! (A simple X startup and quit creates in the order of 1-2 million MMIO events.) As do we really need this, I think it might save a lot of head scratching when someone is reverse engineering a feature and gets every time a different trace due to some events being missed. But this is theory. So far everyone has been tracing with UP, and this has not been a problem. I have no idea if it would make a real difference. [Recap for nouveau@ list: mmiotrace has a race on SMP, where during instruction single stepping other CPUs can run freely on the page which the faulted instruction accessed. This causes some of the simultaneous accesses to the same page of the same iomem-mapping to be missed.] It does sound very rare. Nouveau people, what do you think, can this be a problem?> > > i suspect the bug is that you bring the CPU down from an atomic > > > (spinlocked or irq disabled) context. > > > > Hmm, it should not be... I have to double-check, but all the other > > code, too, from where enter_uniprocessor() is called, may sleep. The > > first thing the caller does is to acquire a mutex, which I assume > > would complain loudly if spinlocked or irq-disabled. > > > > Ingo, thank you for fixing this patch, though I'd like to suggest to > > leave it out for now, since there clearly are worse problems with it > > than without it. And if we can solve the SMP issue, this is not > > needed. For the time being we can just instruct users to disable all > > but one CPU when try want to trace. > > i think we still need to make this as 'transparent' to users as > possible. Disabling CPUs can be tedious.Compared to the out-of-tree mmiotrace, the in-kernel version is already a lot easier to use. Instructing people to drop to UP before tracing is simple compared to what it was.> i'm leaving out this patch from the series for now.Thanks. -- Pekka Paalanen http://www.iki.fi/pq/
Ingo Molnar
2008-Apr-16 18:32 UTC
[Nouveau] [BUG/PATCH] x86 mmiotrace: dynamically disable non-boot CPUs
* Pekka Paalanen <pq at iki.fi> wrote:> > yeah - it looks complex. Not a showstopper for now :-) > > > > but given that Xorg is usually just a single task, do we _really_ > > need this? > > We're not tracing Xorg at all. Mmiotrace still cannot catch accesses > originating in user space. It is tracing MMIO accesses from within the > kernel, and this means that IRQ services and device syscalls may be > accessing the hardware at the same time. Vblank interrupts happen > quite often, some GPU commands are actually emulated in kernel via > interrupts and whatnot. [...]ok, understood - i forgot about IRQ generated GPU accesses. In fact UP probably generates a more readable trace because DRM accesses from one CPU are not mixed up with IRQ completion from another CPU. So i think we need to fix your automatic-cpudown/cpuup patch. I tried that and it worked very intuitively and the cpus were disabled/enabled without any trouble - with ftrace based mmiotrace we now basically have something that most distros could enable by default without thinking twice about it. But if it means an UP kernel has to be used then it will be turned off immediately and the barrier to users will be huge again. I really envision mmiotrace to be usable by default on _any_ generic distro, without rebooting and without any hassle on the user's part. the automatic drop-to-single-CPU-when-tracing solution from you is OK - it will also test our CPU hotplug primitives some more ;-) And it's not like users expect a mmiotraced X session to be particularly fast, right? so lets fix those preemptability bugs. They show that the cpu-up/cpu-down ops are called from atomic context - it should normally be straightforward to sort out - there's no particular reason why the ->open()/->close() methods of an ftrace plugin should run in atomic context. Steve, any ideas where the atomicity might come from? Ingo
Stephane Marchesin
2008-Jul-24 15:34 UTC
[Nouveau] [BUG/PATCH] x86 mmiotrace: dynamically disable non-boot CPUs
On Wed, Apr 16, 2008 at 6:59 PM, Pekka Paalanen <pq at iki.fi> wrote:> On Wed, 16 Apr 2008 13:46:09 +0200 > Ingo Molnar <mingo at elte.hu> wrote: > >> >> * Pekka Paalanen <pq at iki.fi> wrote: >> >> > > we should fix this restriction ASAP. Forcibly dropping to UP will >> > > cause mmiotrace to be much less useful for diagnostic purposes of >> > > Linux >> > >> > Ok, how do you propose we solve this? >> > >> > I have asked the question before, and then I had two ideas. Well, the >> > first one was actually your idea (so I hear) to solve the same problem for >> > kmemcheck. >> > - per-cpu page tables >> > - instead of single-stepping, emulate the faulting instruction and never >> > disarm pages during tracing. (Use and modify code from KVM.) >> > >> > I don't believe either of these is easy or fast to implement. Given >> > some months, I might be able to achieve emulation. Page tables are >> > still magic to me. >> >> yeah - it looks complex. Not a showstopper for now :-) >> >> but given that Xorg is usually just a single task, do we _really_ need >> this? > > We're not tracing Xorg at all. Mmiotrace still cannot catch accesses > originating in user space. It is tracing MMIO accesses from within > the kernel, and this means that IRQ services and device syscalls > may be accessing the hardware at the same time. Vblank interrupts > happen quite often, some GPU commands are actually emulated in > kernel via interrupts and whatnot. The nvidia proprietary kernel blob > is many times bigger than my bzImage! > > (A simple X startup and quit creates in the order of 1-2 million > MMIO events.) > > As do we really need this, I think it might save a lot of head > scratching when someone is reverse engineering a feature and gets > every time a different trace due to some events being missed. > But this is theory. So far everyone has been tracing with UP, > and this has not been a problem. I have no idea if it would make > a real difference. > > [Recap for nouveau@ list: > mmiotrace has a race on SMP, where during instruction single stepping > other CPUs can run freely on the page which the faulted instruction > accessed. This causes some of the simultaneous accesses to the same > page of the same iomem-mapping to be missed.] > > It does sound very rare. Nouveau people, what do you think, can this > be a problem? >In the nvidia case, I don't think this would happen. The register ranges for different purposes are set apart by more than 1 page usually, so the risk of accessing a page that's been faulted in is probably extremely low. Not to mention that the design of the binary module doesn't use threads currently (only tasklets for interrupt handlers, this might be the corner case but again the interrupt handler doesn't touch the same reg families). Stephane
Reasonably Related Threads
- MmioTrace: Using the Instruction Decoder, etc.
- [PATCH] kmmio/mmiotrace: fix double free of kmmio_fault_pages
- [PATCHv2] kmmio/mmiotrace: fix double free of kmmio_fault_pages
- MmioTrace: Using the Instruction Decoder, etc.
- MmioTrace: Using the Instruction Decoder, etc.