Christian König
2021-Feb-10 10:34 UTC
[Nouveau] drm/nouneau: 5.11 cycle regression bisected to 461619f5c324 "drm/nouveau: switch to new allocator"
Hi Mike, do you have more information than just system stuck in a loop? What seems to happen here is that your system is low on resources and we just try to free up pages. Previously the allocation would just have failed with an out of memory condition. Regards, Christian. Am 10.02.21 um 11:13 schrieb Mike Galbraith:> Greetings, > > The symptom is tasks stuck waiting for lord knows what by calling > sched_yield() in a loop (less than wonderful, sched_yield() sucks). > After boot to KDE login, I immediately see tracker-extract chewing cpu > in aforementioned loop. Firing up evolution and poking 'new' to > compose, WebKitWebProcess joins in the yield loop fun. > > Hand rolled reverts of 256dd44b "drm/ttm: nuke old page allocator" and > the fingered commit cures the problem for me at 207665fd in the bisect > log below, and at master and tip HEAD. > > There's a "things that make ya go hmm" aspect to this thing though. If > you look at the bisect log below, the starting "bad" is 207665fd. That > commit DOES NOT exhibit the yield loop symptom immediately out of the > box, but DOES after applying the much needed fix... > > 660a59953f4f "drm/nouveau: fix multihop when move doesn't work" > > ...to prevent an earlier regression from quickly appearing, one which > Dave will likely recall having fixed. Relevant? No idea, but seems > worth mentioning. > > Box: aging generic i4790 box with its equally aged Nvidia GTX 980. > > > 461619f5c3242aaee9ec3f0b7072719bd86ea207 is the first bad commit > commit 461619f5c3242aaee9ec3f0b7072719bd86ea207 > Author: Christian K?nig <christian.koenig at amd.com> > Date: Sat Oct 24 13:13:25 2020 +0200 > > drm/nouveau: switch to new allocator > > It should be able to handle all cases now. > > Signed-off-by: Christian K?nig <christian.koenig at amd.com> > Reviewed-by: Dave Airlie <airlied at redhat.com> > Reviewed-by: Madhav Chauhan <madhav.chauhan at amd.com> > Tested-by: Huang Rui <ray.huang at amd.com> > Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2F397082%2F%3Fseries%3D83051%26rev%3D1&data=04%7C01%7Cchristian.koenig%40amd.com%7C8af0b5f635fe41d2eab508d8cdac7c5c%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637485488031207323%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7%2Bi9YU1C7NpS%2FMC0FDrpcVtsZU6MHRUrvPPzdq6Q40A%3D&reserved=0 > > drivers/gpu/drm/nouveau/nouveau_bo.c | 30 ++---------------------------- > drivers/gpu/drm/nouveau/nouveau_drv.h | 1 - > 2 files changed, 2 insertions(+), 29 deletions(-) > > git bisect start > # good: [2c85ebc57b3e1817b6ce1a6b703928e113a90442] Linux 5.10 > git bisect good 3f995f8e0b540342612d3f6b1fc299f5bf486987 > # bad: [207665fd37561f97591e74d0ee80f24bdf06b789] Merge tag 'exynos-drm-next-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next > git bisect bad 207665fd37561f97591e74d0ee80f24bdf06b789 > # good: [f8394f232b1eab649ce2df5c5f15b0e528c92091] Linux 5.10-rc3 > git bisect good f8394f232b1eab649ce2df5c5f15b0e528c92091 > # good: [b3bf99daaee96a141536ce5c60a0d6dba6ec1d23] drm/i915/display: Defer initial modeset until after GGTT is initialised > git bisect good b3bf99daaee96a141536ce5c60a0d6dba6ec1d23 > # good: [dfbbfe3c17651fa0fcf2658fb90317df08e52bb2] drm/amd/display: Add formats for DCC with 2/3 planes. > git bisect good dfbbfe3c17651fa0fcf2658fb90317df08e52bb2 > # bad: [112e505a76de69f8667e2fe8da38433f754364a8] Merge drm/drm-next into drm-misc-next > git bisect bad 112e505a76de69f8667e2fe8da38433f754364a8 > # bad: [49a3f51dfeeecb52c5aa28c5cb9592fe5e39bf95] drm/gem: Use struct dma_buf_map in GEM vmap ops and convert GEM backends > git bisect bad 49a3f51dfeeecb52c5aa28c5cb9592fe5e39bf95 > # bad: [d7e0798925ea9272f8c8e66ceb1f7c51823e50ab] dt-bindings: display: bridge: Intel KeemBay DSI > git bisect bad d7e0798925ea9272f8c8e66ceb1f7c51823e50ab > # bad: [c489573b5b6ce6442ad4658d9d5ec77839b91622] Merge drm/drm-next into drm-misc-next > git bisect bad c489573b5b6ce6442ad4658d9d5ec77839b91622 > # bad: [8567d51555c12d169c4e0f796030051fff1c318d] drm/vmwgfx: switch to new allocator > git bisect bad 8567d51555c12d169c4e0f796030051fff1c318d > # good: [5144eead3f8c80ac7f913c07139442fede94003e] drm: xlnx: Use dma_request_chan for DMA channel request > git bisect good 5144eead3f8c80ac7f913c07139442fede94003e > # good: [e93b2da9799e5cb97760969f3e1f02a5bdac29fe] drm/amdgpu: switch to new allocator v2 > git bisect good e93b2da9799e5cb97760969f3e1f02a5bdac29fe > # bad: [461619f5c3242aaee9ec3f0b7072719bd86ea207] drm/nouveau: switch to new allocator > git bisect bad 461619f5c3242aaee9ec3f0b7072719bd86ea207 > # good: [0fe3cf3a53b5c1205ec7d321be1185b075dff205] drm/radeon: switch to new allocator v2 > git bisect good 0fe3cf3a53b5c1205ec7d321be1185b075dff205 > # first bad commit: [461619f5c3242aaee9ec3f0b7072719bd86ea207] drm/nouveau: switch to new allocator >
Mike Galbraith
2021-Feb-10 10:40 UTC
[Nouveau] drm/nouneau: 5.11 cycle regression bisected to 461619f5c324 "drm/nouveau: switch to new allocator"
On Wed, 2021-02-10 at 11:34 +0100, Christian K?nig wrote:> Hi Mike, > > do you have more information than just system stuck in a loop?No, strace shows no syscalls but sched_yield(). -Mike
Christian König
2021-Feb-10 10:42 UTC
[Nouveau] drm/nouneau: 5.11 cycle regression bisected to 461619f5c324 "drm/nouveau: switch to new allocator"
Am 10.02.21 um 11:40 schrieb Mike Galbraith:> On Wed, 2021-02-10 at 11:34 +0100, Christian K?nig wrote: >> Hi Mike, >> >> do you have more information than just system stuck in a loop? > No, strace shows no syscalls but sched_yield().Well you can try to comment out the call to register_shrinker() in ttm_pool.c, but apart from that I don't have much ideas. Christian.> > -Mike >
Mike Galbraith
2021-Feb-10 10:46 UTC
[Nouveau] drm/nouneau: 5.11 cycle regression bisected to 461619f5c324 "drm/nouveau: switch to new allocator"
On Wed, 2021-02-10 at 11:34 +0100, Christian K?nig wrote:> > What seems to happen here is that your system is low on resources and we > just try to free up pages.FWIW, box has oodles generic ram free right after boot. -Mike