Sean
2019-Jan-09 20:40 UTC
[CentOS] high kworker CPU usage in 3.10.0-957 w/ Xorg nouveau driver?
Hi all, I have a number of Gnome/X desktop workstations with NVidia GeForce GT 1030 adapters, dual monitors, Core I7 3770 quad-core hyper-threaded CPUs, with 32GB of RAM. Most (haven't checked them all yet) are exhibiting problems that include significant sluggish-ness with mouse movement and typing as well as screen rendering problems happening since upgrading from kernel 3.10.0-862.14.4.el7.x86_64 to 3.10.0-957.1.3.el7.x86_64. The users have seen this behavior after logging into Gnome, but with out any additional applications running (Chrome/Firefox/LibreOffice, etc.). I can see in top that there are multiple kworker processes consuming a large amount of CPU time and unusually high load averages - like 5-7 range on the 5 minute average, normal load average would be between 1-2 for these users. At one point, while troubleshooting with a user, I was logged in remotely while the user was working on the desktop when it became completely unresponsive. /var/log/messages had nouveau messages like: kernel: nouveau: evo channel stalled kernel: nouveau 0000:01:00.0: disp: chid 1 mthd 0000 data 00000000 10003000 00000000 kernel: nouveau 0000:01:00.0: DRM: base-1: timeout kernel: nouveau 0000:01:00.0: DRM: core notifier timeout Those messages might be meaningless, but they are abundant in the logs. For grins before rebooting, I attempted to stop and start GDM. Both operations seemed successful, I verified all processes owned by the user were gone, and asked him to log in again, but he reported his screens still looked like they did before I restarted GDM and that he didn't have a login screen. Users are currently booting their systems to the 3.10.862 kernel, and this problem does not present itself. I can also add that running the proprietary nvidia driver (from nvidia.com, not elrepo) version 410.78 does not produce this problem. I config manage all these desktops with Puppet and they were all built from by the same kickstart file. The nvidia driver is not purposefully managed by puppet, I just happened to be experimenting with it on my workstation. Before I load the proprietary driver on all the problematic systems, I was hoping someone on the list might have some insight or suggestions. Thanks! --Sean
Possibly Parallel Threads
- ZFS fails with C7 957
- ZFS fails with C7 957
- allocation error and high CPU usage from kworker and migration: memory fragmentation?
- Centos 7 kworker uses 100% of single core on mulit-core processor usage inquiry
- Centos 7 kworker uses 100% of single core on mulit-core processor usage inquiry