Adam Nielsen
2020-Apr-14 12:18 UTC
[LightDM] lightdm doesn't work via systemd, but does via shell
Hi all, I'm really stuck trying to troubleshoot an issue with lightdm. Basically if I run lightdm from the shell as root (via SSH) then it launches X11 properly and presents the greeter with a login prompt, I can log in, and everything works as expected. However if I run 'systemctl start lightdm', or reboot the machine where systemd tries to launch lightdm during startup, then X11 loads, lightdm loads, but the lightdm-gtk-greeter locks up and never draws anything on the screen. I'm just left sitting there at a black screen. After some debugging I have confirmed that X11 is working fine despite the black screen, as I can get a terminal to appear by running this: XAUTHORITY=/run/lightdm/root/:0 DISPLAY=:0 urxvt lightdm is also fine, with the different between the systemd launch and the shell launch is that only the working shell launch causes lightdm to print log messages about the greeter connecting to it. When launched via systemd, lightdm mentions switching to VT7 but then says nothing about any greeter trying to connect. Replacing lightdm-gtk-greeter with a shell script that runs strace on the original binary reveals that when launched from systemd and locking, the strace output ends here: connect(6, {sa_family=AF_UNIX, sun_path="/run/user/620/bus"}, 19) = 0 sendto(6, "AUTH EXTERNAL 363230\r\n", 22, MSG_NOSIGNAL, NULL, 0) = 22 poll([{fd=6, events=POLLIN}], 1, -1 When running via the shell instead however, these lines appear instead as: connect(7, {sa_family=AF_UNIX, sun_path="/var/run/dbus/system_bus_socket"}, 110) = 0 sendto(7, "AUTH EXTERNAL 363230\r\n", 22, MSG_NOSIGNAL, NULL, 0) = 22 recvfrom(7, "OK 7f94aa8da629128be07028bc5e93e"..., 4096, 0, NULL, NULL) = 37 Is someone able to advise what the lightdm GTK greeter is doing when it connects to this socket and sends AUTH EXTERNAL to it? What is it trying to do? What is the socket for? How does it know to connect to a system DBus socket when I run it as root from the shell, but when systemd launches it it instead looks at a bus file in the /run/user/620/ folder? This folder is owned by and corresponds to the lightdm user, yet if I try to run the lightdm process as that user it tells me it can only be started as root unless I run in test mode. What's stranger is that I'm migrating a machine to new hardware, so this config was (and still is) working on the old hardware, but after using rsync to copy everything onto the new machine's disk, now suddenly lightdm won't work (but apparently everything else does). Everything else is the same so I can't figure out why I'm seeing different behaviour. I'm really stuck as I'm not sure where else to go from here, so if anyone is able to give me any pointers I'd really appreciate it! Many thanks, Adam.
Adam Nielsen
2020-Jul-28 10:01 UTC
[LightDM] lightdm doesn't work via systemd, but does via shell
Hi all, Just following up on this in case anyone else encounters a similar issue in future.> Basically if I run lightdm from the shell as root (via SSH) then it > launches X11 properly and presents the greeter with a login prompt, I > can log in, and everything works as expected. > > However if I run 'systemctl start lightdm', or reboot the machine > where systemd tries to launch lightdm during startup, then X11 loads, > lightdm loads, but the lightdm-gtk-greeter locks up and never draws > anything on the screen. I'm just left sitting there at a black > screen.The hint here turned out to be the fact that it worked as root. This points at a permissions issue, and in my case checking things like /dev/null showed the permissions to be wrong, which caused this problem. Ultimately it was a malfunctioning udev rule I had added (my fault) that incorrectly changed the permissions of a bunch of unrelated nodes in /dev. I hadn't rebooted the source machine since adding the rule, so the first time it had been used during the boot process was on the destination machine. I didn't realise this at the time and it confused me further, thinking the two machines should be identical! Removing the errant rule and rebooting fixed the problem and got LightDM working perfectly again. Cheers, Adam.