On 4/14/21 2:22 AM, Simon Matter wrote:>>>> On 4/13/21 11:36 PM, Chris Schanzle via CentOS wrote:
>>>>> On 4/13/21 5:00 PM, Frank Cox wrote:
>>>>>> On Tue, 13 Apr 2021 22:29:26 +0200
>>>>>> Simon Matter wrote:
>>>>>>
>>>>>>> You could try running strace on the hanging process
so see what it's
>>>>>>> doing.
>>>>>> [frankcox at mutt temp]$ rsync -avv ../temp/ jeff:temp
>>>>>> opening connection using: ssh jeff rsync --server
-vvlogDtpre.iLsfxC
>>>>>> .
>>>> temp (7 args)
>>>>>> sending incremental file list
>>>>>> delta-transmission enabled
>>>>>> abc is uptodate
>>>>>> total: matches=0 hash_hits=0 false_alarms=0 data=0
>>>>>>
>>>>>> Leaving that sit there apparently doing nothing (but
still not giving
>>>>>> me my cursor back) I switched to another terminal
window and did the
>>>>>> following:
>>>>>>
>>>>>> [frankcox at mutt ~]$ ps -FA | grep rsync
>>>>>> frankcox 5400 2435 0 60586 3160 5 14:52 pts/0
00:00:00
>>>>>> rsync -avv ../temp/ jeff:temp
>>>>>> frankcox 5401 5400 0 67980 7440 1 14:52 pts/0
00:00:00
>>>>>> ssh
>>>>> jeff rsync --server -vvlogDtpre.iLsfxC . temp
>>>>>> frankcox 5526 5416 0 55476 1076 3 14:53 pts/1
00:00:00
>>>>>> grep --color=auto rsync
>>>>>>
>>>>>> [frankcox at mutt ~]$ strace -p 5401
>>>>>> strace: Process 5401 attached
>>>>>> select(11, [5 9 10], [], NULL, NULL
>>>>>>
>>>>>> Then it just sits there with no further action. I get
my cursor back
>>>>>> when I hit ctrl-c.
>>>>>>
>>>>>> [frankcox at mutt ~]$ strace -p 5400
>>>>>> strace: Process 5400 attached
>>>>>> restart_syscall(<... resuming interrupted nanosleep
...>) = 0
>>>>>> wait4(5401, 0x7ffd45105564, WNOHANG, NULL) = 0
>>>>>> nanosleep({tv_sec=0, tv_nsec=20000000}, NULL) = 0
>>>>>> wait4(5401, 0x7ffd45105564, WNOHANG, NULL) = 0
>>>>>> nanosleep({tv_sec=0, tv_nsec=20000000}, NULL) = 0
>>>>>> wait4(5401, 0x7ffd45105564, WNOHANG, NULL) = 0
>>>>>> nanosleep({tv_sec=0, tv_nsec=20000000}, NULL) = 0
>>>>>> wait4(5401, 0x7ffd45105564, WNOHANG, NULL) = 0
>>>>>> nanosleep({tv_sec=0, tv_nsec=20000000}, NULL) = 0
>>>>>> wait4(5401, 0x7ffd45105564, WNOHANG, NULL) = 0
>>>>>> nanosleep({tv_sec=0, tv_nsec=20000000}, NULL) = 0
>>>>>> wait4(5401, 0x7ffd45105564, WNOHANG, NULL) = 0
>>>>>> nanosleep({tv_sec=0, tv_nsec=20000000}, NULL) = 0
>>>>>> wait4(5401, 0x7ffd45105564, WNOHANG, NULL) = 0
>>>>>>
>>>>>> The wait4-etc line just keeps repeating endlessly until
I hit ctrl-c.
>>>>>>
>>>>>> Unfortunately, I have no idea what any of the above
actually means.
>>>>>> Does it tell us anything interesting?
>>>>> Yay!? I am glad someone else on the planet is experiencing
this.?
>>>>> I noticed this started happening to me after updating some
CentOS
>>>>> Linux
>>>> 8
>>>>> systems today.
>>>>>
>>>>> I discovered if I set ForwardX11=no (either on ssh command
line or in
>>>> ~/.ssh/config) the hang does not happen.? But why does that
matter?? No
>>>> updates to openssh.
>>>>> It is not the systemd update doing something silly with
session
>>>>> management.? I painfully downgraded manually and rebooted
to no
>>>>> effect.?
>>>>> As an aside, why can't we we have nice things in life
like 'dnf
>>>>> downgrade
>>>>> systemd\*' actually work?? I did the below - might be
dumb, but it
>>>> worked -- alternate suggestions to downgrade are appreciated -
>>>> searching
>>>> the list and my google-fu was off the mark today.
>>>>> ? cd [path-to-repo]/centos/8/BaseOS/x86_64/os/Packages
>>>>> ? dnf downgrade $(rpm -qa systemd\* | grep 239-41.el8_3.2 |
sed -e
>>>> 's/3\.2/3.1/' -e 's/^/.\//' -e
's/$/.rpm/')
>>>>> Chris
>>>>
>>>> [adjusted the subject, hope that is OK.]
>>>>
>>>> Found it!? It's the dbus update to 1.12.8-12.? Downgrade to
-11
>>>> and ssh connections close normally.
>>>>
>>>> To clarify the problem, with the new dbus, simple ssh's
like:
>>>>
>>>> ssh somehost uptime
>>>>
>>>> will print the uptime, but do not return to the local shell
prompt
>>>> until
>>>> you hit ctrl-c.? It works normally if you downgrade dbus or
>>>>
>>>> ssh -o forwardx11=no somehost uptime
>>>>
>>>> I'm sure a bug report exists somewhere, but that's
something to dig for
>>>> or
>>>> create tomorrow.
>>>>
>>>> To downgrade, packages were scattered in different locations,
so I
>>>> copied
>>>> them to one directory and did
>>>>
>>>> dnf downgrade ./*
>>>>
>>>> The packages I needed to downgrade on a? x86_64 system were:
>>>>
>>>> dbus-1.12.8-11.el8.x86_64.rpm
>>>> dbus-common-1.12.8-11.el8.noarch.rpm
>>>> dbus-daemon-1.12.8-11.el8.x86_64.rpm
>>>> dbus-devel-1.12.8-11.el8.x86_64.rpm
>>>> dbus-libs-1.12.8-11.el8.x86_64.rpm
>>>> dbus-tools-1.12.8-11.el8.x86_64.rpm
>>>> dbus-x11-1.12.8-11.el8.x86_64.rpm
>>> Now that's really interesting, I was wondering why I don't
see this on
>>> OL8. The thing is that certain OL8 packages have an additional RPM
>>> revision added like .0.1. Just checked dbus and its changelog
shows:
>>>
>>> * Tue Feb 16 2021 Kevin Lyons <kevin.x.lyons at oracle.com>
-1.12.8-12.0.1
>>> - bus: raise fd limits before dropping privs [Orabug: 31175643]
>>> - fix netlink poll: error 4 (Zhenzhong Duan)
>>>
>>> So OL is defnitly not 100% bug to bug compatible like the other
clones
>>> :-)
>>>
>>> And it makes me a bit worried why O* fixed this on Feb 16 and the
broken
>>> dbus packages are now (in April) installed on CentOS servers?
>> Sorry, maybe I'm wrong here and the OL8 addons are fixing other
things?
>> Could someone who experiences the issue test with the OL8 dbus
packages?
>>
> Could it be BZ #1940067?
>
>
https://gcc02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D1940067&data=04%7C01%7Cchristopher.schanzle%40nist.gov%7C33c18e2f06884a73d85508d8ff0dc2c4%7C2ab5d82fd8fa4797a93e054655c61dec%7C1%7C0%7C637539781864707918%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jFSxkP%2BWTZgq7VGAZGHXIWak7N%2BmP8SeGLelTTRUHv8%3D&reserved=0
Bullseye, Simon!? Many thanks.
A reasonable one-liner fix / workaround is below.? Also works when requesting a
terminal with 'ssh -Xt'.? Adds a "tty -s || return" line
in the right spot to check if a tty exists and if not, bail out w/o starting
dbus-launch.? Change "-i" to "-i.bak" to make a backup.
?sed -i '/SHLVL/atty -s || return' /etc/profile.d/ssh-x-forwarding.sh