thr3ads.net - CentOS - [CentOS] CentOS 7, systemd, NetworkMangler, oh, my [Feb 2017]

If this information is useful, please help other people find it:
Share via:

m.roth at 5-cent.us

2017-Feb-13 15:35 UTC

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

My manager tells me a system in the datacenter is down. I go down there,
and plug in a monitor-on-a-stick and keyboard. It's up, but no network. I
try systemctl restart NetworkManager several times, and ip a shows *no*
change.

Finally, I do an ifdown, followed by an ifup, and everything's wonderful.

My manager thinks that the NM daemon thinks everything's fine, and
there've been no changes, so it does nothing. He suggests that it might
have to be stopped, then started, rather than restarted.

This is completely unacceptable behavior, since it leave the system with
no network connection. Pre-systemd, as we all know, restart *RESTARTED*
the damn thing.

Is there some Magic (#insert "pixie-dust-sparkles") incantation,
either
restarting NetworkManager, or using nm-cli, to force it to perform the
expected actions?

Btw, if this is supposed to be part of the "hide stuff, desktop Linux
users don't need to know this stuff", this is a *much* worse result.

    mark (and yes, my manager's truly aggravated about this, also)

James Hogarth

2017-Feb-13 16:13 UTC

head link

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

On 13 February 2017 at 15:35,  <m.roth at 5-cent.us>
wrote:> My manager tells me a system in the datacenter is down. I go down there,
> and plug in a monitor-on-a-stick and keyboard. It's up, but no network.
I
> try systemctl restart NetworkManager several times, and ip a shows *no*
> change.
>
> Finally, I do an ifdown, followed by an ifup, and everything's
wonderful.
>
> My manager thinks that the NM daemon thinks everything's fine, and
> there've been no changes, so it does nothing. He suggests that it might
> have to be stopped, then started, rather than restarted.
>
> This is completely unacceptable behavior, since it leave the system with
> no network connection. Pre-systemd, as we all know, restart *RESTARTED*
> the damn thing.
>
> Is there some Magic (#insert "pixie-dust-sparkles") incantation,
either
> restarting NetworkManager, or using nm-cli, to force it to perform the
> expected actions?
>

I'd be interested in the journal from the NetworkManager restart as
that's not the way it behaves ... it uses the netlink API to get state
and not it's own internal tracker of state (ie doing an ip link down
will reflect in nmcli output) ... a restart of NetworkManager should
not ignore interfaces but rather bring the system to the on disk
configured state ... and a quick check it doesn't override ExecRestart
in the unit file to do a reload or similar instead ...

And indeed a quick test in a VM shows nmcli device status correctly
changing between connected and unavailable when doing ip link set eth0
down/up

Do note that on a NM based system ifup and ifdown are effectively
aliases to nmcli conn down and nmcli conn up

nmcli conn down "connection name" will make it disconnected
nmcli conn up "connection mame" will bring it back to connected

there is a slight interesting difference between using nmcli and ip
link set though ...

with ip link set down <interface> the interface is marked
administratively down (as if you've pulled the cable) but nmcli conn
down "connection name" will unconfigure the interface but leave it in
an UP state ... just without an IP address etc

anyway that's just an interesting diversion on behavioural differences

NM won't change an interface state without some sort of event though
(manual or virtual cable pulled etc), and if you have a case where it
*has* done that then you have found a bug that would be great to get
reported

TL;DR: cannot reproduce, need logs to determine what happened without
a working crystal ball

peter.winterflood

2017-Feb-13 16:17 UTC

head link

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

On 13/02/17 15:35, m.roth wrote:> My manager tells me a system in the datacenter is down. I go down there,
> and plug in a monitor-on-a-stick and keyboard. It's up, but no network.
I
> try systemctl restart NetworkManager several times, and ip a shows *no*
> change.
>
> Finally, I do an ifdown, followed by an ifup, and everything's
wonderful.
>
> My manager thinks that the NM daemon thinks everything's fine, and
> there've been no changes, so it does nothing. He suggests that it might
> have to be stopped, then started, rather than restarted.
>
> This is completely unacceptable behavior, since it leave the system with
> no network connection. Pre-systemd, as we all know, restart *RESTARTED*
> the damn thing.
>
> Is there some Magic (#insert "pixie-dust-sparkles") incantation,
either
> restarting NetworkManager, or using nm-cli, to force it to perform the
> expected actions?
>
> Btw, if this is supposed to be part of the "hide stuff, desktop Linux
> users don't need to know this stuff", this is a *much* worse
result.
>
>      mark (and yes, my manager's truly aggravated about this, also)
>
>
>
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos
there's a really good solution to this.

yum remove NetworkManager*

chkconfig network on

service network start

and yes thats all under fedora 25, and centos 7.

works like a charm.

sometimes removing NM leaves resolv.conf pointing to the networkmanager 
directory, and its best to check this, and replace your resolv.conf link 
with a file with the correct settings.

sorry if this upsets the people who maintain network mangler, but its 
inappropriate on a server.

regards peter

James Hogarth

2017-Feb-13 16:49 UTC

head link

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

On 13 February 2017 at 16:17, peter.winterflood
<peter.winterflood at ossi.co.uk> wrote:>
>
>
> there's a really good solution to this.
>
> yum remove NetworkManager*
>
> chkconfig network on
>
> service network start
>
> and yes thats all under fedora 25, and centos 7.
>
> works like a charm.
>
> sometimes removing NM leaves resolv.conf pointing to the networkmanager
> directory, and its best to check this, and replace your resolv.conf link
> with a file with the correct settings.
>
> sorry if this upsets the people who maintain network mangler, but its
> inappropriate on a server.
>
>
This is terribly bad advice I'm afraid ...

https://access.redhat.com/solutions/783533

The legacy network service is a fragile compilation of shell scripts
(which is why certain changes like some bonding or tagging alterations
require a full system restart or very careful unpicking manually with
ip) and is effectively deprecated in RHEL at this time due to major
bug fixes only but no feature work.

You really should have a read through this as well:

https://www.hogarthuk.com/?q=node/8

On EL6 yes NM should be removed on anything but a wifi system but on
EL7 unless you fall into a specific edge case as per the network docs:

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html-single/Networking_Guide/index.html

you really should be using NM for a variety of reasons.

Incidentally Mark, this had nothing to do with systemd ... I wish you
would pick your topics a little more appropriately rather than
tempting the usual flames.

Gordon Messmer

2017-Feb-13 17:15 UTC

head link

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

On 02/13/2017 07:35 AM, m.roth at 5-cent.us wrote:> Finally, I do an ifdown, followed by an ifup, and everything's
wonderful.
What's in /etc/sysconfig/network-scripts/ifcfg-<interface>? Does it
say
NM_CONTROLLED=no?
> My manager thinks that the NM daemon thinks everything's fine, and
> there've been no changes, so it does nothing. He suggests that it might
> have to be stopped, then started, rather than restarted.
"systemctl restart NetworkManager" completely stops the service and 
starts it again.
> This is completely unacceptable behavior, since it leave the system with
> no network connection. Pre-systemd, as we all know, restart *RESTARTED*
> the damn thing.
Still does.

Johnny Hughes

2017-Feb-13 17:26 UTC

head link

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

On 02/13/2017 11:15 AM, Gordon Messmer wrote:> On 02/13/2017 07:35 AM, m.roth at 5-cent.us wrote:
>> Finally, I do an ifdown, followed by an ifup, and everything's
wonderful.
> 
> What's in /etc/sysconfig/network-scripts/ifcfg-<interface>? Does
it say
> NM_CONTROLLED=no?
or

onboot=no

<snip>


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.centos.org/pipermail/centos/attachments/20170213/7495ce7d/attachment-0001.sig>

m.roth at 5-cent.us

2017-Feb-13 18:29 UTC

head link

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

James Hogarth wrote:> On 13 February 2017 at 15:35,  <m.roth at 5-cent.us> wrote:
>> My manager tells me a system in the datacenter is down. I go down
there,
>> and plug in a monitor-on-a-stick and keyboard. It's up, but no
network.
>> I
>> try systemctl restart NetworkManager several times, and ip a shows *no*
>> change.
>>
>> Finally, I do an ifdown, followed by an ifup, and everything's
>> wonderful.
>>
>> My manager thinks that the NM daemon thinks everything's fine, and
>> there've been no changes, so it does nothing. He suggests that it
might
>> have to be stopped, then started, rather than restarted.
>>
>> This is completely unacceptable behavior, since it leave the system
with
>> no network connection. Pre-systemd, as we all know, restart *RESTARTED*
>> the damn thing.
>>
>> Is there some Magic (#insert "pixie-dust-sparkles")
incantation, either
>> restarting NetworkManager, or using nm-cli, to force it to perform the
>> expected actions?
>>
>
>
> I'd be interested in the journal from the NetworkManager restart as
> that's not the way it behaves ... it uses the netlink API to get state
> and not it's own internal tracker of state (ie doing an ip link down
> will reflect in nmcli output) ... a restart of NetworkManager should
> not ignore interfaces but rather bring the system to the on disk
> configured state ... and a quick check it doesn't override ExecRestart
> in the unit file to do a reload or similar instead ...
>
> And indeed a quick test in a VM shows nmcli device status correctly
> changing between connected and unavailable when doing ip link set eth0
> down/up
>
> Do note that on a NM based system ifup and ifdown are effectively
> aliases to nmcli conn down and nmcli conn up
>
> nmcli conn down "connection name" will make it disconnected
> nmcli conn up "connection mame" will bring it back to connected
>
> there is a slight interesting difference between using nmcli and ip
> link set though ...
>
> with ip link set down <interface> the interface is marked
> administratively down (as if you've pulled the cable) but nmcli conn
> down "connection name" will unconfigure the interface but leave
it in
> an UP state ... just without an IP address etc
>
> anyway that's just an interesting diversion on behavioural differences
>
> NM won't change an interface state without some sort of event though
> (manual or virtual cable pulled etc), and if you have a case where it
> *has* done that then you have found a bug that would be great to get
> reported
>
> TL;DR: cannot reproduce, need logs to determine what happened without
> a working crystal ball
>From journalctl, I see this happening when I do systemctl restartNetworkManager (much edited)
Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> 
[1486997272.7755] manager: (em1): new Ethernet device
(/org/freedesktop/NetworkManager/Devi
Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> 
[1486997272.7791] ifcfg-rh: add connection in-memory
(79d3ed9d-cc41-498c-9169-44320e332f68,
Feb 13 09:47:52 <servername> systemd[1]: Started Hostname Service.
Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> 
[1486997272.7797] device (em1): state change: unmanaged -> unavailable
(reason 'connection-
Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> 
[1486997272.7805] device (em1): state change: unavailable -> disconnected
(reason 'connecti
<...>
eb 13 09:47:52 <servername> NetworkManager[67312]: <info> 
[1486997272.7986] device (em1): state change: disconnected -> prepare
(reason 'none') [30 4
Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> 
[1486997272.7999] policy: set 'em1' (em1) as default for IPv6 routing
and
DNS
Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> 
[1486997272.8027] device (em1): state change: prepare -> config (reason
'none') [40 50 0]
Feb 13 09:47:52 <servername> NetworkManager[67312]: <info> 
[1486997272.8034] device (em1): state change: config -> ip-config (reason
'none') [50 70 0]
Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> 
[1486997273.3594] device (em1): state change: ip-config -> ip-check
(reason 'none') [70 80
Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> 
[1486997273.3661] device (em1): state change: ip-check -> secondaries
(reason 'none') [80 9
Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> 
[1486997273.3666] device (em1): state change: secondaries -> activated
(reason 'none') [90
Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> 
[1486997273.3667] manager: NetworkManager state is now CONNECTED_GLOBAL
Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> 
[1486997273.3670] manager: NetworkManager state is now CONNECTED_SITE
Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> 
[1486997273.3670] manager: NetworkManager state is now CONNECTED_GLOBAL
Feb 13 09:47:53 <servername> nm-dispatcher[67317]: req:2
'connectivity-change': new request (6 scripts)
Feb 13 09:47:53 <servername> nm-dispatcher[67317]: req:2
'connectivity-change': start running ordered scripts...
Feb 13 09:47:53 <servername> NetworkManager[67312]: <info> 
[1486997273.3697] device (em1): Activation: successful, device activated.

Note there is no IP address being obtained. Now, when I run ifdown/ifup:

Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> 
[1486997297.6804] device (em1): Activation: starting connection 'em1'
(c432eaa1-023b-4f1f-a
Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> 
[1486997297.6809] audit: op="connection-activate"
uuid="c432eaa1-023b-4f1f-a7b5-4605ec07195
Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> 
[1486997297.6810] device (em1): state change: disconnected -> prepare
(reason 'none') [30 4
Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> 
[1486997297.6811] manager: NetworkManager state is now CONNECTING
Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> 
[1486997297.6816] device (em1): state change: prepare -> config (reason
'none') [40 50 0]
Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> 
[1486997297.6858] device (em1): state change: config -> ip-config (reason
'none') [50 70 0]
Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> 
[1486997297.6869] dhcp4 (em1): activation: beginning transaction (timeout
in 45 seconds)
Feb 13 09:48:17 <servername> NetworkManager[67312]: <info> 
[1486997297.6900] dhcp4 (em1): dhclient started with pid 67715
Feb 13 09:48:17 <servername> dhclient[67715]: DHCPDISCOVER on em1 to
255.255.255.255 port 67 interval 6 (xid=0x745ba623)
Feb 13 09:48:17 <servername> dhclient[67715]: DHCPREQUEST on em1 to
255.255.255.255 port 67 (xid=0x745ba623)
Feb 13 09:48:17 <servername> dhclient[67715]: DHCPOFFER from <DHCP
server>
Feb 13 09:48:17 <servername> dhclient[67715]: DHCPACK from <DHCP
server>
(xid=0x745ba623)

And it then gets an IP address.

And looking at /var/log/messages, it *appears* that the restart never
invokes the dhclient script, while ifup does.

       mark

m.roth at 5-cent.us

2017-Feb-13 18:31 UTC

head link

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

peter.winterflood wrote:> On 13/02/17 15:35, m.roth wrote:
>> My manager tells me a system in the datacenter is down. I go down
there,
>> and plug in a monitor-on-a-stick and keyboard. It's up, but no
network.
>> I try systemctl restart NetworkManager several times, and ip a shows
*no*
>> change.
>>
>> Finally, I do an ifdown, followed by an ifup, and everything's
>> wonderful.
>>
>> My manager thinks that the NM daemon thinks everything's fine, and
>> there've been no changes, so it does nothing. He suggests that it
might
>> have to be stopped, then started, rather than restarted.
>>
>> This is completely unacceptable behavior, since it leave the system
with
>> no network connection. Pre-systemd, as we all know, restart *RESTARTED*
>> the damn thing.
>>
>> Is there some Magic (#insert "pixie-dust-sparkles")
incantation, either
>> restarting NetworkManager, or using nm-cli, to force it to perform the
>> expected actions?
>>
>> Btw, if this is supposed to be part of the "hide stuff, desktop
Linux
>> users don't need to know this stuff", this is a *much* worse
result.
>>
>>      mark (and yes, my manager's truly aggravated about this, also)
>
> there's a really good solution to this.
>
> yum remove NetworkManager*
>
> chkconfig network on
>
> service network start
>
> and yes thats all under fedora 25, and centos 7.
>
> works like a charm.
>
> sometimes removing NM leaves resolv.conf pointing to the networkmanager
> directory, and its best to check this, and replace your resolv.conf link
> with a file with the correct settings.
>
> sorry if this upsets the people who maintain network mangler, but its
> inappropriate on a server.
>That't'd be a 100% agreement, good buddy.... We may have done it on some
systems, but in general, we appear to be stuck with the damn thing.

And why the *hell* would a server want wifi enabled, or avahi-daemon
running by default?

        mark

m.roth at 5-cent.us

2017-Feb-13 18:35 UTC

head link

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

Gordon Messmer wrote:> On 02/13/2017 07:35 AM, m.roth at 5-cent.us wrote:
>> Finally, I do an ifdown, followed by an ifup, and everything's
>> wonderful.
>
> What's in /etc/sysconfig/network-scripts/ifcfg-<interface>? Does
it say
> NM_CONTROLLED=no?
>Good catch. No, it doesn't say no... because the line was commented out.
I've just uncommented it, and set it to yes.
<snip>
    mark

Seemingly Similar Threads

Search for more seemingly similar threads

CentOS - Feb 2017 - CentOS 7, systemd, NetworkMangler, oh, my

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

[CentOS] CentOS 7, systemd, NetworkMangler, oh, my

Seemingly Similar Threads