Vadim Bulst
2017-Nov-23 20:51 UTC
[CentOS] Cluster installation CentOS 7.4 network problems
Hi there, after using Foreman successful on our clusters for more than a year. I'd like to reinstall a 90 node cluster with Centos 7.4. It's now running on Centos 7.3 . I'm not able to just update to 7.4 because of zfsonlinux dependencies and well - some nodes died and had to bare metal install them. So I was able to install these nodes successfully by pxe-booting and using a regular CentOS mirror with foreman and kickstart. After the final reboot the nodes got no network connection at all and puppet wasn't able to pull of course. After logging in locally and restart NetworkManager the connection came up - sometimes on the first try sometimes on the second try. I never discovered such behavior with Centos 7.3 or 7.2. Network properties: DHCP, MTU 9000 DHCP-Server not Foreman managed, on different network TFTP-Server Foreman managed, on different network I've read one thread on stackexchange which describes a simular problem using a kickstart installation and dhcp network configuration on Centos 7.4 https://unix.stackexchange.com/questions/396096/centos-7-network-service-failed-to-start-because-systemd-starts-the-daemon-too Does any body of you discovered similar problems? This is what my provisioning template / kickstart template looks like: install url --url http://mirror.centos.org/centos/7.4.1708/os/x86_64 --proxy=http://proxy.uni-leipzig.de:3128 lang en_US.UTF-8 selinux --enforcing keyboard de skipx network --bootproto dhcp --hostname galaxy110.sc.uni-leipzig.de --device=somemacaddress rootpw --iscrypted foo firewall --service=ssh authconfig --useshadow --passalgo=SHA256 --kickstart timezone --utc Europe/Berlin services --disabled gpm,sendmail,cups,pcmcia,isdn,rawdevices,hpoj,bluetooth,openibd,avahi-daemon,avahi-dnsconfd,hidd,hplip,pcscd bootloader --location=mbr --append="nofb quiet splash=quiet" zerombr clearpart --initlabel --all ignoredisk --only-use=sda part biosboot --size 1 --fstype=biosboot --asprimary part / --fstype=xfs --size=20480 --asprimary --ondisk=sda part swap --size=131072 --ondisk=sda part /var/log --fstype=xfs --size=10240 --ondisk=sda part /home --fstype=xfs --size=10240 --grow --ondisk=sda text reboot %packages yum dhclient ntp wget @Core redhat-lsb-core %end %post --nochroot exec < /dev/tty3 > /dev/tty3 #changing to VT 3 so that we can see whats going on.... /usr/bin/chvt 3 ( cp -va /etc/resolv.conf /mnt/sysimage/etc/resolv.conf /usr/bin/chvt 1 ) 2>&1 | tee /mnt/sysimage/root/install.postnochroot.log %end %post logger "Starting anaconda galaxy110.sc.uni-leipzig.de postinstall" exec < /dev/tty3 > /dev/tty3 #changing to VT 3 so that we can see whats going on.... /usr/bin/chvt 3 ( #update local time echo "updating system time" /usr/sbin/ntpdate -sub 139.18.1.2 /usr/sbin/hwclock --systohc # Yum proxy echo 'proxy = http://proxy.uni-leipzig.de:3128' >> /etc/yum.conf rpm -Uvh --httpproxy proxy.uni-leipzig.de --httpport 3128 https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm # update all the base packages from the updates repository if [ -f /usr/bin/dnf ]; then dnf -y update else yum -t -y update fi # SSH keys setup snippet for Remote Execution plugin # # Parameters: # # remote_execution_ssh_keys: public keys to be put in ~/.ssh/authorized_keys # # remote_execution_ssh_user: user for which remote_execution_ssh_keys will be # authorized # # remote_execution_create_user: create user if it not already existing # # remote_execution_effective_user_method: method to switch from ssh user to # effective user # # This template sets up SSH keys in any host so that as long as your public # SSH key is in remote_execution_ssh_keys, you can SSH into a host. This only # works in combination with Remote Execution plugin. # The Remote Execution plugin queries smart proxies to build the # remote_execution_ssh_keys array which is then made available to this template # via the host's parameters. There is currently no way of supplying this # parameter manually. # See http://projects.theforeman.org/issues/16107 for details. rpm -Uvh --httpproxy proxy.uni-leipzig.de --httpport 3128 https://yum.puppetlabs.com/puppetlabs-release-pc1-el-7.noarch.rpm if [ -f /usr/bin/dnf ]; then dnf -y install puppet-agent else yum -t -y install puppet-agent fi cat > /etc/puppetlabs/puppet/puppet.conf << EOF [main] vardir = /opt/puppetlabs/puppet/cache logdir = /var/log/puppetlabs/puppet rundir = /var/run/puppetlabs ssldir = /etc/puppetlabs/puppet/ssl [agent] pluginsync = true report = true ignoreschedules = true ca_server = urzlxdeploy.rz.uni-leipzig.de certname = galaxy110.sc.uni-leipzig.de environment = production server = urzlxdeploy.rz.uni-leipzig.de EOF puppet_unit=puppet /usr/bin/systemctl list-unit-files | grep -q puppetagent && puppet_unit=puppetagent /usr/bin/systemctl enable ${puppet_unit} /sbin/chkconfig --level 345 puppet on # export a custom fact called 'is_installer' to allow detection of the installer environment in Puppet modules export FACTER_is_installer=true # passing a non-existent tag like "no_such_tag" to the puppet agent only initializes the node /opt/puppetlabs/bin/puppet agent --config /etc/puppetlabs/puppet/puppet.conf --onetime --tags no_such_tag --server urzlxdeploy.rz.uni-leipzig.de --no-daemonize sync # Inform the build system that we are done. echo "Informing Foreman that we are built" wget -q -O /dev/null --no-check-certificate http://urzlxdeploy.rz.uni-leipzig.de/unattended/built ) 2>&1 | tee /root/install.post.log exit 0 %end Thanks in advance for your suggestions. Cheers, Vadim -- Vadim Bulst Universit?t Leipzig / URZ 04109 Leipzig, Augustusplatz 10 phone: +49-341-97-33380 mail: vadim.bulst at uni-leipzig.de