Posts Tagged ‘centos’

Restarting network with keepalived on RedHat / CentOS

Tuesday, October 8th, 2013


This was supposed to be a normal setup: highly-available gateways/firewall with CentOS and keepalived. Been there, done that, just not with CentOS. Anyway, I anticipated no problems whatsoever.

The Twist

The installation went smooth, firewall configuration was deployed from git from previous server, and so was keepalived.conf. Startup normal, things worked as expected. Then there was this thing with removing old statically-configured IP addresses from the system.
I edited appropriate files in /etc/sysconfig/network-scripts/, executed ‘service network restart’ and was cut off from the machine.


This almost never happened before on RH/CentOS. Ok, it did happen when I messed up the IP digits, or when I tried to restart only one interface (dunno exactly about this one, but machine was 200km away and I really did not feel like roadtrip so I just resorted to restarting whole network stack which seemed to work ATM).

Fortunately I had a backup network interface still reachable and thorough that I was able to restore connections and debug the situation.

The Reason

It turned out that keepalived was lagging a little bit behind actual system state and it tried to follow the network interface status changes, but it actually messed up the networking configuration.

Here is what should happen:

  1. ‘service network restart’ executed
  2. interfaces go down
  3. interfaces go up
  4. keepalived does its magic to assign correct HA IP addresses

Here is what actually happened:

  1. ‘service network restart’ executed
  2. interfaces go down
  3. interfaces go up
  4. as soon as interfaces get up, keepalived readds HA IP addresses
  5. primary IP address then gets assigned to the interface as secondary address
  6. keepalived receives higher-priority VRRP advertisement, and removes IP address from interface
  7. unfortunately this also removes should-be-primary IP address and leaves server with an IP-address-less interface and thus unreachable server

The Resolution

The resolution was simple: do not restart network when keepalived is running. To achieve that we needed to modify the /etc/init.d/network script which now notifies the admin that keepalived is running and refuses to continue in such situation.

Here is the diff of the changes we did: network-keepalived.diff

I hope no one else gets biten by this peculiarity. Or if she does, she does not spend hours trying to figure out where carnivorous animal is hiding 🙂