Overlays and IPSec Troubleshooting

One disadvantage of overlay networking is that it considerably increases the complexity of the whole system. This consequently increases the number of places where things can go wrong.

If an overlay network does not work, the best way to start debugging is to to connect using SSH to compute nodes that host virtual machines that refuse to communicate. Then you can use several tools to find out what’s going on.

  • ipadm show-addr - this command will list all configured IP addresses and respective NICs on a compute node (hypervisor). You need to find out which interface is used for overlay communication. The decision is simple: if the compute nodes are in the same physical datacenter, they use the admin interface. Otherwise they use the external interface. Find the appropriate interface name by looking at configured IP addresses.

  • ping of overlay IPs - try to ping the other compute node’s adminoverlay_0 IP address

  • snoop - a network sniffer. Your swiss army knife to find out what packets are (not) flowing between the compute nodes in question. Example: for sniffing packets on the external interface between two nodes, run this (203.0.113.110 is a public address of the compute node on the other side):

    [root@node02 (MYDC-remote1) ~] snoop -rd external0 host 203.0.113.110
    
  • ping inside the virtual machines - try to generate some traffic inside overlay network to see some packets by snoop.

  • /opt/erigones/bin/debug/ipsec_* directory - contains various IPSec debug scripts (see here).

How things should look like when using snoop:

  • Overlay communication without IPSec, using UDP port 4790:

    10.xx.yy.10 -> 10.xx.yy.11   UDP D=4790 S=49251 LEN=118
    10.xx.yy.11 -> 10.xx.yy.10   UDP D=4790 S=49251 LEN=118
    10.xx.yy.11 -> 10.xx.yy.10   UDP D=4790 S=49177 LEN=94
    10.xx.yy.10 -> 10.xx.yy.11   UDP D=4790 S=49251 LEN=118
    10.xx.yy.11 -> 10.xx.yy.10   UDP D=4790 S=49251 LEN=118
    10.xx.yy.10 -> 10.xx.yy.11   UDP D=4790 S=49252 LEN=62
    
  • Initial IPSec negotiation (UDP port 500):

    xx.yy.zz.10 -> xx.yy.zz.20  UDP D=500 S=500 LEN=232
    xx.yy.zz.20 -> xx.yy.zz.10  UDP D=500 S=500 LEN=160
    xx.yy.zz.10 -> xx.yy.zz.20  UDP D=500 S=500 LEN=372
    xx.yy.zz.20 -> xx.yy.zz.10  UDP D=500 S=500 LEN=300
    xx.yy.zz.10 -> xx.yy.zz.20  UDP D=500 S=500 LEN=100
    xx.yy.zz.20 -> xx.yy.zz.10  UDP D=500 S=500 LEN=205
    
  • Normal IPSec communication (ESP protocol):

    xx.yy.zz.10 -> xx.yy.zz.20  ESP SPI=0x11a183ba Replay=15027
    xx.yy.zz.20 -> xx.yy.zz.10  ESP SPI=0xfdc85734 Replay=86306
    xx.yy.zz.10 -> xx.yy.zz.20  ESP SPI=0x11a183ba Replay=15028
    xx.yy.zz.20 -> xx.yy.zz.10  ESP SPI=0xfdc85734 Replay=86307
    xx.yy.zz.10 -> xx.yy.zz.20  ESP SPI=0x11a183ba Replay=15029
    xx.yy.zz.20 -> xx.yy.zz.10  ESP SPI=0xfdc85734 Replay=86308
    xx.yy.zz.10 -> xx.yy.zz.20  ESP SPI=0x11a183ba Replay=15030
    xx.yy.zz.20 -> xx.yy.zz.10  ESP SPI=0xfdc85734 Replay=86309
    xx.yy.zz.10 -> xx.yy.zz.20  ESP SPI=0x11a183ba Replay=15031
    xx.yy.zz.10 -> xx.yy.zz.20  ESP SPI=0x11a183ba Replay=15032
    
  • IPSec fragmented packets (bad thing, you need to lower the MTU in the overlay rule definition):

    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12388 Offset=0    MF=1 TOS=0x0 TTL=60
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12388 Offset=1480 MF=0 TOS=0x0 TTL=60
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12389 Offset=0    MF=1 TOS=0x0 TTL=60
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12389 Offset=1480 MF=0 TOS=0x0 TTL=60
    xx.yy.zz.20 -> xx.yy.zz.10  ESP SPI=0x83c78776 Replay=30625
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12390 Offset=0    MF=1 TOS=0x0 TTL=60
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12390 Offset=1480 MF=0 TOS=0x0 TTL=60
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12391 Offset=0    MF=1 TOS=0x0 TTL=60
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12391 Offset=1480 MF=0 TOS=0x0 TTL=60
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12392 Offset=0    MF=1 TOS=0x0 TTL=60
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12392 Offset=1480 MF=0 TOS=0x0 TTL=60
    xx.yy.zz.10 -> xx.yy.zz.20  ESP SPI=0x7fc7028d Replay=207382
    xx.yy.zz.20 -> xx.yy.zz.10  ESP SPI=0x83c78776 Replay=30626
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12394 Offset=0    MF=1 TOS=0x0 TTL=60
    xx.yy.zz.10 -> xx.yy.zz.20  ESP IP fragment ID=12394 Offset=1480 MF=0 TOS=0x0 TTL=60
    

When IPSec things are working correctly, you should see an IPSec negotiation packets when virtual machines start to communicate for the first time (or a key renegotiation is needed). Immediately after that, you should see a normal IPSec communication.

What can go wrong:
  • You don’t see any IPSec packets - verify the snoop interface and parameters or verify that IPsec services are online (svcs ipsecalgs ike policy).
  • You see only the negotiation phase packets from one IP but no packets from the other IP - verify firewall, verify IPsec config, try to flush the association database on both hosts.
  • You see only the negotiation phase packets from both IPs but no normal IPSec ESP packets - verify IPsec config, try to flush the association database on both hosts.
  • You see normal IPSec ESP packets but only from one host - try to look at dropped packets and flush the association database.
  • You see normal IPSec ESP packets from both hosts but the VMs don’t communicate anyway - try to use network sniffer inside virtual machines on both nodes. There’s a suspicion that one node is accepting packets but the other node is dropping them. If the suspicion is true, you should see the incoming and outgoing packets inside the one virtual machine but only outgoing packets inside the second virtual machine. Also look if the ipsec_print_dropped_packets.d will show some output. To solve the problem try to flush the association database or verify the IPSec policy.

The following IPSec debug scripts can save you a lot of debugging time. They are ordered by priority in which you should go when searching for the answer.

IPSec debug scripts

Turn on IPSec debug

To make the things simpler, you can enable IPSec debug by running ipsec_logging_enable.sh and watching the logs:

[root@node01 (myDC) ~] /opt/erigones/bin/debug/ipsec_logging_enable.sh
[root@node01 (myDC) ~] tail -f /var/adm/messages /var/log/in.iked.log

To turn the logging off, run /opt/erigones/bin/debug/ipsec_logging_disable.sh.

Run esdc-overlay update

To verify and (if needed) re-apply the configuration of IPSec (and overlays) on all compute nodes, you can run esdc-overlay update on the first compute node. For more info see here.

Inspect/Flush IPSec SADB

To see current contents of a security association database on a compute node, run /opt/erigones/bin/debug/ipsec_associations_print.sh. The output is quite detailed but you can see the IPSec status of all connected hosts there. Please note that the other side does not necessarily have the same association status resulting in dropped packets. In this case it’s worth examining the SADB also on the other compute node.

If you want to force a full renegotiation of IPSec connection, run

[root@node01 (myDC) ~] /opt/erigones/bin/debug/ipsec_associations_flush.sh

To flush all SADBs on all compute nodes, you can use Ansible to make the things simpler:

[root@node01 (myDC) ~] esdc-overlay update-ans-hosts
[root@node01 (myDC) ~] cd /opt/erigones/ans
# test ansible connect
[root@node01 (myDC) ~] ansible all -a date
# flush all SADBs everywhere
[root@node01 (myDC) ~] ansible all -a /opt/erigones/bin/debug/ipsec_associations_flush.sh

IPSec services and config files

There are 3 system services and 3 configuration files. To see the status of IPSec services, run svcs ipsecalgs ike policy. Effective configuration files are located here:

  • /etc/inet/ike/config
  • /etc/inet/secret/ike.preshared
  • /etc/inet/ipsecinit.conf

But because SmartOS does not persist the configuration by default (when booted from an USB stick), you can find the persistent configuration files here: /opt/custom/etc/ipsec/. After changing the persistent configuration, reload IPSec by running /opt/custom/etc/rc-pre-network.d/020-ipsec-restore.sh refresh.