View Issue Details

IDProjectCategoryView StatusLast Update
0000166Rocky-Linux-9NetworkManagerpublic2022-07-29 05:57
ReporterPascal Häussler Assigned To 
PriorityhighSeverityblockReproducibilityalways
Status newResolutionopen 
Summary0000166: NetworkManager fails to configure IP over InfiniBand (IPoIB) connections
DescriptionWe setup a HPE ProLiant DL380 system with Rocky 9 minimal install. Based on that, we installed InfiniBand support (`dnf group install "InfiniBand Support"`). The HPE InfiniBand NIC (in fact, a Mellanox ConnectX 5 adapter) is detected and the kernel modules are loaded. Both, a ib verbs capable IB device `mlx5_0`and a IPoIB default device `ips2`are available.

When trying to configure an IPoIB connection with NetworkManager `nmcli` along the steps described in the RHEL 9 manual on InfiniBand support, NetworkManager fails to bring the connection up. We see these log entries in `journalctl`:

```
Jul 29 07:46:37 master01.c.hpc.zhaw.ch NetworkManager[1521]: <info> [1659073597.0438] device (ibs2): Activation: starting connection 'mlx5-ipoib' (e60615fc-b9fc-48fd-9797-db1addb6625e)
Jul 29 07:46:37 master01.c.hpc.zhaw.ch NetworkManager[1521]: <info> [1659073597.0439] device (ibs2): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
Jul 29 07:46:37 master01.c.hpc.zhaw.ch NetworkManager[1521]: <warn> [1659073597.4487] device (ibs2): mtu: failure to set IPv6 MTU
Jul 29 07:46:37 master01.c.hpc.zhaw.ch NetworkManager[1521]: <info> [1659073597.4487] device (ibs2): state change: prepare -> failed (reason 'config-failed', sys-iface-state: 'managed')
Jul 29 07:46:37 master01.c.hpc.zhaw.ch NetworkManager[1521]: <warn> [1659073597.4489] device (ibs2): Activation: failed for connection 'mlx5-ipoib'
Jul 29 07:46:37 master01.c.hpc.zhaw.ch NetworkManager[1521]: <info> [1659073597.4491] device (ibs2): state change: failed -> disconnected (reason 'none', sys-iface-state: 'managed')
Jul 29 07:46:37 master01.c.hpc.zhaw.ch NetworkManager[1521]: <warn> [1659073597.4493] device (ibs2): mtu: failure to set IPv6 MTU
Jul 29 07:46:37 master01.c.hpc.zhaw.ch NetworkManager[1521]: <info> [1659073597.4730] device (ibs2): carrier: link connected
```

The connection `mlx5-ipoib`doesn't come online but remains in disconnected state:

```
[root@master01 ~]# nmcli con sh
NAME UUID TYPE DEVICE
br1 cd93090e-f721-4e7b-87f9-7c1dde6fb994 bridge br1
br0 d7f956d0-f696-43b8-9ba4-86e3541aaa1e bridge br0
bond1 ead7ec11-04d5-4d00-8772-d34412e911d2 bond bond1
bond1-eno5 56affc33-d0bf-4081-b36e-527e6ea02da9 ethernet eno5
bond1-eno6 a4f443b9-35fa-416e-93f6-befe2b07b0d7 ethernet eno6
eno1 787aaf7c-58d7-361c-a85b-48b801a7e4ac ethernet eno1
eno2 2af9052d-6922-4398-bc96-68b7f5f12bef ethernet --
eno3 5af585d0-cd7f-49ba-9e37-f034b5ca0399 ethernet --
eno4 24b4ebd6-c39b-4b5a-a466-1f56bc8df945 ethernet --
eno5 e3d6d805-8919-3131-a47e-cbe9d4341037 ethernet --
eno6 0ad230e1-b47c-389e-bceb-a1722b0fdbee ethernet --
mlx5-ipoib e60615fc-b9fc-48fd-9797-db1addb6625e infiniband --
```

The device configured for this connection, namely `ibs2`is in a disconnected state:

```
[root@master01 ~]# nmcli device
DEVICE TYPE STATE CONNECTION
br1 bridge connected br1
br0 bridge connected br0
bond1 bond connected bond1
eno1 ethernet connected eno1
eno5 ethernet connected bond1-eno5
eno6 ethernet connected bond1-eno6
ibs2 infiniband disconnected --
eno2 ethernet unavailable --
eno3 ethernet unavailable --
eno4 ethernet unavailable --
lo loopback unmanaged --
```

I can reproduce this error on a second system with the exact same hardware.

Note: Both of these systems were running on CentOS 7.8 before and uses a peer-to-peer IPoIB connections on these adapters successfully.

Steps To Reproduce- Install Rocky Linux 9.0 minimal
- Install InfiniBand support
- Install and start `opensm`subnet manager
- Configure IPoIB support as explained in the RHEL 9 manual "Configuring InfiniBand and RDMA support"
- Try to gring up the IPoIB connection
Additional InformationCommands used to configure the connection:

```
nmcli connection add type infiniband con-name mlx5_ib0 ifname ibs2 transport-mode Connected mtu 65520

nmcli connection modify mlx5_ib0 ipv4.addresses 10.20.1.1/24
nmcli connection modify mlx5_ib0 ipv4.method manual
nmcli connection modify mlx5_ib0 ipv6.method ignore

nmcli connection up mlx5_ib0
```

Note: As you can see, we set IPv6 support to "ignore". Nevertheless, in the journalctl log excerpt above you can see a warning saying that setting IPv6 MTU fails.

TagsNo tags attached.

Activities

There are no notes attached to this issue.

Issue History

Date Modified Username Field Change
2022-07-29 05:57 Pascal Häussler New Issue