Skip to content

Interfaces flaps due to DCB message #125

@liviozanol

Description

@liviozanol

Hi all.

We are having some weird issues with lldpad process.

When the process is running it is causing our physical Intel 10G interfaces to flap. It seems like, but I'm not sure, that the interface gets restarted.

Using strace and comparing with the compute logs, we narrowed the issue to a single message that is sent by the process:

The interface going down, apparently resetting:

08:13:38.968996+0000 my.compute kernel: ixgbe 0000:19:00.0: removed PHC on eno1
08:13:39.261060+0000 my.compute kernel: ixgbe 0000:19:00.0: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0
08:13:39.525041+0000 my.compute kernel: ixgbe 0000:19:00.0: registered PHC device on eno1
08:13:39.676982+0000 my.compute kernel: bond0: link status definitely down for interface eno1, disabling it

strace showing the process sending a DCB_CMD_IEEE_DEL message just before for this same interface.

08:13:38.968831 sendmsg(69, {msg_name={sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, msg_namelen=12, msg_iov=[{iov_base={{len=48, type=RTM_SETDCB, flags=NLM_F_REQUEST|NLM_F_ACK, seq=1744791218, pid=9353
54412}, {dcb_family=AF_UNSPEC, cmd=DCB_CMD_IEEE_DEL}, [{{nla_len=9, nla_type=DCB_ATTR_IFNAME}, "\x65\x6e\x6f\x31\x00"}, {{nla_len=16, nla_type=DCB_ATTR_IEEE}, "\x0c\x00\x03\x00\x08\x00\x01\x00\x01\x03\x06\x89"
}]}, iov_len=48}], msg_iovlen=1, msg_controllen=0, msg_flags=0}, 0) = 48 <0.674729>

We can see that it took 674 ms. So it sent before the interface went down/restarted and the call just had response after the interface went down.

Question: Has anyone seem this before? Can we somehow disable these DCB/PFC/others flow handle on lldpad processes? I just want it to announce neighborhood like compute name and interface.

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions