Packetloss several vlans DBC (Status)

« Back

[#560] Packetloss several vlans DBC (Status)

Posted: 2017-11-08 11:29

Start: 2017-11-08 08:45:00
End : 2017-11-08 11:15:00

Affects: DBC vlans 1018, 1069

At 08.45 we received notifications from our monitoring that for vlan 1018 and 1069 at the DBC facility there is packetloss. At 09.00 this automatically recovered.

We are currently investigating the cause.

Update 11.30:
During the timeframes 09.45 - 09.55 and 11.00 - 11.15 we have received the same reports.

Update 11.42:
Investigation is still ongoing, VRRP-E flaps due to duplicated mac-address. The issue seems to originate between the R1 <-> core-sw1.z1 and R2 <-> core-sw1.z1. We do not see any loops and our ports facing the routers are configured as route-only as well. Our debugging shows that it is most likely caused by a defective port which is not shutting itself down and still forwarding partial traffic. We are currently going through all 96x 10 Gbps ports connecting the core-sw1.z1 and the R1/R2 routers to locate the problematic port.

An engineer was already on-site since 09.00 for regular maintenance, that maintenance has been prosponed to later today and the engineer is working together with remote engineers to debug and resolve this issue.

The issue is not continuously ongoing, making the issue more difficult to identify. Hence we have enabled real-time monitoring to find the culprit.

We expect an update/resolve of the issue within approximately 1-2 hours.

Update 13.05:
We have found the problematic bridge and are currently moving away the vlans to debug this further. No incidents have occurred since 11.15.

Update 14.45:
The incident has been resolved, everything has been restored to primary paths. There we no more incidents since last incident which was at 11.15.