At 08:40:03.514 am this morning access switch 15 in a customer cabinet on the first floor detected a loopback on both of its GigE connections and both switch ports were set to err-disable. Our syslog servers never got this information so we assumed the switch had physically failed.
Time Line:
08:40 alerts received by on-call engineer, engineer proceeded to trouble shoot the issue
08:50 issue was perceived to be a switch failure, DEG notified to reboot the switch and to cable test both cables connecting the switch to the core network.
09:15 DEG call back saying that the issue doesn’t appear to be related to cabling as they had ran the fluke test on both uplink cables.
09:16 DEG report both ge0/1 and ge0/2 are showing state down, both syslog servers checked for possible data that may explain this. Nothing found.
09:20 Blacknight ask DEG to connect another port on access switch 15 to our core network. DEG have to make up a cable.
09:35:02 Fa0/19 comes up (without config)
09:42 Config placed on Fa0/19 to carry trunk traffic to core network, network comes up 30 seconds later
09:43 Network in customer cab resumes and all machines come back online.
We’re going to investigate this issue as this is not normal behavior and neither of the core access switches report any issues with loopbacks.
We’ll swap out this switch incase there is a fault with it today.
Total downtime for this customer cab was 1 hour and 3 minutes.

