There was an issue with connectivity into our second location in Dublin on Friday night.
Timeline: 11:17pm till 02:11am
Location: InterXion, Blacknight Dub2 data centre
Problem and Resolution:
The problem was identified at 11:17pm when we were unable to reach any equipment over our primary link into InterXion (IX).
This link is from Broighter Networks and is our primary link into IX.
We dispatched an engineer on-site to diagnose the problem and to eliminate our own hardware as the source of the problem.
We had completed this by 00:30 and we had switched both ends of the link to alternative hardware in DEG and IX.
We then notified Broighter that we had diagnosed the fault to be on their end. They in turn tried some fault diagnosis with no success, including a reboot of their fibre switch which impacted other customers of theirs. They then dispatched an engineer with a new switch + line cards to IX at around 01:00 ~ – he arrived on site and had to migrate customers to the new switch, this took a bit of time.
At approx 02:11 packets started routing again into IX and the issue was resolved.
We are awaiting a detailed explanation from Broighter regarding this outage, as we have a protected fibre ring which should be fault tolerant.
The main problem with this outage was that the physical layer, layer 2, never dropped and so it took significantly longer to fix than we would have liked.
Future protection against such outages:
We’re provisioning another protected circuit between DEG and InterXion with an alternative carrier.
Unfortunately even if we had had this on Friday night, it would have been no use to us as the physical layer never went down and any automated switchover as a result of a failure would not have occurred.
In the future, if we have similar issues we can simply disable 1 of the rings in the event that the issue re-occurs.

Search for your perfect domain name...