Service Interruption on the VoIP Network, Thursday 4/10/2012

Old, inactive threads
Post Reply
SysAdmin

Service Interruption on the VoIP Network, Thursday 4/10/2012

Post by SysAdmin » Fri Oct 05, 2012 1:05 pm

Start Time: 04/10/2012 10:23:00

Restoration Time: 04/10/2012 17:14:00

Fault Closed: 05/10/2012 08:00:00

Interruption 1:

Root Cause:

The Sip server DNS resolutions partially ceased operation affecting some
customers.

Investigation Outcomes:

No fault was found on the DNS servers.

DNS resolution requests from some software on the sip server in question
began to fail or slow down.

This resulted in DNS lookup failures for sip connections involving hostnames
and affected incoming call routing for a number of customers where DNS
resolution was involved for call routing.

Resolution:

Due to all attempts to resolve the fault without a complete outage being
unsuccessful, a live backup was prepared and the Sip server was restarted.

The preparation of the live backup delayed the restart of the Sip server,
however was required to ensure correct recovery of the Sip server.

Future Plans to avoid such issues:

Network planning and design has been underway to improve the performance and
capacity of the Exetel VoIP network since June. Additional communications
will be forthcoming to customers from the Product Manager regarding the
pending network improvements.

Interruption 2:

Root Cause:

Quintum 1 (verizon PRI switch) stopped accepting inbound calls.

Investigation Outcomes:

Two of the inbound PRI's failed which resulted in certain Direct Indial
ranges not being able to receive calls from third parties.

Resolution:

We had to restart quintum 1 switch in order to bring it back to proper
working state.

Future Plans to avoid such issues:

Existing Network improvements and expansion has already tasked these devices
replaced. Unfortunately this migration is behind schedule due to delays in
the delivery of core fibre trunks to Exetel Data Centres.

Current ECD October 31st, however this is subject to delays on delivery and
testing.

Further communications will be forthcoming from the Product Manager.

Post Reply