Jan 14, 2015: Unpexpected outages in last 24 hours

First off, we apologize to all our hosted customers for two consecutive outages in last 24 hours which resulted in downtime of 15 minutes and 60 minutes respectively.

What went wrong?
Our billing server had an unexpected issues at data center which required a manual restart – this should have been simple enough. But due to some issue, server was stuck while restart operation. Although the application server (HTTP Server + DB) was running all the time during both outages, it interacts with billing server – and when it failed to do so, many of you saw following error:
“Plan for this account has been expired. Contact your account owner”

What we’re doing to avoid this in future?
Note: As much as I’d like to put this in simple words but if you’re not a web expert you might not understand this part but we’d still like to keep things transparent with all of you.

  • Response from billing server will be cached for longer time period – so downtime on billing server won’t affect the application functioning.
  • DNS Cache times will be increased, so any failure on DNS servers won’t affect application accessibility for those with almost expiring DNS cache records.
  • We’re also considering new options to communicate with our customers during downtime.
  • First two changes are to be implemented immediately within 48 hours.

    Once again, we apologize for this inconvenience.

    About the Author

    Abhimanyu is founder of Test Collab, a test case management tool. Test Collab makes your testing more productive and efficient by enabling teams to collaborate in real-time.