Skip to main content

Incident Report: Dead Gateway Alerts

Date: 2024-01-24
Time: 3:06 PM (GMT+3)
Duration: 5 days 5 hours 45 minutes

Description

Various Dead Gateway errors have been detected through OpsGenie.

Root Cause

The root cause of the issue is identified as the deployment of new gateways and the failure to remove old gateways from the database. Aaron manually pruned out the old gateways.

Impact

These gateway errors has led to alerts and potential disruptions in monitoring.

Timeline

  • 15:06(01-24) - Mertcan has done changes and notified team through slack.
  • 20:58(01-28) - Aaron has noticed the occurrence of dead gateways.
  • 23:54(01-28) - It is identified that SPY/USD and CHF/USD dead gateways are related to the issue.
  • 20:51(01-29) - Aaron has found that the failure to remove old gateways from the database has caused the problem.

Lessons Learned

Pruning old gateways manually may be required after the deployment of new signed-api gateways.

Actions Taken

Acknowledgment of alerts for gateway errors initiated. Investigation into dead gateway alerts and pruning of old gateways have done by Aaron.

Escalation links: 1 2 3

Incident Reviewer(s)

Mertcan, Aaron, Andrew, Arda