Skip to main content

Incident Report: Unnecessary price updates

Date: 2023-12-04
Time: 14:47 (GMT+3)
Duration: 5 hours 15 minutes

Description

Excessive update triggers were observed for AAPL/USD and other stable assets on the Polygon network. This unusual behavior raised concerns regarding the frequency of updates, which escalated from the typical once a day to around 25 times in a short period.

Root Cause

The issue appears to be related to RPC (Remote Procedure Call) inconsistencies. Specifically, updates were being triggered due to the 'On chain data timestamp older than heartbeat' error, indicating potential issues with the public RPC provider.

Impact

The abnormal update frequency led to concerns about the stability and reliability of data feeds on the Polygon network, raising questions about the need for immediate alerting in such scenarios.

Timeline

  • 16:47 - Initial observation of unusual update behavior on AAPL/USD and stable assets on the Polygon network.
  • 16:51 - Identification of the issue being exclusive to the Polygon network.
  • 16:51 - Initial assumption of a possible RPC related issue.
  • 17:14 - Confirmation of consecutive transactions created due to Public RPC provider issues.
  • 17:39 - Discussion on the reliability of the public provider and consideration of redeployment strategies.
  • 17:56 - Decision to redeploy Airseeker and modify the Public RPC URL for Polygon in the database.
  • 20:02 - Completion of all three Airseeker re-deployments.

Lessons Learned

The incident underscored the importance of having reliable RPC providers and the need for a responsive system to address anomalies in update frequencies. It also highlighted the necessity for clear protocols and instructions for switching providers and redeploying essential services.

Actions Taken

  • Monitoring of updates and investigation into their frequency.
  • Checking of logs and on-chain data for insights into the issue.
  • Redeployment of Airseeker after changing the Public RPC URL for Polygon in the database.
  • Discussion and planning for further optimization to avoid consecutive redeployments.

Incident Reviewer(s)

  • Ugur, Bedirhan, Burak, Vekil, Prenaam D, Mertcan Karik.