Elevated API Errors
Incident Report for Balena.io
Postmortem

We’ve observed a reoccurrence of the database connectivity issue.

Things we’ve tried so far:

  • examine DB monitoring metrics
  • attempt to correlate specific queries with times leading up to the outage
  • isolate one of the affected API instances
  • examine the instance offline with debugging tools to glean the state of the DB connection pool leading up to the outage
  • modify DB connection pool behaviour to purge dead connections
  • adjust DB working memory settings
  • implement auto-recovery in the API code
  • begin analysing some of the more expensive queries to see if they can be optimised further
Posted Jan 20, 2021 - 22:02 UTC

Resolved
This incident has been resolved.
Posted Jan 20, 2021 - 20:26 UTC
Monitoring
We're experiencing an elevated level of API errors and are currently looking into the issue.
Posted Jan 20, 2021 - 19:30 UTC
This incident affected: API, Application Builder, Application Registry, Dashboard, Delta Image Downloads, Device URLs, Git, and Cloudlink (VPN).