CAPI Websocket Issues
Incident Report for Layer
Resolved
Primary and redundant database systems continued to operate normally since our last update.
Posted 8 months ago. Mar 20, 2018 - 07:55 PDT
Monitoring
We have restored operation of our primary databases and systems and will continue to monitor for further issues. Our focus is now on adding increased capacity and redundancy throughout the day.
Posted 8 months ago. Mar 19, 2018 - 14:45 PDT
Update
We're experiencing intermittent delays in sending messages and are continuing to investigate.

Efforts are still underway to add additional capacity to the system and solve underlying issues with unread message counts.
Posted 8 months ago. Mar 19, 2018 - 12:37 PDT
Update
We have restored operation of our primary databases and systems and will continue to monitor for further issues. We'll be adding increased capacity and redundancy throughout the day.

There remains some issues with unread message counts that we are currently triaging.
Posted 8 months ago. Mar 19, 2018 - 12:10 PDT
Update
We are currently experiencing backups in messages due to lower throughput capacity.

The team is still working to restore full operation to our primary databases and increase capacity by adding additional resources to the system. We will update with progress shortly.
Posted 8 months ago. Mar 19, 2018 - 09:31 PDT
Update
The Infrastructure Engineering team responded to an issue relating to a primary operational database at 3:43 AM PST. The issue affected several production nodes and the team made the decision to fail over to backup systems. Since that time we have been working through data consistency anomalies while restoring the primaries to production.

We are currently experiencing backups in messages due to lower throughput capacity.

We will update as soon as possible when we have more information.
Posted 8 months ago. Mar 19, 2018 - 07:39 PDT
Identified
Around 3:45AM PST we identified a problem with the databases backing our client and server messaging APIs.
We've been working on a fix since we identified the problem and have failed over to a set of hot-spare services in order to get a majority of messaging traffic flowing again.
Posted 8 months ago. Mar 19, 2018 - 05:43 PDT