Intermittant service degradation
Incident Report for Layer

On September 6th, 2017, we experienced two separate incidents affecting API services. Both were related to underlying database issues. Layer utilizes a database and access library to more easily map database objects to more usable collections in our software. We have identified several instances where queries generated by this automated mapping tool resulted in sub-par performance. When coupled with overall elevated growth of our customer base and usage patterns, previously un-identified issues with query optimization, indexing, and connection pools manifested as extreme latency and API call failures.

To address this, we have begun immediately performing a holistic look at all database queries and data models related to our API services. The first phase of this, involving queries and model issues that came to light today, has already been completed. We plan to identify and remedy any additional optimizations over the course of the next 72 hours. Additionally, we are assessing and implementing options to gracefully degrade non-critical messaging features during failure scenarios such as those that occurred.

Lastly, we are immediately implementing a moratorium on feature deployments and a less aggressive cadence on deployments overall, until such a time that we have completed the assessment outlined above.

Posted 9 months ago. Sep 06, 2017 - 20:07 PDT

Resolved
We have identified the primary contributing causes to our issues occurring today, centered around critical queries experiencing elevated activity and exposing some new data model issues. We have been able to address several of these already, and will have follow-on refinements over the next 1-2 days to cement and improve these fixes to achieve a more permanent state for the system.
Posted 9 months ago. Sep 06, 2017 - 18:07 PDT
Investigating
As of 11:30am PST, we have seen and been monitoring a server degradation which has led to increased latencies and API errors. We are investigating root causes and working to stabilize the system.
Posted 9 months ago. Sep 06, 2017 - 12:26 PDT