US Service – Intermittent Performance Issues
Incident Report for NetDocuments US

On Jan. 3rd at approximately 9:25 am MST NetDocuments began experiencing elevated response times when interacting with ndMail and Workspace operations in the US Service.    

We were initially alerted to the issue via automated monitoring and began investigating immediately.  Throughout the disruption, we were seeing areas of normalized activity on the Service but understand that some customers may have experienced larger impacts than others.

Our Operations and Engineering teams worked to determine the root cause of an issue stemming from a query creating excessive load on a database cluster.  The query was related to a product change that was deployed into production a few months ago.  Over time, as more areas of the Service were interacting with this code change, the load generated by the query triggered the larger issue.  The team was able to optimize the query which resolved the core issue.  By approximately 12:27 pm MST the Service had stabilized, and we were able to see normalized response times in the affected areas.  

In addition to the actions taken that resolved the issue, we have adjusted our automated monitoring around the specific queries, in order to monitor potential abnormalities before thresholds are triggered.

We apologize for any inconvenience this disruption may have caused.

Thank you,

The NetDocuments Team

Posted Jan 04, 2022 - 18:06 EST

We have corrected the issue and are monitoring for any further disruptions. While the root issue has been resolved, the ndMail queue is still working through the resulting backlog of items and may see continued delays as it catches back up. We anticipate this will be fully caught up in the next couple of hours.
A postmortem is being conducted and the details will be provided in the next 24 hours.
Posted Jan 03, 2022 - 15:04 EST
NetDocuments is currently experiencing issues with the US Service. The Service may appear to be reachable, but customers have reported it is responding slowly.
We are working to remedy the situation as soon as possible and apologize for the inconvenience. We will provide further updates within the next hour or as soon as further information is available.
Posted Jan 03, 2022 - 14:16 EST
This incident affected: Platform.