On Jan. 3rd at approximately 9:25 am MST NetDocuments began experiencing elevated response times when interacting with ndMail and Workspace operations in the US Service.
We were initially alerted to the issue via automated monitoring and began investigating immediately. Throughout the disruption, we were seeing areas of normalized activity on the Service but understand that some customers may have experienced larger impacts than others.
Our Operations and Engineering teams worked to determine the root cause of an issue stemming from a query creating excessive load on a database cluster. The query was related to a product change that was deployed into production a few months ago. Over time, as more areas of the Service were interacting with this code change, the load generated by the query triggered the larger issue. The team was able to optimize the query which resolved the core issue. By approximately 12:27 pm MST the Service had stabilized, and we were able to see normalized response times in the affected areas.
In addition to the actions taken that resolved the issue, we have adjusted our automated monitoring around the specific queries, in order to monitor potential abnormalities before thresholds are triggered.
We apologize for any inconvenience this disruption may have caused.
Thank you,
The NetDocuments Team