At approximately 09:30 EDT NetDocuments platform monitoring began reporting congestion with a database sub-system. According to operating procedures, the impacted databases were failed over to alternate servers. This process took about 10 minutes to complete and by 09:45 EDT the NetDocuments platform returned to normal operating performance.
At 11:16 EDT the same database sub-system again reported congestion and mal-performance. Again the failover procedure was executed. This time however, the mal-performance did not cease. The issue was then escalated to our Tier II Platform Engineering and Software Engineering teams. Triage teams assembled and began to evaluate the situation. Various troubleshooting events occurred until the root cause was identified.
The root cause was found to be tied to an application software change in the document indexing pipeline. This change included additional load placed upon the database infrastructure that resulted the mal-performance. Once the software change was identified, the software roll-back process was executed and system performance returned to normal at around 12:10 EDT.