Update – US Service database incident and current status
We recognize that today’s disruption remains ongoing, and that earlier progress noted on our trust site did not persist or lead to full recovery. We apologize for the continued impact on your work and want to provide more detail about what is happening, how it relates to a prior incident, what we’re doing now, and mitigations that might be options for your team members.
Summary of impact:
In the US region
• Workspace search, Advanced Search, and Lookup Table interactions are still degraded.
• Some users may see intermittent errors or slow responses in other parts of the Service.
• Direct access to content (for example via Favorites, Recents, direct workspace links, or Echo-synced documents on your local device) generally continues to work.
How today’s issue started
We deployed an upgrade to our Lookup Table database platform outside of business hours. This upgrade was part of our long-term plan to move from a NetDocuments patch to our database provider’s own permanent fix for a bug that previously affected the US Service.
Despite having tested this upgrade extensively in our lab environment when it was applied to the US production cluster we experienced a disruption that meant:
• Memory pressure within the cluster caused database nodes to become unresponsive and require repeated restarts.
• As the cluster became unstable, our search and lookup indexes (including Full Text indexes and Lookup Table structures that power workspace search and Advanced Search) were affected.
• Our cloud operations team deleted and rebuilt one of the Full Text index replicas and then recreated the replica, but indexing did not fully recover.
As soon as we saw the impact to the cluster stability, we followed our protocol and rolled back the change to the previous database version.
Why service did not fully recover after the rollback
The rollback itself completed, but unexpectedly the cluster recovery process did not successfully complete across all nodes.
How this relates to the earlier database incident
Earlier this year, a bug in the Lookup Table database platform caused a serious disruption in the US Service. In response, we:
• Deployed an application-level patch in our software that enforces query timeouts independently of the database configuration.
Our application-level patch has remained effective and resilient. Today’s incident arose when we fulfilled our commitment to deploy the database provider’s permanent fix via a database upgrade.
Both incidents involved unexpected behavior in our database layer. We’re working closely with the database providers’ engineering as part of our remediation efforts, and we’re reviewing our deployment and testing practices to help reduce the risk of recurrence.
What we are doing now
Our teams are working around the clock to restore full service. Specifically, we are:
• Collaborating directly with senior engineers at the platform provider to pinpoint and remediate the memory and indexing behavior we saw after the upgrade.
• Completing the rebuild and rebalance of the affected Full Text and Lookup Table indexes in the US cluster.
• Carefully restarting web and API servers as indexes recover, while monitoring performance and error rates in real time.
Because the remaining work is dominated by index recovery and validation, we are not publishing a precise time estimate for full restoration at this point. Previous estimates were based on the best information we had at the time, but as the incident evolved, they proved too optimistic. We will only provide an ETA when we have high confidence.
What you can do in the meantime
While we work to restore full search and lookup functionality*:
• Team members will be able to access a workspace; you can often still do so from Favorites, Recents, or via a direct link to the workspace. Direct links can be sent from one team member to another as needed if they have access to the workspace.
• Documents that have been synchronized locally via Echo can be accessed directly from your local Echo folder.
Please ensure you and collaborators review final documents to ensure merging of updates.
We will continue to post updates here as we make progress on index recovery and will also publish a full root cause analysis after the incident is fully resolved, including the specific actions we are taking in coordination with the database provider and within our own change and testing processes to reduce the risk of recurrence.
Posted Dec 15, 2025 - 20:20 EST