We sincerely apologize for the service disruption you experienced. We understand that any interruption to your service is unacceptable, and we take full responsibility for the impact this had on your operations.
What Happened
At 12:06 PM ET on June 1, our primary database server shut down unexpectedly. We restored the server to service as quickly as possible, and it was during that recovery process that the root cause of a single failed hard drive in the storage array revealed itself.
Root Cause
The incident was caused solely by the failure of a single hard drive within the storage array. There was no data loss due to the failure of this drive, and our system would usually handle this in the background, however the hardware faulted and we needed to hard reboot the system.
After the reboot, we did a quick, but comprehensive analysis to confirm that no data loss occurred before bringing the database back online, to ensure the integrity of client data.
What We've Done
The failed drive has been replaced and the storage array is fully operational, with all redundant drives functioning normally.
We performed a comprehensive analysis of the primary and secondary database servers and ruled out any issues with any other hardware components. We verified that the Mean Time to Data Loss (MTTDL) of the remaining drives in the array are well within acceptable tolerances.
Our Commitment to You
We are committed to the reliability of your service and will continue to closely monitor the health of this infrastructure. You have our assurance that the server is stable and operating normally.