Update on Temporary Outage at Dallas Data Center on Saturday, Dec. 10th, 2011
On Saturday, December 10, we experienced a data collection incident in our Dallas data center. The impact for most of the affected customers was approximately 5%-10% of their data collection requests were not answered for approximately 80 minutes. A smaller set of customers with very specialized and customized data collection implementations were more significantly impacted. We are working directly with impacted customers. If you are a customer and feel you may have been impacted but have not yet been contacted by your account manager, please contact 1−800−497−0335 and to talk to someone live, press 1 and then 1 again. Your supported user should make this call.
While we continue to investigate the root cause of the event, we have ascertained that a key network device in the Dallas data center was not passing traffic correctly. We are investigating, in partnership with our vendor, why the secondary device (this was part of a redundant pair) did not detect the problem and fail over and what the initial trigger or cause of the event may have been.
An event like this, while rare, often triggers a larger discussion about redundancy, a critical ingredient for any enterprise service delivered via the cloud. At Adobe, we have numerous redundancies implemented throughout our architecture. At each site, we have multiple data collection ‘networks’. This event impacted a single one of these networks. At each layer within the network, we also have redundancies (multiple firewalls, routers, switches, load balancers, servers, etc.).
This brings us to a much larger redundancy strategy we began rolling out several years ago: RDC (regional data collection). Adobe customers can participate in implementing a multi-site redundancy strategy by moving to RDC. While we have primarily touted the performance benefits of RDC, customers also inherit multi-site redundancy, which minimizes their impact should any site be affected, but also allows us to take a problem site offline much more quickly. Additional information can be found here:
We deeply regret the inconvenience this has caused and are taking steps to protect against similar events going forward. We are committed to delivering the products and services our digital marketing customers need to drive positive business impact and ROI.