Andrew Miklas CTO, PagerDuty
Andrew Miklas is the CTO of PagerDuty, a San Francisco startup, busy at work creating an incident tracking and alerting system for IT operations teams. PagerDuty integrates with a variety of monitoring tools and handles the people part of the ops equation: alerting (via phone, SMS, email), on-call scheduling for teams and automatic escalation of critical issues. The operations staff at Heroku, EngineYard, Linode, 37signals and many others trust PagerDuty to dispatch their alerts. Prior to co-founding PagerDuty, Andrew interned at Amazon.com where he worked on their Performance, and Personalization & Recommendation teams.
Ensuring the Call Goes Out—Everytime
Many systems can afford a bit of downtime now and then. Unfortunately, at PagerDuty, even a three-minute outage at the wrong time can cost thousands in lost revenue. This talk will cover some of the techniques and tricks we use at PagerDuty to ensure that our phone and SMS alerts continue to flow while staying well within the budget of a typical startup. Specifically, I will show how we do zero downtime database migrations and deploys, and how we recover from host and data center outages with minimal disruption.