Resolved — 3 min ago: Stripe API — HTTP 503 errors on /v1/charges endpoint in us-east-1. Root cause: upstream database failover. Response time returned to 120ms baseline.
Investigating — 12 min ago: Twilio SMS Gateway — Elevated latency (2.4s avg) on messaging endpoints in eu-west-1. Engineering team notified via PagerDuty webhook at 14:32 UTC.
Resolved — 47 min ago: AWS S3 — Partial outage affecting multi-part uploads in ap-southeast-2 (Sydney). Affected 1,204 requests over 22 minutes. Fully recovered.
Major Incident Declared — 1 hr ago: Cloudflare — Degraded performance across 18 edge locations in North America. Cache hit ratio dropped from 96% to 71%. Incident commander assigned.
Operational — All systems green: GitHub API, Slack Webhook, SendGrid SMTP, Vercel Deployments, Datadog Agent — all within normal thresholds for the past 6 hours.