Follow-the-Sun On-Call: Coverage Across Time Zones
Nobody should be paged at 3am if a teammate across the world is mid-afternoon. Here's how to build follow-the-sun on-call that actually hands off cleanly.
- #incident-response
- #on-call
- #sre
- #remote
- #process
- #reliability
The first time I worked on a team spread across three continents, on-call was a single rotation anchored to one time zone — which meant half the team was getting paged in the dead of their night while colleagues in another region were wide awake, well-caffeinated, and not on the hook. It was a waste of the one advantage a distributed team has: someone is always awake.
Follow-the-sun on-call fixes that. Instead of one rotation paging people at all hours, you chain regional rotations so the pager always lands on someone who’s in their working day. Done right, almost nobody gets a 3am page. Done wrong, you trade night pages for dropped handoffs, which is arguably worse. Here’s how to get the good version.
The core idea
Split coverage into regional shifts that follow daylight. A typical three-region setup:
- APAC covers roughly 00:00-08:00 UTC.
- EMEA covers roughly 08:00-16:00 UTC.
- Americas covers roughly 16:00-24:00 UTC.
Each region is on-call during its own business hours. The pager moves around the globe with the sun, and no single person eats the night shift. You need at least two well-staffed regions to make this work; three gives you clean eight-hour windows.
The handoff is the whole game
The promise of follow-the-sun is “no night pages.” The risk is that an incident in progress falls into the gap between two shifts and nobody owns it. The entire discipline lives in the handoff.
Treat each shift change like a shift change in a hospital — structured, mandatory, and not a vibe. A good handoff transfers:
- Active incidents — anything open, with current state and next steps. Nothing in flight should change owners silently.
- Watch items — “the queue is climbing, keep an eye on it,” “we deployed X, watch for fallout.”
- Suppressed or acknowledged alerts — so the next region doesn’t get blindsided by something the previous region was deliberately sitting on.
- Planned work — deploys, maintenance windows, migrations happening in the next window.
A handoff template
Run every shift change through the same structure so nothing falls through:
Outgoing region → Incoming region — [HH:MM UTC] Open incidents: [ID, severity, current state, next action, who to grab] Watch items: [metric/system + why + threshold to act on] Acked/suppressed alerts: [what + why + when it should be revisited] Recent & upcoming changes: [deploys, migrations, maintenance windows] Anything weird: [gut-feel concerns that aren’t yet incidents] Confirmed received by: [incoming on-call name]
That last line matters. A handoff isn’t complete until the incoming person acknowledges it. An unacknowledged handoff is a dropped baton.
Tooling the handoff
A live document or channel thread per shift beats verbal sync, especially when the outgoing and incoming engineers can’t always overlap live. Some teams enforce a short overlap window — 15-30 minutes where both regions are on — for live questions on anything in flight. If you can swing the overlap, the verbal layer catches the stuff the template misses.
Your paging tool should rotate the schedule automatically at the boundary, but the human handoff has to happen regardless of what the tool does. The tool moves the pager; the template moves the context.
Don’t let an incident respect the clock
The hardest case is a live SEV1 that spans a shift boundary. Two rules prevent disaster:
- The incident commander role hands off explicitly, mid-incident, with a full state transfer — or the outgoing IC stays on until a clean handoff point even if it runs past their shift. You never drop incident command at a clock boundary just because the schedule flipped.
- The incident channel is the source of truth, so the incoming region can read the full timeline rather than relying on a hurried verbal summary. A well-kept incident log is what makes a cross-region IC handoff survivable.
The clock is for routine coverage. An active major incident overrides the clock until there’s a safe handoff point.
Fairness across regions
Follow-the-sun can quietly become unfair if one region carries more load. Watch for:
- Uneven incident volume by window. If most deploys land in Americas hours, that region eats most of the fallout. Spread deploys, or staff that window heavier.
- The “small region” trap. A two-person region burns out fast because the rotation comes around constantly. Each region needs enough people for a humane rotation in isolation — typically four or more.
- Holiday and weekend asymmetry. Regions don’t share holidays. Map them out so coverage doesn’t silently collapse on a regional holiday nobody else observes.
When follow-the-sun isn’t worth it
Be honest about the prerequisites. Follow-the-sun needs genuinely distributed staffing — enough engineers in at least two regions to run humane rotations independently. If you have eight engineers, six in one city, you don’t have a follow-the-sun team; you have one real rotation and a token second region that will burn out.
In that case, a single well-designed rotation with fair compensation for night pages is more honest than a follow-the-sun setup that exists on paper but dumps everything on one region anyway. Don’t cargo-cult the model without the headcount to support it.
Make handoffs a measured habit
Track dropped or incomplete handoffs the way you track missed pages — they’re the failure mode unique to this model. If incidents keep getting “re-discovered” at shift boundaries, your handoff discipline is the problem, not the schedule.
We keep on-call handoff and shift-change templates in our incident-response toolkit, and the AI Incident Response Assistant can summarize an incident channel into a clean handoff brief so the outgoing region passes complete context instead of a rushed paragraph.
Coverage models and rotation sizes here are general guidance. Design any on-call schedule around your real staffing, geography, and local labor norms.
Download the Free 500-Prompt DevOps AI Toolkit
500 battle-tested, copy-paste AI prompts engineered by a senior systems engineer — every one with fill-in placeholders and safety/back-out notes. Drop your email and it's yours.
- 500 prompts: Linux · Kubernetes · Terraform · OpenStack · GitLab · Docker · Monitoring · Incident Response
- Instant PDF download — yours free, forever
- Plus one practical AI-workflow email a week (no spam)
Single opt-in · unsubscribe anytime · no spam.