We lost a good engineer to on-call before we took it seriously. He never complained loudly. He just got quieter, then he left, and in the exit conversation it came out that he had not slept through a night in his rotation week for a year. Nobody had looked at the page volume, because nobody owned the question of whether on-call was survivable.
Page volume is a budget, not a fact of nature
We set a number: more than two pages in a night, or any page outside business hours that was not genuinely urgent, is a bug in the alerting, not a fact about the system. When the on-call person gets paged for something that could have waited until morning, we fix the alert, not the human's expectations. The target is that most on-call weeks are quiet, and a loud week triggers a review.
- Track pages per rotation and treat a spike as a defect to fix, not a season to endure
- Every page must be actionable and urgent, or it gets downgraded to a ticket
- No solo heroics: a second person is reachable for anything serious
- People who carried a rough week get real recovery time, not a thank-you in standup
The rotation is a design, not a leftover
A healthy rotation is large enough that your turn comes around every five or six weeks, not every other week. It has clear handoffs so the incoming person knows what is fragile right now. And it pays back into itself: time spent on-call includes time to fix the thing that woke you, so the rotation gets quieter over time instead of accumulating debt that the next person inherits.
If on-call is something your team survives rather than something they can sustain, you are borrowing against people who will eventually leave to stop paying interest.
Compensate it honestly
On-call is work, including the weeks nothing happens, because the constraint of staying reachable is itself a cost. We compensate it, whether in pay or in time, and we are explicit that being unable to leave the city for a week has a price. Pretending it is free is how you end up with a rotation only the people who cannot say no will join.
The leader's actual job here
My job is not to thank people for sacrificing. It is to make sure the sacrifice is small and shrinking. I look at the page data every month, I carry the rotation myself sometimes so I feel it, and when a week is rough I ask what we are going to change rather than how the person coped. Humane on-call is built, and it decays the moment leadership stops paying attention to it.