Excited to share what I've been working on solo recently. Rezible is a "mission control" platform for oncall teams, aiming to automate, support, and report on the important but often overlooked, unglamorous work involved being oncall.
Github repo: https://github.com/rezible/rezible
While working as an SRE in different teams across Google & Canva, I saw the impact an unhealthy oncall rotation can have on engineers as individuals and as teams.
I believe oncall is a huge missed opportunity for many teams - it is often viewed as a necessary evil rather than as a source of growth & learning. This is not surprising considering the continuous administrative upkeep needed to keep a rotation healthy.
So while all dysfunctional rotations are somewhat unique, there are common practices that strong ones share - which I am building as features in Rezible to provide "healthy oncall on rails":
- Oncall shift event annotation (e.g. flag noisy alerts, measure toil)
- Automated shift handovers
- AI powered post-incident debriefs
- Real-time collaborative incident retrospectives
- Searchable & discoverable knowledgebase (populated from retrospective learnings & analysis)
- Structured oncall training & onboarding
The backend is in Go + Huma/OpenAPI, and the frontend is Svelte 5 + SvelteKit (which has been a pleasure to use despite not being a frontend engineer). It's still very rough around the edges, but I'm sharing now to avoid it never being "ready" :)
Github: https://github.com/rezible/rezible
If you're interested in receiving updates: https://tally.so/r/wLJ5ll
Would greatly appreciate any feedback & a star on Github!