It was a fun project to lead the effort to put together this reliability training for developers.
At GitLab, we have a focus on reliability in engineering. We have made many changes to our handbook, production documentation, and processes. While we have announced them via multimodal communication (engineering week in review document, slack, email, meetings, etc), not everyone has likely seen and internalized all of the important changes.
We gathered all the crucial changes, explain why we made them, discuss a summary, and link to where you can find more information.
Most of this training is available to the public. Some content is GitLab specific and some apply to any company focusing on reliability in engineering.
The topics include:
- The business impact of reliability
- Reliability and values
- Blameless culture
- Limiting the impact of far-reaching work
- Risk mapping
- Change acceptance checklists
- Definition of done
- Backward compatibility
- Error budgets
- Feature change locks
If you are interested, check out this handbook page.