Discussing the process of crafting effective and creative OKRs for engineering teams, highlighting the importance of specificity, creativity, and alignment with desired outcomes.
In the last few weeks, I was working with a co-worker on crafting good “tech excellence” OKRs for the engineering team, and I learned a lot through the process. This was not the first time we worked together to make OKRs (it is currently Q423 and we did the same exercise for Q323), but this time around it was a lot more clear on what makes a good OKR vs a brittle OKR.
OKRs are hard because they need to be specific on what outcome they’re trying to measure, and also inspire creativity in how to drive that metric down. It’s really hard to do that correctly. These OKRs we were creating were specific to the engineering team, and at first they looked something like this:
Key-result: 100% of all front-end components should use the new DS Library
My co-worker and I thought this was a crisp OKR, but there were a lot of issues with it.
- This objective was scoped specifically to front-end improvements, which eliminates a whole class of engineering improvements we could be making to the system.
- Also, the KR was quite binary - we either did it or we didn’t.
- The KR was also too prescriptive - embedded in the metric was an opinion that the team was forced to choose a particular DS Library, and migrate to it.
After some self-reflecting, we realized that a lot of why the OKR came to be like this is because we knew we really wanted to push for a Design System this quarter, but we were struggling to come up with an OKR to give justification for why we wanted to do it. So we ended up framing the OKR too narrowly around the project, and thought that would be good enough to give it the justification it needed to move forward.
At the end of the day, there should be multiple projects or initiatives that can help drive an OKR down, and the OKR should still be specific to measuring the outcome we actually want.
After several rounds of feedback from a few staff+ folks on the team, we reformatted the OKRs to be more like this:
This version of the OKR was much better because there are many ways to decrease the bundle size for your application, one of which could be going all-in on a Design System[1]. Also, the outcome we ultimately want from this project is a more performant application and reduced cognitive overhead, which can be somewhat proxied through the bundle size of the application[2].
Once we presented this Objective the second time around, the team buzzed much more excitedly about what the possibilities were to help drive this metric down. It was incredibly interesting to see the before and after of this OKR, from transitioning to a narrowly-defined one to one that gave folks more freedom to do the projects they were interested in.
As a final note, these OKRs were specific to the engineering team, and the nature of using OKRs to drive improvements to engineering systems can be different from more general OKRs that drive business outcomes. But I think a lot of principles from this experience can be applied more broadly, namely:
- OKRs should inspire creativity.
- Projects don’t make the OKR, the OKR should make the projects.
- I will say that metrics are made to be gamed and certain common sense is still required. Again, it’s super hard to come up with a good OKR and sometimes projects kind of make sense and sometimes they can be interpreted in a completely different way. One teammate brought up that if this KR was taken too literally, we should be dropping the design system and use vanilla html + css, since that would save the most on the bundle size.
- Some of you reading will be like “wtf is he on?”, and I agree that this is a pretty big stretch here, but there are just certain things that are really hard to measure and that you will have to proxy through other metrics, like cognitive overhead. Some folks on the team mentioned that metrics like cycle time are better for justifying why a DS system is useful, but I’m not sure if a noisy metric like cycle time is better than one that can be more accurately measured. Also, when it comes to accurately measuring cognitive overhead, I’m not yet onboard that we should be buying EEGs for everyone to help measure this more accurately, let alone mandate that folks wear them throughout the work daay.