- Avoid "big bang" changes such as rewriting in a new language or framework;
- Make incremental improvements using techniques like "inline" refactorings and spike stories;
- Use reflective techniques like retrospectives and root-cause analyses to steer the improvement process.
The Big Bang Trap
Developers always want to rewrite everything
- Sach Kukadia, co-founder, SecretSales
Sach's observation is not far off the mark. As you start to grow your engineering team, expect to hear complaints like "This code is as tangled as spaghetti!" or "No one knows how that bit works, so we can't change it", and consequent demands to throw out the whole thing and start over. The developers are likely to want to switch technologies at the same time, adopting a new tool or framework that they are sure is a better fit for what users need the software to do.
We don't have to think too hard to see why this might be. Your path to the product-market fit you now have was hardly straight, was it? If you're like most startups, you've pivoted more than once and substantially changed your original vision - so it's no surprise that your code has had to be twisted and mangled to keep pace, and that the libraries and technologies you chose at the start are no longer a perfect match for the job the software is doing. There are techniques for keeping the design clean and the software maintainable during such changes, but I've yet to meet a startup able to make the investment of money and time to use those methods consistently during the initial growth phase and emerge with clean, fit-for-purpose software.
No matter how tempting it is, resist the siren song of the full rewrite. Decades of experience show code overhauls of this kind are likely to be late, buggy, and expensive, if they are finished at all. Not only will developers encounter surprising behaviour and bugs that they'll have to carefully repair or preserve in the new system, they will have to build every new feature twice - once in the old software for live users, and once in the not-yet-live new application. The risks are very unlikely to outweigh the costs, so steer clear! (See Joel Spolsky's classic article "Things You Should Never Do")
Instead of a risky rewrite, aim for incremental, individually-released improvements - that is, changes that are, by themselves, very small, but together add up to a big step forward.
For instance, when you "refactor", you carry out "a series of small behaviour-preserving transformations" in your code (see Martin Fowler's article and book on the subject). A good refactoring can take just a few minutes and have a big effect on maintainability, so it's a great way to move safely and incrementally to improved software.
But beware - developers sometimes forget to keep their refactorings small, and instead embark on days or weeks of substantial changes. Since by its nature, refactoring doesn't change what the software does, the result is that users see zero progress, and those not working on the new code can't measure what's happening or provide feedback.
Instead, keep refactoring "inline" by tying each such change to an existing business story. For instance, at an e-commerce company where I was CTO, we wanted to move to a software framework called Symfony for greater safety and development speed. So whenever we worked on a user-requested change to a component, such as the payment page or the returns form, we spent a short time moving just that component into the Symfony framework. This made each task a little longer, but kept us tightly focussed on the key pages that users needed. A year later, we had all the most-used components in Symfony and achieved the safety and speed gains we wanted, without noticeable delay to progress on business-visible goals.
Another incremental technique is "spiking". When a developer works on a spike story, he or she tries out something new - a coding method, a third-party technology, or something else of interest - for a fixed period, say 1-2 days. The goal is not to produce a usable change; in fact, the developer often discards any code written during the spike. Instead, the spike helps the developer understand the benefits, costs, and limitations of the new method or technology. If it's not right for the team, you can discard it and move on to other options. But if the spike proves the new idea is valuable, you can start adopting it with inline refactorings or another technique.
Just like refactorings, the key here is keeping the spike duration very short - resist the temptation to keep exploring in spike mode. If the spike is showing early promise after a day or two, convert it into a business-led story and prioritise it as you would anything else.
Pick the Right Path Through Reflection
How do you know which refactorings, spikes, and other initiatives will move you toward more sustainable development? Use reflective practises to look back at what works and what doesn't in light of your overall business goals. Two such practises that I find really helpful are retrospectives and root-cause analyses.
Hold a retrospective meeting every week or two - wait longer, and the team forgets what happened since the last one.
- Invite everyone who works with the technology: developers, QA, product managers, designers, and internal customers. (This could be most or all of the company if you're small and tech-driven.)
- Start by reviewing important recent events on a visible timeline, then ask team members to put up sticky notes or cards on the wall indicating what went well over that time and what needs fixing.
- Group these together - there will likely be plenty of overlap - and help the team to pick several to address. After discussing each one, agree at least one action that will keep you doing what works ("Annie to show others the new mocking framework") and fixing what doesn't ("Bob to refactor to an MVC pattern for the login page feature next week").
- Assign each action to a single person in the room, who can get help from others if needed, and check up to be sure they are done by the next retrospective.
This is a simple retrospective recipe - for lots more, see the book Agile Retrospectives.
When something goes badly wrong, such as a system outage or an important missed deadline, hold a root-cause analysis with all the affected parties in the room. This involves asking "why" at least five times for each of the impacts of the failure, avoiding individual blame or castigation, and agreeing actions for each. The best actions are those that address cultural or training issues ("Carl to hold a lunchtime session on common security mistakes") as they have the widest effect and the greatest chance of preventing many future errors. Eric Reis goes into more detail on how to run such a session in his "five whys" post.