This is a repost of a series of article I originally published for Songbird.
Drowning in the Waterfall
Up until version 0.3, Songbird development had been following a fairly traditional waterfall model. Realizing the ambitious vision of building both a platform and a desktop media player had presented many challenges. A lot of plumbing infrastructure was needed before any features could be created.
Faced with that challenge, the engineering team did what engineers like to do best, they designed a very comprehensive system, planned for it very carefully and started cranking code.
“Things are going ok, we’re kind of tracking it.”
During the planning phase, the team estimated the work to the best of their abilities and a Gantt chart was created to reflect identified dependencies and track progress. Unfortunately, this approach led to lengthy release cycles (10-12 months) with lots of room for scope creep. When the release finally got completed, lots of good work was accomplished (over 1,200 issues where addressed in 0.3 alone) but the lack of visibility was problematic and the slow pace of releases was too demoralizing.
My challenge as the new VP of Engineering was to fix this.
I needed to make the team recognize that the schedule was build on the false assumption that we knew everything upfront. There was a sentiment amongst the team that the whole Gantt thing was too removed from the actual work. “Things are going ok, we’re kind of tracking it” was not an acceptable way to run this project. We had to accept that our planning and scheduling practices were broken.
Besides the effect on morale, a long development cycle presented the following problems:
- All or nothing. Nothing could be released until everything was completed and put back together.
- No visibility. The fundamental “When can we ship?” question was hard to answer due to lack of visibility on real progress.
- Big ugly merges. The last stable release branch and trunk drift apart significantly. Especially problematic to support commercial partners.
We also felt that our task estimates were not granular enough and the built-in slack was not taking into account other work. Unfrequent build also meant a lack of visibility on product quality.
That kind of development approach also fosters poor engineering and product development habits, such as:
- Reduced sense of urgency to release code. Discipline drops, unit test failures increase, bugs pile up. “We have enough time to fix this” mentality develops.
- Ultimate release becomes a huge effort as the routine around releasing code does not form.
- No continuity in QA, bugs accumulate in a backlog.
- No context for new work, so there is a tendency to add more things to existing release, also known as feature creep.
Becoming more Agile
I lead the team to adopt a new approach to development that would address those issues. These are the objectives we tried to fulfill:
- Satisfy our customers (end-users, developers, partners) through early and continuous delivery of valuable and innovative software.
- Provide ability to react quickly to business changes.
- Reduce product defects and security exposures.
- Ensure team and product sustainability. Focus on ease of product maintenance, allow the team to maintain a constant pace indefinitely and be resilient to turn over.
- Provide good visibility into progress and release date.
I felt that we had a good foundation of existing practices we could build upon. Build automation, continuous integration, unit testing, peer review process for code commits, automated api documentation and the use of Bugzilla as an issue tracking system were already well established within the team. We needed to augment those with a few others borrowed from Agile methodologies.
We added the following set of guiding principles:
- Reduce release length and maintain a releasable product at all time.
- Working software is delivered frequently (weeks rather than months) and is the principal measure of progress.
- Provide rapid feedback loop to QA and Product Manager.
- Develop a system for better estimation and tracking.
Reduce Release Length
We committed to release a major update of Songbird every other month1. We opted for calendar driven releases with a short development cycle of 4 to 6 weeks of coding maximum. We adopted a naming scheme for each release train drawn from a music artists theme. Alphabetically sequencing those release trains makes it easy to see the order of releases (Eno release came after Dokken for instance).
Measure Progress Against Working Code
Within a release cycle, we introduced the notion of iterations of 1 week length that we used for planning and measuring progress against. This gave us the context to improve our estimation and tracking system.
Develop a System for Better Estimation and Tracking
Estimating how long software development takes is one of the most difficult thing to do.
We created a high level roadmap that captures the scope and sequencing of each release.
We came up with a set of lightweight artifacts that would help us represent and track what needs to be done.
- Feature2 - High level product feature (bullet point on product box, if we had one)
- Story - Smallest increment of implementable end-user value
- Task - Engineering implementation detail, chore, etc.
- Bug - Defect in existing functionality
In support of that, we also created on our wiki design docs that represent wireframe and accompanying notes to explain each feature.
For each story, task and bug, we’ve introduced a costing system based on points. The scale starts as at 1 point, for something easy that an engineer can do in a day. Two points means it will take some time to think about it and write the code. 3 pointers are reserved for the most difficult issues, requiring research or lots of code. The idea is to come up with a normalized unit of work across the team that we can measure ourselves against. Moving from a time based estimate to point base is not easy. For one, the scale is not linear (e.g. 3 pointers don’t take 3 times as long as 1 pointers). However, the basis for this is to simplify and reduce cost estimate process to a good enough level that we can measure and forecast. While it may be a blunt instrument, there is diminish return in spending more time refining and tracking estimate at a much more precise level.
The key benefit from developing a consistent costing practice is the ability to measure progress and use it to forecast future work. Enter Velocity. By computing how many points get completed by the team over an iteration, we were able to normalize the output in the form of a velocity metric. This gives us a sense of how much stuff we can accomplish in a typical cycle.
To track our progress, we setup 2 iteration meeting, one to review and plan the iteration and the other to steer the iteration at mid-point.
All the artifacts and measurements are tied together on a Release Plan published on our internal wiki.
Another important factor is the acceptance and validation of work. In order for points to be counted in an iteration, there needs to be an agreement on when things are done. Nightly builds are being used for verification every day. On the bug front, QA validates engineering fixes and mark them validated. Similarly, Product Manager accept stories as completed when they are satisfied with the implementation of a feature.
Putting It to Practice
With all these new practices in place, we were eager to tackle our first Agile release known as “Cher”. The next post takes a look at the roller coaster of real life release cycle.
1. Far from the daily continuous delivery pace that can be achieved with cloud applications. Keep in mind this is a very large desktop application running on three OSes, with enormous amount of device and library dependencies.