The Art of Right-Sizing Stories
breaking down to things we can reason about
Getting story sizes right is one of the most important skills of a successful team. Too much decomposition risks losing the narrative and getting side-tracked from a focus on the value stream, while too little decomposition leaves the work at risk of going in the wrong direction or getting blocked. The essence of decomposition is breaking things down to components of work that we can reason about.
What's at Stake: Over-confidence vs. Under confidence
Over-estimating is an expression of under-confidence. One of the consequences of under-confidence is that you risk engendering a lack of urgency, chronicled in Goldrat’s Student Syndrom. Related is the tendency for the task to fill the available time, as explained in Parkinson’s Law. Managers most fear the Golem Effect, where low expectations foster a decrease in performance. If the risks of over-estimating sound bad, consider that the consequences of under-estimating are much more severe.
Under-estimating is an expression of over-confidence. When we under-estimate, the debilitating knock-on effects on the development process can include:
- Reduced effectiveness of the project plan
- Reduced chance of meeting stakeholder expectations
- Increased risk of compromising the technical foundation
You’re also likely to suffer:
- More status meetings
- More re-estimation
- Damage to stakeholder expectations
- Greater exposure to changing requirements
While over-estimating can blunt a team’s effectiveness, under-estimating can wreak havoc on stakeholder confidence. The performance of an otherwise competent development team can be significantly degraded simply by getting decomposition wrong. Good intentions are not enough. We need sound technical skills to follow through on our resolve to right-size our stories.
Easier said than done
Decomposition sounds deceptively simple; I mean, if a guy on a chain gang can do it with a pile of rocks, we should be able to do it too, right? Except that we’re not breaking rocks, we’re sorting conceptually complex statements of work in an abstract domain. It’s deceptive to say that you’re just breaking something big to smaller components because the reality is a bit more complicated than that.
Horizontal vs. Vertical
A horizontal breakdown is by functional components, such as class structure, persistence, front end, configuration, and so on. Vertical decomposition separates the value stream into layers, with each slice encompassing all relevant functional components, each a serviceable increment of the overall objective.
Horizontal decomposition is more accessible because it’s easier for us to imagine the scope. It’s nearly inevitable to break things done this way if you’re into the habit of assigning work to team members by matching function component with skill specialty; more so if your team is distributed, working across time zones. We’ll look at how to overcome this in a moment.
The real benefit of vertical decomposition is that it keeps the focus on the value stream. Thin slicing functionality lends itself better to discovery since we have as close to an end-to-end feature as possible with each pass. This helps avert the risk of delayed detection of inconsistency between functional areas being implemented in parallel.
Every conversation about how to sort the tasks into slices touches on the business objective because we only know we have a coherent slice when we can see the value stream through it. This is not the case with horizontal decomposition, where discussions which are focused on one component, persistence, for example, can get off in the weeds without anyone noticing that you’re talking about things that, while interesting, don’t contribute to the value stream of the parent story.
What serves as a coherent story in horizontal decomposition often flounders when subjected to slicing based on business objective, because often the vertically decomposed unit of work cuts across multiple stories and even epics. While the entanglements of vertical decomposition may be distressing, the effort has the benefit of mitigating integration risks that would only emerge in a late stage when working the stories horizontally, by functional component.
When the cross-functional capabilities of a team are not strong, we can still adopt a strategy of vertically sliced stories to functional level implementation on the task level rather than the story level. This requires more coordination but has the benefit of building cross-functional strength in the process.
Backlog Refinement vs Sprint Planning
How far to go in decomposition in backlog refinement is often unclear, leading to ambiguity and inconsistent planning, so we should address the question before getting much further into the question of how to go about decomposition. The goal of backlog refinement is prioritization and decomposition.
The business objective, independent of a solution, is the boundary to observe in backlog refinement. Keep the focus on the value stream and steer clear of a discussion of technical implementation. Inevitably, technical considerations need to be discussed, and implementation details creep into the conversation. Getting good at shepherding the conversation back the business objective is a big part of what makes for successful backlog refinement.
Generally, it is a mistake to try to size the stories for work in backlog refinement; that’s why tee-shirt sizing is common practice. In the section on “Methods of Measurement” we discussed the Cockburn ordinal scale as a way around the common perils of tee-shirt sizing. Using the right scale goes a long way towards keeping backlog refinement on track.
Whether we’re scheduling work with Scrum, Kanban or a bespoke methodology, we’re going to have regular activities that correspond to Scrum’s Sprint Planning. For Kanban, the Ready-for-Work queue replenishment meeting is the same a Scrum Sprint Planning meeting, except that in Kanban you don’t have to be concerned with batching a 2-week allocation for each team member; you need to batch enough work to fill the buffer. The smaller scope of the discussion of filling the Kanban On-Deck buffer helps when it comes to our decomposition effort. All the time that spent trying to get the Sprint allocation right can instead be focused on getting the composition of the stories right; generally a better value.
Unlike backlog refinement, technical implementation details are the subject of the work planning meeting (Sprint Planning), and so this is the time for the comprehensive evaluation of decomposition.
Sprint Planning vs. Technical Planning
Whether we decompose horizontally or vertically, we still need to get the size right. The most common criteria people use for gauging size is the estimate of how long it will take to implement, which is a quantitative estimate. Guessing effort as a means of determining if you have the right level of decomposition is effectively short-circuiting the evaluation, substituting a guess in place of what should be an evaluation of specific attributes of the story. In a moment, we’ll discuss using a qualitative approach for decomposition at this planning stage.
A quantitative estimate of effort is harder for vertically decomposed stories because it cuts across functional components. That fact that effort estimation of horizontally decomposed stories is easier is deceptive because it avoids the need to tackle integration questions, which often turns out to be the hardest part of the work.
Leave the effort estimation to the next stage, and let’s look at what we can realistically know about a story at this stage, after backlog refinement, and before technical planning.
Qualitative vs. Quantitative Estimation
If we’re going to leave effort estimation till the technical planning meeting, how do we measure the size of a story in the Queue Replenishment or Sprint Planning meeting? When we understand the various Methods of Measurement, then we know we have more tools in our kit than just time-based effort estimation. Nominal scales give us a way to measure specific qualities of stories. We can adopt a set of qualities as a screen for story decomposition, without having to resort to time-based effort estimations, which require more technical considerations that we have access to at this stage.
Scrum practice specifies a set of attributes precisely for this purpose of qualitative assessment of stories in the planning stage which goes by the mnemonic “INVEST”: is the story Independent, Negotiable, Valuable, Estimable, Small and Testable? Let’s work through what we mean by each of those attributes, to see if they help us understand if we have the right level of decomposition.
Evaluating INVEST Attributes as Qualitative Estimation
The INVEST attributes are qualities of the work being evaluated which together constitute a sort of screen through which stories must pass to go into a work queue.
We want to make sure that our stores are not entangled with other work in ways that might unnecessarily complicate getting the work done. We’ll want to enumerate dependencies and to evaluate if any of those can reasonably be broken out separately. In the planning stage, we can delve into implementation, and explore how various approaches play out in terms of separation of concern and the prospect for the work to getting snarled up in related issues.
The quality of being negotiable indicates that the business objective of the story hasn’t been lost or pinned down in the technical specification. If we have work that everyone seems to understand and agree on value and viability of the story, but it is in the form of a statement of what to do, rather than what objective must be met, then we’ve lost negotiability. If unforeseen dependencies or technology issues emerge when it goes into work, then it is more likely to become blocked. Making sure that the statement of work is primarily a statement of what the business needs out of the effort leaves room for the development team to renegotiate the proposed implementation, so long as the desired outcome is met.
The principle of negotiability is essential to Scrum because the work is constrained to a non-negotiable time-box. If the development team can meet the essential goal within the Sprint, then that is generally a better outcome than rolling the work into the next sprint, so that a predetermined, specific implementation of the story can be delivered.
The value of the story was already established in backlog refinement when the work was prioritized and promoted to the planning stage. Still, circumstances change. Developments in the business, deployment environment or technology may change the relative value of a story. We should ask this question at every gate, and to take the opportunity to drop any work what doesn’t pass the value test before it goes into work.
It seems common for people to confuse the qualitative question: can we estimate it? with actually estimating the work in terms of effort — the very fact that we can answer yes to each of the INVEST attributes for a store is a form of estimation. At this stage, we just asking: is the amount of uncertainty you have about the story so great that it should be further refined before queuing it for work? The effort estimation, if you choose to do that, would happen as an outcome of technical planning, just before the story goes into work.
Asking if a story is small is effectively an extension of asking if it’s estimable. We’ve already looked at the question of size in backlog refinement, and we’re not going to attempt the effort until technical planning, so we should still be on some ordinal scale evaluation.
Evaluating if a story is testable is simply a time to stop and consider if there is something particular about this story that is unique concerning verification. For most stories, testability is not a problem. Whether you do, it is another question, but here we’re just considering if we’d know how to approach it. If a story interacts with a 3rd party service, then it may not be clear how’d you’d verify that the interactions perform as expected; you might add a specification to build a mock of the service to test against, then you could induce failures on the service side that would hard to do otherwise.
Decomposition is tough work. It’s not merely a question of breaking work down to smaller chunks, but organizing how we approach the architecture, how we work together as a team and agreeing on how to structure out planning meetings. In theory, the time we spend getting to the right decomposition will be more than paid back in time saved in implementation and delivery.
bantersnaps — "Cockpit at Night"
Let's agree to define productivity in terms of throughput. We can debate the meaning of productivity in terms of additional measurements of the business value of delivered work, but as Eliyahu Goldratt pointed out in his critique of the Balanced Scorecard, there is a virtue in simplicity. Throughput doesn’t answer all our questions about business value, but it is a sufficient metric for the context of evaluating the relationship of practices with productivity.