Methods of Measurement
Mostly when we think of estimation, we think of how much time or money will be needed, but we commonly use other scales to help us estimate, and it is important to know when and how to use different scales.
A Taxonomy of Measurement Methods
In 1946, an American Psychologist named Stanley Smith Stevens published a paper called “On the Theory of Scales and Measurements,” in which he defined a taxonomy of measurement methods. Steven’s vocabulary of measurement scales includes Nominal, Ordinal, Rational and Interval scales. This is not the only taxonomy of measurement methods, but it is a useful one in describing what we need to pay attention to understand how the different measurement scales play out in estimation scenarios.
Rational scales are ordered sets that support addition, subtraction, multiplication, and division. Time and money are rational scales you use every day. You can roll up a group of time estimates to a single value, and know that 2 times that value is twice as long.
Interval scales are ordered sets like rational scales, except that there is no fixed zero, such as the Celsius temperature scale, where zero does not indicate an absence of temperature, it’s effectively an arbitrary point on the scale. Scrum Story Points rely on the properties of interval scale to separate time from the effort, although teams using story points often make a mess of things by ignoring the behaviors and limitations of the scale that they’re using.
Ordinal scales are ordered sets, but you don’t know how much more one value is than another. Mohr’s scale of mineral hardness tells us that one rock is harder than another, but not by how much. You can say granite is harder than sandstone, but the scale doesn’t say anything about how much harder one set member is than another.
Consider the 5-star rating system, is watching 5 one-star movies the same amount of fun as watching 1 five-star. There’s no roll-up of ordinal values.
Tee shirt sizes are an ordinal scale that is often conflated with a rational scale, where sizes are assigned rational values and rolled up into totals. As the old saying goes “Neither fish nor fowl…”
“Neither fish nor fowl nor good red herring.”
Mixing Ordinal and Rational scales ignores the properties of measurement scales and introduces ambiguity.
The Cockburn Scale
A simple Ordinal scale explicitly developed to avoid this problem, was proposed by Alastair Cockburn. The set members are:
Cherry Pit, Peach Cobbler, Mustang and Brontosaurs.
The Cockburn ordinal scale is ideal for backlog refinement. Everybody gets Cherry Pit, and nobody in their right mind is going to figure out how many Peach Cobbles fit in a Mustang. There is no ambiguity about Brontosaurus, which is any story too big to go into work. It may seem whimsical, but the Cockburn scale leaves little opportunity for conversion to rational scale time, and it keeps the focus on story decomposition.
The Cockburn Ordinal Scale for Backlog Refinement
The OWASP risk assessment scale
Ordinal scales are sometimes misused. OWASP proposes the risk assessment scale of low, medium and high, a use that respects the ordinal scale principle of not specifying how much larger one set member is than another. Instead of low medium and high, we could label the set members cherry pit, peach cobbler, and Tyrannosaurus Rex. For an evaluation of cybersecurity risk, it definitely matters how much bigger one Tyrannosaurus is from another. The choice of measurement scale should be appropriate to support the decisions that matter.
Cherry Pit, Peach Cobbler, Tyrannosaurus Rex: Rethinking the OWSAP Ordinal Scale
Nominal scales are of unordered sets of attributes. Essentially, you’re testing set membership. Having no quantitative value, you can’t roll up nominals, because each attribute is either true or false. Thus, nominal scales support making qualitative measurements. There are as many nominal scales as there are qualities of things in the universe.
The INVEST Test
A useful nominal scale proposed by Scrum is acronym “INVEST”, which stands for a set of attributes. A story should test positive for each of the Invest attributes before it moves to technical planning.
Nominal scales are testing set membership, so it is a binary question, either true or false for set member. However, our certainty is usually something less than binary. We can quantify our uncertainty about binary questions by stating our confidence in a range of 50 and 100 percent. A 50% confidence means you have no idea — a coin toss. You can be any less uncertain than that.
Setting confidence intervals on binary questions help us to identify what new information might increase our understanding of the work, makes our uncertainty explicit and persistent for reevaluation.
Measure to Reduce Uncertainty
The reason we take measurements is to reduce uncertainty. When we measure lumber with a tape measurer, we may think we’re marking exactly some length, but when you consider that the duration of a second or the mass of a kilogram is only understood scientifically as approximations, then we can grasp that all measurements are approximations.
Any measurement that leaves with greater certainty than we had before has some value; we’re interested in those cases where the reduction of uncertainty is of higher value than the effort of taking the measurement. We want to avoid wasting our time on measurements that are more effort than they’re worth, or worse, actually increase uncertainty as a result of the effort.
Different measurements scales have different attributes. When we misuse measurement scales, we are as likely as not to increase uncertainty. Learning the properties of measurement scales is a foundational skill of estimation.Tweet alt="Tweet this" />