Accurate Estimations

We’re constantly asked to give estimates:

How long will it take?
How much will it cost?

What time should we leave?
How much do you want?

Estimations are information about the unknown. We constantly use this information to make decisions: allocating resources, changing strategies, and choosing partners. But despite all this practice, we’re horrible at accurate estimations.

Why Estimations are Hard

Estimations are hard for both technical and social reasons.

We don’t know what we don’t know. Naive estimators fail to account for surprises. They estimate based on known factors and best-case scenarios. These estimates may be perfectly accurate beforehand, but they’re instantly broken by the first surprise.

Experienced estimators account for this problem by ‘adding some buffer’. But even then, how much should they add? Even knowing that surprises can happen, it’s impossible to how many will happen or the impact of those surprises. Choosing the right amount of buffer is a lot like making the right estimation in the first place. We still don’t know what we don’t know. And adding too much buffer can be as expensive as failing to account for surprise at all.

In addition to this technical problem, there’s a strong social problem. Let’s imagine two common scenarios.

In the first scenario, you just gave giving an estimate to your team. You perfectly estimate that a project will take three weeks; but your manager gives you a puzzled look. Your teammate snickers, claiming they could do it in a week, tops. The feeling is that you must be either lazy or incompetent to give such a padded estimation. You newly shortened estimate is wrong, so you go on to extend the project’s deadline twice in three weeks. Rather than holding you accountable to that one-week estimate, your manager commends you for being able to handle the unforeseen surprises on such a complicated project.

In the other scenario, you’re giving an estimate to a potential client. You perfectly estimate that a project will take three weeks; but your competitor only estimates it’ll take a week. The client signs with your competitor. Since their estimate was wrong, your competitor goes on to extend the deadline twice in three weeks. Even though your estimate was accurate, your client’s estimate got them paid.

I’ve been in countless scenarios like this. Sometimes people outright pressure us into shortening our estimations, and sometimes the voice in our heads push us to. Either way, giving accurate estimates is both technically hard and socially challenging [1].

Two Types of Estimators

In response to this hard problem, we become systematic underestimators or systematic overestimators.

Underestimators fail to give enough buffer. This strategy has two key benefits. First, it signals (unrealistically) high performance. Like our virtue-signalling teammate, we can underestimate ahead of time, then point at concrete surprises for our eventual underperformance. And like our overpromising competitor, we can underestimate during a bid and do whatever we want after the contract is signed. Second, tight estimates demand efficiency. Underestimators set deadlines that they and their teams must work hard to meet. Underestimation works well when the costs of going over-budget are small. But when those costs are large, underestimations lead to disasters. On the whole, underestimators systematically run the risk of being burnt out, past-deadline, and over-budget.

Overestimators are instead biased towards large buffers. Extreme overestimators might send you articles titled “Estimations are a Scam”, or claim that estimations are simply tools for worker exploitation. Overestimation works when the costs of buffer are low and the costs of going over budget are high. But overestimators are constantly taxed by Parkinson’s Law. Parkinson’s Law is the pattern where projects fill the time and resources they’re allocated, instead of the time and resources they need [2]. Rather than pushing towards peak performance, overestimators systematically move at a bored, leisurely pace. Overestimators are also demotivating. Rather than inspiring the team to reach competitive goals, they disparage those who do. So while underestimators run the risk of their teams burning out, overestimators run the risk of their teams shutting down.

To simplify the hard problem of estimations, we slowly become under- or over-estimators. We reap the systematic rewards and accept the systematic costs. These chosen strategies may work in many contexts. But for any simple strategy, there are worst-case scenarios where those systematic risks blow up. Underestimators blow up when the costs of going over budget skyrocket. Overestimators blow up when the costs of adding extra buffer skyrocket.

This line between underestimators and overestimators forms a classic “spectrum problem”. A spectrum problem occurs when we oversimplify the solution to a given tradeoff. In this case, we’re splitting the spectrum of estimation strategies in half. Underestimators fall on one side of the line, and overestimators fall on the other. With many spectrum problems, the better solution is to cut the spectrum into three pieces, choosing the middle strategy between extremes. In doing so, we acknowledge the costs and rewards of both sides, maximizing the upsides and minimizing the downsides systematically.

Playing Single-Pointed Darts

Imagine a game of darts with very simple rules. I throw my dart first, and you only score by hitting that same exact dart-sized point. This game is very simple, but so difficult that nobody would play it. To make darts playable, we specify scoring ranges. These same ranges are missing from our everyday estimations.

The most common estimate sounds something like: “I’ll have it ready by 5pm.” But this is just like playing single-point darts! Imagine that this estimate is perfectly precise: the project can’t be ready one second before or one second after 5pm. While impressive, there’s zero room for error. If we end up sick, or if the project ends up more complex than expected, our estimate becomes instantly wrong.

To account for surprises, we can add buffer. But adding buffer only shifts the dart board over a few inches. “I’ll have it ready by 5pm tomorrow” faces all the same problems as the first estimation. Two days of surprises still makes me wrong. We’re still playing single-pointed darts.

It’s a losing game. And yet we see it played again and again, day after day, project after project.

Playing Darts with Ranges

To enjoy this game of darts, we need to change the rules. Rather than giving a single-pointed estimate, we give two points: one for the best case, and one for the worst case. These two points create a range of targets to hit, just like the game of darts we know and love.

To demonstrate, I can give an example from my personal life. My girlfriend is very punctual, and I’m not. She’s a classic overestimator, giving as much buffer as we can afford. And I’m a classic underestimator, giving the most optimistic estimates. So whenever we have somewhere to be, and she asks me “what time should we be ready?”… it’s a classic estimation problem!

My preference would be to underestimate, and tell her the last minute we can leave without being late. Her preference would be for me to overestimate, telling her the soonest we can leave without being “too early”. But no matter what I say, we face all of the same problems stated above.

Recently, I’ve started giving her two answers. The first answer is “the green time”. The green time tells us when we should leave so that we’re pleasantly early. The second answer is “the red time”. The red time is the last minute we can leave, without definitely being late. If we can be ready by the green time without stress or shortcuts, that’s perfect. Once we pass the green time, we go into the “yellow zone.” The yellow zone is the buffer between early and late. There’s no need to take shortcuts or change plans yet, but we’re getting close. Once we approach the red line, we start discussing the need to take shortcuts, or to start telling others that we may be a little late.

Having a green time, a yellow zone, and a red time transforms our game of single-pointed darts into a proper dartboard, with zones of success. The green time makes my girlfriend happy: she can be ready early without worry. The red time makes me happy: I can fill my buffer time up with other activities. And the yellow zone is a signal for both of us to get focused or to start taking shortcuts [3].

Aside from the technical benefits, I can feel the difference in enjoyment between playing single-pointed darts vs playing scoring-ranged darts. Having a range makes inherent uncertainty explicit to the group. Adding buffer doesn’t just shift the dartboard a few inches, it expands our range of accuracy. Larger buffers signal more uncertainty and require less precision, while smaller buffers signal more confidence and require more precision. The expression of certainty and confidence isn’t possible to communicate with a single-pointed estimate.

By estimating with two numbers instead of one, a richness of information and strategies are made available.

Simple Changes

Using two numbers instead of one, these so-called “confidence intervals” aren’t complicated. Then why are they so absent from everyday life?

Although nothing prevented us from discovering them earlier, the first mention of confidence intervals in scientific literature wasn’t until 1937. And it wasn’t until the 1980s that they were required in scientific journals. So if it took forty years for the most knowledgeable people to apply a simple solution towards the most urgent problems, it’s not surprising that it’s taken the rest of us at least as long. That said, the goal of this essay is to speed up that process.

We can move towards accurate estimations by practicing two simple rules. The first: When giving an estimate, state the best- and worst-case answers. The second: Whenever getting an estimate, ask for the best- and worst-case answers.

“It’ll take two weeks” should raise a red flag. A better answer sounds like: “in the best case, one week; in the worst case, three”. Boom! Now we have a range of days to deliver the project, instead of one. Not only do we have a picture of the optimal scenario, we have an idea for the hidden risks lurking in the future.

As a bonus, we’re capable of using the “green line, yellow zone, red line” imagery to guide our actions [4]. The green line is how we strive for the best and signal our competence. The red line is our insurance policy in case of surprises. The yellow zone is where time is critical, and we need to coordinate in case our plans need to change.

In a world of uncertainty, this extra byte of information is enough to create more accurate thought and action.

Conclusion

In summary, everyday estimations are hard, and confidence intervals are a simple way to make them easier.

Estimations aren’t because people are stupid. There are compelling social and technical reasons why we fail to make accurate estimations. Given these reasons, we see two types of estimators. Underestimators love tight estimations, while overestimators love padded estimations. Over time and across scenarios, these estimators are systematically right and systematically wrong.

We can break out of this dichotomy by changing the way we give estimates. By estimating with two numbers instead of one, we open up a world of information and process improvements. This small change in how we think, how we talk, and how we work is a huge, missing leverage point for us to be more effective and more efficient over time.

Footnotes

[1] Though questionable, accurate estimations are worth the effort. First of all, accurate estimations build trust. After a few false promises, people catch on. Like compound interest, commitments founded on years of accrued trust are worth much more than short-term deals based on deceptive estimates. Secondly, inaccurate estimations kick off a vicious cycle of degrading quality. Underestimations lead to insufficient resources, insufficient resources lead to shortcuts, shortcuts lead to poor quality, and poor quality leads to greater consumption of resources. These teams work harder and harder to do less and less. Inversely, overestimations lead to too many resources, too many resources lead to inefficiency, inefficiency leads to overestimations. These teams take longer and longer to do less and less.

[2] Parkinson’s Law is the insight that projects tend to consume the resources given to them, be it time, money, or people. Fear of Parkinson’s Law motivates people to justify tight deadlines. They claim that without tight deadlines, people tend to get lazy or unfocused.

But if Parkinson’s Law was the whole story, we’d never see projects go over budget! But since we see projects go over deadline and budget all the time, adjusting deadlines can’t be the only factor. Instead, keeping deadlines as tight as possible leads to burnout, in addition to all the other costs of underestimating. Though useful for timeboxing projects in general, Parkinson’s Law should not be used as the sole guiding principle for estimations.

[3] In professional projects, the same concepts apply a bit differently. The green time is the earliest a project can be ready, satisfying the underestimators; and the red time would be the latest a project can be ready, satisfying the overestimators. Underestimators can still push for the green time, while overestimators still find peace of mind in knowing the red time exists.

[4] “One week, give or take two days” in a common way to convey a similar amount of information. I focused on the “best case / worst case” framing, because it opens the door to the green / yellow / red style of project management. That said, two answers are always better than one, so if adding a “give or take” is an easier way to make this transition in how we estimate, I’m all for it.

Software Entropy

Defining Entropy

Entropy is a measure of chaos, or disorder, in a system.

My college physics professor described entropy using two shoe closets.

Imagine a clean shoe closet, where all shoes are paired and sorted by color. The closet’s entropy is the total number of arrangements its shoes can have. A clean closet’s entropy is relatively small. There may be a few pairs of grey or blue shoes that can be switched around – but this doesn’t add much complexity. In a closet with low entropy, it’s easy to add or remove shoes from that closet as needed.

Now imagine a messy shoe closet. None of the shoes are paired, and they’re all tangled in a big pile. How many possible combinations can these shoes be in? You can quickly find out by trying to pull out the pair you want. The messy shoe closet has a much greater entropy than the clean one.

In short, we measure entropy by counting the number of possible states a system can be in. More states mean more entropy.

Entropy in Software

In software, our building blocks are simple enough for us to measure entropy in a crude way. Take this model for example:

Transaction(
  createdAt: String
  buyerId: String,
  sellerId: String
  amount: Int
)

As simple as it seems, this model is like our messy shoe closet. There are many more ways for this model to be wrong than there are for it to be right. We can see that by comparing it to an organized shoe closet:

Transaction(
  createdAt: DateTime,
  buyerId: UserId,
  sellerId: UserId,
  amount: Price
)

When `createdAt` was an arbitrary string, it could take on invalid values “foo” and “bar” just as easily as a valid value “06-23-2020”. There are many more possible states that the field can be in, and most of them are invalid. This choice of a broad data type allows chaos into our model. This unwanted chaos leads to misunderstandings, bugs, and wasted energy.

When each model is strongly typed to a strict set of values, this chaos is minimized. DateTime, UserId, and Price are typed such that all possible values are valid. Accordingly, these types are more predictable, easier to manipulate, and lead to less surprises in practice.

As in life, entropy is not all bad – some of it is desirable and some of it is not. In software, we need entropy to a certain extent: our code is valuable because it supports a variety of possible dates, users, and prices. But when this chaos grows beyond the value it adds, our software becomes painful to use and painful to maintain.

Modeling Software Entropy

Given our observations, we can describe a simple rule:

complexity = number of total possible states

A construct with only a few possible states is simple. Booleans and enums are much simpler than strings. A system with one moving piece is much simpler than a system with many moving pieces.

Sometimes, our problems are essentially complex. In these cases, our solutions need some essential complexity to match. But when does essential complexity become unnecessary? In these cases, we can use another rule:

cleanliness = number of valid possible states / number of total possible states

If there are thousands of total possible states, but only two of them are valid: it’s a messy solution. A simple example of this is representing a boolean value as a string.

if value == "true": do this
else if value == "false": do that
else: throw error

There are many ways for this code to go wrong; not just in execution but also in interpretation. Keeping our solutions clean improves correctness, readability, and maintainability. It’s one of the primary measures of “quality” in my view.

Minimizing Software Entropy

Given these definitions, we can ask ourselves some questions to guide our software decisions:

  1. How many possible states does this solution have?
  2. How many of those states are invalid?
  3. Is there any way to make the solution simpler, by trimming the number of total possible states?
  4. Is there any way to make the solution cleaner, by trimming the number of invalid possible states?

The power of this concept is that it smoothly scales up and down the ladder of abstraction. It applies to basic data types just as well as it does to solution architecture and product development.

How many moving pieces does our solution need? When an unimaginable requirement flies in and tries to blow our solution to the ground, how many pieces can be left standing? When an unexpected input arrives, do invalid states propagate across the system, or are they contained and eliminated on sight? In short, how clean is our solution?

To make life possible, we utilize chaos by creating complex systems that support a diversity of people and their use cases. To make life predictable, we combat undesirable chaos by keeping those systems as clean and orderly as possible.

In software, we work in a world where chaos is measurable and cleanliness is achievable. We just need the right set of signals and responses to make it happen.


Processing…
Success! You're on the list.

Notes on “Productivity” by Sam Altman

Here’s the original article on Sam’s blog.

“Compound growth gets discussed as a financial concept, but it works in careers as well, and it is magic. A small productivity gain, compounded over 50 years, is worth a lot.”

What are your productivity gains?

  • make a list of what you need to do in a day
  • cut out distractions and focus for as long as you can on tasks that need it
  • break tasks down into the smallest doable chunks
  • arrange the doable chunks into a compelling picture for a project

What you work on

“Picking the right thing to work on is the most important element of productivity and usually almost ignored.”

“I make sure to leave enough time in my schedule to think about what to work on.”

“I learned that I can’t be very productive working on things I don’t care about or don’t like.”

“Everyone else is also most productive when they’re doing what they like, so do what you’d want other people to do for you [to maximize their productivity]. Try to figure out who likes (and is good at) doing what, and delegate that way”

When you work with someone, ask them “what do you like to work on?” You can’t answer this question for a lot of people you currently and previously worked with. You only have ideas, that could be very wrong.

You can go even deeper on this. Before you start a project, understand what each team member’s interests are. Then craft the project with the qualities that maximize your team’s interest. Building something is deeper than just “here are the requirements, build them”. The second- and third-order qualities of a product and the team that builds it determine their success in the long run. Rather than just “what product do we want to build?” – “what kind of product do we want to build?” – and rather than “what team do we want to build?” – “what kind of team do we want to build?”. There needs to be a balance between focus on the first-order results and focus on the second-order results. Trying to milk out as much first-order results as you can in the short term leaves you without a healthy team in the end, and then without a healthy product.

What do you do if you and your coworker like to work on the same stuff? Collaborate and share. Sometimes you get the fun thing, sometimes they do. Make it explicit that you’re sharing – don’t make it implicitly competitive. Learn and teach. Pitch joint ventures you can deliver together.

What do you do if you and your coworker like to work on different stuff? That’s easier, split the work that way. One problem may be that then both of you think that the stuff you like to do is the most important stuff, and that is an interesting problem. Ideally, both of you understand that you complement each other and fill in for each other’s blind spots. Without that balance and appreciation, problems arise.

“If you find yourself not liking what you’re doing for a long period of time, seriously consider a major job change.” short-term burnout should be resolvable by some time off, otherwise there’s a deeper problem.

“It’s important to learn that you can learn anything you want, and that you can get better quickly.” This is along the lines of a keystone achievement. Some achievements are breakthroughs in what you believe is possible and open up a whole world of possibilities for life. The thing about keystone achievements you’ve seen is that they’re hard to foresee, they come accidentally as the result of overcoming some first-order difficulty.

“Try to be around smart, productive, happy, and positive people that don’t belittle your ambitions. I love being around people who push me and inspire me to be better. To the degree you’re able to, avoid the opposite kind of people.”

“You have to both pick the right problem and do the work. There aren’t many shortcuts.”

Prioritization

“My system has three pillars: get the important shit done, don’t waste time on stupid shit, and make a lot of lists”

“I make lists of what I want to accomplish each year, each month, and each day.” You tried doing this monthly, and only the daily ended up sticking. But this was a 10x improvement in my productivity. Maybe i should try for the monthly again.

“Lists are very focusing, and they help me with multitasking because I don’t have to keep as much in my head. If I’m not in the mood for some particular task, I can always find something else I’m excited to do.” You find this too – as long as you wrote down the task, you will come back to the list and it won’t get lost. That trust is critical in a complex and distracting environment.

“I try to prioritize in a way that generates momentum. The more I get done, the better I feel, and then the more I get done. I like to start and end each day with something I can really make progress on.” The power of positive feedback loops – start your day with the smallest positive feedback loop you can build.

“I am relentless about getting my most important projects done” Imagine the kind of life this statement of self-identification produces.

“I find the best meetings are scheduled for 15-20 minutes, or 2 hours.” This is great.

“I have different times of the day I try to use for different kinds of work. The first few hours of the morning are definitely my most productive time of the day. I try to do meetings in the afternoon. I take a break or switch tasks whenever I feel my attention starting to fade” – you used to focus best in the evenings (because you had trouble shutting out distractions), now you focus best in the mornings and afternoons. I’m not sure exactly yet.

“I don’t think most people value their time enough – I am surprised by the number of people making $100/hr that will spend a couple hours doing something to save them $20”

“productivity porn – chasing productivity for its own sake isn’t helpful” the diminishing returns of recursion

“Sleep seems to be the most important physical factor in productivity for me.” You’ve learned this the hard way as well. And not just in pure performance, but in other factors like emotional stability and enjoyment of work.

“great mattress makes a huge difference. Not eating a lot before sleep helps. Not drinking alcohol helps a lot.”

“I use a full spectrum LED light most mornings for about 10-15 minutes. If you try nothing else on here, this is the thing I’d try.” recommends this one

“Exercise is probably the second most important physical factor” – you see this too, mostly in second order effects

“Eating lots of sguar is the thing that makes me feel worst [and thus least productive]. I don’t have much willpower with sweets, so I mostly just try to keep junk food out of the house” – same with second order effects, but also pure performance in terms of focus

“Here’s what I like in a workspace: natural light, quiet, knowing that I won’t be interrupted if I don’t want to be, long blocks of time, and being comfortable and relaxed”

“Like most people, I sometimes go through periods of aw eek or two where I have just no motivaction to do anything”

“In general, I think it’s good to overcommit a little bit. I find that I generally get done what I take on, and if I have a little too much to do it makes me more efficient at everything.” You’ve learned this recently. Being efficient is the critical skill – valuing your time and learning to earn multiples money for the same amount of time.

“Finally, to repeat one more time: productivity in the wrong direction isn’t worth anything at all. Think more about what you work on.” Also from the four-hour work-week – “doing the wrong thing perfectly doesn’t make it the right thing”. Some of the best advice on productivity.