
When I was a developer, I used to underestimate efforts, while trying to estimate a project. Now and then, I faced the work parts, which I didn’t take into account. My colleagues advised to multiply the evaluation to 2 or 3, or even to Pi number — but it didn’t help to improve the estimation precision, but only brought in more problems. For example, when I had to explain where a high estimation came from.
Ten years have passed since that period. During this time I took part in estimation of more than 200 projects, made my mouth sore, and now I’d like to share my ideas with you, how to estimate projects.
I hope the material of this article will be useful for you to improve your estimation accuracy.
Why should we estimate?
The percentage of successful projects is close to 29% (by the research of The Standish Group for year 2015). The rest 71% were either failed or moved beyond the triple restriction (time limits, job description, budget).
This statistics allows to conclude, that the estimation of projects often doesn’t correspond to reality. Does it mean, that there is no use of estimation at all? Moreover, great interweb has a movement, denying the estimation and acclaiming only writing a code — and let chance decide. What comes around, goes around (you can look this movement for with the tag #noestimates).
Not to estimate sounds tempting, but let’s just imagine for a moment, that you are calling a cab. You are asking a driver: “How much will it cost to go to this street?”, and he answers you “I have no idea, get in, when we arrive, I’ll tell you the price”
Another example is from Agile style: “So, we’ll be moving, and you’ll be paying me every 10 minutes until you run out of money) When it ends, you’ll get out of the cab, but perhaps we may already arrive or just be close enough. And if not — so, it’s your problem — bad luck”
The customers in IT feel practically the same, when they are suggested to start working without any estimation.
In the example above, we theoretically want the driver to give us a correct estimation of the price, he’ll drive us for. At least, it would be nice to know a price spectrum. Thus, we can reckon up in our head, whether we should take a cab or not, take a bus or not, go on foot, or go nowhere at all and stay at home instead.
To get into a car with no understanding of what we should await for, is definitely not a decision, one can make in the right mind.
I hope, I could assure you, that estimation is an important part of making solutions in the project.
It may be close to reality, or may be not, but it is very essential.
The reasons of underestimation
Ignoring the theory of probability
Imagine, that a manager asks a developer, how much time the later will need to complete a task. The developer has already made similar tasks earlier and gives “a more probable” estimation. Let’s assume, it’ll take 10 days. There is also a probability, that the task may take 12 days, but this probability is lower than the one, that will take 10 days. There is also a probability, that the task may take 8 days, but it is also lower.
It’s often supposed, that estimations of a task/project spread according to a standard statistical law (more thorough information about a statistical law).
If we depict the spread of estimations and their probabilities in the form of a diagram, we’ll see the following picture:

The X axis corresponds to the estimation, and the Y axis — to the probability, that this estimation will be true, and the task will last for a certain period of time (neither more, nor less). As you can see in the center, there is a point with the most probable estimation. This point corresponds to our probability of 10 days.
The area under the curve gives an integrated probability as 100%. It turns out, that if we give a more probable estimation, we’ll be able to finish a project/task on time or earlier with a probability of 50% (the area under the curve before the estimation of 10 hours is a half of the curve and equals 50%). Thus, inspired by this principle and giving a more probable estimation, we can miss 50% of deadlines.
Moreover, it’s under condition, that probability distribution really corresponds to the normal distribution. At the normal distribution, the probability to finish earlier than a more probable estimation equals the probability to finish later.
However, when you come to think of it, the probability that something goes wrong in reality is much higher than the probability, that a wonder happens and we finish earlier.
In other words, the total value of all possible negative risks is always higher than one of positive “risks” (possibilities).
If we consider this point of view, we’ll get the following spread:

To make it more vivid, let’s represent this information in the form of the cumulative curve, which will run about the probability of finishing the project on time or earlier.

So, if we take a more probable estimation of 10 days, the probability, that the task will be solved on that time or earlier is less than 50%.
Ignoring a current level of uncertainty
While working over the task/project, we constantly learn some new information. We get some feedback from a customer, manager, test engineer, designer and other team members. All this knowledge is constantly growing. At the very beginning of the task/project, we know little about it, as a rule. And as we work on the project, the requirements become more transparent and clear, and at the end of the project we can say it for sure, what we needed to implement, and say the right time, it took us to do it.

The knowledge, we possess, influences the estimation accuracy directly.
The research of Luiz Laranjeira (PhD, Associate Professor at The University of Brasilia) also points out, that the estimation accuracy of the software project depends on the degree of requirements clearness (Luiz Laranjeira, 1990).
The more thorough the requirements are cleared, the more accurate the estimation is. The estimation can’t be precise, firstly, because the uncertainty already lays in the task/project itself. The only way to reduce this estimation uncertainty is to reduce the estimation itself in the project/task.
In accordance with this research and a common sense, if we reduce the uncertainty in the project/task, the estimation accuracy will grow.

This diagram is here for clearness, and a more probable estimate may vary in reality during the uncertainty reduce.
So, the main reasons of underestimation are uncertainty and ignoring the theory of probability.
Dependance of estimation accuracy upon the project stage
Luiz Laranjeira in his research went further and revealed the numerical dependence of, how the estimation spread depends on the project stage (the level of uncertainty).
If we consider a pessimistic, an optimistic and a more probable estimations (an optimistic estimation is the earliest delivery time of all possible options, a pessimistic estimation is the latest one) and show, how the relations among them vary with time, from the project’s start up to its end, we’ll have the following picture:

This diagram is called a cone of uncertainty. The horizontal axis corresponds to the time from the beginning of the work on the project up to its completion. Here the main project stages are notified. A relative value of the estimation mistake is depicted along the vertical axis.
Thus, at the stage of the original concept, a more probable estimation can vary from the optimistic one as 4 times. At the stage of a ready-made UI, the estimation spread vary from 0,8 up to 1,25 regarding a more probable estimation.
To make it more convenient, let’s look at the data in the form of the table:

A very important thing is that a cone doesn’t taper automatically with time. In order to make it tapered we should really manage a project and take certain actions, directed to the uncertainty decrease. If you lower the uncertainty intentionally during a project implementation, your work will look like that:

The area in blue color is called the uncertainty cloud.
During the whole project implementation, the estimation is subjected to great diversities up to the very completion of the project.
To reach the right point along the cone, where there is no uncertainty, we need to implement a ready-made software product :)
So, until the product is ready-made, there is always some uncertainty, and the estimation can’t be precise 100%.
But we can influence the estimation accuracy, by decreasing uncertainty. In addition, any activity, directed on the uncertainty decrease, lowers the estimation spread.
The analyzed model is practiced in many companies, including NASA. Some adapt it, in order to consider the irregularity of requirements. You can learn about it in a more thorough way in the book «Software Estimation: Demystifying the Black Art».
What can we call a good estimation?
There are a lot of variants of answer to this question, but in reality, if the estimation varies from the project goal by more than 20%, the project manager has no options open.
If the estimation is more or less within 20%, he can finish the project successfully by managing functionality, time limits, team size and other parameters.
It sounds quite reasonable, that is why we’ll fix on this definition of a good estimation, for example (this decision must be made company-wide: someone takes risks, and the divergence with 40–50% is quite available, but for someone the divergence by 10% is huge).
So, we’ll consider an estimation as a good one, if the divergence from the actual result is not more than 20%.
Practice. Estimating a project at various stages
Suppose that a project manager came to you and asked to estimate some functions or a project.
First of all, you should study the available requirements and understand, what life cycle stage of the task/project description is really on.
Further actions depend on what stage the project is now:
Stage 1. The original concept
If a manger or a customer comes to you and asks “How much time will it take to make an application, where doctors consult their patients?”, then you are definitely at the stage “the Original concept”.
When is it reasonable to estimate?
At the presell stage. When it’s extremely necessary to define, if the project is worth discussing further or not. Actually, it’s better to avoid estimation at this stage; you’d rather try to lower the uncertainty, to move to the next stage of the project life cycle.
What do we need to estimate the project?
We need to have facts about actual efforts and labor costs of a similar completed project.
What tools are most suitable?
· Analogy estimation
Estimation algorithm
Actually, it’s not possible to estimate the project itself at this stage. We can only say how much time we spent for a similar project.
For instance, you can announce the estimation as follows: “I don’t know how much time this project will take us, since we lack data, but a similar project X took us Y of time. To give at least an approximate estimation of this project, we should clarify the requirements”.
If there is no data from similar completed projects, the only possible option for estimating is to lower the uncertainty and move over to another stage.
How can we move over to another stage?
To do this, we should clarify the requirements, in order to understand, why we need the application and what functions it will fulfil.
Actually, you should have skills of requirements gathering and analyzing.
In order to improve these definite skills, you’d rather read “Requirements Engineering for Software” by Carl Vigers and Joy Bitty.
To gather initial requirements, you may use the following questionnaire:
· For what purpose do we need the application;
· What type of users will use the application (for the task, described above, they may be a Doctor, a Patient, an Administrator);
· What problems can each type of users solve with the help of the application;
· What platforms will the application work on.
Stage 2. Coordinated definition of the product.
At this stage, we already understand what the application will or and what it won’t do. Though, we have no details.
When is it reasonable to estimate?
At the stage of presell again. When we should make up our mind, if we need to implement the task/project at all, if we have enough money for it, if the time limits are reasonable. If the project value worth the resources and efforts, invested into the project.
What do we need to estimate the project?
We need to have a history of completed projects with estimation or vast experience of development in the sphere, which an estimated project relates to.
What tools are most suitable?
· Analogy estimation
· Top down estimation
Estimation algorithm
If the analogous project has been already implemented, the time, spent for the project, can be announced as an approximate estimation.
If there is no data in this project, you should split the project into main functional blocks, and then estimate each block by analogy with blocks, implemented in other projects.
For example, in the application, where doctors will consult their patients, we could have the following blocks:
· Registration;
· System of the appointment booking;
· Notification system;
· Video conferences;
· Feedback system;
· Payment system.
For the estimation of “Registration” block we can take an estimation from one project, and for the estimation of “Feedback system” block — from another project.
If there are blocks, which have never been implemented, or there is no data about them, we can estimate efforts regarding other blocks, or lower uncertainty and use the estimation method from the next stage.
For instance, the “Feedback system” block may seem much more complicated to us, than the “Registration” block. Therefore, we can take the estimation for this block twice bigger than we’ll take for “Registration” block.
This method (the estimation of one block regarding another block) is far from being precise, and it should be used, only if the quantity of blocks, which were never implemented, does not exceed 20% from the whole set of blocks, which have actual data. Otherwise, it will be just a guess work.
After this procedure, the estimation of all the blocks should be put away, and it will be a more probable estimate. A pessimistic and an optimistic estimations can be calculated, using the coefficients, corresponding to a current stage — x0,5 and x2 (see the coefficient table).
Ideally, this may be an answer to a manager, and he should decide what to do with that information.
If there is a situation, when a manager can’t make head or tail of it and asks only for a number — it is possible to give him this number.
How to get one estimation out of three estimations (pessimistic, optimistic, a more probable) we’ll consider later in the paragraph “How to get one estimation out of three?”
How can we move over to another stage?
In order to move over to another stage, we should prepare a full list of requirements. There are many ways for documenting, but we’ll consider a popular method with the use of User Story.
It should be defined for every block, who is going to use it, and what users are particularly going to do in it.
For example, after requirements gathering and analyzing we could have the following points for the “Feedback system” block:
· A patient can see all the feedback about the chosen doctor;
· A patient can leave feedback for a doctor after video conference with him;
· A doctor can see a list of feedback, left by the patients;
· A doctor can leave a comment to the feedback about his conference;
· An administrator can see the list of all feedback in the site;
· An administrator can edit certain feedback in the site;
· An administrator can delete definite feedback in the site.
It’s also necessary to collect and write down all non-functional requirements to the project. To gather them, you can use the following checklist:
· What platforms it should work on;
· What operational systems it’s necessary to support;
· What it should be integrated with;
· How quickly it should work;
· How many users it should support simultaneously.
Clarification of this will define the moving over to another stage.
Stage 3. The requirements are gathered and analyzed.
At this stage, there is a complete list of the things, which each user can do in the system, and there is a list of non-functional requirements.
When is it reasonable to estimate?
When we need to give an approximate estimation to the project before the work start according to T&M model. The estimations of the tasks from this stage can be used for prioritizing certain tasks at the project, for planning the release date and the project budget as well. They also can be used for control over the team’s productivity at the project and for estimating its effectiveness.
What do we need to estimate?
· A list of functional requirements
· A list of non-functional requirements
What tools are most suitable?
· Analogy estimation
· Top down estimation
Estimation algorithm
We need to split every task into components (to make decomposition). As fact, the more you split, the more precise estimation you will get. To do the best of it, you should keep in mind everything, you need to do for this task. You may do it mentally, but it’s better to do it on paper.
For example, for our User Story “A patient can see all the feedback about the chosen doctor” we could get the following picture (the illustrations below are in handwriting by purpose, in order to show, how the estimation process can look like in real life).

Here we have split the task into three components:
· Make infrastructure in the database
· Make DAL level, for the data access
· Make UI, where feedback itself will be at the output
If it’s possible, you can picture the functional interface in handwriting for the tasks, which include the UI interface, and then correlate it with someone, who asks for estimation. That will help to ditch many questions at once, increase the estimation accuracy and make your life easier in future.
If you’d like to improve your skills in interface design, you’d rather read “Interface” by Jef Ruskin and “About Face. The essentials of interaction design” by Alan Cooper.
Further, you should figure out, what exactly you will do for every task and estimate, how much time it will take. At this stage, you should evaluate real time, but not guess it. You should realize, what you will do to process every task.

If there are tasks, that take more than 8 hours, you should split them into subtasks.
The estimation, you got after the procedure described, can be considered as an optimistic one, because it, perhaps, takes in account the shortest way from point A to point B (in conditions, that you didn’t forgot anything).
Now, it’s time to think of the things, that you may have missed, and to correct your estimation. As a rule, in such cases, checklists are of great help. This is an example of such a checklist:
· Testing
· Design
· Test data processing
· Support of various screen resolutions
· …
You can find one of possible variants of a checklist here.
After skimming the list, you should add the list of tasks with something, what could be missed.

After that, you should analyze every task and subtask and think, what could go wrong, what problems might occur, what you could have missed. As a rule, during this analyzing, the issues can also be revealed, without which even the best case (an optimistic estimation) is not possible. They should be added into your estimation.

After you evaluate this time as well, your estimation may, apparently, be close to the optimistic estimation, than to a more probable one. If we look at the cone, this estimation will be close to its lower edge.

The exclusion may be made in case, if you have solved this task before and may say it for sure, that you really know, how to do it, and how much time it took earlier.
In these circumstances, your estimation could be called “a more probable estimation” and would correspond to the line X in the cone. Otherwise, your estimation is an optimistic one.
The rest two estimations can be evaluated with the use of coefficients, which correspond to the current stage х0,67 и х1,5 (see the coefficient table).
If we calculate the estimations from the given example, we’ll get the following:
· An optimistic estimation- 14 hours
· A more probable estimation — 20 hours
· A pessimistic estimation — 31 hours
How can we move over to another stage?
To move over to another stage, we should design the user interface. The best way to do it will be the wireframe design.
There are a lot of programs, but I’d like to draw your attention to Balsamiq and Axure RP.
Prototyping is a single vast topic, which goes beyond this article.
Having a wireframe ready-made denotes moving over to the next stage.
Stage 4. The interface is designed.
Now at this stage, we have wireframe and a full list of things, what every user in the system can do, and we also have a list of nonfunctional requirements.
When is it reasonable to estimate?
It should be done not only for preparing a precise estimation by Fixed Priced model, but also for everything, described at the previous stage.
What we need, to estimate?
· Ready-made wireframes
· A list of functional requirements
· A list of non-functional requirements
What tools are most suitable?
· Top down estimation
· Analogy estimation
Estimation algorithm
It’s the same as at the previous stage. The difference is in the accuracy. If you have a planned interface, there is not much to be thought out, and the probability not to take something in account will be much less.
How can we move over to another stage?
To design all the application architecture and to think over a future implementation in details. We won’t consider this option, since it’s rarely used in practice. Moreover, the estimation algorithm after architecture analyzing won’t differ from the current stage algorithm. The difference will be again in the accuracy increase.
We get one estimation from the whole estimation diapason
If you already have a more probable, pessimistic and optimistic estimations, to get one estimation, you can use one of Tom deMarco’s groundwork, who in his book “Waltzing WithBbears” grabbed an idea, that absolute probability can be calculated by space integration under the curve (the very graph about asymmetrical probability spread, which we draw earlier). An original calculation pattern can be downloaded here (it’s also available here without registration). You should just put 3 numbers into the pattern and get a result as a list of estimations with all corresponding probabilities.
For instance, for estimations 14, 20 and 31 we would get the following result (a screenshot from the exel pattern):

You can choose any probability, which will be available for your company, but I would advise to take 85%.
You don’t know how to estimate — say it
If you don’t know, what they want from you, or you don’t know, how to implement functionality exactly, which you are asked to estimate, in this case you’d better tell a manager about it sincerely, give him an approximate estimation (if it’s possible) and offer him certain actions to make this estimation more precise.
For example, if you don’t know if a definite technology is suitable for the task solving, ask for some time to make a prototype, which either will approve your estimation, or will show you, that you’ve missed something. If you are not sure, that the task can be solved at all, say it at once.
These things should be taken care of before you undertake certain commitments.
It’s very important to give a manager such sort of information, or he could take your estimation for granted without any suspicion, that it’s possible to reduce time limits by 5 times or even not to implement a product according to a certain technology/certain requirements.
A good manager is always on your side, because you are in the same boat, and his career often depends to a greater degree than yours on the fact if you meet the deadline or not.
If you have doubts — do not promise
Many companies and developers themselves often bring along the failure of their projects only because of the fact, that they undertake the commitments in the too early point of the cone of uncertainty.
At an early stage, where a possible result volatiles between 100% and 1600%, it’s very riskily to make decisions and set terms.
Successful companies and developers put off incurrence of obligations before the moment, when the work on the cone tapering will be complete.
As a rule, such behavior is typical for organizations, which are at the more mature level of the SMM model, where the procedure of the cone tapering is thoroughly described and followed.
There is an illustration of improvement of the estimation quality in the USA Air Force, when passing to a more mature level of SMM (Lawlis, Flowe and Thodahl, 1995).

From my point of view, there is something, to give some thought to it. Statistics from other companies approves of this correlation.
In addition, the estimation accuracy can’t be obtained only with estimation methods, it is in inextricable connection with effectiveness of running the projects; it depends not only on the developers, but also on the project managers and the company’s top management.
Conclusion
· To give an estimation, which is close to reality, is practically impossible, but we are capable to influence the diapason, within which it will volatile. To do it, we should decrease the uncertainty in the project;
· Splitting tasks into components will help to increase your estimation accuracy. During the decomposition you’ll think over in details, what you’re going to do and how;
· Use checklists in order to lower the possibility to miss something during estimating;
· Use the cone of uncertainty to understand, within what diapason your estimation will probably volatile;
· And finally — you should constantly compare the given estimation to the time, really spent to the task. It will help you to improve your estimation skill and understand, what you’ve missed, and use it in the next estimations.
Useful books
There is a lot of literature concerning estimation time and efforts, but I’d like to cite two books as an example, they are must-read:
· “Waltzing With Bears: managing risk on software projects” by Tom DeMarco and Timothy Lister.

· «Software Estimation: Demystifying the Black Art» by Steve McConnell.
