Tuesday, March 2, 2010

 SOFTWARE PROJECT ESTIMATION
In the early days of computing, software costs constituted a small percentage of the
overall computer-based system cost. An order of magnitude error in estimates of
software cost had relatively little impact. Today, software is the most expensive element
of virtually all computer-based systems. For complex, custom systems, a large
cost estimation error can make the difference between profit and loss. Cost overrun
can be disastrous for the developer
.
Software cost and effort estimation will never be an exact science. Too many variables—
human, technical, environmental, political—can affect the ultimate cost of
software and effort applied to develop it. However, software project estimation can
be transformed from a black art to a series of systematic steps that provide estimates
with acceptable risk.
To achieve reliable cost and effort estimates, a number of options arise:
1. Delay estimation until late in the project (obviously, we can achieve
100% accurate estimates after the project is complete!).
2. Base estimates on similar projects that have already been completed.
3. Use relatively simple decomposition techniques to generate project cost and
effort estimates.
4. Use one or more empirical models for software cost and effort estimation.
Unfortunately, the first option, however attractive, is not practical. Cost estimates
must be provided "up front." However, we should recognize that the longer we wait,
the more we know, and the more we know, the less likely we are to make serious
errors in our estimates.
The second option can work reasonably well, if the current project is quite similar
to past efforts and other project influences (e.g., the customer, business conditions,
the SEE, deadlines) are equivalent. Unfortunately, past experience has not
always been a good indicator of future results.
The remaining options are viable approaches to software project estimation. Ideally,
the techniques noted for each option should be applied in tandem; each used as
a cross-check for the other. Decomposition techniques take a "divide and conquer"
approach to software project estimation. By decomposing a project into major functions
and related software engineering activities, cost and effort estimation can be
performed in a stepwise fashion. Empirical estimation models can be used to complement
decomposition techniques and offer a potentially valuable estimation
approach in their own right. A model is based on experience (historical data) and
takes the form
                               d = f (vi)
where d is one of a number of estimated values (e.g., effort, cost, project duration)
and vi are selected independent parameters (e.g., estimated LOC or FP).
Automated estimation tools implement one or more decomposition techniques or
empirical models. When combined with a graphical user interface, automated tools
provide an attractive option for estimating. In such systems, the characteristics of the
development organization (e.g., experience, environment) and the software to be
developed are described. Cost and effort estimates are derived from these data.
Each of the viable software cost estimation options is only as good as the historical
data used to seed the estimate. If no historical data exist, costing rests on a very
shaky foundation. In Chapter 4, we examined the characteristics of some of the software
metrics that provide the basis for historical estimation data.
DECOMPOSITION TECHNIQUES
Software project estimation is a form of problem solving, and in most cases, the
problem to be solved (i.e., developing a cost and effort estimate for a software project)
is too complex to be considered in one piece. For this reason, we decompose
the problem, recharacterizing it as a set of smaller (and hopefully, more manageable)
problems.
The decomposition approach was discussed from two different points
of view: decomposition of the problem and decomposition of the process. Estimation
uses one or both forms of partitioning. But before an estimate can be made, the
project planner must understand the scope of the software to be built and generate
an estimate of its “size.”
 Software Sizing
The accuracy of a software project estimate is predicated on a number of things: (1)
the degree to which the planner has properly estimated the size of the product to be
built; (2) the ability to translate the size estimate into human effort, calendar time,
and dollars (a function of the availability of reliable software metrics from past projects);
(3) the degree to which the project plan reflects the abilities of the software
team; and (4) the stability of product requirements and the environment that supports
the software engineering effort.
Estimation tools.
In this section, we consider the software sizing problem. Because a project estimate
is only as good as the estimate of the size of the work to be accomplished, sizing
represents the project planner’s first major challenge. In the context of project
planning, size refers to a quantifiable outcome of the software project. If a direct
approach is taken, size can be measured in LOC. If an indirect approach is chosen,
size is represented as FP.

“Fuzzy logic” sizing. This approach uses the approximate reasoning techniques
that are the cornerstone of fuzzy logic. To apply this approach, the
planner must identify the type of application, establish its magnitude on a
qualitative scale, and then refine the magnitude within the original range.
Although personal experience can be used, the planner should also have
access to a historical database of projects8 so that estimates can be compared
to actual experience.
Function point sizing. The planner develops estimates of the information
domain characteristics discussed in Chapter 4.
Standard component sizing. Software is composed of a number of different
“standard components” that are generic to a particular application area.
For example, the standard components for an information system are subsystems,
modules, screens, reports, interactive programs, batch programs, files,
LOC, and object-level instructions. The project planner estimates the number
of occurrences of each standard component and then uses historical project
data to determine the delivered size per standard component. To illustrate,
consider an information systems application. The planner estimates that 18
reports will be generated. Historical data indicates that 967 lines of COBOL
[PUT92] are required per report. This enables the planner to estimate that
17,000 LOC will be required for the reports component. Similar estimates and
computation are made for other standard components, and a combined size
value (adjusted statistically) results.
Change sizing. This approach is used when a project encompasses the use
of existing software that must be modified in some way as part of a project.
The planner estimates the number and type (e.g., reuse, adding code, changing
code, deleting code) of modifications that must be accomplished. Using
an “effort ratio” [PUT92] for each type of change, the size of the change may
be estimated.

EMPIRICAL ESTIMATION MODELS
An estimation model for computer software uses empirically derived formulas to predict
effort as a function of LOC or FP. Values for LOC or FP are estimated using the
approach described in Sections 5.6.2 and 5.6.3. But instead of using the tables described
in those sections, the resultant values for LOC or FP are plugged into the estimation
model.
The empirical data that support most estimation models are derived from a limited
sample of projects. For this reason, no estimation model is appropriate for all
classes of software and in all development environments. Therefore, the results
obtained from such models must be used judiciously.13
5.7.1 The Structure of Estimation Models
A typical estimation model is derived using regression analysis on data collected from
past software projects. The overall structure of such models takes the form [MAT94]
E = A + B x (ev)C (5-2)
where A, B, and C are empirically derived constants, E is effort in person-months, and
ev is the estimation variable (either LOC or FP). In addition to the relationship noted
in Equation (5-2), the majority of estimation models have some form of project adjust-ment component that enables E to be adjusted by other project characteristics (e.g.,
problem complexity, staff experience, development environment). Among the many
LOC-oriented estimation models proposed in the literature are
E = 5.2 x (KLOC)0.91 Walston-Felix model
 E = 5.5 + 0.73 x (KLOC)1.16 Bailey-Basili model
 E = 3.2 x (KLOC)1.05 Boehm simple model
 E = 5.288 x (KLOC)1.047 Doty model for KLOC > 9
FP-oriented models have also been proposed. These include
  
    E = -13.39 + 0.0545 FP Albrecht and Gaffney model
    E = 60.62 x 7.728 x 10-8 FP3 Kemerer model
    E= 585.7 + 15.12 FP Matson, Barnett, and Mellichamp model
A quick examination of these models indicates that each will yield a different result14
for the same values of LOC or FP. The implication is clear. Estimation models mustbe calibrated for local needs!

The COCOMO Model
In his classic book on “software engineering economics,” Barry Boehm [BOE81] introduced
a hierarchy of software estimation models bearing the name COCOMO, for
COnstructive COst MOdel. The original COCOMO model became one of the most widely
used and discussed software cost estimation models in the industry. It has evolved
into a more comprehensive estimation model, called COCOMO II [BOE96, BOE00].
Like its predecessor, COCOMO II is actually a hierarchy of estimation models that
address the following areas:
Application composition model. Used during the early stages of software
engineering, when prototyping of user interfaces, consideration of software
and system interaction, assessment of performance, and evaluation of technology
maturity are paramount.
Early design stage model. Used once requirements have been stabilized
and basic software architecture has been established.
Post-architecture-stage model. Used during the construction of the
software.
Like all estimation models for software, the COCOMO II models require sizing information.
Three different sizing options are available as part of the model hierarchy:
object points, function points, and lines of source code.
The COCOMO II application composition model uses object points and is
illustrated in the following paragraphs. It should be noted that other, more sophisticated estimation models (using FP and KLOC) are also available as part of
COCOMO II.
Like function points (Chapter 4), the object point is an indirect software measure
that is computed using counts of the number of (1) screens (at the user interface), (2)
reports, and (3) components likely to be required to build the application. Each object
instance (e.g., a screen or report) is classified into one of three complexity levels (i.e.,
simple, medium, or difficult) using criteria suggested by Boehm [BOE96]. In essence,
complexity is a function of the number and source of the client and server data tables
that are required to generate the screen or report and the number of views or sections
presented as part of the screen or report.
Once complexity is determined, the number of screens, reports, and components
are weighted according to Table 5.1. The object point count is then determined by
multiplying the original number of object instances by the weighting factor in Table
5.1 and summing to obtain a total object point count. When component-based development
or general software reuse is to be applied, the percent of reuse (%reuse) is
estimated and the object point count is adjusted:
       NOP= (object points) x [(100 -%reuse)/100]
           where NOP is defined as new object points.
To derive an estimate of effort based on the computed NOP value, a “productivity

rate” must be derived. Table 5.2 presents the productivity rate
PROD = NOP/person-month
for different levels of developer experience and development environment maturity.
Once the productivity rate has been determined, an estimate of project effort can be
derived as
estimated effort = NOP/PROD

AUTOMATED ESTIMATION TOOLS

The decomposition techniques and empirical estimation models described in the preceding
sections are available as part of a wide variety of software tools. These automated
estimation tools allow the planner to estimate cost and effort and to perform
"what-if" analyses for important project variables such as delivery date or staffing.
Although many automated estimation tools exist, all exhibit the same general characteristics
and all perform the following six generic functions [JON96]:
1. Sizing of project deliverables. The “size” of one or more software work
products is estimated. Work products include the external representation of
software (e.g., screen, reports), the software itself (e.g., KLOC), functionality
delivered (e.g., function points), descriptive information (e.g. documents).
2. Selecting project activities. The appropriate process framework (Chapter
2) is selected and the software engineering task set is specified.
3. Predicting staffing levels. The number of people who will be available to
do the work is specified. Because the relationship between people available
and work (predicted effort) is highly nonlinear, this is an important input.
4. Predicting software effort. Estimation tools use one or more models (e.g.,
Section 5.7) that relate the size of the project deliverables to the effort
required to produce them.
5. Predicting software cost. Given the results of step 4, costs can be computed
by allocating labor rates to the project activities noted in step 2.
6. Predicting software schedules. When effort, staffing level, and project
activities are known, a draft schedule can be produced by allocating labor
across software engineering activities based on recommended models for
effort distribution.
When different estimation tools are applied to the same project data, a relatively
large variation in estimated results is encountered. More important, predicted values
sometimes are significantly different than actual values. This reinforces the notion
that the output of estimation tools should be used as one "data point

Monday, March 1, 2010

Effort and schedule Estimation.

 SOFTWARE PROJECT ESTIMATION
In the early days of computing, software costs constituted a small percentage of the
overall computer-based system cost. An order of magnitude error in estimates of
software cost had relatively little impact. Today, software is the most expensive element
of virtually all computer-based systems. For complex, custom systems, a large
cost estimation error can make the difference between profit and loss. Cost overrun
can be disastrous for the developer
.
Software cost and effort estimation will never be an exact science. Too many variables—
human, technical, environmental, political—can affect the ultimate cost of
software and effort applied to develop it. However, software project estimation can
be transformed from a black art to a series of systematic steps that provide estimates
with acceptable risk.
To achieve reliable cost and effort estimates, a number of options arise:
1. Delay estimation until late in the project (obviously, we can achieve
100% accurate estimates after the project is complete!).
2. Base estimates on similar projects that have already been completed.
3. Use relatively simple decomposition techniques to generate project cost and
effort estimates.
4. Use one or more empirical models for software cost and effort estimation.
Unfortunately, the first option, however attractive, is not practical. Cost estimates
must be provided "up front." However, we should recognize that the longer we wait,
the more we know, and the more we know, the less likely we are to make serious
errors in our estimates.
The second option can work reasonably well, if the current project is quite similar
to past efforts and other project influences (e.g., the customer, business conditions,
the SEE, deadlines) are equivalent. Unfortunately, past experience has not
always been a good indicator of future results.
The remaining options are viable approaches to software project estimation. Ideally,
the techniques noted for each option should be applied in tandem; each used as
a cross-check for the other. Decomposition techniques take a "divide and conquer"
approach to software project estimation. By decomposing a project into major functions
and related software engineering activities, cost and effort estimation can be
performed in a stepwise fashion. Empirical estimation models can be used to complement
decomposition techniques and offer a potentially valuable estimation
approach in their own right. A model is based on experience (historical data) and
takes the form
                               d = f (vi)
where d is one of a number of estimated values (e.g., effort, cost, project duration)
and vi are selected independent parameters (e.g., estimated LOC or FP).
Automated estimation tools implement one or more decomposition techniques or
empirical models. When combined with a graphical user interface, automated tools
provide an attractive option for estimating. In such systems, the characteristics of the
development organization (e.g., experience, environment) and the software to be
developed are described. Cost and effort estimates are derived from these data.
Each of the viable software cost estimation options is only as good as the historical
data used to seed the estimate. If no historical data exist, costing rests on a very
shaky foundation. In Chapter 4, we examined the characteristics of some of the software
metrics that provide the basis for historical estimation data.
DECOMPOSITION TECHNIQUES
Software project estimation is a form of problem solving, and in most cases, the
problem to be solved (i.e., developing a cost and effort estimate for a software project)
is too complex to be considered in one piece. For this reason, we decompose
the problem, recharacterizing it as a set of smaller (and hopefully, more manageable)
problems.
The decomposition approach was discussed from two different points
of view: decomposition of the problem and decomposition of the process. Estimation
uses one or both forms of partitioning. But before an estimate can be made, the
project planner must understand the scope of the software to be built and generate
an estimate of its “size.”
 Software Sizing
The accuracy of a software project estimate is predicated on a number of things: (1)
the degree to which the planner has properly estimated the size of the product to be
built; (2) the ability to translate the size estimate into human effort, calendar time,
and dollars (a function of the availability of reliable software metrics from past projects);
(3) the degree to which the project plan reflects the abilities of the software
team; and (4) the stability of product requirements and the environment that supports
the software engineering effort.
Estimation tools.
In this section, we consider the software sizing problem. Because a project estimate
is only as good as the estimate of the size of the work to be accomplished, sizing
represents the project planner’s first major challenge. In the context of project
planning, size refers to a quantifiable outcome of the software project. If a direct
approach is taken, size can be measured in LOC. If an indirect approach is chosen,
size is represented as FP.

“Fuzzy logic” sizing. This approach uses the approximate reasoning techniques
that are the cornerstone of fuzzy logic. To apply this approach, the
planner must identify the type of application, establish its magnitude on a
qualitative scale, and then refine the magnitude within the original range.
Although personal experience can be used, the planner should also have
access to a historical database of projects8 so that estimates can be compared
to actual experience.
Function point sizing. The planner develops estimates of the information
domain characteristics discussed in Chapter 4.
Standard component sizing. Software is composed of a number of different
“standard components” that are generic to a particular application area.
For example, the standard components for an information system are subsystems,
modules, screens, reports, interactive programs, batch programs, files,
LOC, and object-level instructions. The project planner estimates the number
of occurrences of each standard component and then uses historical project
data to determine the delivered size per standard component. To illustrate,
consider an information systems application. The planner estimates that 18
reports will be generated. Historical data indicates that 967 lines of COBOL
[PUT92] are required per report. This enables the planner to estimate that
17,000 LOC will be required for the reports component. Similar estimates and
computation are made for other standard components, and a combined size
value (adjusted statistically) results.
Change sizing. This approach is used when a project encompasses the use
of existing software that must be modified in some way as part of a project.
The planner estimates the number and type (e.g., reuse, adding code, changing
code, deleting code) of modifications that must be accomplished. Using
an “effort ratio” [PUT92] for each type of change, the size of the change may
be estimated.

EMPIRICAL ESTIMATION MODELS
An estimation model for computer software uses empirically derived formulas to predict
effort as a function of LOC or FP. Values for LOC or FP are estimated using the
approach described in Sections 5.6.2 and 5.6.3. But instead of using the tables described
in those sections, the resultant values for LOC or FP are plugged into the estimation
model.
The empirical data that support most estimation models are derived from a limited
sample of projects. For this reason, no estimation model is appropriate for all
classes of software and in all development environments. Therefore, the results
obtained from such models must be used judiciously.13
5.7.1 The Structure of Estimation Models
A typical estimation model is derived using regression analysis on data collected from
past software projects. The overall structure of such models takes the form [MAT94]
E = A + B x (ev)C (5-2)
where A, B, and C are empirically derived constants, E is effort in person-months, and
ev is the estimation variable (either LOC or FP). In addition to the relationship noted
in Equation (5-2), the majority of estimation models have some form of project adjust-ment component that enables E to be adjusted by other project characteristics (e.g.,
problem complexity, staff experience, development environment). Among the many
LOC-oriented estimation models proposed in the literature are
E = 5.2 x (KLOC)0.91 Walston-Felix model
 E = 5.5 + 0.73 x (KLOC)1.16 Bailey-Basili model
 E = 3.2 x (KLOC)1.05 Boehm simple model
 E = 5.288 x (KLOC)1.047 Doty model for KLOC > 9
FP-oriented models have also been proposed. These include
  
    E = -13.39 + 0.0545 FP Albrecht and Gaffney model
    E = 60.62 x 7.728 x 10-8 FP3 Kemerer model
    E= 585.7 + 15.12 FP Matson, Barnett, and Mellichamp model
A quick examination of these models indicates that each will yield a different result14
for the same values of LOC or FP. The implication is clear. Estimation models mustbe calibrated for local needs!

The COCOMO Model
In his classic book on “software engineering economics,” Barry Boehm [BOE81] introduced
a hierarchy of software estimation models bearing the name COCOMO, for
COnstructive COst MOdel. The original COCOMO model became one of the most widely
used and discussed software cost estimation models in the industry. It has evolved
into a more comprehensive estimation model, called COCOMO II [BOE96, BOE00].
Like its predecessor, COCOMO II is actually a hierarchy of estimation models that
address the following areas:
Application composition model. Used during the early stages of software
engineering, when prototyping of user interfaces, consideration of software
and system interaction, assessment of performance, and evaluation of technology
maturity are paramount.
Early design stage model. Used once requirements have been stabilized
and basic software architecture has been established.
Post-architecture-stage model. Used during the construction of the
software.
Like all estimation models for software, the COCOMO II models require sizing information.
Three different sizing options are available as part of the model hierarchy:
object points, function points, and lines of source code.
The COCOMO II application composition model uses object points and is
illustrated in the following paragraphs. It should be noted that other, more sophisticated estimation models (using FP and KLOC) are also available as part of
COCOMO II.
Like function points (Chapter 4), the object point is an indirect software measure
that is computed using counts of the number of (1) screens (at the user interface), (2)
reports, and (3) components likely to be required to build the application. Each object
instance (e.g., a screen or report) is classified into one of three complexity levels (i.e.,
simple, medium, or difficult) using criteria suggested by Boehm [BOE96]. In essence,
complexity is a function of the number and source of the client and server data tables
that are required to generate the screen or report and the number of views or sections
presented as part of the screen or report.
Once complexity is determined, the number of screens, reports, and components
are weighted according to Table 5.1. The object point count is then determined by
multiplying the original number of object instances by the weighting factor in Table
5.1 and summing to obtain a total object point count. When component-based development
or general software reuse is to be applied, the percent of reuse (%reuse) is
estimated and the object point count is adjusted:
       NOP= (object points) x [(100 -%reuse)/100]
           where NOP is defined as new object points.
To derive an estimate of effort based on the computed NOP value, a “productivity

rate” must be derived. Table 5.2 presents the productivity rate
PROD = NOP/person-month
for different levels of developer experience and development environment maturity.
Once the productivity rate has been determined, an estimate of project effort can be
derived as
estimated effort = NOP/PROD

AUTOMATED ESTIMATION TOOLS

The decomposition techniques and empirical estimation models described in the preceding
sections are available as part of a wide variety of software tools. These automated
estimation tools allow the planner to estimate cost and effort and to perform
"what-if" analyses for important project variables such as delivery date or staffing.
Although many automated estimation tools exist, all exhibit the same general characteristics
and all perform the following six generic functions [JON96]:
1. Sizing of project deliverables. The “size” of one or more software work
products is estimated. Work products include the external representation of
software (e.g., screen, reports), the software itself (e.g., KLOC), functionality
delivered (e.g., function points), descriptive information (e.g. documents).
2. Selecting project activities. The appropriate process framework (Chapter
2) is selected and the software engineering task set is specified.
3. Predicting staffing levels. The number of people who will be available to
do the work is specified. Because the relationship between people available
and work (predicted effort) is highly nonlinear, this is an important input.
4. Predicting software effort. Estimation tools use one or more models (e.g.,
Section 5.7) that relate the size of the project deliverables to the effort
required to produce them.
5. Predicting software cost. Given the results of step 4, costs can be computed
by allocating labor rates to the project activities noted in step 2.
6. Predicting software schedules. When effort, staffing level, and project
activities are known, a draft schedule can be produced by allocating labor
across software engineering activities based on recommended models for
effort distribution.
When different estimation tools are applied to the same project data, a relatively
large variation in estimated results is encountered. More important, predicted values
sometimes are significantly different than actual values. This reinforces the notion
that the output of estimation tools should be used as one "data point