Monday, April 19, 2010


 Process quality and ETVX

Quality in the process

A quality process has the right inputs and performs the right actions to produce outputs that meet the needs of customer processes.
Definitions of quality thus include:
  • Fitness for purpose
  • Right output, right time, right place
  • Customer satisfaction


There are four places where the quality can be specified and checked:
  • Entry criteria define what inputs are required and what quality these must be to achieve the exit criteria. Entry criteria should be communicated to supplier processes, to become their exit criteria. If supplier processes are sufficiently well controlled, then there is no need to check inputs.
  • Task definitions specify the actions within the process.
  • Validation definitions identify test points within the process and define the tests and criteria for checking at these points. This enables problems to be caught close to their cause, reducing rework and scrap costs, and enabling problem causes to be addressed.
  • Exit criteria define what outputs are required and what quality these must be to meet the needs of customer processes. Exit criteria may be derived from the entry criteria of customer processes.
Together, these make up what is known as the ETVX model (as below), which can be used to define the process and the quality required within it completely.

Fig. 1. The ETVX model

Spiral Model

The WIN-WIN Spiral Model

In this model the developer and the customer both together strive for a “win-win” result. The customer wins by getting the system or product that satisfies the majority of the customer needs and the developer wins by working on realistic and achievable goals, budgets and deadlines. Rather than a single customer communication activity the following activities are defined:
  • Identification of the Key Stakeholders in the organization.
  • Determination of the Key Stakeholders “Win conditions” - a crucial step.
  • Negotiating of the stake holders win conditions into a set of win-win conditions for all including the developers, management, customers and the various other stake holders.
In addition to the negotiations, the WINWIN spiral model also introduces three process milestones (anchor points) which help completion of one cycle around the spiral and provides the decision milestones. The three process milestones are:
  • 1. Life Cycle Objective (LCO) – defines a set of activity for each major software engineering activity. Eg. Defining the top-level system/product requirements.
  • Life Cycle Architecture (LCA) – defines the objectives that must be met as the system and software architecture is defined. Eg. The software team can demonstrate that they have evaluated the applicability of the software and also considered the impact on architectural decisions.
3. Initial Operational Capability (IOC) – defines the set of objectives


• Faster software production facilitated through collaborative involvement of the relevant stake holders.
• Cheaper software via rework and maintenance reductions

The Spiral Model

The Spiral Model is an evolutionary software process model that couples the iterative nature of prototyping with the controlled and systematic aspects of the Linear Sequential Model. Using the Spiral Model the software is developed in a series of incremental releases. Unlike the Iteration Model where in the first product is a core product, in the Spiral Model the early iterations could result in a paper model or a prototype. However, during later iterations more complex functionalities could be added.

A Spiral Model, combines the iterative nature of prototyping with the controlled and systematic aspects of the Waterfall Model, therein providing the potential for rapid development of incremental versions of the software. A Spiral Model is divided into a number of framework activities, also called task regions. These task regions could vary from 3-6 in number and they are:
  • Customer Communication - tasks required to establish effective communication between the developer and customer.
  • Planning - tasks required to define resources, timelines and other project related information /items.
  • Risk Analysis - tasks required to assess the technical and management risks.
  • Engineering - tasks required to build one or more representation of the application.
  • Construction & Release - tasks required to construct, test and support (eg. Documentation and training)
  • Customer evaluation - tasks required to obtain periodic customer feedback so that there are no last minute surprises.

Advantages of the Spiral Model

  • Realistic approach to the development because the software evolves as the process progresses. In addition, the developer and the client better understand and react to risks at each evolutionary level.
  • The model uses prototyping as a risk reduction mechanism and allows for the development of prototypes at any stage of the evolutionary development.
  • It maintains a systematic stepwise approach, like the classic waterfall model, and also incorporates into it an iterative framework that more reflect the real world.

Disadvantages of the Spiral Model

  • One should possess considerable risk-assessment expertise
  • It has not been employed as much proven models (e.g. the Waterfall Model) and hence may prove difficult to ‘sell’ to the client. 

Thursday, April 15, 2010


The Constructive Cost Model (COCOMO) is an algorithmic software cost estimation model developed by Barry Boehm. The model uses a basic regression formula, with parameters that are derived from historical project data and current project characteristics.
COCOMO was first published in 1981 Barry W. Boehm's Book Software engineering economics[1] as a model for estimating effort, cost, and schedule for software projects. It drew on a study of 63 projects at TRW Aerospace where Barry Boehm was Director of Software Research and Technology in 1981. The study examined projects ranging in size from 2,000 to 100,000 lines of code, and programming languages ranging from assembly to PL/I. These projects were based on the waterfall model of software development which was the prevalent software development process in 1981.
References to this model typically call it COCOMO 81. In 1997 COCOMO II was developed and finally published in 2000 in the book Software Cost Estimation with COCOMO II[2]. COCOMO II is the successor of COCOMO 81 and is better suited for estimating modern software development projects. It provides more support for modern software development processes and an updated project database. The need for the new model came as software development technology moved from mainframe and overnight batch processing to desktop development, code reusability and the use of off-the-shelf software components. This article refers to COCOMO 81.
COCOMO consists of a hierarchy of three increasingly detailed and accurate forms. The first level, Basic COCOMO is good for quick, early, rough order of magnitude estimates of software costs, but its accuracy is limited due to its lack of factors to account for difference in project attributes (Cost Drivers). Intermediate COCOMO takes these Cost Drivers into account and Detailed COCOMO additionally accounts for the influence of individual project phases.


Basic COCOMO computes software development effort (and cost) as a function of program size. Program size is expressed in estimated thousands of lines of code (KLOC).
COCOMO applies to three classes of software projects:
  • Organic projects - "small" teams with "good" experience working with "less than rigid" requirements
  • Semi-detached projects - "medium" teams with mixed experience working with a mix of rigid and less than rigid requirements
  • Embedded projects - developed within a set of "tight" constraints (hardware, software, operational, ...)
The basic COCOMO equations take the form
Effort Applied = ab(KLOC)bb [ man-months ]
Development Time = cb(Effort Applied)db [months]
People required = Effort Applied / Development Time [count]
The coefficients ab, bb, cb and db are given in the following table.
Software project    ab      bb      cb      db
   Organic             2.4     1.05    2.5     0.38
   Semi-detached       3.0     1.12    2.5     0.35
   Embedded            3.6     1.20    2.5     0.32
Basic COCOMO is good for quick estimate of software costs. However it does not account for differences in hardware constraints, personnel quality and experience, use of modern tools and techniques, and so on.

Intermediate COCOMO

Intermediate COCOMO computes software development effort as function of program size and a set of "cost drivers" that include subjective assessment of product, hardware, personnel and project attributes. This extension considers a set of four "cost drivers", each with a number of subsidiary attributes:-
  • Product attributes

    • Required software reliability
    • Size of application database
    • Complexity of the product
  • Hardware attributes

    • Run-time performance constraints
    • Memory constraints
    • Volatility of the virtual machine environment
    • Required turnabout time
  • Personnel attributes

    • Analyst capability
    • Software engineering capability
    • Applications experience
    • Virtual machine experience
    • Programming language experience
  • Project attributes

    • Use of software tools
    • Application of software engineering methods
    • Required development schedule
Each of the 15 attributes receives a rating on a six-point scale that ranges from "very low" to "extra high" (in importance or value). An effort multiplier from the table below applies to the rating. The product of all effort multipliers results in an effort adjustment factor (EAF). Typical values for EAF range from 0.9 to 1.4.
Cost Drivers Ratings
Very Low Low Nominal High Very High Extra High
Product attributes
Required software reliability 0.75 0.88 1.00 1.15 1.40
Size of application database 0.94 1.00 1.08 1.16
Complexity of the product 0.70 0.85 1.00 1.15 1.30 1.65
Hardware attributes
Run-time performance constraints 1.00 1.11 1.30 1.66
Memory constraints 1.00 1.06 1.21 1.56
Volatility of the virtual machine environment 0.87 1.00 1.15 1.30
Required turnabout time 0.87 1.00 1.07 1.15
Personnel attributes
Analyst capability 1.46 1.19 1.00 0.86 0.71
Applications experience 1.29 1.13 1.00 0.91 0.82
Software engineer capability 1.42 1.17 1.00 0.86 0.70
Virtual machine experience 1.21 1.10 1.00 0.90
Programming language experience 1.14 1.07 1.00 0.95
Project attributes
Application of software engineering methods 1.24 1.10 1.00 0.91 0.82
Use of software tools 1.24 1.10 1.00 0.91 0.83
Required development schedule 1.23 1.08 1.00 1.04 1.10
The Intermediate Cocomo formula now takes the form:
where E is the effort applied in person-months, KLoC is the estimated number of thousands of delivered lines of code for the project, and EAF is the factor calculated above. The coefficient ai and the exponent bi are given in the next table.
Software project ai bi
Organic 3.2 1.05
Semi-detached 3.0 1.12
Embedded 2.8 1.20
The Development time D calculation uses E in the same way as in the Basic COCOMO.

 Detailed COCOMO

Detailed COCOMO - incorporates all characteristics of the intermediate version with an assessment of the cost driver's impact on each step (analysis, design, etc.) of the software engineering process.


Tuesday, March 2, 2010

In the early days of computing, software costs constituted a small percentage of the
overall computer-based system cost. An order of magnitude error in estimates of
software cost had relatively little impact. Today, software is the most expensive element
of virtually all computer-based systems. For complex, custom systems, a large
cost estimation error can make the difference between profit and loss. Cost overrun
can be disastrous for the developer
Software cost and effort estimation will never be an exact science. Too many variables—
human, technical, environmental, political—can affect the ultimate cost of
software and effort applied to develop it. However, software project estimation can
be transformed from a black art to a series of systematic steps that provide estimates
with acceptable risk.
To achieve reliable cost and effort estimates, a number of options arise:
1. Delay estimation until late in the project (obviously, we can achieve
100% accurate estimates after the project is complete!).
2. Base estimates on similar projects that have already been completed.
3. Use relatively simple decomposition techniques to generate project cost and
effort estimates.
4. Use one or more empirical models for software cost and effort estimation.
Unfortunately, the first option, however attractive, is not practical. Cost estimates
must be provided "up front." However, we should recognize that the longer we wait,
the more we know, and the more we know, the less likely we are to make serious
errors in our estimates.
The second option can work reasonably well, if the current project is quite similar
to past efforts and other project influences (e.g., the customer, business conditions,
the SEE, deadlines) are equivalent. Unfortunately, past experience has not
always been a good indicator of future results.
The remaining options are viable approaches to software project estimation. Ideally,
the techniques noted for each option should be applied in tandem; each used as
a cross-check for the other. Decomposition techniques take a "divide and conquer"
approach to software project estimation. By decomposing a project into major functions
and related software engineering activities, cost and effort estimation can be
performed in a stepwise fashion. Empirical estimation models can be used to complement
decomposition techniques and offer a potentially valuable estimation
approach in their own right. A model is based on experience (historical data) and
takes the form
                               d = f (vi)
where d is one of a number of estimated values (e.g., effort, cost, project duration)
and vi are selected independent parameters (e.g., estimated LOC or FP).
Automated estimation tools implement one or more decomposition techniques or
empirical models. When combined with a graphical user interface, automated tools
provide an attractive option for estimating. In such systems, the characteristics of the
development organization (e.g., experience, environment) and the software to be
developed are described. Cost and effort estimates are derived from these data.
Each of the viable software cost estimation options is only as good as the historical
data used to seed the estimate. If no historical data exist, costing rests on a very
shaky foundation. In Chapter 4, we examined the characteristics of some of the software
metrics that provide the basis for historical estimation data.
Software project estimation is a form of problem solving, and in most cases, the
problem to be solved (i.e., developing a cost and effort estimate for a software project)
is too complex to be considered in one piece. For this reason, we decompose
the problem, recharacterizing it as a set of smaller (and hopefully, more manageable)
The decomposition approach was discussed from two different points
of view: decomposition of the problem and decomposition of the process. Estimation
uses one or both forms of partitioning. But before an estimate can be made, the
project planner must understand the scope of the software to be built and generate
an estimate of its “size.”
 Software Sizing
The accuracy of a software project estimate is predicated on a number of things: (1)
the degree to which the planner has properly estimated the size of the product to be
built; (2) the ability to translate the size estimate into human effort, calendar time,
and dollars (a function of the availability of reliable software metrics from past projects);
(3) the degree to which the project plan reflects the abilities of the software
team; and (4) the stability of product requirements and the environment that supports
the software engineering effort.
Estimation tools.
In this section, we consider the software sizing problem. Because a project estimate
is only as good as the estimate of the size of the work to be accomplished, sizing
represents the project planner’s first major challenge. In the context of project
planning, size refers to a quantifiable outcome of the software project. If a direct
approach is taken, size can be measured in LOC. If an indirect approach is chosen,
size is represented as FP.

“Fuzzy logic” sizing. This approach uses the approximate reasoning techniques
that are the cornerstone of fuzzy logic. To apply this approach, the
planner must identify the type of application, establish its magnitude on a
qualitative scale, and then refine the magnitude within the original range.
Although personal experience can be used, the planner should also have
access to a historical database of projects8 so that estimates can be compared
to actual experience.
Function point sizing. The planner develops estimates of the information
domain characteristics discussed in Chapter 4.
Standard component sizing. Software is composed of a number of different
“standard components” that are generic to a particular application area.
For example, the standard components for an information system are subsystems,
modules, screens, reports, interactive programs, batch programs, files,
LOC, and object-level instructions. The project planner estimates the number
of occurrences of each standard component and then uses historical project
data to determine the delivered size per standard component. To illustrate,
consider an information systems application. The planner estimates that 18
reports will be generated. Historical data indicates that 967 lines of COBOL
[PUT92] are required per report. This enables the planner to estimate that
17,000 LOC will be required for the reports component. Similar estimates and
computation are made for other standard components, and a combined size
value (adjusted statistically) results.
Change sizing. This approach is used when a project encompasses the use
of existing software that must be modified in some way as part of a project.
The planner estimates the number and type (e.g., reuse, adding code, changing
code, deleting code) of modifications that must be accomplished. Using
an “effort ratio” [PUT92] for each type of change, the size of the change may
be estimated.

An estimation model for computer software uses empirically derived formulas to predict
effort as a function of LOC or FP. Values for LOC or FP are estimated using the
approach described in Sections 5.6.2 and 5.6.3. But instead of using the tables described
in those sections, the resultant values for LOC or FP are plugged into the estimation
The empirical data that support most estimation models are derived from a limited
sample of projects. For this reason, no estimation model is appropriate for all
classes of software and in all development environments. Therefore, the results
obtained from such models must be used judiciously.13
5.7.1 The Structure of Estimation Models
A typical estimation model is derived using regression analysis on data collected from
past software projects. The overall structure of such models takes the form [MAT94]
E = A + B x (ev)C (5-2)
where A, B, and C are empirically derived constants, E is effort in person-months, and
ev is the estimation variable (either LOC or FP). In addition to the relationship noted
in Equation (5-2), the majority of estimation models have some form of project adjust-ment component that enables E to be adjusted by other project characteristics (e.g.,
problem complexity, staff experience, development environment). Among the many
LOC-oriented estimation models proposed in the literature are
E = 5.2 x (KLOC)0.91 Walston-Felix model
 E = 5.5 + 0.73 x (KLOC)1.16 Bailey-Basili model
 E = 3.2 x (KLOC)1.05 Boehm simple model
 E = 5.288 x (KLOC)1.047 Doty model for KLOC > 9
FP-oriented models have also been proposed. These include
    E = -13.39 + 0.0545 FP Albrecht and Gaffney model
    E = 60.62 x 7.728 x 10-8 FP3 Kemerer model
    E= 585.7 + 15.12 FP Matson, Barnett, and Mellichamp model
A quick examination of these models indicates that each will yield a different result14
for the same values of LOC or FP. The implication is clear. Estimation models mustbe calibrated for local needs!

The COCOMO Model
In his classic book on “software engineering economics,” Barry Boehm [BOE81] introduced
a hierarchy of software estimation models bearing the name COCOMO, for
COnstructive COst MOdel. The original COCOMO model became one of the most widely
used and discussed software cost estimation models in the industry. It has evolved
into a more comprehensive estimation model, called COCOMO II [BOE96, BOE00].
Like its predecessor, COCOMO II is actually a hierarchy of estimation models that
address the following areas:
Application composition model. Used during the early stages of software
engineering, when prototyping of user interfaces, consideration of software
and system interaction, assessment of performance, and evaluation of technology
maturity are paramount.
Early design stage model. Used once requirements have been stabilized
and basic software architecture has been established.
Post-architecture-stage model. Used during the construction of the
Like all estimation models for software, the COCOMO II models require sizing information.
Three different sizing options are available as part of the model hierarchy:
object points, function points, and lines of source code.
The COCOMO II application composition model uses object points and is
illustrated in the following paragraphs. It should be noted that other, more sophisticated estimation models (using FP and KLOC) are also available as part of
Like function points (Chapter 4), the object point is an indirect software measure
that is computed using counts of the number of (1) screens (at the user interface), (2)
reports, and (3) components likely to be required to build the application. Each object
instance (e.g., a screen or report) is classified into one of three complexity levels (i.e.,
simple, medium, or difficult) using criteria suggested by Boehm [BOE96]. In essence,
complexity is a function of the number and source of the client and server data tables
that are required to generate the screen or report and the number of views or sections
presented as part of the screen or report.
Once complexity is determined, the number of screens, reports, and components
are weighted according to Table 5.1. The object point count is then determined by
multiplying the original number of object instances by the weighting factor in Table
5.1 and summing to obtain a total object point count. When component-based development
or general software reuse is to be applied, the percent of reuse (%reuse) is
estimated and the object point count is adjusted:
       NOP= (object points) x [(100 -%reuse)/100]
           where NOP is defined as new object points.
To derive an estimate of effort based on the computed NOP value, a “productivity

rate” must be derived. Table 5.2 presents the productivity rate
PROD = NOP/person-month
for different levels of developer experience and development environment maturity.
Once the productivity rate has been determined, an estimate of project effort can be
derived as
estimated effort = NOP/PROD


The decomposition techniques and empirical estimation models described in the preceding
sections are available as part of a wide variety of software tools. These automated
estimation tools allow the planner to estimate cost and effort and to perform
"what-if" analyses for important project variables such as delivery date or staffing.
Although many automated estimation tools exist, all exhibit the same general characteristics
and all perform the following six generic functions [JON96]:
1. Sizing of project deliverables. The “size” of one or more software work
products is estimated. Work products include the external representation of
software (e.g., screen, reports), the software itself (e.g., KLOC), functionality
delivered (e.g., function points), descriptive information (e.g. documents).
2. Selecting project activities. The appropriate process framework (Chapter
2) is selected and the software engineering task set is specified.
3. Predicting staffing levels. The number of people who will be available to
do the work is specified. Because the relationship between people available
and work (predicted effort) is highly nonlinear, this is an important input.
4. Predicting software effort. Estimation tools use one or more models (e.g.,
Section 5.7) that relate the size of the project deliverables to the effort
required to produce them.
5. Predicting software cost. Given the results of step 4, costs can be computed
by allocating labor rates to the project activities noted in step 2.
6. Predicting software schedules. When effort, staffing level, and project
activities are known, a draft schedule can be produced by allocating labor
across software engineering activities based on recommended models for
effort distribution.
When different estimation tools are applied to the same project data, a relatively
large variation in estimated results is encountered. More important, predicted values
sometimes are significantly different than actual values. This reinforces the notion
that the output of estimation tools should be used as one "data point

Monday, March 1, 2010

Effort and schedule Estimation.

In the early days of computing, software costs constituted a small percentage of the
overall computer-based system cost. An order of magnitude error in estimates of
software cost had relatively little impact. Today, software is the most expensive element
of virtually all computer-based systems. For complex, custom systems, a large
cost estimation error can make the difference between profit and loss. Cost overrun
can be disastrous for the developer
Software cost and effort estimation will never be an exact science. Too many variables—
human, technical, environmental, political—can affect the ultimate cost of
software and effort applied to develop it. However, software project estimation can
be transformed from a black art to a series of systematic steps that provide estimates
with acceptable risk.
To achieve reliable cost and effort estimates, a number of options arise:
1. Delay estimation until late in the project (obviously, we can achieve
100% accurate estimates after the project is complete!).
2. Base estimates on similar projects that have already been completed.
3. Use relatively simple decomposition techniques to generate project cost and
effort estimates.
4. Use one or more empirical models for software cost and effort estimation.
Unfortunately, the first option, however attractive, is not practical. Cost estimates
must be provided "up front." However, we should recognize that the longer we wait,
the more we know, and the more we know, the less likely we are to make serious
errors in our estimates.
The second option can work reasonably well, if the current project is quite similar
to past efforts and other project influences (e.g., the customer, business conditions,
the SEE, deadlines) are equivalent. Unfortunately, past experience has not
always been a good indicator of future results.
The remaining options are viable approaches to software project estimation. Ideally,
the techniques noted for each option should be applied in tandem; each used as
a cross-check for the other. Decomposition techniques take a "divide and conquer"
approach to software project estimation. By decomposing a project into major functions
and related software engineering activities, cost and effort estimation can be
performed in a stepwise fashion. Empirical estimation models can be used to complement
decomposition techniques and offer a potentially valuable estimation
approach in their own right. A model is based on experience (historical data) and
takes the form
                               d = f (vi)
where d is one of a number of estimated values (e.g., effort, cost, project duration)
and vi are selected independent parameters (e.g., estimated LOC or FP).
Automated estimation tools implement one or more decomposition techniques or
empirical models. When combined with a graphical user interface, automated tools
provide an attractive option for estimating. In such systems, the characteristics of the
development organization (e.g., experience, environment) and the software to be
developed are described. Cost and effort estimates are derived from these data.
Each of the viable software cost estimation options is only as good as the historical
data used to seed the estimate. If no historical data exist, costing rests on a very
shaky foundation. In Chapter 4, we examined the characteristics of some of the software
metrics that provide the basis for historical estimation data.
Software project estimation is a form of problem solving, and in most cases, the
problem to be solved (i.e., developing a cost and effort estimate for a software project)
is too complex to be considered in one piece. For this reason, we decompose
the problem, recharacterizing it as a set of smaller (and hopefully, more manageable)
The decomposition approach was discussed from two different points
of view: decomposition of the problem and decomposition of the process. Estimation
uses one or both forms of partitioning. But before an estimate can be made, the
project planner must understand the scope of the software to be built and generate
an estimate of its “size.”
 Software Sizing
The accuracy of a software project estimate is predicated on a number of things: (1)
the degree to which the planner has properly estimated the size of the product to be
built; (2) the ability to translate the size estimate into human effort, calendar time,
and dollars (a function of the availability of reliable software metrics from past projects);
(3) the degree to which the project plan reflects the abilities of the software
team; and (4) the stability of product requirements and the environment that supports
the software engineering effort.
Estimation tools.
In this section, we consider the software sizing problem. Because a project estimate
is only as good as the estimate of the size of the work to be accomplished, sizing
represents the project planner’s first major challenge. In the context of project
planning, size refers to a quantifiable outcome of the software project. If a direct
approach is taken, size can be measured in LOC. If an indirect approach is chosen,
size is represented as FP.

“Fuzzy logic” sizing. This approach uses the approximate reasoning techniques
that are the cornerstone of fuzzy logic. To apply this approach, the
planner must identify the type of application, establish its magnitude on a
qualitative scale, and then refine the magnitude within the original range.
Although personal experience can be used, the planner should also have
access to a historical database of projects8 so that estimates can be compared
to actual experience.
Function point sizing. The planner develops estimates of the information
domain characteristics discussed in Chapter 4.
Standard component sizing. Software is composed of a number of different
“standard components” that are generic to a particular application area.
For example, the standard components for an information system are subsystems,
modules, screens, reports, interactive programs, batch programs, files,
LOC, and object-level instructions. The project planner estimates the number
of occurrences of each standard component and then uses historical project
data to determine the delivered size per standard component. To illustrate,
consider an information systems application. The planner estimates that 18
reports will be generated. Historical data indicates that 967 lines of COBOL
[PUT92] are required per report. This enables the planner to estimate that
17,000 LOC will be required for the reports component. Similar estimates and
computation are made for other standard components, and a combined size
value (adjusted statistically) results.
Change sizing. This approach is used when a project encompasses the use
of existing software that must be modified in some way as part of a project.
The planner estimates the number and type (e.g., reuse, adding code, changing
code, deleting code) of modifications that must be accomplished. Using
an “effort ratio” [PUT92] for each type of change, the size of the change may
be estimated.

An estimation model for computer software uses empirically derived formulas to predict
effort as a function of LOC or FP. Values for LOC or FP are estimated using the
approach described in Sections 5.6.2 and 5.6.3. But instead of using the tables described
in those sections, the resultant values for LOC or FP are plugged into the estimation
The empirical data that support most estimation models are derived from a limited
sample of projects. For this reason, no estimation model is appropriate for all
classes of software and in all development environments. Therefore, the results
obtained from such models must be used judiciously.13
5.7.1 The Structure of Estimation Models
A typical estimation model is derived using regression analysis on data collected from
past software projects. The overall structure of such models takes the form [MAT94]
E = A + B x (ev)C (5-2)
where A, B, and C are empirically derived constants, E is effort in person-months, and
ev is the estimation variable (either LOC or FP). In addition to the relationship noted
in Equation (5-2), the majority of estimation models have some form of project adjust-ment component that enables E to be adjusted by other project characteristics (e.g.,
problem complexity, staff experience, development environment). Among the many
LOC-oriented estimation models proposed in the literature are
E = 5.2 x (KLOC)0.91 Walston-Felix model
 E = 5.5 + 0.73 x (KLOC)1.16 Bailey-Basili model
 E = 3.2 x (KLOC)1.05 Boehm simple model
 E = 5.288 x (KLOC)1.047 Doty model for KLOC > 9
FP-oriented models have also been proposed. These include
    E = -13.39 + 0.0545 FP Albrecht and Gaffney model
    E = 60.62 x 7.728 x 10-8 FP3 Kemerer model
    E= 585.7 + 15.12 FP Matson, Barnett, and Mellichamp model
A quick examination of these models indicates that each will yield a different result14
for the same values of LOC or FP. The implication is clear. Estimation models mustbe calibrated for local needs!

The COCOMO Model
In his classic book on “software engineering economics,” Barry Boehm [BOE81] introduced
a hierarchy of software estimation models bearing the name COCOMO, for
COnstructive COst MOdel. The original COCOMO model became one of the most widely
used and discussed software cost estimation models in the industry. It has evolved
into a more comprehensive estimation model, called COCOMO II [BOE96, BOE00].
Like its predecessor, COCOMO II is actually a hierarchy of estimation models that
address the following areas:
Application composition model. Used during the early stages of software
engineering, when prototyping of user interfaces, consideration of software
and system interaction, assessment of performance, and evaluation of technology
maturity are paramount.
Early design stage model. Used once requirements have been stabilized
and basic software architecture has been established.
Post-architecture-stage model. Used during the construction of the
Like all estimation models for software, the COCOMO II models require sizing information.
Three different sizing options are available as part of the model hierarchy:
object points, function points, and lines of source code.
The COCOMO II application composition model uses object points and is
illustrated in the following paragraphs. It should be noted that other, more sophisticated estimation models (using FP and KLOC) are also available as part of
Like function points (Chapter 4), the object point is an indirect software measure
that is computed using counts of the number of (1) screens (at the user interface), (2)
reports, and (3) components likely to be required to build the application. Each object
instance (e.g., a screen or report) is classified into one of three complexity levels (i.e.,
simple, medium, or difficult) using criteria suggested by Boehm [BOE96]. In essence,
complexity is a function of the number and source of the client and server data tables
that are required to generate the screen or report and the number of views or sections
presented as part of the screen or report.
Once complexity is determined, the number of screens, reports, and components
are weighted according to Table 5.1. The object point count is then determined by
multiplying the original number of object instances by the weighting factor in Table
5.1 and summing to obtain a total object point count. When component-based development
or general software reuse is to be applied, the percent of reuse (%reuse) is
estimated and the object point count is adjusted:
       NOP= (object points) x [(100 -%reuse)/100]
           where NOP is defined as new object points.
To derive an estimate of effort based on the computed NOP value, a “productivity

rate” must be derived. Table 5.2 presents the productivity rate
PROD = NOP/person-month
for different levels of developer experience and development environment maturity.
Once the productivity rate has been determined, an estimate of project effort can be
derived as
estimated effort = NOP/PROD


The decomposition techniques and empirical estimation models described in the preceding
sections are available as part of a wide variety of software tools. These automated
estimation tools allow the planner to estimate cost and effort and to perform
"what-if" analyses for important project variables such as delivery date or staffing.
Although many automated estimation tools exist, all exhibit the same general characteristics
and all perform the following six generic functions [JON96]:
1. Sizing of project deliverables. The “size” of one or more software work
products is estimated. Work products include the external representation of
software (e.g., screen, reports), the software itself (e.g., KLOC), functionality
delivered (e.g., function points), descriptive information (e.g. documents).
2. Selecting project activities. The appropriate process framework (Chapter
2) is selected and the software engineering task set is specified.
3. Predicting staffing levels. The number of people who will be available to
do the work is specified. Because the relationship between people available
and work (predicted effort) is highly nonlinear, this is an important input.
4. Predicting software effort. Estimation tools use one or more models (e.g.,
Section 5.7) that relate the size of the project deliverables to the effort
required to produce them.
5. Predicting software cost. Given the results of step 4, costs can be computed
by allocating labor rates to the project activities noted in step 2.
6. Predicting software schedules. When effort, staffing level, and project
activities are known, a draft schedule can be produced by allocating labor
across software engineering activities based on recommended models for
effort distribution.
When different estimation tools are applied to the same project data, a relatively
large variation in estimated results is encountered. More important, predicted values
sometimes are significantly different than actual values. This reinforces the notion
that the output of estimation tools should be used as one "data point

Saturday, February 13, 2010

Software bug

A software bug is the common term used to describe an error, flaw, mistake, failure, or fault in a computer program or system that produces an incorrect or unexpected result, or causes it to behave in unintended ways. Most bugs arise from mistakes and errors made by people in either a program's source code or its design, and a few are caused by compilers producing incorrect code. A program that contains a large number of bugs, and/or bugs that seriously interfere with its functionality, is said to be buggy. Reports detailing bugs in a program are commonly known as bug reports, fault reports, problem reports, trouble reports, change requests, and so forth.


Bugs are a consequence of the nature of human factors in the programming task. They arise from oversights or mutual misunderstandings made by a software team during specification, design, coding, data entry and documentation. For example: In creating a relatively simple program to sort a list of words into alphabetical order, one's design might fail to consider what should happen when a word contains a hyphen. Perhaps, when converting the abstract design into the chosen programming language, one might inadvertently create an off-by-one error and fail to sort the last word in the list. Finally, when typing the resulting program into the computer, one might accidentally type a '<' where a '>' was intended, perhaps resulting in the words being sorted into reverse alphabetical order. More complex bugs can arise from unintended interactions between different parts of a computer program. This frequently occurs because computer programs can be complex — millions of lines long in some cases — often having been programmed by many people over a great length of time, so that programmers are unable to mentally track every possible way in which parts can interact. Another category of bug called a race condition comes about either when a process is running in more than one thread or two or more processes run simultaneously, and the exact order of execution of the critical sequences of code have not been properly synchronized.
The software industry has put much effort into finding methods for preventing programmers from inadvertently introducing bugs while writing software.[11][12] These include:
Programming style
While typos in the program code most likely are caught by the compiler, a bug usually appears when the programmer makes a logic error. Various innovations in programming style and defensive programming are designed to make these bugs less likely, or easier to spot.
Programming techniques
Bugs often create inconsistencies in the internal data of a running program. Programs can be written to check the consistency of their own internal data while running. If an inconsistency is encountered, the program can immediately halt, so that the bug can be located and fixed. Alternatively, the program can simply inform the user, attempt to correct the inconsistency, and continue running.
Development methodologies
There are several schemes for managing programmer activity, so that fewer bugs are produced. Many of these fall under the discipline of software engineering (which addresses software design issues as well). For example, formal program specifications are used to state the exact behavior of programs, so that design bugs can be eliminated. Unfortunately, formal specifications are impractical or impossible for anything but the shortest programs, because of problems of combinatorial explosion and indeterminacy.
Programming language support
Programming languages often include features which help programmers prevent bugs, such as static type systems, restricted name spaces and modular programming, among others. For example, when a programmer writes (pseudocode) LET REAL_VALUE PI = "THREE AND A BIT", although this may be syntactically correct, the code fails a type check. Depending on the language and implementation, this may be caught by the compiler or at runtime. In addition, many recently-invented languages have deliberately excluded features which can easily lead to bugs, at the expense of making code slower than it need be: the general principle being that, because of Moore's law, computers get faster and software engineers get slower; it is almost always better to write simpler, slower code than "clever", inscrutable code, especially considering that maintenance cost is considerable. For example, the Java programming language does not support pointer arithmetic; implementations of some languages such as Pascal and scripting languages often have runtime bounds checking of arrays, at least in a debugging build.
Code analysis
Tools for code analysis help developers by inspecting the program text beyond the compiler's capabilities to spot potential problems. Although in general the problem of finding all programming errors given a specification is not solvable (see halting problem), these tools exploit the fact that human programmers tend to make the same kinds of mistakes when writing software.
Tools to monitor the performance of the software as it is running, either specifically to find problems such as bottlenecks or to give assurance as to correct working, may be embedded in the code explicitly (perhaps as simple as a statement saying PRINT "I AM HERE"), or provided as tools. It is often a surprise to find where most of the time is taken by a piece of code, and this removal of assumptions might cause the code to be rewritten

Bug management

It is common practice for software to be released with known bugs that are considered non-critical, that is, that do not affect most users' main experience with the product. While software products may, by definition, contain any number of unknown bugs, measurements during testing can provide an estimate of the number of likely bugs remaining; this becomes more reliable the longer a product is tested and developed ("if we had 200 bugs last week, we should have 100 this week"). Most big software projects maintain two lists of "known bugs"— those known to the software team, and those to be told to users. This is not dissimulation, but users are not concerned with the internal workings of the product. The second list informs users about bugs that are not fixed in the current release, or not fixed at all, and a workaround may be offered.
There are various reasons for not fixing bugs:
  • The developers often don't have time or it is not economical to fix all non-severe bugs.
  • The bug could be fixed in a new version or patch that is not yet released.
  • The changes to the code required to fix the bug could be large, expensive, or delay finishing the project.
  • Even seemingly simple fixes bring the chance of introducing new unknown bugs into the system. At the end of a test/fix cycle some managers may only allow the most critical bugs to be fixed.
  • Users may be relying on the undocumented, buggy behavior, especially if scripts or macros rely on a behavior; it may introduce a breaking change.
  • It's "not a bug". A misunderstanding has arisen between expected and provided behavior
Given the above, it is often considered impossible to write completely bug-free software of any real complexity. So bugs are categorized by severity, and low-severity non-critical bugs are tolerated, as they do not affect the proper operation of the system for most users. NASA's SATC managed to reduce the number of errors to fewer than 0.1 per 1000 lines of code (SLOC)[citation needed] but this was not felt to be feasible for any real world projects.
The severity of a bug is not the same as its importance for fixing, and the two should be measured and managed separately. On a Microsoft Windows system a blue screen of death is rather severe, but if it only occurs in extreme circumstances, especially if they are well diagnosed and avoidable, it may be less important to fix than an icon not representing its function well, which though purely aesthetic may confuse thousands of users every single day. This balance, of course, depends on many factors; expert users have different expectations from novices, a niche market is different from a general consumer market, and so on.
A school of thought popularized by Eric S. Raymond as Linus's Law says that popular open-source software has more chance of having few or no bugs than other software, because "given enough eyeballs, all bugs are shallow". This assertion has been disputed, however: computer security specialist Elias Levy wrote that "it is easy to hide vulnerabilities in complex, little understood and undocumented source code," because, "even if people are reviewing the code, that doesn't mean they're qualified to do so."
Like any other part of engineering management, bug management must be conducted carefully and intelligently because "what gets measured gets done"and managing purely by bug counts can have unintended consequences. If, for example, developers are rewarded by the number of bugs they fix, they will naturally fix the easiest bugs first— leaving the hardest, and probably most risky or critical, to the last possible moment ("I only have one bug on my list but it says "Make sun rise in West"). If the management ethos is to reward the number of bugs fixed, then some developers may quickly write sloppy code knowing they can fix the bugs later and be rewarded for it, whereas careful, perhaps "slower" developers do not get rewarded for the bugs that were never there.

Tools, Techniques, and Metrics


Metrics in the area of software fault tolerance, (or software faults,) are generally pretty poor. The data sets that have been analyzed in the past are surely not indicative of today's large and complex software systems. The analysis by [DeVale99] of various POSIX systems has the largest applicable data set found in the literature. Some of the advantages of the [DeVale99] research are the fact that the systems are commercially developed, the systems adhere to the same specification, and the systems are large enough that testing them shows an array of problems. Still, the [DeVale99] research is not without its problems; operating systems may be a more unique case than application software; operating systems may share more heritage from projects like Berkeley's Unix or the Open Software Foundation's research projects. The issue with gathering good metrics data is the cost involved in developing multiple versions of complex robust software. Operating systems offer the advantage of many organizations building their own versions of this complex software.
The results of the [DeVale99] and [Knight86] research show that software errors may be correlated in N-version software systems. The results of these studies imply that the failure mode for programmers is not unique, destroying a major tenant of the N-version software fault tolerance technique. It is important to remember however, the the [Knight86] research, like most other publications in this area, are case studies, and may not be an in-depth study across enough variety of software systems to be a conclusive result.


Software fault tolerance has an extreme lack of tools in order to aide the programmer in making reliable system. This lack of adequate tools is not very different from the general lack of functional tools in software development that go beyond an editor and a compiler. The ability to semi-automate the adding of fault tolerance into software would be a significant enhancement to the market today. One of the biggest issues facing the development of software fault tolerant systems is the cost currently required to develop these systems. Enhanced and functional tools, that can easily accomplish their task, would surely be welcomed in the market place.


Recovery Blocks

The recovery block method is a simple method developed by Randell from what was observed as somewhat current practice at the time. [Lyu95] The recovery block operates with an adjudicator which confirms the results of various implementations of the same algorithm. In a system with recovery blocks, the system view is broken down into fault recoverable blocks. The entire system is constructed of these fault tolerant blocks. Each block contains at least a primary, secondary, and exceptional case code along with an adjudicator. (It is important to note that this definition can be recursive, and that any component may be composed of another fault tolerant block composed of primary, secondary, exceptional case, and adjudicator components.) The adjudicator is the component which determines the correctness of the various blocks to try. The adjudicator should be kept somewhat simple in order to maintain execution speed and aide in correctness. Upon first entering a unit, the adjudicator first executes the primary alternate. (There may be N alternates in a unit which the adjudicator may try.) If the adjudicator determines that the primary block failed, it then tries to roll back the state of the system and tries the secondary alternate. If the adjudicator does not accept the results of any of the alternates, it then invokes the exception handler, which then indicates the fact that the software could not perform the requested operation.
Recovery block operation still has the same dependency which most software fault tolerance systems have: design diversity. The recovery block method increases the pressure on the specification to be specific enough to create different multiple alternatives that are functionally the same. This issue is further discussed in the context of the N-version method.
The recovery block system is also complicated by the fact that it requires the ability to roll back the state of the system from trying an alternate. This may be accomplished in a variety of ways, including hardware support for these operations. This try and rollback ability has the effect of making the software to appear extremely transactional, in which only after a transaction is accepted is it committed to the system. There are advantages to a system built with a transactional nature, the largest of which is the difficult nature of getting such a system into an incorrect or unstable state. This property, in combination with checkpointing and recovery may aide in constructing a distributed hardware fault tolerant system

capability maturity model

  Capability Maturity Model Integration (CMMI) is a process improvement approach that provides organizations with the essential elements of effective processes that ultimately improve their performance. CMMI can be used to guide process improvement across a project, a division, or an entire organization
CMMI in software engineering and organizational development is a trademarked process improvement approach that provides organizations with the essential elements for effective process improvement.
CMMI according to the Software Engineering Institute (SEI, 2008), helps "integrate traditionally separate organizational functions, set process improvement goals and priorities, provide guidance for quality processes, and provide a point of reference for appraising current processes."[2]

CMMI currently addresses three areas of interest:
  1. Product and service development — CMMI for Development (CMMI-DEV),
  2. Service establishment, management, and delivery — CMMI for Services (CMMI-SVC), and
  3. Product and service acquisition — CMMI for Acquisition (CMMI-ACQ).
CMMI was developed by a group of experts from industry, government, and the Software Engineering Institute (SEI) at Carnegie Mellon University. CMMI models provide guidance for developing or improving processes that meet the business goals of an organization. A CMMI model may also be used as a framework for appraising the process maturity of the organization.[1]
CMMI originated in software engineering but has been highly generalised over the years to embrace other areas of interest, such as the development of hardware products, the delivery of all kinds of services, and the acquisition of products and services. The word "software" does not appear in definitions of CMMI. This generalization of improvement concepts makes CMMI extremely abstract. It is not as specific to software engineering as its predecessor, the Software CMM (CMM, see below)...
CMMI was developed by the CMMI project, which aimed to improve the usability of maturity models by integrating many different models into one framework. The project consisted of members of industry, government and the Carnegie Mellon Software Engineering Institute (SEI). The main sponsors included the Office of the Secretary of Defense (OSD) and the National Defense Industrial Association.
CMMI is the successor of the capability maturity model (CMM) or software CMM. The CMM was developed from 1987 until 1997. In 2002, CMMI Version 1.1 was released. Version 1.2 followed in August 2006.
 CMMI topics
 CMMI representation
CMMI exists in two representations: continuous and staged.[1] The continuous representation is designed to allow the user to focus on the specific processes that are considered important for the organization's immediate business objectives, or those to which the organization assigns a high degree of risk. The staged representation is designed to provide a standard sequence of improvements, and can serve as a basis for comparing the maturity of different projects and organizations. The staged representation also provides for an easy migration from the SW-CMM to CMMI.[1]
 CMMI model framework
For more details on this topic, see Process area (CMMI).
Depending on the CMMI constellation (acquisition, services, development) used, the process areas it contains will vary. Key process areas are the areas that will be covered by the organization's processes. The table below lists the process areas that are present in all CMMI constellations. This collection of sixteen process areas is called the CMMI Model Framework, or CMF.
Capability Maturity Model Integration (CMMI) Model Framework (CMF)
Maturity Level
Requirements Management
Project Monitoring and Control
Project Management
Project Planning
Project Management
Configuration Management
Measurement and Analysis
Process and Product Quality Assurance
Organizational Process Definition
Process Management
Causal Analysis
 CMMI models
CMMI best practices are published in documents called models, each of which addresses a different area of interest. The current release of CMMI, version 1.2, provides models for three areas of interest: development, acquisition, and services.
  • CMMI for Development (CMMI-DEV), v1.2 was released in August 2006. It addresses product and service development processes.
  • CMMI for Acquisition (CMMI-ACQ), v1.2 was released in November 2007. It addresses supply chain management, acquisition, and outsourcing processes in government and industry.
  • CMMI for Services (CMMI-SVC), v1.2 was released in February 2009. It addresses guidance for delivering services within an organization and to external customers.
  • CMMI Product Suite (includes Development, Acquisition, and Services), v1.3 is expected to be released in 2010. CMMI Version 1.3—Plans for the Next Version
Regardless of which model an organization chooses, CMMI best practices should be adapted by an organization according to its business objectives.