Survey 1 Software Cost Estimation For Global Software Development Study

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

SURVEY 1

Software cost estimation for global software development study.


Author: Manal El Baita

Publisher: Software Project Management Research Team, ENSIAS, Mohammed V


University, Rabat, Morocco. (15 April 2015).
Source: IEEE Xplore.

Abstract:
Software cost estimation plays a central role in the success of software project
management in the context of global software development (GSD). The importance of
mastering software cost estimation may appear to be obvious. However, as regards the
issue of customer satisfaction, end-users are often unsatisfied with software project
management results. In this paper, a systematic mapping study (SMS) is carried out
with the aim of summarising software cost estimation in the context of GSD research
by answering nine mapping questions. A total, of 16 articles were selected and
classified according to nine criteria: publication source, publication year, research
type, research approach, contribution type, software cost estimation techniques,
software cost estimation activity, cost drivers and cost estimation performances for
GSD projects. The results show that the interest in estimating software cost for GSD
projects has increased in recent years and reveal that conferences are the most
frequently targeted publications. Most software cost estimation for GSD research has
focused on theory. The dominant contribution type of software cost estimation for
GSD research is that of models, while the predominant activity was identified as
being software development cost. Identifying empirical solutions to address software
cost estimation for GSD is a promising direction for researchers.
SURVEY 2

A Systematic Review of Software Development Cost Estimation


Studies

Author: Martin john shepperd, Brunel University London. (Feb 2007).

Source: IEEE Xplore.

Abstract:

This study aims to provide a basis for the improvement of software estimation
research through a systematic review of previous work. The review identifies 304
software cost estimation papers in 76 journals and classifies the papers according to
research topic, estimation approach, research approach, study context and data set.
Based on the review, we provide recommendations for future software cost estimation
research:

1) Increase the breadth of the search for relevant studies,

2) Search manually for relevant papers within a carefully selected set of journals when
completeness is essential,

3) Conduct more research on basic software cost estimation topics,

4) Conduct more studies of software cost estimation in real-life settings,

5) Conduct more studies on estimation methods commonly used by the software


industry, and,

6) Conduct fewer studies that evaluate methods based on arbitrarily chosen data sets.

In recent years, software has become the most expensive component of computer
system projects. The bulk of the cost of software development is due to the human
effort, and most cost estimation methods focus on this aspect and give estimates in
terms of person-months. Accurate software cost estimates are critical to both
developers and customers. They can be used for generating request for proposals,
contract negotiations, scheduling, monitoring and control. Underestimating the costs
may result in management approving proposed systems that then exceed their budgets,
with underdeveloped functions and poor quality, and failure to complete on time.
Overestimating may result in too many resources committed to the project, or, during
contract bidding, result in not winning the contract, which can lead to loss of jobs.
Accurate cost estimation is important because:
• It can help to classify and prioritize development projects with respect to an overall
business plan.
• It can be used to determine what resources to commit to the project and how well
these resources will be used.
• It can be used to assess the impact of changes and support replanning.
• Projects can be easier to manage and control when resources are better matched to
real needs.
• Customers expect actual development costs to be in line with estimated costs.
Software cost estimation involves the determination of one or more of the
following estimates:
• effort (usually in person-months)
• project duration (in calendar time)
• cost.

Most cost estimation models attempt to generate an effort estimate, which can then be
converted into the project duration and cost. Although effort and cost are closely
related, they are not necessarily related by a simple transformation function. Effort is
often measured in person-months of the programmers, analysts and project managers.
This effort estimate can be converted into a dollar cost figure by calculating an average
salary per unit time of the staff involved, and then multiplying this by the estimated
effort required.

Practitioners have struggled with three fundamental issues:


• Which software cost estimation model to use?
• Which software size measurement to use – lines of code (LOC), function points (FP),
or feature point?
• What is a good estimate?
The widely practiced cost estimation method is expert judgment. For many years,
project
managers have relied on experience and the prevailing industry norms as a basis to
develop cost
estimate. However, basing estimates on expert judgment is problematic:
• This approach is not repeatable and the means of deriving an estimate are not
explicit.
• It is difficult to find highly experienced estimators for every new project.
• The relationship between cost and system size is not linear. Cost tends to increase
exponentially with size. The expert judgment method is appropriate only when the
sizes of the current project and past projects are similar.
• Budget manipulations by management aimed at avoiding overrun make experience
and data from previous projects questionable.

In the last three decades, many quantitative software cost estimation models
have been developed. They range from empirical models such as Boehm’s COCOMO
models to analytical models such as those in. An empirical model uses data from
previous projects to evaluate the current project and derives the basic formulae from
analysis of the particular database available. An analytical model, on the other hand,
uses formulae based on global assumptions, such as the rate at which developer solve
problems and the number of problems available.
Most cost models are based on the size measure, such as LOC and FP, obtained from
size estimation. The accuracy of size estimation directly impacts the accuracy of cost
estimation.
Although common size measurements have their own drawbacks, an organization can
make good use of any one, as long as a consistent counting method is used.
A good software cost estimate should have the following attributes:
• It is conceived and supported by the project manager and the development team.
• It is accepted by all stakeholders as realizable.
• It is based on a well-defined software cost model with a credible basis.
• It is based on a database of relevant project experience (similar processes, similar
technologies, similar environments, similar people and similar requirements).
• It is defined in enough detail so that its key risk areas are understood and the
probability of success is objectively assessed.
Software cost estimation historically has been a major difficulty in software
development.
Several reasons for the difficulty have been identified:
• Lack of a historical database of cost measurement
• Software development involving many interrelated factors, which affect development
effort and productivity, and whose relationships are not well understood
• Lack of trained estimators and estimators with the necessary expertise
• Little penalty is often associated with a poor estimate
Process of estimation
Cost estimation is an important part of the planning process. For example,
in the top-down planning approach, the cost estimate is used to derive the project plan:
1. The project manager develops a characterization of the overall functionality, size,
process, environment, people, and quality required for the project.
2. A macro-level estimate of the total effort and schedule is developed using a software
cost estimation model.
3. The project manager partitions the effort estimate into a top-level work breakdown
structure. He also partitions the schedule into major milestone dates and determines a
staffing profile, which together forms a project plan.

The actual cost estimation process involves seven steps


1. Establish cost-estimating objectives
2. Generate a project plan for required data and resources
3. Pin down software requirements
4. Work out as much detail about the software system as feasible
5. Use several independent cost estimation techniques to capitalize on their combined
strengths
6. Compare different estimates and iterate the estimation process
7. After the project has started, monitor its actual cost and progress, and feedback
results to project management.
No matter which estimation model is selected, users must pay attention to the
following to get best results:
• coverage of the estimate (some models generate effort for the full life-cycle, while
others do not include effort for the requirement stage)
• calibration and assumptions of the model.
• sensitivity of the estimates to the different model parameters.
• deviation of the estimate with respect to the actual cost.
Cost estimation
There are two major types of cost estimation methods: algorithmic and non-
algorithmic.
Algorithmic models vary widely in mathematical sophistication. Some are based on
simple arithmetic formulas using such summary statistics as means and standard
deviations. Others are based on regression models and differential equations. To
improve the accuracy of algorithmic models, there is a need to adjust or calibrate the
model to local circumstances. These models cannot be used off-the-shelf. Even with
calibration the accuracy can be quite mixed.

We first give an overview of non-algorithmic methods.

Non-algorithmic Methods
Analogy costing: This method requires one or more completed projects that are
similar to the new project and derives the estimation through reasoning by analogy
using the actual costs of previous projects. Estimation by analogy can be done either at
the total project level or at subsystem level. The total project level has the advantage
that all cost components of the system will be considered while the subsystem level
has the advantage of providing a more detailed assessment of the similarities and
differences between the new project and the completed projects. The strength of this
method is that the estimate is based on actual project experience.
However, it is not clear to what extend the previous project is actually representative of
the constraints, environment and functions to be performed by the new system.
Expert judgment: This method involves consulting one or more experts. The experts
provide estimates using their own methods and experience. Expert-consensus
mechanisms such as Delphi technique or PERT will be used to resolve the
inconsistencies in the estimates. The Delphi technique works as follows:
1) The coordinator presents each expert with a specification and a form to record
estimates.
2) Each expert fills in the form individually (without discussing with others) and is
allowed to ask the coordinator questions.
3) The coordinator prepares a summary of all estimates from the experts (including
mean or median) on a form requesting another iteration of the experts’ estimates and
the rationale for the estimates.
4) Repeat steps 2)-3) as many rounds as appropriate.
A modification of the Delphi technique proposed by Boehm and Fahquhar seems to be
more effective before the estimation, a group meeting involving the coordinator and
experts is arranged to discuss the estimation issues. In step 3), the experts do not need
to give any rationale for the estimates. Instead, after each round of estimation, the
coordinator calls a meeting to have experts discussing those points where their
estimates varied widely.

Parkinson: Using Parkinson's principle “work expands to fill the available volume”
the cost is determined (not estimated) by the available resources rather than based on
an objective assessment. If the software has to be delivered in 12 months and 5 people
are available, the effort is estimated to be 60 person-months. Although it sometimes
gives good estimation, this method is not recommended as it may provide very
unrealistic estimates. Also, this method does not promote good software engineering
practice.

Price-to-win: The software cost is estimated to be the best price to win the project.
The estimation is based on the customer's budget instead of the software functionality.
For example, if a reasonable estimation for a project costs 100 person-months but the
customer can only afford 60 person-months, it is common that the estimator is asked
to modify the estimation to fit 60 person- months’ effort in order to win the project.
This is again not a good practice since it is very likely to cause a bad delay of delivery
or force the development team to work overtime.
Bottom-up: In this approach, each component of the software system is separately
estimated and the results aggregated to produce an estimate for the overall system. The
requirement for this approach is that an initial design must be in place that indicates
how the system is decomposed into different components.
Top-down: This approach is the opposite of the bottom-up method. An overall cost
estimate for the system is derived from global properties, using either algorithmic or
non-algorithmic methods. The total cost can then be split up among the various
components. This approach is more suitable for cost estimation at the early stage.

Algorithmic methods
The algorithmic methods are based on mathematical models that produce cost estimate
as a function of a number of variables, which are considered to be the major cost
factors. Any algorithmic model has the form:
Effort = f(x 1, x 2, …, x n)
where {x 1, x 2, …, x n} denote the cost factors. The existing algorithmic methods
differ in two aspects: the selection of cost factors, and the form of the function f. We
will first discuss the cost factors used in these models, then characterize the models
according to the form of the functions and whether the models are analytical or
empirical.
Cost factors
Besides the software size, there are many other cost factors. The most comprehensive
set of cost factors are proposed and used by Boehm et al in the COCOMO II model.
These cost factors can be divided into four types:
Product factors: required reliability; product complexity; database size used; required
reusability; documentation match to life-cycle needs;
Computer factors: execution time constraint; main storage constraint; computer
turnaround constraints; platform volatility;
Personnel factors: analyst capability; application experience; programming capability;
platform experience; language and tool experience; personnel continuity;
Project factors: multisite development; use of software tool; required development
schedule.
The above factors are not necessarily independent, and most of them are hard to
quantify. In many models, some of the factors appear in combined form and some are
simply ignored. Also, some factors take discrete values, resulting in an estimation
function with a piece-wise form.

Performance of estimation models


Many studies have attempted to evaluate the cost estimation models. Unfortunately,
the results are not encouraging, as many of them were found to be not very accurate.
Kemerer performed an empirical validation of four algorithmic models (SLIM,
COCOMO, Estimacs, and FPA). No recalibration of models was performed on the
project data, which was different from that used for model development. Most models
showed a strong over estimation bias and large estimation errors, ranging from a
MARE of 57% to 800%.
Vicinanza, Mukhopadhyay and Prietula used experts to estimate the project effort
using Kemerer’s data set without formal algorithmic techniques and found the results
outperformed the models in the original study. However, the MARE ranges from 32 to
1107%.
Ferens and Gurner evaluated three models (SPANS, Checkpoint, and COSTAR) using
22 projects from Albrecht’s database and 14 projects from Kemerer’s data set. The
estimation error is also large, with MARE ranging from 46% for the Checkpoint model
to 105% for the COSTAR model.
Another study on COCOMO also found high error rates, averaging 166%.
Jeffery and Low investigated the need for model calibration at both the industry and
organization levels. Without model calibration, the estimation error was large, with
MARE ranging from 43% to 105%.
Jeffery, Low and Barnes later compared the SPQR/20 model to FPA using data from 64
projects from a single organization. The models were recalibrated to the local
environment to remove estimation biases. Improvement in the estimate was observed
with a MARE of 12%, reflecting the benefits of model calibration.
There were also studies based on the use of analogy. With the use of a program called
ANGEL that was based on the minimization of Euclidian distance in n-dimensional
space, Shepperd and Schofield found that estimating by analogy outperformed
estimation based on statistically derived algorithms.
Heemstra surveyed 364 organizations and found that only 51 used models to estimate
effort and that the model users made no better estimate than the non-model users [15].
Also, use of estimation models was no better than expert judgment.
A survey of software development within JPL found that only 7% of estimators use
algorithmic models as a primary approach of estimation.
New approaches
Cost estimation remains a complex problem, which continues to attract considerable
research attention. Researchers have attempted different approaches. Recently, models
based on artificial intelligence techniques have been developed. For example, Finnie
and Wittig applied artificial neural networks (ANN) and case-based reasoning (CBR)
to estimation of effort. Using a data set from the Australian Software Metrics
Association, ANN was able to estimate development 12 effort within 25% of the
actual effort in more than 75% of the projects, and with a MARE of less than 25%.
However, the results from CBR were less encouraging. In 73% of the cases, the
estimates were within 50% of the actual effort, and for 53% of the cases, the estimates
were within 25% of the actual.
In a separate study, Mukhopadhyay, Vicinanza and Prietula found that an expert system
based on analogical reasoning outperformed other methods.
Srinivasan and Fisher used machine learning approaches based on regression trees and
neural networks to estimate costs. The learning approaches were found to be
competitive with SLIM, COCOMO, and function points, compared to the previous
study by Kemerer. A primary advantage of learning systems is that they are adaptable
and nonparametric. Briand, El Eman, and Bomarius proposed a hybrid cost modeling
method, COBRA: Cost estimation, Benchmarking and Risk Analysis. This method
was based on expert knowledge and quantitative project data from a small number of
projects. Encouraging results were reported on a small data set.
Conclusion
Today, almost no model can estimate the cost of software with a high degree of
accuracy. This state of the practice is created because
(1) there are a large number of interrelated factors that influence the software
development process of a given development team and a large number of project
attributes, such as number of user screens, volatility of system requirements and the
use of reusable software components.
(2) the development environment is evolving continuously.
(3) the lack of measurement that truly reflects the complexity of a software system.
To produce a better estimate, we must improve our understanding of these project
attributes and their causal relationships, model the impact of evolving environment,
and develop effective ways of measuring software complexity.
At the initial stage of a project, there is high uncertainty about these project attributes.
The estimate produced at this stage is inevitably inaccurate, as the accuracy depends
highly on the amount of reliable information available to the estimator. As we learn
more about the project during analysis and later design stages, the uncertainties are
reduced and more accurate estimates can be made. Most models produce exact results
without regard to this uncertainty. They need to be enhanced to produce a range of
estimates and their probabilities.
To improve the algorithmic models, there is a great need for the industry to collect
project data on a wider scale. The recent effort of ISBSG is a step in the right direction.
They have established a repository of over 790 projects, which will likely be a valuable
source for builders of cost estimation model.
With new types of applications, new development paradigms and new development
tools, cost estimators are facing great challenges in applying known estimation models
in the new millenium. Historical data may prove to be irrelevant for the future projects.
The search for reliable, accurate and low cost estimation methods must continue.
Several areas are in need of immediate attention. For example, we need models for
development based on formal methods, or iterative software process. Also, more
studies are needed to improve the accuracy of cost estimate for maintenance projects.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy