Testing Benford’s Law with Software Code Counts (IT-2)
When analyzing a data set, common thinking may lead one to suspect that the leading digit of each data point would follow a uniform distribution, where each digit (1 through 9) has an equal probability of occurrence. Benford’s law, to the contrary, states that the first digit of each data point will conform to a nonuniform distribution. More specifically, it states that a data point is more likely to begin with a one than a two, a two more likely than a three, a three more likely than a four, and so on. In a real world example, forensic accounting has identified patterns in financial fraud cases where criminals have used fabricated numbers where the leading digit does not follow Benford’s law.
One of the largest cost drivers of a software development project is size. Counting source lines of code (SLOC) is one of the most common ways to estimate the size of software. Cost estimators typically use estimated SLOC counts to determine the labor required to develop the respective software and ultimately estimate the cost. When SLOC estimates are provided to a cost estimator (such as in a CARD, an ICBD, or input from a SME), should the estimator accept the estimates? Most estimators are not software developers or engineers and therefore it is reasonable to accept the SLOC estimates as it presents the best data available. This paper will present estimators with a quick test as a cross check using Benford’s law. If an estimator’s SLOC estimates do not pass this test, this paper will also discuss suggestions to mitigate some of the associated risks.
Improved Method for Predicting Software Effort and Schedule (IT-3)
Wilson Rosa – AIS/C4ISR Branch Head, Naval Center for Cost Analysis
Barry Boehm – Professor Emeritus, University of Southern California
Ray Madachy – Associate Professor, Naval Postgraduate School
Brad Clark – Vice-President, Software Metrics Incorporated
Joseph P. Dean – Operating Location Chief, Hanscom AFB Air Force Cost Analysis Agency
This paper presents a set of effort and schedule estimating relationships for predicting software development using empirical data from 317 very recent US DoD programs. The first set predicts effort as a function of size and application type. The second predicts duration using size and staff level. The models are simpler and more viable to use for early estimates than traditional parametric cost models. Practical benchmarks are also provided to guide analysts in normalizing data.
Keywords: Cost Model; Effort Estimation; Schedule Estimation; Software Engineering
Costs of Migration and Operation in the Cloud (IT-4)
Arlene Minkiewicz – Chief Scientist, PRICE Systems
At one level cloud computing is just Internet enabled time sharing. Instead of organizations investing in all the Information Technology (IT) assets such as hardware, software and infrastructure they need to meet business needs; cloud computing technology makes these resources available through the Internet. Cloud computing allows an organization to adopt a different economic model for meeting IT needs by reducing capital investments and increasing operational investments. Gartner has predicted that “the bulk of new IT spending by 2016 will be for could computing platforms and applications with nearly half of large enterprises having cloud deployments by 2017″. McKinsey and Company predict that the total economic impact of cloud technology could be $1.7 trillion to $6.2 trillion by 2025.
“Cloud computing embraces cyber-infrastructure and builds upon decades of research in virtualization, distributed computing, grid computing and more recently networking, web, and software services”. In other words, although the term cloud computing is relatively new, the concepts and technologies behind cloud computing have been emerging and evolving for some time. Consumers of cloud computing access hardware, software, and networking capabilities from third party providers in much the same way they get electricity or water from their utility companies.
The utility computing model offered in the cloud clearly brings benefits – especially to small and medium sized enterprises and any sort of startup business. In addition to the cost savings from not having to purchase all the hardware, software and infrastructure associated with running a business, cloud solutions bring agility, scalability, portability and on-demand availability as well. So what’s the downside?
While the potential for cost savings is real, as with all things – getting there is not free. A company with firmly entrenched legacy systems need to think about the trade-offs of migrating from the status quo into the cloud. Migration into the cloud could spur a host of activities. These include issues of installation and configuration, possible changes to code to adapt to the cloud hosts operational environment, possible changes to data base queries and schemas as well as adaption for changes in the way applications interface with legacy applications or other applications in the cloud. They also need to identify the cloud solution providers, understand their pricing models and determined a strategy to wisely and affordably move to the cloud.
This paper reports on on-going research into the costs and benefits of cloud computing. It begins with a discussion of cloud computing – what it is, what are the different types of cloud computing and how it is being used by businesses and the government. It then delves into the cost issues associated with moving to and operating in the cloud. Following this, there will be a discussion of the various pricing models and options currently offered by cloud providers. Finally, a methodology and model will be presented for using this information to understand the total cost of moving capabilities into the cloud.
How I Continued to Stop Worrying and Love Software Resource Data Reports (IT-5)
This presentation highlights the trends and cost estimating relationships derived from detailed analysis of the August 2013 Office of the Secretary of Defense (OSD) Software Resource Data Report (SRDR) data. This analysis was conducted by Nicholas Lanham and Mike Popp and provides as follow-on analysis to the August 2012 SRDR brief developed and previously presented by Mike Popp, AIR 4.2. As described within the August 2013 presentation, the Government’s unprecedented view of the Department of Defense’s (DoD) most comprehensive software-specific database has continued to expand from 1,890 individual records in 2012 to 2,546 records within the 2013 analysis. This expansion has also allowed for previously developed software growth algorithms to be updated, including an increased number of paired (initial and final records) data, expanding from 142 paired data points within the 2012 presentation to 212 paired data points within the 2013 analysis. In addition, the latest 2013 SRDR data analysis has also driven the generation of a more comprehensive understanding between the relationships between experience level, requirements volatility, CMMI level, development process, code count type, new development, upgrade development, language type(s), and software productivity (Hours/ESLOC). As initially highlighted within the 2012 analysis, the latest analysis of the 2013 dataset further indicates the lack of software productivity influence that is driven by contractor-reported requirements volatility ratings, CMM levels, and/or staffing experience levels due to inconsistent SRDR reporting definitions between various contractors. Considering the significant increase in data records from 2012, this presentation further supports the derived initial ESLOC and percent change in software development hour relationship, as well as increases the number of records supporting the previously derived software productivity-rate relationships to software language type(s).
Mobile Applications, Functional Analysis and Cost Estimation (IT-6)
Mobile applications, their use and popularity, have increased exponentially in the past 7 years with the introduction of Apple’s iPhone, Google’s Android Operating System and mobile gaming platforms such as Microsoft’s XBox One. This increase in applications and the data used has challenged communication service providers to provide the needed bandwidth and has led to the quick deployment of high speed cellular networks such as LTE. New business models, revenue models, and even companies have been created on the success of one mobile application or a new piece of functionality.
Customers experience mobile applications differently than application on computers. In addition to their portability, customers interact with mobile applications through different interfaces. Using multi-touch screens, voice, rotation/alignment, camera interfaces and blowing air on the screen. These applications are changing our communication methods and allowing customer to personalize their interactions.
Functional Analysis, as defined by ISO/IEC 14143-1:2007 and documented in the IFPUG Counting Practices Manual (CPM 4.3.1) can quickly identify the functionality provided to the customer by documenting data and transactional functions. This method can be used to estimate costs at any time during the lifecycle of a mobile application.
This presentation will demonstrate how to derive cost estimates at different stages in a project’s lifecycle by using function points and the advantages of using an FP based size estimate over a SLOC based estimate. The intended audience is software cost estimators, project managers, and anyone who is interested in software measurement.
Keywords: Function Points, Software Estimation, Agile, Mobile Applications, Mobility, Project Management, Software Measurement
In Pursuit of the One True Software Resource Data Reporting (SRDR) Database (IT-7)
Zachary McGregor-Dorsey – Cost Analyst, Technomics, Inc.
For many years, Software Resource Data Reports, collected by the Defense Cost and Resource Center (DCARC) on Major Defense Acquisition Programs (MDAPs), have been widely acknowledged as an important source of software sizing, effort, cost, and schedule data to support estimating. However, using SRDRs presents a number of data collection, normalization, and analysis challenges, which would in large part be obviated by a single robust relational database. The authors set out to build just such a database, and this paper describes their journey, pitfalls encountered along the way, and success in bringing to fruition a living artifact that can be of tremendous utility to the defense software estimating community.
SRDRs contain a wealth of data and metadata, and various attempts have been made by such luminaries in the field as Dr. Wilson Rosa and Mr. Mike Popp to excerpt and summarize the “good” data from SRDRs and make them available to the community. Such summaries typically involve subjective interpretations of the raw data, and by their nature are snapshots in time and may not distinguish between final data and those for which updates are expected.
The primary goal of this project was to develop an Access database, which would both store the raw source data in its original form at an atomic level, exactly as submitted by WBS element and reporting event, and allow evaluations, interpretations, and annotations of the data, including appropriate pairing of Initial and Final reports; mapping of SLOC to standard categories for the purposes of determining ESLOC; normalization of software activities to a standard set of activities; and storage of previous assessments, such as those of the aforementioned experts. The database design not only provides flexible queries for quick, reliable access to the desired data to support analysis, it also incorporates the DCARC record of submitted and expected SRDRs in order to track missing past data and anticipate future data.
The database is structured by Service, Program, Contract, Organization, CSDR Plan, and Reporting Event, and is flexible enough to include non-SRDR data. Perhaps its most innovative feature is the implementation of “movable” entities, wherein quantities such as Requirements, Effort, and SLOC, and qualities such as Language, Application Type, and Development Process can be reported at multiple levels and “rolled up” appropriately using a sophisticated set of queries. These movable entities enable the database to easily accommodate future changes made to the suggested format or reporting requirement found in the SRDR Data Item Description (DID).
This work was sponsored by the Office of the Deputy Assistant Secretary of the Army for Cost and Economics, and represents a continuation of the effort that produced the ICEAA 2013 Best Paper in the IT track, “ODASA-CE Software Growth Research.” A key motivation of the database is to be able to provide real-time updates to both that Software Growth Model and ODASA-CE’s Software Estimating Workbook. We are also collaborating with the SRDR Working Group on continual improvements to the database and how best to make it available to the broader community.
Optimizing Total Cost of Ownership for Best Value IT Solutions: A Case Study using Parametric Models for Estimates of Alternative IT Architectures and Operational Approaches (IT-8)
Because of a variety of architectures and deployment models, Information Technology (IT) has become more and more complex for organizations to manage and support. Current technology IT system architectures range from server based local systems to implementations of a Private Cloud to utilization of the Public Cloud. Determining a “best value architecture” for IT systems requires the ability to effectively understand not only the cost, but the relative performance, schedule and risk associated with alternative solutions. The search for best value changes the “price-only” focus to one of Total Cost of Ownership (TCO). To optimally select a “best value ” approach for an information systems (IS) architecture, the IT organization must have a method to develop high confidence performance, cost, schedule, and risk estimates for each alternative. In order to assess TCO, it is critical to be able to effectively estimate the cost of ongoing operations provided by an in-house data center technical support team vs. a Managed Service Contractor and the risks associated with each model.
This paper presents IT project management support methods that incorporate parametric effort estimation models into the process of establishing IT architectures, solutions, and ongoing support to optimize TCO relative to capability. A case study of applying a parametric information technology estimate model to the development of estimates for Managed Service, or the cost of ongoing operations, for complex IT systems is presented. IT estimates in the case study include analysis of alternative operational approaches to maintain multiple data centers located globally and a widely distributed user community. The estimates in the case study include systems engineering for architecture and system design, the IT infrastructure, end user support, service desk, documentation, software and database services, development and maintenance of custom applications software, training, purchased hardware, purchased software, and facilities. In addition, ongoing support or Managed Service estimates incorporate requirements for multiple Service Level Agreements which must be satisfied.
Parametric model results for the case study are provided to demonstrate the decision support process. Utilizing a proven optimization approach, it is demonstrated that, with the support of an effective estimation model to develop effort estimates for alternative approaches, it is possible to optimize TCO, and thus establish a “best value” IT solution.
Estimating Hardware Storage Costs (IT-9)
Estimating Commercial-off-the-Shelf (COTS) hardware storage volume and cost requirements can be challenging. Factors such as storage type, speed, configuration, and changing costs can potentially lead to estimating difficulties. This is especially true when a Redundant Array of Independent Disks (RAID) configuration is implemented. Due to the multiple attributes that can vary within each RAID level, as well as other factors that may influence the total storage volume needed, developing relationships for estimating long-term storage costs can become complicated.
This research will examine the estimation of RAID storage costs. Through the evaluation of historical procurement data, we will evaluate the costs associated with several common disk drive standards and how those costs may change over time. Other areas of consideration include storage needs for different RAID levels, the need for open storage, and changing storage needs within the government. By obtaining a better understanding of storage variations, analysts will be better able to predict storage volume needs and estimate the potential cost impacts of different storage requirements.
Relating Cost to Performance: The Performance-Based Cost Model (IT-10)
Michael Jeffers – Senior Cost Analyst, Technomics, Inc.
Robert Nehring – Cost Analyst, Technomics, Inc.
Jean-Ali Tavassoli – Cost Analyst, Naval Surface Warfare Center Carderock Division
Kelly Meyers – Surface Combatant Team Lead, Naval Surface Warfare Center Carderock Division
Robert Jones – Senior Cost Analyst, Technomics, Inc.
For decades, in order to produce a cost estimate, estimators have been heavily reliant on the technical characteristics of a system, such as weight for hardware elements or source lines of code (SLOC) for software elements, as specified by designers and engineers. Quite often, a question will arise about the cost of adding additional performance requirements to a system design (or in a design-to-cost scenario, the savings to be achieved by removing requirements). Traditionally, the engineers will then have to undertake a design cycle to determine how the shift in requirements will change the system. The resultant technical outputs are finally given to the cost estimators, who will run them through their cost model to arrive at the cost impact. However, what if a single model could estimate the cost from the performance of the system alone? A Performance Based Cost Model (PBCM) can do just that.
First introduced in 1996, a PBCM is an early-stage rough-order-of-magnitude (ROM) cost estimating tool that is focused on relating cost to performance factors. PBCMs are parametric cost models that are integrated with a parametric engineering model so that they estimate cost as a function of performance by simultaneously estimating major physical characteristics. They are derived from historical data and engineering principles, consistent with experience. PBCMs are quick, flexible, and easy to use and have proven to be a valuable supplement to standard, detailed concept design and costing methods.
In this paper we explain essential PBCM concepts, including:
• A discussion of the interplay of capabilities, effectiveness, performance characteristics, and cost.
• How to identify the most meaningful cost drivers (i.e., performance characteristics, technology factors, and market conditions).
• How to identify the most meaningful output variables (i.e., those variables of prime interest to the PBCM user).
• How to create the mathematical structure that integrates cost drivers with cost and physical characteristics.
• How to obtain and normalize historical performance data, cost data, and technical data (physical characteristics).
• How to generate cost and physical characteristic equations.
• How to implement a PBCM.
• How to use a PBCM.
Lessons Learned from the International Software Benchmark Standards Group (ISBSG) Database (IT-11)
Arlene Minkiewicz – Chief Scientist, PRICE Systems
As corporate subscribers and partners to the International Software Benchmarks Standards Group (ISBSG )Database, PRICE has access to a wealth of information about software projects. The ISBSG was formed in 1997 with the mission “To improve the management of IT resources by both business and government through the provision and exploitation of public repositories of software engineering knowledge that are standardized, verified, recent and representative of current technologies.” This database contains detailed information on close to 6000 development and enhancement projects and more than 500 maintenance and support projects. To the best of this author’s knowledge, this database is the largest, most trusted source of publically available software data that has been vetted and quality checked.
The data covers many industry sectors and types of businesses though it is weak on data in the aerospace and defense industries. Never the less, there are many things we can learn from analysis of this data. The Development and Enhancement database contains 121 columns of project information for each project submitted. This information includes information identifying the type of business and application, the programming language(s) used, Functional Size of the project in one of many Functional Measures available in the industry (IFPUG, COSMIC, NESMA, etc.), project effort normalized based on the project phases the report contains, Project Delivery Rate (PDR), elapsed project time, etc.
At PRICE we have used this data in many ways both to improve our estimating guidance and to improve our software CERs. One of the projects we accomplished with this data was the creation of a series of data driven cost modeling templates across industry sector and application type. These templates are pre-filled with relevant values for input parameters along with a risk range determined by the statistical goodness of the values as predictors within the data set studied.
This paper will introduce the ISBSG and the database that are available from the ISBSG. It then provides details of the data driven approach applied to develop these templates ? discussing research approach, methodology, tools used, findings and outcomes. This is followed by a discussion of lessons learned including the strengths and weaknesses of the database and the strength and weaknesses of the solutions derived from it. While particularly relevant to software estimators, this paper should be valuable to any estimator who lacks data or has data they are not quite sure what they might do with it.
Software Maintenance: Recommendations for Estimating and Data Collection (IT-12)
Shelley Dickson – Operations Research Analyst, Naval Center for Cost Analysis
Bruce Parker – Naval Center for Cost Analysis
Alex Thiel – Operations Research Analyst, Naval Center of Cost Analysis
Corinne Wallshein – Technical Advisor, Naval Center for Cost Analysis
The software maintenance study reported at ICEAA in 2012 and 2013 continued to progress in 2013 in spite of the high data variability. This presentation summarizes the past years’ software maintenance data collection structure, categorizations, normalizations, and analyses. Software maintenance size, defect, cost, and effort data were collected from Fiscal Years (FY) 1992 – 2012. Parametric analyses were performed in depth on available variables included in or derived from this U.S. Department of Defense software maintenance data set. This description of the team’s decision making, derivations, and analyses may assist others in building cost estimating methodologies. Effort Estimating Relationships (EERs) presented may support future software maintenance estimation, including uncertainty distribution characterizations based on collected historical data. Recommended EERs include using an industry formula to compute Equivalent Source Lines of Code (ESLOC) software size and using the computed ESLOC to estimate an annual number of Full Time Equivalent (FTE) personnel during the software maintenance period, or combining yearly defects fixed and source lines of code to estimate annual software maintenance effort hours. Ultimately, the goal is to routinely collect and analyze data to develop defensible software maintenance cost estimating methodologies. A synopsis of the current phase of the study will be presented.
An Update to the Use of Function Points in Earned Value Management for Software Development (IT-13)
In this follow up presentation to their 2013 ICEAA presentation, the authors detail their efforts in the successful implementation of an EVM methodology for a government software development project utilizing the International Function Point User Group (IPFUG) function point software sizing metric.
Traditionally it has been difficult to apply Earned Value Management (EVM) criteria to software development projects, as no tangible value is earned until the software is delivered to production. The process developed by the authors successfully addressed this by application of the International Standards Organization (ISO) approved function point software sizing methodology to measure EVM during the course of the software development lifecycle. The use of SLOC to determine progress in software development is difficult, if not impossible, as there is no standard SLOC counting rules and is heavily dependent upon language, platform and individual developer skills.
This presentation describes the opportunity that was presented to the team and how the recently completed pilot program was developed and implemented to address it. The authors will address how effective the pilot program was as far as identifying and resolving issues, measuring earned value, as well as the challenges and lessons learned with the development, implementation, and sustainment of the FP based EVM process.
The Federal IT Dashboard: Potential Application for IT Cost & Schedule Analysis (IT-14)
Daniel Harper – MITRE Corporation
Federal agencies have experienced a growing demand for rapid turnaround cost and schedule estimates. This need is increasing as the pressure to deploy systems rapidly mounts. The push for Agile SW development compounds this problem.
A critical component in cost estimating is the data collection of costs for the various elements within the estimate. Analogous programs constitute a robust source for credible estimates. The problem is how to find analogous programs and how to capture the cost of elements within those programs at a sufficiently detailed level to use in a cost estimate and in a timely manner so that the cost data is still relevant.
The data for analogous programs already exists within government. The Open Government initiative has provided a way to overcome some of the problems of obtaining this data rapidly.
One example of a source of data for the analogous programs is the IT Spending Dashboard. The IT Dashboard is an open source, publicly available site that provides insight to over $80 billion of IT spending for over 700 major programs across 27 major agencies. The data from government sources such as the IT Dashboard can be use a source for current, analogues cost data.
Cost and schedule data for these programs is provided directly from agencies and can be accessed and exported into Microsoft Excel for analysis. Analysis of the cost and schedule elements for these programs can provide insight into historical spending and provide program managers with an important tool for predicting cost and schedule estimates for current programs.
Trends in Enterprise Software Pricing from 2002 to 2011 (IT-17)
One of the biggest challenges in the cost estimating community is data collection. In the Information Technology (IT) cost community, technology is always evolving, while the data capturing it tend to be scarce and more difficult to use in building solid cost models. Fortunately, NCCA learned the Department of the Navy (DON) Chief Information Officer (CIO) has been collecting benchmarking measures, including pricing, since 2002 under the Enterprise Software Initiative (ESI) Blanket Purchasing Agreements (BPAs). DON CIO shared its data with NCCA so various data analyses could be conducted. NCCA generated statistical trends and pricing factors from the ESI IT Commercial Off-the-Shelf (COTS) products and services. Although this is a start, and the benefits from this initial analysis primarily assist the IT cost estimator, NCCA plans to continue its relationship with DON CIO so that by providing continued database updates, they too will reap benefits from forecast models developed in support of various cost estimating efforts. Currently, multiple programs use IT COTS, many are using ESI BPAs to procure IT, and NCCA expects these programs to continue using the same or similar products over their lifecycles. Therefore, the results from this continued analysis will certainly benefit the community’s efforts with producing credible estimating tools.
Estimating Cloud Computing Costs: Practical Questions for Programs (IT-18)
Kathryn Connor – Cost Analyst, RAND
Cloud computing has garnered the attention of the Department of Defense (DoD) as data and computer processing needs grow and budgets shrink. In the meantime, reliable literature on the costs of cloud computing in the government is still limited, but programs are interested in any solution that has potential to control growing data management costs. We found that cloud provider costs can be more or less expensive than traditional information system alternatives because of cost structure variations. RAND looked at the cost drivers for several data management approaches for one acquisition program to develop structured cost considerations for analysts approaching new cloud investments. These considerations can help analysts be comprehensive in their analysis until the DoD can develop more official guidance on cloud computing cost analysis.
The Agile PM Tool: The Trifecta for Managing Cost, Schedule, and Scope (IT-19)
Addressing the need to more rapidly develop and field capabilities for the warfighter, more and more software-centric DoD programs are transitioning towards an industry trend called “Agile” software development. While “Agile” is geared towards producing usable software products more rapidly than waterfall or incremental methods, it also requires more flexibility with managing requirements. The main challenge this has created for Program Managers (PMs)is figuring out how to effectively manage cost, schedule, and scope in this flexible, fast-paced development environment. In turn, PMs are looking to cost estimators to provide insight into decisions they should consider that will impact their projects cost, schedule, and scope.
The Agile Program Management Tool: Analysts supporting programs within the Space and Naval Warfare Enterprise have successfully developed an Agile software development management tool that aligns better to “Agile” projects. This Excel-based tool utilizes “Agile” software metrics, such as complexity points, backlogs, and velocity, to estimate a project’s future software development costs, schedule, and scope. The “Agile PM Tool” dynamically updates its projections using a historical data input area, time-phases projections based on user-defined schedule, and utilizes uncertainty analysis to analyze impacts of possible changes in future velocity, all in a matter of minutes. Key outputs of the model include time-phased uncertainty costs, project burndown charts, velocity charts, and a project management analysis section that depicts cost, schedule, or scope impacts of deviations to the original plan.
Benefits: The Agile PM Tool provides a PM insight into the cost, schedule, and scope of the project over time. Actual cost/performance inputs allow for continual tracking and automatically produce updated cost projections. Lastly, the model features various visual outputs that depict key project performance indicators, as well as an automated scenario analysis tool that displays impacts to the project’s cost, schedule, and scope if the project experiences any deviations from its original plan.
Disadvantages: The template requires customization for each specific project, which entails a significant up-front time investment to define the project’s requirements and complexity in detail. Additionally, when this model is implemented on a new project that has not yet begun, the Basis of Estimate is more difficult to defend when compared to traditional approaches for two main reasons: 1) no data from other “Agile” projects is currently used to formulate them, and 2) the main sizing metric used, complexity points, is a subjective measure of complexity that can be unique to the development team of the specific project.
Summary: The growing number of DoD software projects that are adopting an “Agile” development philosophy requires cost estimators to not only adapt the methodologies and metrics they use to estimate software development costs, but also re-think their models to give PMs the information they need to effectively manage these programs. The Agile PM Tool is one manifestation of this trend as it provides a logical, dynamic approach for helping the government effectively manage the cost, schedule, and scope of their “Agile” projects.