Fit, Rather Than Assume, a CER Error Distribution
Analysts usually assume a distribution (e.g., normal, log-normal, or triangular) to model the errors of a cost estimating relationship (CER) for cost uncertainly analysis. However, this hypothetical assumption may not be suitable to model the underlying distribution of CER errors. A distribution fitting tool is often used to hypothesize an appropriate distribution for a given set of data. It can also be applied to fit a distribution to
1. the CER residuals (i.e., Actual – Predicted = yi / ŷi) for additive error models and
2. the CER “percent” errors in the form of ratios (i.e., Actual/Predicted = yi / ŷi) for multiplicative error models.
This way, the CER error distribution is derived based upon the residuals (or percent errors) specific to the analysis, rather than a generic assumption applied to any analysis.
If we use a curve-fitting tool to analyze the yi / ŷi ratios for a multiplicative error CER, we cannot apply the fitted distribution directly for cost uncertainty analysis. This is because it does not account for (1) a distance assessment between the estimating point and the centroid of the data set, (2) the sample size, or (3) the degrees of freedom of the respective CER. We must make adjustments when using the fitted distribution to perform uncertainty analysis in a simulation tool. This paper proposes an objective method to account for the above elements when modeling CER uncertainty with a fitted distribution; namely, it develops a prediction interval for cost uncertainty analysis using a curve-fitting tool.
Furthermore, analysts often use a curve-fitting tool to analyze the residuals (or percentage errors) from various CERs all together. This paper discusses issues associated with this approach and explains why it is not appropriate to do so.
Tecolote Research, Inc.
Shu-Ping Hu is a Chief Statistician at Tecolote Research, Incorporated. Shu-Ping joined Tecolote in 1984 and serves as a company expert in all statistical matters. She earned her Ph.D. in Mathematics, with an emphasis in Statistics, at the University of California, Santa Barbara.
She has published many technical papers, covering such topics as developing the PING Factor to adjust the log-linear CER to reflect the mean and suggesting an adjusted R-square measure for the Minimum-Unbiased-Percentage Error (MUPE) and Minimum-Percentage Error Regression under Zero-Percentage Bias (ZMPE) CERs.
Dr. Hu has 20 years of experience supporting Unmanned Space Vehicle Cost Model (USCM) CER development and the related database. She also has 25 years of experience designing, developing, and validating statistical, learning, and regression algorithms in CO$TAT. In addition, Dr. Hu developed many of the distribution and correlation algorithms implemented in the ACE RI$K simulation tool. For over 20 years, she has been a regular presenter of the most advanced cost analysis techniques at major cost conferences.