Contents
- Function implementing a Monte Carlo method
- Step 1: Set controls for the Monte Carlo calculation
- Step 2: Set the state for the random number generator
- Step 3: Initialize the Monte Carlo calculation
- Step 4.0: Start of loop
- Step 4.1: Form values of the output quantity
- Step 4.2: Form results for the complete set of values for the output quantity
- Step 4.3: Form results for a subset of values for the output quantity
- Step 4.4: Test for stabilization of the results
- Step 4.5: End of loop
- Step 5: Form approximations to the probability distribution for the output quantity
- Step 6: Evaluate the coverage interval for the output quantity
- Function for making random draws from various probability distributions
- Function for updating the results of a Monte Carlo calculation based on a new set of values of the output quantity
- Function for evaluating the results from a Monte Carlo calculation based on a set of values of the output quantity
- Function for evaluating approximations to the probability distribution for the output quantity
- Function for evaluating a coverage interval for the output quantity
- Program identifier
Function implementing a Monte Carlo method
This function implements the Monte Carlo method for uncertainty evaluation for a measurement model
with a scalar output quantity and a general number of uncorrelated input quantities.
The inputs to the function are a specification model of the measurement model, a specification pdfin of the probability distributions assigned to the input quantities in the measurement model, a coverage probability p, the type of coverage interval, and controls controls for running the Monte Carlo calculation. The function returns an estimate yMCM of the output quantity, the standard uncertainty uyMCM associated with the estimate, the endpoints IyMCM of the coverage interval for the output quantity of the given type corresponding to coverage probability p, an approximation pdfMCM to the probability density function for the output quantity, and an indication conv of whether the Monte Carlo calculation has stabilized within the number of trials undertaken.
model is a string containing the name of a function for evaluating the measurement model for values of the input quantities. For an example of such a function, see model_additive.
Each input quantity in the measurement model is assigned a probability distribution, described by a probability density function, which may be (1) a Gaussian (or normal) distribution, (2) a scaled and shifted t-distribution, (3) a rectangular (or uniform) distribution, (4) a curvilinear trapezoidal distribution, or (5) a U-shaped distribution. (This list includes all the univariate probability distributions encountered in the examples of GUM Supplement 1. There is no consideration of joint probability distributions, such as the multivariate Gaussian distribution.)
pdfin is a matrix of N rows and 4 columns, the ith row of which defines the probability distribution for the ith input quantity. For a measurement model with a single input quantity, pdfin can take the following forms:
pdfin = [1, mu, sigma, inf] defines a Gaussian distribution with expectation mu and standard deviation sigma (the fourth element of pdfin is not used);
pdfin = [2, mu, sigma, nu] defines a t-distribution with shift parameter mu, scale parameter sigma and degrees of freedom nu;
pdfin = [3, a, b, inf] defines a rectangular distribution with lower limit a and upper limit b (the fourth element of pdfin is not used);
pdfin = [4, a, b, r] defines a curvilinear trapezoidal distribution with lower limit a, upper limit b and (fractional) reliability r for the semi-width;
pdfin = [5, a, b, inf] defines a U-shaped distribution with lower limit a and upper limit b (the fourth element of pdfin is not used).
Guidance on the assignment, in some common circumstances, of probability density functions for the input quantities in a measurement model is given in GUM Supplement 1 [3, section 6].
p is the coverage probability, which can be 0.90, 0.95 or 0.99, and type defines the type of coverage interval, which can be 'Shortest' or 'Symmetric'.
controls is a row vector of five elements and contains controls for the Monte Carlo calculation as follows:
controls(1) determines whether the calculation is adaptive (1) or not (0);
controls(2) is the maximum number of Monte Carlo trials to be undertaken (as a multiple of 10 000);
controls(3) is the number of histogram bins for defining an approximation to the probability density function for the output quantity;
controls(4) is the initial state of the random number generator used in generating random draws from the probability distributions for the input quantities;
controls(5) is the number of significant decimal digits regarded as meaningful in the value of the standard uncertainty when testing for stabilization of the results.
Section 5.1 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010, describes the procedure implemented in the Monte Carlo calculation.
Section 3.1.1 of M G Cox, P M Harris and I M Smith, Software for GUM Supplement 1: User Manual, 2011, describes the specifications of the different probability distributions. (Note that this document describes other probability distributions that are not considered here.)
function [yMCM, uyMCM, IyMCM, pdfMCM, conv] = ... MCM_calculation(model, pdfin, p, type, controls)
Input
model string Function defining the measurement model pdfin N x 4 Probability density functions for the input quantities in the measurement model p Coverage probability (0.90, 0.95 or 0.99) type string Coverage type ('Shortest' or 'Symmetric') controls 1 x 5 Controls for running a Monte Carlo calculation: adap the Monte Carlo calculation is adaptive (1) or not (0) M the maximum number of Monte Carlo trials (a multiple of 10 000) Nb the number of histogram bins for defining an approximation to the probability density function for the output quantity state the initial state for the random number generator ndig the number of significant decimal digits for testing for stabilization
Output
yMCM Estimate of the output quantity uyMCM Standard uncertainty associated with the estimate of the output quantity IyMCM 1 x 2 Lower and upper endpoints of a coverage interval for the output quantity pdfMCM 2 x Nb Approximation to the probability density function for the output quantity conv Indicates whether the results of the Monte Carlo calculation have stabilized (1) or not (0)
Step 1: Set controls for the Monte Carlo calculation
Extract the values in controls, which control the running of the Monte Carlo calculation.
adap = controls(1); M = controls(2); Nb = controls(3); state = controls(4); ndig = controls(5);
Step 2: Set the state for the random number generator
Generating pseudo-random numbers from a rectangular distribution is the basis for making random draws from any distribution. The random number generator used should be of good quality: here MATLAB's implementation of the Mersenne-Twister algorithm is used, with the initial state of the generator set to state.
rand('twister', state)
Step 3: Initialize the Monte Carlo calculation
The calculation is organized into M sets of m = 10 000 Monte Carlo trials. Doing so helps to ensure that memory requirements are modest, because only m values of each of the input quantities and the output quantity are stored. Also, the organization of the calculation in this way is the basis of an adaptive implementation of the Monte Carlo calculation. The vector yk contains the estimates of the output quantity obtained from the M sets of m Monte Carlo trials. Similarly, uyk contains the standard uncertainties associated with the estimates, and Iyk the endpoints of the coverage intervals.
yk = zeros(M,1); uyk = zeros(M,1); Iyk = zeros(M,2); m = 10000; conv = 0; k = 1;
Step 4.0: Start of loop
This is the start of the main loop. Provided the calculation has not stabilized and the maximum number of trials has not been exceeded, undertake another set of m Monte Carlo trials.
while ~conv && k < M+1
Step 4.1: Form values of the output quantity
Make m random draws from the probability distributions assigned to the input quantities, and store the values in the array xr. The measurement model is evaluated m times to obtain m values for the output quantity, which are stored in the vector yr.
See sections 5.3 and 5.4.1 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
xr = sampledata(pdfin, m);
yr = eval([model, '(xr)']);
Step 4.2: Form results for the complete set of values for the output quantity
Evaluate the average of the complete set of values of the output quantity (which provides an estimate yMCM of the output quantity) and the standard deviation of the values (which provides the standard uncertainty uyMCM associated with the estimate). The probability distribution for the output quantity is represented in terms of the values yleft of the output quantity that are less than a value ymin, the values yrght greater than a value ymax and a frequency distribution (fx, fy) for the values between ymin and ymax. MATLAB's hist function is used to form the frequency distribution, which bins the values of the output quantity between ymin and ymax into Nb uniformly-spaced containers, returning the numbers fy of values in the bins (frequencies) and the centres fx of the bins. The first set of m values of the output quantity are used to initialize the results. Subsequent sets of m values are used to update the results.
See section 5.4 and appendix E of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
if k == 1 yMCM = mean(yr); uyMCM = std(yr); ymin = min(yr); ymax = max(yr); yleft = []; yrght = []; [fy, fx] = hist(yr, Nb); else [yMCM, uyMCM, yleft, yrght, fx, fy] = ... MCM_update_results(ymin, ymax, yr, (k-1)*m, ... yMCM, uyMCM, yleft, yrght, fx, fy); end
Step 4.3: Form results for a subset of values for the output quantity
Evaluate an estimate of the output quantity, the standard uncertainty associated with the estimate, and a coverage interval for the output quantity from the current set of m values of the output quantity, and store the results in yk, uyk and Iyk.
See sections 5.4.2 and 5.4.6 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
[yk(k), uyk(k), Iyk(k,:)] = MCM_results(yr, p, type);
Step 4.4: Test for stabilization of the results
In a non-adaptive application of the calculation (adap = 0), a fixed number of Monte Carlo trials is undertaken and a test of whether the results obtained have stabilized in a statistical sense is performed at the end of the calculation. The test is based on a numerical tolerance tol calculated in terms of the number ndig of significant decimal digits regarded as meaningful in the value of the standard uncertainty uyMCM. In an adaptive application (adap = 1), the calculation is terminated when either the results have stabilized or the maximum number of trials has been undertaken.
See section 5.7 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
if k > 1 if uyMCM == 0 tol = 0; else r = -(floor(log10(uyMCM)) - (ndig - 1)); tol = 0.5*10^(-r); end stab = 2*std([yk(1:k) uyk(1:k) Iyk(1:k,1) Iyk(1:k,2)])/sqrt(k); if adap if max(stab) <= tol; conv = 1; end else if k == M if max(stab) <= tol; conv = 1; end end end end
Step 4.5: End of loop
This is the end of the main loop. When the calculation has stabilized or the maximum number of trials has been undertaken, exit the loop. Otherwise, increment the number k of sets of m trials.
k = k + 1;
end
Step 5: Form approximations to the probability distribution for the output quantity
In terms of the representation (yleft, yrght, fx, fy) of the probability distribution for the output quantity, form approximations (gx, gy) and (Gx, Gy), respectively, of the probability density function and distribution function for the output quantity. The approximation to the probability density function takes the form of a scaled frequency distribution with bin centres gx and corresponding probability densities gy. The approximation to the distribution function takes the form of a piecewise linear function composed of straight-line segments joining the points with coordinates (Gx, Gy). The case where all the values of the output quantity are the same is treated separately.
See sections 5.4.3, 5.4.5 and appendix E of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
if uyMCM == 0 Gx = [yMCM, yMCM]; Gy = [ 0, 1]; gx = yMCM; gy = 1; else [Gx, Gy, gx, gy] = MCM_distributions(yleft, yrght, fx, fy); end pdfMCM = [gx; gy];
Step 6: Evaluate the coverage interval for the output quantity
Use the approximation to the distribution function for the output quantity to form the coverage interval of type type for coverage probability p.
See section 5.4.6 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
IyMCM = coverage_interval(Gx, Gy, p, type);
Function for making random draws from various probability distributions
This function returns random draws from the probability distributions for the input quantities in a measurement model. The input to the function is the specification pdf of the probability distributions assigned to the input quantities and the number M of random draws. The function returns an array xr, the ith row of which contains M random draws from the probability distribution for the ith input quantity.
The Box-Muller algorithm is used to make random draws from the standard Gaussian distribution. A rejection method is used to make random draws from the t-distribution. MATLAB's rand function, implementing the Mersenne-Twister algorithm, is used to make random draws from the standard rectangular distribution. Finally, transformation methods are used to make random draws from the curvilinear trapezoidal and U-shaped distributions.
See section 5.3 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
function xr = sampledata(pdf, M)
Input
pdf N x 4 Probability density functions for the input quantities in the measurement model M Number of Monte Carlo trials
Output
xr N x M Random draws made from the probability distributions for the input quantities
For each input quantity, the calculation depends on the nature of the probability distribution assigned to the quantity.
N = size(pdf,1); xr = zeros(N,M); for k = 1:N switch pdf(k,1)
Gaussian distribution.
case {1} x = pdf(k,2); ux = pdf(k,3); mn = 0; ndist = zeros(1,M); while mn < M v1 = rand; v2 = rand; if mn < M mn = mn + 1; ndist(mn) = sqrt(-2*log(v1))*cos(2*pi*v2); end if mn < M mn = mn + 1; ndist(mn) = sqrt(-2*log(v1))*sin(2*pi*v2); end end xr(k,:) = x + ux*ndist; %%% % t-distribution. case {2} x = pdf(k,2); ux = pdf(k,3); nu = pdf(k,4); mt = 0; tdist = zeros(1,M); while mt < M v1 = rand; v2 = rand; if v1 < 0.5 t = 1/(4*v1 - 1); v = v2/t^2; else t = 4*v1 - 3; v = v2; end if v < 1 - abs(t)/2 || v < (1 + t^2/nu)^(-(nu+1)/2) mt = mt + 1; tdist(mt) = t; end end xr(k,:) = x + ux*tdist; %%% % Rectangular distribution. case {3} a = pdf(k,2); b = pdf(k,3); xr(k,:) = a + (b - a)*rand(1,M); %%% % Curvilinear trapezoidal distribution. case {4} a = pdf(k,2); b = pdf(k,3); r = pdf(k,4); d = r*(b - a)/2; am = a - d; ap = a + d; ak = am + (ap - am)*rand(1,M); bk = (a + b) - ak; xr(k,:) = ak + (bk - ak).*rand(1,M); %%% % U-shaped distribution. case {5} a = pdf(k,2); b = pdf(k,3); xr(k,:) = (a + b)/2 + ((b - a)/2)*cos(pi*rand(1,M)); end end
Function for updating the results of a Monte Carlo calculation based on a new set of values of the output quantity
This function evaluates the results of a Monte Carlo calculation from those obtained from a previous calculation (based on M0 trials) and a new set of m values of the output quantity. The results comprise the average and standard deviation of the values of the output quantity (taken as an estimate of the quantity with the associated standard uncertainty), and a representation of the probability distribution for the output quantity determined from those values.
See appendix E of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
function [yMCM, uyMCM, yleft, yrght, fx, fy] = ... MCM_update_results(ymin, ymax, yr, M0, yMCM, uyMCM, yleft, yrght, fx, fy)
Input
ymin Left-hand end of the first bin ymax Right-hand end of the last bin yr 1 x m New values of the output quantity M0 Number of Monte Carlo trials undertaken to obtain the current results (below)
Input/output
yMCM Estimate of the output quantity uyMCM Standard uncertainty associated with the estimate of the output quantity yleft Values of the output quantity less than ymin yright Values of the output quantity greater than ymax fx, fy Frequency distribution for the output quantity between ymin and ymax
Evaluate the estimate of the output quantity and the associated standard uncertainty from the previous values and the new values of the output quantity.
m = length(yr); d = sum(yr - yMCM)/(M0 + m); mu = yMCM + d; s2 = ((M0 - 1)*uyMCM^2 + M0*d^2 + sum((yr - mu).^2))/(M0 + m - 1); yMCM = mu; uyMCM = sqrt(s2);
Update yleft and yrght that contain the values of the output quantity that are less than ymin and greater than ymax.
ileft = find(yr < ymin); yleft = [yleft, yr(ileft)]; irght = find(yr > ymax); yrght = [yrght, yr(irght)];
Update the frequency distribution for the output quantity using those values of the output quantity between ymin and ymax.
yr([ileft, irght]) = []; fy = fy + hist(yr, fx);
Function for evaluating the results from a Monte Carlo calculation based on a set of values of the output quantity
This function evaluates from a set of values of the output quantity an estimate of the output quantity, the standard uncertainty associated with the estimate, and a coverage interval for the output quantity.
See section 5.4 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
function [yMCM, uyMCM, IyMCM] = MCM_results(yr, p, type)
Input
yr 1 x M Values of the output quantity p Coverage probability (0.90, 0.95 or 0.99) type string Coverage type ('Shortest' or 'Symmetric')
Output
yMCM Estimate of the output quantity uyMCM Standard uncertainty associated with the estimate of the output quantity IyMCM 1 x 2 Lower and upper endpoints of a coverage interval for the output quantity
Evaluate the estimate (as the average of the values of the output quantity) and the associated standard uncertainty (as the standard deviation of the values).
yMCM = mean(yr); uyMCM = std(yr);
Form an approximation to the distribution function for the output quantity. The values of the output quantity are sorted and associated with uniformly-spaced probabilities. The approximation to the distribution function is the piecewise linear function composed of straight-line segments that join the points defined by the sorted values and these probabilities.
M = length(yr); Gx = sort(yr); Gy = ((1:M) - 1/2)/M;
Determine a coverage interval from the approximation to the distribution function for the output quantity.
IyMCM = coverage_interval(Gx, Gy, p, type);
Function for evaluating approximations to the probability distribution for the output quantity
This function forms approximations to the probability density function and distribution function for the output quantity from the results of a Monte Carlo calculation. The approximation to the probability density function takes the form of a scaled frequency distribution (or piecewise constant function), and the approximation to the distribution function takes the form of a piecewise linear function.
See appendix E of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
function [Gx, Gy, gx, gy] = MCM_distributions(yleft, yrght, fx, fy)
Input
yleft Values of the output quantity less than ymin yright Values of the output quantity greater than ymax fx, fy Frequency distribution for the output quantity between ymin and ymax
Output
Gx, Gy Approximation to the distribution function for the output quantity gx, gy Approximation to the probability density function for the output quantity
Calculate the numbers of output quantity values to the left of ymin and to the right of ymax, the number of bins in the frequency distribution (fx, fy), the total number of output quantity values, and the width of the bins in the frequency distribution.
ml = length(yleft); mr = length(yrght); Nb = length(fx); M = ml + sum(fy) + mr; bw = fx(2) - fx(1);
Form an approximation to the distribution function for the output quantity. This takes the form of a piecewise linear function composed of straight-line segments joining the points with coordinates (Gx, Gy), with Gx defined by the edges of the bins, and Gy corresponding to cumulative probabilities.
If yleft is empty, the probability corresponding to the left-hand edge of the first bin is set equal to zero. If yrght is empty, the probability corresponding to the right-hand edge of the last bin is set equal to unity.
Gx(1) = fx(1) - bw/2; Gy(1) = ml/M; for k = 1:Nb Gx(k+1) = Gx(k) + bw; Gy(k+1) = Gy(k) + fy(k)/M; end %%% % Otherwise, make allowance for any values in *yleft* and *yrght*. if ml > 0 Gx = [sort(yleft), Gx]; Gy = [((1:ml) - 1/2)/M, Gy]; end if mr > 0 Gx = [Gx, sort(yrght)]; Gy = [Gy, ((M-mr+1:M) - 1/2)/M]; end
Make sure the function is strictly increasing (which is important to ensure that inverse interpolation of the function is possible).
iz = find(diff(Gy) == 0); if ~isempty(iz) for k = 1:length(iz) Gy(iz+1) = Gy(iz) + 10*eps; end end
Form an approximation to the probability density function for the output quantity. This takes the form of a scaled frequency distribution defined by uniformly-spaced bins with bin centres gx and probability densities gy (equivalently a function that is piecewise constant over the intervals defined by the bins). The approximation is normalized such that the area under the function is unity.
Add bins to the left of the first bin and to the right of the last bin to contain the values of the output quantity stored in yleft and yrght.
if ml > 0 Nbl = fix((fx(1) - bw/2 - min(yleft))/bw) + 1; fxl = linspace(fx(1)-Nbl*bw, fx(1)-bw, Nbl); if Nbl == 1 fyl = ml; else fyl = hist(yleft, fxl); end fx = [fxl, fx]; fy = [fyl, fy]; end if mr > 0 Nbr = fix((max(yrght) - fx(end) - bw/2)/bw) + 1; fxr = linspace(fx(end)+bw, fx(end)+Nbr*bw, Nbr); if Nbr == 1 fyr = mr; else fyr = hist(yrght,fxr); end fx = [fx, fxr]; fy = [fy, fyr]; end
Scale the frequencies to define probability densities so that the area under the approximation to the probability density function is unity.
gx = fx; gy = fy/(M*bw);
Function for evaluating a coverage interval for the output quantity
This function evaluates a coverage interval for the output quantity from an approximation to the distribution function for the quantity. The approximation is assumed to take the form of a piecewise linear function, such as that provided by the routine MCM_results or MCM_distributions.
See section 5.4.6 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.
function IyMCM = coverage_interval(Gx, Gy, p, type)
Input
Gx, Gy Approximation to the distribution function for the output quantity p Coverage probability (0.90, 0.95 or 0.99) type string Coverage type ('Shortest' or 'Symmetric')
Output
IyMCM 1 x 2 Lower and upper endpoints of a coverage interval for the output quantity
The shortest coverage interval is required.
if strcmp(type, 'Shortest')
Evaluate data (pcov, lcov) defining the length of the coverage interval as a function of the probability at the lower endpoint of the interval. MATLAB's interp1 function is used to interpolate inversely the approximation to the distribution function to find values of the output quantity that correspond to a range of probabilities between 0 and 1 - p.
M = length(Gy); pcov = linspace(Gy(1), Gy(M)-p, 101)'; ylow = interp1(Gy, Gx, pcov); yhgh = interp1(Gy, Gx, pcov+p); lcov = yhgh - ylow; %%% % Identify the coverage interval of shortest length. imin = find(lcov == min(lcov), 1, 'first'); IyMCM = [ylow(imin), yhgh(imin)]; end
The probabilistically symmetric coverage interval is required.
if strcmp(type, 'Symmetric')
MATLAB's interp1 function is used to interpolate inversely the appoximation to the distribution function to find values of the output quantity that correspond to probabilities (1 - p)/2 and (1 + p)/2.
ylow = interp1(Gy, Gx, (1-p)/2);
yhgh = interp1(Gy, Gx, (1+p)/2);
IyMCM = [ylow, yhgh];
end
Program identifier
NPLUnc_101 Open-Source Software (Release 1.1): Software for measurement uncertainty evaluation
M G Cox, P M Harris and I M Smith, Mathematics and Scientific Computing Group, National Physical Laboratory, UK
Crown copyright 2011