Contents

Function implementing a Monte Carlo method

This function implements the Monte Carlo method for uncertainty evaluation for a measurement model

$$Y = f(X_1, X_2, ..., X_N),$$

with a scalar output quantity and a general number of uncorrelated input quantities.

The inputs to the function are a specification model of the measurement model, a specification pdfin of the probability distributions assigned to the input quantities in the measurement model, a coverage probability p, the type of coverage interval, and controls controls for running the Monte Carlo calculation. The function returns an estimate yMCM of the output quantity, the standard uncertainty uyMCM associated with the estimate, the endpoints IyMCM of the coverage interval for the output quantity of the given type corresponding to coverage probability p, an approximation pdfMCM to the probability density function for the output quantity, and an indication conv of whether the Monte Carlo calculation has stabilized within the number of trials undertaken.

model is a string containing the name of a function for evaluating the measurement model for values of the input quantities. For an example of such a function, see model_additive.

Each input quantity in the measurement model is assigned a probability distribution, described by a probability density function, which may be (1) a Gaussian (or normal) distribution, (2) a scaled and shifted t-distribution, (3) a rectangular (or uniform) distribution, (4) a curvilinear trapezoidal distribution, or (5) a U-shaped distribution. (This list includes all the univariate probability distributions encountered in the examples of GUM Supplement 1. There is no consideration of joint probability distributions, such as the multivariate Gaussian distribution.)

pdfin is a matrix of N rows and 4 columns, the ith row of which defines the probability distribution for the ith input quantity. For a measurement model with a single input quantity, pdfin can take the following forms:

pdfin = [1, mu, sigma, inf] defines a Gaussian distribution with expectation mu and standard deviation sigma (the fourth element of pdfin is not used);

pdfin = [2, mu, sigma, nu] defines a t-distribution with shift parameter mu, scale parameter sigma and degrees of freedom nu;

pdfin = [3, a, b, inf] defines a rectangular distribution with lower limit a and upper limit b (the fourth element of pdfin is not used);

pdfin = [4, a, b, r] defines a curvilinear trapezoidal distribution with lower limit a, upper limit b and (fractional) reliability r for the semi-width;

pdfin = [5, a, b, inf] defines a U-shaped distribution with lower limit a and upper limit b (the fourth element of pdfin is not used).

Guidance on the assignment, in some common circumstances, of probability density functions for the input quantities in a measurement model is given in GUM Supplement 1 [3, section 6].

p is the coverage probability, which can be 0.90, 0.95 or 0.99, and type defines the type of coverage interval, which can be 'Shortest' or 'Symmetric'.

controls is a row vector of five elements and contains controls for the Monte Carlo calculation as follows:

controls(1) determines whether the calculation is adaptive (1) or not (0);

controls(2) is the maximum number of Monte Carlo trials to be undertaken (as a multiple of 10 000);

controls(3) is the number of histogram bins for defining an approximation to the probability density function for the output quantity;

controls(4) is the initial state of the random number generator used in generating random draws from the probability distributions for the input quantities;

controls(5) is the number of significant decimal digits regarded as meaningful in the value of the standard uncertainty when testing for stabilization of the results.

Section 5.1 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010, describes the procedure implemented in the Monte Carlo calculation.

Section 3.1.1 of M G Cox, P M Harris and I M Smith, Software for GUM Supplement 1: User Manual, 2011, describes the specifications of the different probability distributions. (Note that this document describes other probability distributions that are not considered here.)

function [yMCM, uyMCM, IyMCM, pdfMCM, conv] = ...
   MCM_calculation(model, pdfin, p, type, controls)

Input

model    string Function defining the measurement model
pdfin    N x 4  Probability density functions for the input quantities
                in the measurement model
p               Coverage probability (0.90, 0.95 or 0.99)
type     string Coverage type ('Shortest' or 'Symmetric')
controls 1 x 5  Controls for running a Monte Carlo calculation:
                   adap    the Monte Carlo calculation is adaptive
                           (1) or not (0)
                   M       the maximum number of Monte Carlo trials
                           (a multiple of 10 000)
                   Nb      the number of histogram bins for defining an
                           approximation to the probability density
                           function for the output quantity
                   state   the initial state for the random number
                           generator
                   ndig    the number of significant decimal digits
                           for testing for stabilization

Output

yMCM            Estimate of the output quantity
uyMCM           Standard uncertainty associated with the estimate of
                the output quantity
IyMCM    1 x 2  Lower and upper endpoints of a coverage interval for
                the output quantity
pdfMCM   2 x Nb Approximation to the probability density function for
                the output quantity
conv            Indicates whether the results of the Monte Carlo
                calculation have stabilized (1) or not (0)

Step 1: Set controls for the Monte Carlo calculation

Extract the values in controls, which control the running of the Monte Carlo calculation.

   adap  = controls(1);
   M     = controls(2);
   Nb    = controls(3);
   state = controls(4);
   ndig  = controls(5);

Step 2: Set the state for the random number generator

Generating pseudo-random numbers from a rectangular distribution is the basis for making random draws from any distribution. The random number generator used should be of good quality: here MATLAB's implementation of the Mersenne-Twister algorithm is used, with the initial state of the generator set to state.

   rand('twister', state)

Step 3: Initialize the Monte Carlo calculation

The calculation is organized into M sets of m = 10 000 Monte Carlo trials. Doing so helps to ensure that memory requirements are modest, because only m values of each of the input quantities and the output quantity are stored. Also, the organization of the calculation in this way is the basis of an adaptive implementation of the Monte Carlo calculation. The vector yk contains the estimates of the output quantity obtained from the M sets of m Monte Carlo trials. Similarly, uyk contains the standard uncertainties associated with the estimates, and Iyk the endpoints of the coverage intervals.

   yk   = zeros(M,1);
   uyk  = zeros(M,1);
   Iyk  = zeros(M,2);
   m    = 10000;
   conv = 0;
   k    = 1;

Step 4.0: Start of loop

This is the start of the main loop. Provided the calculation has not stabilized and the maximum number of trials has not been exceeded, undertake another set of m Monte Carlo trials.

   while ~conv && k < M+1

Step 4.1: Form values of the output quantity

Make m random draws from the probability distributions assigned to the input quantities, and store the values in the array xr. The measurement model is evaluated m times to obtain m values for the output quantity, which are stored in the vector yr.

See sections 5.3 and 5.4.1 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

      xr = sampledata(pdfin, m);
      yr = eval([model, '(xr)']);

Step 4.2: Form results for the complete set of values for the output quantity

Evaluate the average of the complete set of values of the output quantity (which provides an estimate yMCM of the output quantity) and the standard deviation of the values (which provides the standard uncertainty uyMCM associated with the estimate). The probability distribution for the output quantity is represented in terms of the values yleft of the output quantity that are less than a value ymin, the values yrght greater than a value ymax and a frequency distribution (fx, fy) for the values between ymin and ymax. MATLAB's hist function is used to form the frequency distribution, which bins the values of the output quantity between ymin and ymax into Nb uniformly-spaced containers, returning the numbers fy of values in the bins (frequencies) and the centres fx of the bins. The first set of m values of the output quantity are used to initialize the results. Subsequent sets of m values are used to update the results.

See section 5.4 and appendix E of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

      if k == 1
         yMCM     = mean(yr);
         uyMCM    = std(yr);
         ymin     = min(yr);
         ymax     = max(yr);
         yleft    = [];
         yrght    = [];
         [fy, fx] = hist(yr, Nb);
      else
         [yMCM, uyMCM, yleft, yrght, fx, fy] = ...
            MCM_update_results(ymin, ymax, yr, (k-1)*m, ...
                               yMCM, uyMCM, yleft, yrght, fx, fy);
      end

Step 4.3: Form results for a subset of values for the output quantity

Evaluate an estimate of the output quantity, the standard uncertainty associated with the estimate, and a coverage interval for the output quantity from the current set of m values of the output quantity, and store the results in yk, uyk and Iyk.

See sections 5.4.2 and 5.4.6 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

      [yk(k), uyk(k), Iyk(k,:)] = MCM_results(yr, p, type);

Step 4.4: Test for stabilization of the results

In a non-adaptive application of the calculation (adap = 0), a fixed number of Monte Carlo trials is undertaken and a test of whether the results obtained have stabilized in a statistical sense is performed at the end of the calculation. The test is based on a numerical tolerance tol calculated in terms of the number ndig of significant decimal digits regarded as meaningful in the value of the standard uncertainty uyMCM. In an adaptive application (adap = 1), the calculation is terminated when either the results have stabilized or the maximum number of trials has been undertaken.

See section 5.7 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

      if k > 1
         if uyMCM == 0
            tol = 0;
         else
            r   = -(floor(log10(uyMCM)) - (ndig - 1));
            tol = 0.5*10^(-r);
         end
         stab = 2*std([yk(1:k) uyk(1:k) Iyk(1:k,1) Iyk(1:k,2)])/sqrt(k);
         if adap
            if max(stab) <= tol; conv = 1; end
         else
            if k == M
               if max(stab) <= tol; conv = 1; end
            end
         end
      end

Step 4.5: End of loop

This is the end of the main loop. When the calculation has stabilized or the maximum number of trials has been undertaken, exit the loop. Otherwise, increment the number k of sets of m trials.

      k = k + 1;
   end

Step 5: Form approximations to the probability distribution for the output quantity

In terms of the representation (yleft, yrght, fx, fy) of the probability distribution for the output quantity, form approximations (gx, gy) and (Gx, Gy), respectively, of the probability density function and distribution function for the output quantity. The approximation to the probability density function takes the form of a scaled frequency distribution with bin centres gx and corresponding probability densities gy. The approximation to the distribution function takes the form of a piecewise linear function composed of straight-line segments joining the points with coordinates (Gx, Gy). The case where all the values of the output quantity are the same is treated separately.

See sections 5.4.3, 5.4.5 and appendix E of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

   if uyMCM == 0
      Gx = [yMCM, yMCM];
      Gy = [   0,    1];
      gx = yMCM;
      gy = 1;
   else
      [Gx, Gy, gx, gy] = MCM_distributions(yleft, yrght, fx, fy);
   end
   pdfMCM = [gx; gy];

Step 6: Evaluate the coverage interval for the output quantity

Use the approximation to the distribution function for the output quantity to form the coverage interval of type type for coverage probability p.

See section 5.4.6 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

   IyMCM = coverage_interval(Gx, Gy, p, type);

Function for making random draws from various probability distributions

This function returns random draws from the probability distributions for the input quantities in a measurement model. The input to the function is the specification pdf of the probability distributions assigned to the input quantities and the number M of random draws. The function returns an array xr, the ith row of which contains M random draws from the probability distribution for the ith input quantity.

The Box-Muller algorithm is used to make random draws from the standard Gaussian distribution. A rejection method is used to make random draws from the t-distribution. MATLAB's rand function, implementing the Mersenne-Twister algorithm, is used to make random draws from the standard rectangular distribution. Finally, transformation methods are used to make random draws from the curvilinear trapezoidal and U-shaped distributions.

See section 5.3 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

function xr = sampledata(pdf, M)

Input

pdf      N x 4  Probability density functions for the input quantities
                in the measurement model
M               Number of Monte Carlo trials

Output

xr       N x M  Random draws made from the probability distributions
                for the input quantities

For each input quantity, the calculation depends on the nature of the probability distribution assigned to the quantity.

   N  = size(pdf,1);
   xr = zeros(N,M);
   for k = 1:N
      switch pdf(k,1)

Gaussian distribution.

         case {1}
            x  = pdf(k,2);
            ux = pdf(k,3);
            mn = 0;
            ndist = zeros(1,M);
            while mn < M
               v1 = rand;
               v2 = rand;
               if mn < M
                  mn = mn + 1;
                  ndist(mn) = sqrt(-2*log(v1))*cos(2*pi*v2);
               end
               if mn < M
                  mn = mn + 1;
                  ndist(mn) = sqrt(-2*log(v1))*sin(2*pi*v2);
               end
            end
            xr(k,:) = x + ux*ndist;
%%%
% t-distribution.
         case {2}
            x  = pdf(k,2);
            ux = pdf(k,3);
            nu = pdf(k,4);
            mt = 0;
            tdist = zeros(1,M);
            while mt < M
               v1 = rand;
               v2 = rand;
               if v1 < 0.5
                  t = 1/(4*v1 - 1);
                  v = v2/t^2;
               else
                  t = 4*v1  - 3;
                  v = v2;
               end
               if v < 1 - abs(t)/2 || v < (1 + t^2/nu)^(-(nu+1)/2)
                  mt = mt + 1;
                  tdist(mt) = t;
               end
            end
            xr(k,:) = x + ux*tdist;
%%%
% Rectangular distribution.
         case {3}
            a = pdf(k,2);
            b = pdf(k,3);
            xr(k,:) = a + (b - a)*rand(1,M);
%%%
% Curvilinear trapezoidal distribution.
         case {4}
            a  = pdf(k,2);
            b  = pdf(k,3);
            r  = pdf(k,4);
            d  = r*(b - a)/2;
            am = a - d;
            ap = a + d;
            ak = am + (ap - am)*rand(1,M);
            bk = (a + b) - ak;
            xr(k,:) = ak + (bk - ak).*rand(1,M);
%%%
% U-shaped distribution.
         case {5}
            a = pdf(k,2);
            b = pdf(k,3);
            xr(k,:) = (a + b)/2 + ((b - a)/2)*cos(pi*rand(1,M));
      end
   end

Function for updating the results of a Monte Carlo calculation based on a new set of values of the output quantity

This function evaluates the results of a Monte Carlo calculation from those obtained from a previous calculation (based on M0 trials) and a new set of m values of the output quantity. The results comprise the average and standard deviation of the values of the output quantity (taken as an estimate of the quantity with the associated standard uncertainty), and a representation of the probability distribution for the output quantity determined from those values.

See appendix E of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

function [yMCM, uyMCM, yleft, yrght, fx, fy] = ...
   MCM_update_results(ymin, ymax, yr, M0, yMCM, uyMCM, yleft, yrght, fx, fy)

Input

ymin            Left-hand end of the first bin
ymax            Right-hand end of the last bin
yr       1 x m  New values of the output quantity
M0              Number of Monte Carlo trials undertaken to obtain the
                current results (below)

Input/output

yMCM            Estimate of the output quantity
uyMCM           Standard uncertainty associated with the estimate of
                the output quantity
yleft           Values of the output quantity less than ymin
yright          Values of the output quantity greater than ymax
fx, fy          Frequency distribution for the output quantity between
                ymin and ymax

Evaluate the estimate of the output quantity and the associated standard uncertainty from the previous values and the new values of the output quantity.

   m     = length(yr);
   d     = sum(yr - yMCM)/(M0 + m);
   mu    = yMCM + d;
   s2    = ((M0 - 1)*uyMCM^2 + M0*d^2 + sum((yr - mu).^2))/(M0 + m - 1);
   yMCM  = mu;
   uyMCM = sqrt(s2);

Update yleft and yrght that contain the values of the output quantity that are less than ymin and greater than ymax.

   ileft = find(yr < ymin);
   yleft = [yleft, yr(ileft)];
   irght = find(yr > ymax);
   yrght = [yrght, yr(irght)];

Update the frequency distribution for the output quantity using those values of the output quantity between ymin and ymax.

   yr([ileft, irght]) = [];
   fy = fy + hist(yr, fx);

Function for evaluating the results from a Monte Carlo calculation based on a set of values of the output quantity

This function evaluates from a set of values of the output quantity an estimate of the output quantity, the standard uncertainty associated with the estimate, and a coverage interval for the output quantity.

See section 5.4 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

function [yMCM, uyMCM, IyMCM] = MCM_results(yr, p, type)

Input

yr       1 x M  Values of the output quantity
p               Coverage probability (0.90, 0.95 or 0.99)
type     string Coverage type ('Shortest' or 'Symmetric')

Output

yMCM            Estimate of the output quantity
uyMCM           Standard uncertainty associated with the estimate of
                the output quantity
IyMCM    1 x 2  Lower and upper endpoints of a coverage interval for
                the output quantity

Evaluate the estimate (as the average of the values of the output quantity) and the associated standard uncertainty (as the standard deviation of the values).

   yMCM  = mean(yr);
   uyMCM = std(yr);

Form an approximation to the distribution function for the output quantity. The values of the output quantity are sorted and associated with uniformly-spaced probabilities. The approximation to the distribution function is the piecewise linear function composed of straight-line segments that join the points defined by the sorted values and these probabilities.

   M  = length(yr);
   Gx = sort(yr);
   Gy = ((1:M) - 1/2)/M;

Determine a coverage interval from the approximation to the distribution function for the output quantity.

   IyMCM = coverage_interval(Gx, Gy, p, type);

Function for evaluating approximations to the probability distribution for the output quantity

This function forms approximations to the probability density function and distribution function for the output quantity from the results of a Monte Carlo calculation. The approximation to the probability density function takes the form of a scaled frequency distribution (or piecewise constant function), and the approximation to the distribution function takes the form of a piecewise linear function.

See appendix E of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

function [Gx, Gy, gx, gy] = MCM_distributions(yleft, yrght, fx, fy)

Input

yleft           Values of the output quantity less than ymin
yright          Values of the output quantity greater than ymax
fx, fy          Frequency distribution for the output quantity between
                ymin and ymax

Output

Gx, Gy          Approximation to the distribution function for the
                output quantity
gx, gy          Approximation to the probability density function for
                the output quantity

Calculate the numbers of output quantity values to the left of ymin and to the right of ymax, the number of bins in the frequency distribution (fx, fy), the total number of output quantity values, and the width of the bins in the frequency distribution.

   ml = length(yleft);
   mr = length(yrght);
   Nb = length(fx);
   M  = ml + sum(fy) + mr;
   bw = fx(2) - fx(1);

Form an approximation to the distribution function for the output quantity. This takes the form of a piecewise linear function composed of straight-line segments joining the points with coordinates (Gx, Gy), with Gx defined by the edges of the bins, and Gy corresponding to cumulative probabilities.

If yleft is empty, the probability corresponding to the left-hand edge of the first bin is set equal to zero. If yrght is empty, the probability corresponding to the right-hand edge of the last bin is set equal to unity.

   Gx(1) = fx(1) - bw/2;
   Gy(1) = ml/M;
   for k = 1:Nb
      Gx(k+1) = Gx(k) + bw;
      Gy(k+1) = Gy(k) + fy(k)/M;
   end

%%%
% Otherwise, make allowance for any values in *yleft* and *yrght*.

   if ml > 0
      Gx = [sort(yleft), Gx];
      Gy = [((1:ml) - 1/2)/M, Gy];
   end
   if mr > 0
      Gx = [Gx, sort(yrght)];
      Gy = [Gy, ((M-mr+1:M) - 1/2)/M];
   end

Make sure the function is strictly increasing (which is important to ensure that inverse interpolation of the function is possible).

   iz = find(diff(Gy) == 0);
   if ~isempty(iz)
      for k = 1:length(iz)
         Gy(iz+1) = Gy(iz) + 10*eps;
      end
   end

Form an approximation to the probability density function for the output quantity. This takes the form of a scaled frequency distribution defined by uniformly-spaced bins with bin centres gx and probability densities gy (equivalently a function that is piecewise constant over the intervals defined by the bins). The approximation is normalized such that the area under the function is unity.

Add bins to the left of the first bin and to the right of the last bin to contain the values of the output quantity stored in yleft and yrght.

   if ml > 0
      Nbl = fix((fx(1) - bw/2 - min(yleft))/bw) + 1;
      fxl = linspace(fx(1)-Nbl*bw, fx(1)-bw, Nbl);
      if Nbl == 1
         fyl = ml;
      else
         fyl = hist(yleft, fxl);
      end
      fx  = [fxl, fx];
      fy  = [fyl, fy];
   end
   if mr > 0
      Nbr = fix((max(yrght) - fx(end) - bw/2)/bw) + 1;
      fxr = linspace(fx(end)+bw, fx(end)+Nbr*bw, Nbr);
      if Nbr == 1
         fyr = mr;
      else
         fyr = hist(yrght,fxr);
      end
      fx  = [fx, fxr];
      fy  = [fy, fyr];
   end

Scale the frequencies to define probability densities so that the area under the approximation to the probability density function is unity.

   gx = fx;
   gy = fy/(M*bw);

Function for evaluating a coverage interval for the output quantity

This function evaluates a coverage interval for the output quantity from an approximation to the distribution function for the quantity. The approximation is assumed to take the form of a piecewise linear function, such as that provided by the routine MCM_results or MCM_distributions.

See section 5.4.6 of M G Cox, P M Harris and I M Smith, Software specifications for uncertainty evaluation, 2010.

function IyMCM = coverage_interval(Gx, Gy, p, type)

Input

Gx, Gy          Approximation to the distribution function for the
                output quantity
p               Coverage probability (0.90, 0.95 or 0.99)
type     string Coverage type ('Shortest' or 'Symmetric')

Output

IyMCM    1 x 2  Lower and upper endpoints of a coverage interval for
                the output quantity

The shortest coverage interval is required.

   if strcmp(type, 'Shortest')

Evaluate data (pcov, lcov) defining the length of the coverage interval as a function of the probability at the lower endpoint of the interval. MATLAB's interp1 function is used to interpolate inversely the approximation to the distribution function to find values of the output quantity that correspond to a range of probabilities between 0 and 1 - p.

      M    = length(Gy);
      pcov = linspace(Gy(1), Gy(M)-p, 101)';
      ylow = interp1(Gy, Gx, pcov);
      yhgh = interp1(Gy, Gx, pcov+p);
      lcov = yhgh - ylow;

%%%
% Identify the coverage interval of shortest length.

      imin  = find(lcov == min(lcov), 1, 'first');
      IyMCM = [ylow(imin), yhgh(imin)];

   end

The probabilistically symmetric coverage interval is required.

   if strcmp(type, 'Symmetric')

MATLAB's interp1 function is used to interpolate inversely the appoximation to the distribution function to find values of the output quantity that correspond to probabilities (1 - p)/2 and (1 + p)/2.

     ylow  = interp1(Gy, Gx, (1-p)/2);
     yhgh  = interp1(Gy, Gx, (1+p)/2);
     IyMCM = [ylow, yhgh];

   end

Program identifier

NPLUnc_101 Open-Source Software (Release 1.1): Software for measurement uncertainty evaluation

M G Cox, P M Harris and I M Smith, Mathematics and Scientific Computing Group, National Physical Laboratory, UK

Crown copyright 2011