NPL Data Fusion Software: Release 1
Contents
- INTRODUCTION
- Simulation methodology
- Simulation options provided by the software
- Framework for the examples
- Example 2B: No bandwidth limitation, Kalman filter data fusion method
- Structure of the MATLAB code
- Generating pseudo-random numbers
- Licence agreement
- MAIN PROGRAM
- Identify the software to the user
- Add folder 'functions' to MATLAB search path
- Set seed values for random number generation
- Assign path to Excel workbook to be processed
- Define measurand type
- Define observation process
- Define sensor behaviour
- Define data fusion method
- Write results to Excel workbook
- Generate plots of results
- Remove folder 'functions' from MATLAB search path
- Acknowledgements
- Program identifier
INTRODUCTION
This MATLAB script DataFusionSoftware_2B.m and its associated Excel workbook DFSInput_2B.xls form part of a suite of MATLAB software routines and Excel workbooks that can be used to simulate distributed measuring systems. The software can be used to study network design, to compare data fusion algorithms and to evaluate measurement uncertainties associated with aggregated data in networks.
The software can be used to simulate cases in which sensors are intermittently faulty or unreliable, varying levels of noise appear in the sensor outputs, and the sensor outputs possess interdependencies, that is, the behaviour of one class of sensor depends on the quantity being measured by another class of sensor.
The software has been designed in a modular manner. Users select the modules and workbooks that are relevant to the task of interest and then combine them in a manner suitable for their application. In some cases it may be necessary for users to provide a limited amount of additional code of their own.
We have developed a set of examples that demonstrate how the software can be used. For each example we provide all the necessary MATLAB routines and an associated Excel workbook that is used to define the inputs to the MATLAB routines and to record the numerical results from the simulation.
The cases we demonstrate are:
1. A simulation of a small network of pressure and temperature sensors in which the response of each pressure sensor is temperature-dependent and some sensors may be intermittently faulty.
2. A simulation of a dynamic measurement task in which the sensors are modelled as filters and Kalman filtering is employed as the data fusion methodology.
3. A simulation based on a real wireless sensor network that was employed to make a distributed temperature measurement.
For each of the three cases we provide two examples (identified by the suffixes A and B) designed to demonstrate different aspects of the software.
This script and its associated MATLAB routines and Excel workbook implement case 2, example B from the above list. The example is explained in more detail below.
Simulation methodology
Consider a measurement of a physical quantity (the measurand) that varies with time.
In the real world, as opposed to the simulation world, we assume that the measurand is an analogue signal that can be represented by an analogue model, i.e., a model from which one can derive an exact instantaneous value of the signal at any time of interest in a manner that is not limited by sampling and quantization considerations. This also applies to noise that may be associated with the measurand.
In the real world the analogue assumptions made concerning the measurand are also to be applied to sensors and to any noise that the sensing process itself generates.
We have therefore assumed that a measurement embodies the following stages:
1. Specification of an analogue measurand (and any associated noise).
2. Specification of an analogue sensor (and any associated noise). The sensor may also be faulty. The sensor produces a continuous electrical output (e.g., voltage). The manner in which the sensor converts the measurand to a voltage signal is described by a mathematical model that may be the sensor's calibration function.
3. Sampling of the sensor output. The sampling process may be imperfect, and also introduce further noise effects such as jitter.
4. Quantization of the sampled sensor output. The quantization process may itself be imperfect, with uneven quantization intervals, for example.
5. Inversion of the quantized and sampled electrical output of the sensor to produce an estimate of the measurand and its associated standard uncertainty, using the sensor's calibration function. If the measurement is regarded as dynamic, an additional stage may be required here to allow for correction of the sensor output for sensor bandwidth limitations. This correction will typically be performed using a deconvolution algorithm.
6. If we can assume that all the sensors are measuring the same physical quantity, then estimates from the individual sensors (after any necessary correction) may be aggregated using a data fusion algorithm to produce a combined estimate of the measurand and its associated standard uncertainty. At this stage faulty sensors may be identified and excluded from the evaluation of the combined estimate.
The simulation software allows a user to include as many of these stages as are necessary to simulate the problem of interest but it does not require every stage to be included in the simulation. For example, the user may omit noise processes or effects due to imperfect sampling and quantization if these are not of interest for the simulation task in question. The software has been designed in a modular fashion that allows users to add their own modules that simulate particular features of interest. We consider the software to be a toolbox that allows users to build their own solutions.
The examples provided with the software demonstrate key features, though not necessarily all features, of the stages described above.
Simulation options provided by the software
The software examples we have developed provide three methods of simulating a measurand. These methods are:
1. Fourier series;
2. Piecewise linear model;
3. Time series explicitly defined in input worksheet.
Two methods of modelling sensors are provided:
1. Represent sensor as a filter;
2. Linear calibration function.
We provide three data fusion algorithms:
1. Chi-squared method;
2. Maximum clique method;
3. Kalman filter.
More information about these options, including references to relevant scientific papers, is provided below in the relevant sections of this documentation.
Framework for the examples
The generic framework for the examples is as follows:
1. Define the measurand in terms of probability distributions for the parameters in a mathematical model for the measurand, and distributions for additive random noise.
2. Generate 'true' parameter values and random noise values from the distributions defined at stage 1.
3. Generate 'true' values for the measurand using the parameter and random noise values at stage 2 for a sampling frequency that exceeds that for any sensor.
4. Define calibration information about the sensors in terms of probability distributions for the parameters in a mathematical model for each sensor, and distributions for additive random noise, as well as the sampling frequency, the number of bits and saturation level for quantization, and the proportion of data dropouts. For a faulty sensor, information about the 'true' behaviour of the sensor (in addition to the behaviour 'known' through calibration) is also provided.
5. Generate 'true' parameter values from the distributions defined at stage 4.
6. Generate a voltage response of each sensor to the measurand generated at stage 3, based on the 'true' parameter values generated at stage 5.
7. Add random noise to the voltage response based on the distributions defined at stage 4.
8. Sample and quantize the voltage response using the sampling frequency, number of bits and saturation level defined at stage 4.
9. Generate measurement results (estimates of the measurand and associated standard uncertainties) for the sensors based on the sampled and quantized voltage responses of stage 8 and the calibration information provided at stage 4.
10. Randomly remove measurement results based on the data dropout information given at stage 4.
11. Apply a data fusion method to the measurement results provided by the sensors at stage 10.
12. Write results to an output file.
13. Generate figures displaying the 'true' values of the measurand provided at stage 3, the measurement results provided by the sensors at stage 10 and the combined estimates of the measurand and associated standard uncertainties from stage 11.
Example 2B: No bandwidth limitation, Kalman filter data fusion method
In this example we demonstrate how to use the software to study a problem in which the response of the sensors are adequate to capture reliably the time history of the measurand. Data fusion is performed using the Kalman filter method (for more information about this method see below). Kalman filter based data fusion requires a prediction of the value of the measurand, an estimate of the uncertainty of the predicted value, and computation of the weighted average of the predicted value and the measured value.
To demonstrate the Kalman filter method of data fusion, the problem is set up in the following manner.
1. We assume that the measurand is a sinusoidal signal (a tone-burst) applied to a transmitting transducer. For this example we have assumed that the transducers in question are hydrophones for underwater acoustics applications.
2. The transmitting transducer acts on the applied signal to produce an observable output signal. We assume that the transmitting transducer is modelled as a second order linear time-invariant response that can be defined by a resonant frequency and a damping factor.
3. The observable output signal is detected by three receiving transducers, each modelled as a second order linear time-invariant system defined by a resonant frequency and a damping factor. All receivers have resonant frequencies that greatly exceed the frequency of the measurand.
4. In this example we do not simulate missing data points and quantization effects.
The input parameters for the problem described above are entered into the Excel workbook that is also used to record the numerical outputs of the simulation. The workbook contains four worksheets. Worksheet Measurand contains the rules (the model) for generating the measurand as a sum of sinusoidal components. Worksheet Observation defines the observable process stimulated by the measurand. For this example, the process is a second order linear time-invariant system defined by a frequency and a damping factor. Worksheet Sensor defines the frequency responses of the three receiving transducers, which are also modelled as second order linear time-invariant systems. In this example quantization effects are ignored. At the end of the simulation the data fusion results are written to worksheet Results that has the following format. The first columm contains the times at which sensor measurements have been made. Subsequent columns contain, for each time value, the value of the measurand, the combined estimate of the measurand obtained from the consistent sensors, the standard uncertainty associated with that estimate, a list of the sensors identified as consistent, the observable output signal, the estimate of the observable output signal obtained from each individual sensor and the standard uncertainty associated with that estimate. The results worksheet also records the values of the seeds that were employed for the random number generator. If the analysis is repeated either with the same or amended input data and no change is made to the worksheet name, the results worksheet is overwritten with the new results. Users who need to ensure traceability of results should rename the results worksheet they wish to save to avoid data being overwritten.
For comparison purposes, we also make available DataFusionSoftware_2A.m and DFSInput_2A.xls, as an example in which one of the receiving transducers does not have an adequate bandwidth for the measurement task and, as a consequence, does not form part of the consistent subset.
Structure of the MATLAB code
The MATLAB software calls a number of functions. These functions are included as separate files containing comments that explain the calculations being undertaken.
Users should note that the software does not carry out exhaustive checks on values provided in worksheets.
Generating pseudo-random numbers
Pseudo-random numbers are generated using the Wichmann-Hill random number generator [1]. This generator is simple to implement in software and allows the user to generate repeatable random number sequences with a long period.
[1] Generating good pseudo-random numbers
B A Wichmann and I D Hill
Computational Statistics and Data Analysis, 51 (2006):1614-1622
The Wichmann-Hill random number generator has been implemented as a DLL independently of this MATLAB code. The files needed to run the Wichmann-Hill generator (WH_MVS.dll and wh_mvs.h) are included as part of this software release.
Licence agreement
The software is provided with a software licence agreement (REF: MSC/L/11/004) and the use of the software is subject to the terms laid out in that agreement. By running the software, the user accepts the terms of the agreement.
MAIN PROGRAM
Identify the software to the user
fprintf('\nNPL Data Fusion Software: Release 1'); fprintf('\nNational Physical Laboratory, UK'); fprintf('\nDecember 2011 \n');
NPL Data Fusion Software: Release 1 National Physical Laboratory, UK December 2011
Add folder 'functions' to MATLAB search path
The MATLAB search path is updated to include the folder that contains the MATLAB functions called by the example scripts. It is assumed that this folder is a subfolder of the current folder and is named 'functions'.
topfolder = pwd;
funfolder = [topfolder, '\functions'];
path(path, funfolder);
Set seed values for random number generation
The user must assign a row vector of four integers each in the interval [1, 2147483647]. To assign seed values based on the current time, the user should set the seed variable to be a vector of four zeros. For traceability purposes the values of the chosen seeds are written to the results worksheet of the Excel workbook.
seed = [25 05 20 05]; global g_wh_seed if all(seed == 0) c = fix(clock); g_wh_seed = c(3:6); else g_wh_seed = seed; end fprintf('\nSeed vector for random number generation: %u %u %u %u \n', ... g_wh_seed);
Seed vector for random number generation: 25 5 20 5
Assign path to Excel workbook to be processed
The user must assign the path to an Excel workbook containing worksheets that describe all components of the measurement.
The user may choose to edit or add worksheets defining different measurand types and sensor behaviours, and may also introduce new components to the calculation by writing a new MATLAB function that may require a new worksheet format.
If on running the software the Excel workbook is open an error will be obtained and the results will not be written to the selected worksheet.
workbookpath = 'DFSInput_2B.xls'; fprintf(['\nWorkbook: %', num2str(length(workbookpath)), 's \n'], workbookpath);
Workbook: DFSInput_2B.xls
Define measurand type
In the example in this script, the measurand is generated as a sinusoidal component using the routines FourierSeries.m and GenerateFourierSeries.m and the input given in the worksheet Measurand.
'True' measurand values are stored in the structure MeasIn which includes the fields:
t, the vector containing time values, and
y, the vector containing the 'true' measurand values corresponding to the time values in t.
A list and description of the information that must be supplied by the user, together with a mathematical description of how the 'true' measurand values are calculated, is available here.
measurandworksheet = 'Measurand';
MeasIn = FourierSeries(workbookpath, measurandworksheet);
Type of measurand simulation: Fourier series Worksheet: Measurand
Define observation process
In the example in this script, the observable output signal is generated by a second order linear time-invariant system acting on the measurand using the routines SecondOrderSystemObs.m and GenerateSystemResponse.m and the input given in the worksheet Observation.
The observable output signal, and information about the process generating it, are stored in the structure Obs which includes the fields:
t, the vector containing time values, and
y, the vector containing the system response values corresponding to the time values in t.
State propagation matrices for the Kalman filter are stored in the structure Phi which includes the fields:
A, the first state propagation matrix,
B, the second state propagation matrix,
C, the third state propagation matrix, and
D, the fourth state propagation matrix.
A list and description of the information that must be supplied by the user, together with a mathematical description of how the observable output signal values are calculated, is available here.
observationworksheet = 'Observation'; [Obs, Phi] = SecondOrderSystemObs(workbookpath, observationworksheet, ... MeasIn);
Type of observation simulation: Second order system Worksheet: Observation
Define sensor behaviour
In this example, each sensor acts as a second order linear time-invariant system specified in the worksheet Sensor. The routine SecondOrderSystemSens.m takes information about the observable output signal given in Obs and returns the data structure Sens containing measurement results in the fields:
t, the vector containing time values,
Y, the array containing the estimates of the observable output signal obtained using the different sensors, corresponding to the time values in t, and
uY, the array containing the standard uncertainties associated with the estimates of the observable output signal.
A list and description of the information that must be supplied by the user, together with a mathematical description of how sensor estimates of the observable output signal are calculated, is available here.
sensorworksheet = 'Sensor';
Sens = SecondOrderSystemSens(workbookpath, sensorworksheet, Obs);
Type of sensor behaviour: Second order system Worksheet: Sensor
Define data fusion method
Combined estimates of the measurand and their associated standard uncertainties are determined using the Kalman filter data fusion method. This method combines information about the time-varying behaviour of the measurand (in the form of a model for that behaviour) with the estimates of the observable output signal provided by the sensors and the associated uncertainties. It can be expected to deliver aggregated estimates with smaller associated uncertainties compared with other methods but at the expense of knowing a model for the measurand and the process generating the observable signal. Like the largest consistent subset method, it can treat correlations associated with estimates provided by different sensors at the same sampling point, but in its conventional form assumes the estimates at different sampling times to be obtained independently. A test is included to identify (possibly) faulty sensors. Missing data are permitted. However, unlike the other methods, if no sensor provides data an aggregated result can still be determined on the basis of the model for the measurand. Consequently, the network will always deliver an estimate of the measurand.
MeasOut = KalmanFilterDataFusion(Sens, Phi, MeasIn.f0);
Data fusion method: Kalman filter
Write results to Excel workbook
Results are written to the worksheet named Results.
resultsworksheet = 'Results'; WriteObsResults(workbookpath, resultsworksheet, MeasIn, Obs, Sens,... MeasOut, seed);
Results worksheet: Results
Generate plots of results
The plots for example 2B are set out below.
1. The first figure shows the 'true' measurand.
2. The second figure shows the observable output signal.
3. The third figure shows the estimates of the observable output signal obtained using the sensors, both the estimates and the estimates plus their associated standard uncertainties.
4. The fourth figure shows the 'true' measurand and the estimates of the measurand that arise from applying the Kalman filter to the estimates of the observable output signal provided by the individual sensors shown in the previous figure. The consequence of choosing the Kalman filter as the fusion method is demonstrated at the start of the trace, where it takes some time before the prediction settles to a good estimate of the measurand.
5. The fifth figure plots the standard uncertainty associated with the pressure estimate as a function of time. Once again the time required for the Kalman filter fusion method to settle to a good estimate is clear.
6. The sixth plot indicates the times at which individual sensors fall outside the largest consistent subset. For all three sensors departures from consistency arise from the Monte Carlo processes built into the simulation. The Kalman filter model (identified as 'sensor' 0 in the Results worksheet) itself is part of the consistent subset except at the start.
PlotMeasIn(MeasIn) PlotObsMeasOut(MeasIn, Obs, Sens, MeasOut)






Remove folder 'functions' from MATLAB search path
The MATLAB search path is updated to remove the folder that contains the MATLAB functions called by the example scripts.
rmpath(funfolder);
Acknowledgements
The work described here was supported by the National Measurement Office of the UK Department of Business, Innovation and Skills as part of its NMS Software Support for Metrology programme.
Program identifier
Data Fusion Software (Release 1)
Authors: T J Esward, P M Harris, C E Matthews, H D Minh and I M Smith
National Physical Laboratory, UK
Date: December 2011
(c) Queen's Printer and Controller of HMSO, 2011