1. Fundamentals
Artificial Neural Network (ANN)
technology is an approach to describe physical system
behaviour from process data, using mathematical algorithms and statistical
techniques. ANNs simulate biological neural systems,
in that they are made up of an interconnected system of nodes (neurons) and in
terms of learning and pattern recognition. These nodes operate in parallel and
are inspired by biological nervous systems. A neural network can be trained to
identify patterns and extract trends in imprecise and complicated non-linear
data. A particular function can be performed by adjusting the values of the
connections (weights) between elements following a determined training
algorithm.
Neural networks have been under development for
many years in a variety of disciplines to derive meaning from complicated data
and to make predictions. In recent years, neural networks have been investigated
for use in pollution forecasting. Because ozone formation is a complex
non-linear process, neural networks, which allow for the incorporation of
non-linear relationships, are well suited for ozone forecasting.
2. Strengths
of artificial neural networks
Many methods exist for predicting ozone
concentration. Table 1 summarises the most commonly used forecasting methods.
Strength aspects of ANNs include the following:
- ANNs allow for non-linear relationships between variables.
The method can weight relationships that are difficult to subjectively
quantify.
- Neural
networks have the potential to predict extreme values more effectively than
regression.
- Once
the neural network is developed, forecasters do not need specific expertise to
operate the ANN.
- Neural
networks can be used to complement other forecasting methods, or used as the
primary forecasting method.
On the other hand, neural networks are complex and
not commonly understood and hence the technology can be inappropriately
applied.
Table 1: Comparison of
forecasting methods.
3. Neural
Network architecture
The basic structure of an ANN involves a system of
layered, interconnected neurons. The neurons are arranged to form an input
layer, one or more “hidden” layers and an output layer, with nodes in each
layer connected to all nodes in neighbouring layers (Figure 1).
Figure 1: The architecture
of a multi-layered feed forward neural network.
The layer of input neurons receives the data either
from input files or directly from electronic sensors in real-time applications.
The output layer sends information directly to the outside, to a secondary
computer process, or to other devices such as a mechanical control system. The
internal or hidden layers contain many of the neurons in various interconnected
structures. The inputs and outputs of each of these hidden neurons go to other
neurons.
In most networks each neuron in a hidden layer
receives the signals from all of the neurons in a layer above it. After a
neuron performs its function it passes its output to all of the neurons in the
layer below it, providing a feed forward path to the output.
Artificial neurons comprise seven major components,
which are valid whether the neuron is used for input, output or in hidden layers:
1) Weighting factors, which are adaptive
coefficients within the network
determine the intensity of the input signal. These input connection strengths
can be modified in response to various training sets and according to a network
specific topology or through its learning rules.
2) Summation
function, which transforms the weighted inputs in to a single number. The
summation function can be complex as the input and weighting coefficients can
be combined in many different ways before passing on to the transfer function.
The summation function can select the minimum, maximum, majority, product or
several normalizing algorithms depending on the specific algorithm for
combining neural inputs selected.
3) Transfer
function, which transforms the result of the summation function to a working
output. In the transfer function the summation total can be compared with some
threshold to determine the neural output. If the sum is greater than the
threshold value, the processing element generates a signal. If the sum of the
input and weight products is less than the threshold, no signal (or some
inhibitory signal) is generated.
4) Scaling
and limiting. This scaling multiplies a scale factor times the transfer value,
and then adds an offset. Limiting mechanism insures that the scaled result does
not exceed an upper or lower bound.
5) Output
Function (competition). Neurons are allowed to compete with each other,
inhibiting processing elements. Competitive inputs help determine which
processing element will participate in the learning or adaptation process.
6) Error
function and back-propagated value. The difference between the current output
and the expected output is calculated and transformed by the error function to
match particular network architecture. This artificial neuron error is
generally propagated backwards to a previous layer in order to modify the
incoming connection weights before the next learning cycle.
7) Learning
function, which modifies the variable connection weights on the inputs of each
processing element according to some neural based algorithm. The software first
adjusts the weights between the output layer and the hidden layer and then
adjusts the weights between the hidden layer and the input layer. In each iteration, the software adjusts the weights to
produce the lowest amount of error in the output data. This process “trains”
the network.
4. Neural
networks training
Training and production are essential for the
neural network application (Figure 2).
Figure 2: Essential
phases of the neural network application: training and production
The development of ANNs
comprises the performance of a series of consecutive steps. In addition, a
thorough knowledge of the process to be modelled is also required.
The general steps to develop neural
networks for ozone forecasting are the following:
- Complete
historical data analysis and/or literature reviews to establish the air quality
and meteorological phenomena that influence ozone concentrations in the area
under study.
- Select
parameters that accurately represent these phenomena. This is a critical aspect
in developing the neural network since an appropriate selection improves
significantly the results obtained by the ANN.
- Confirm
the importance of each meteorological and air quality parameter using
statistical analysis techniques (Cluster analysis, correlation analysis,
step-wise regression, human selection).
- Create
three data sets: a data set to train the network, a data set to validate the
network general performance and a data set to evaluate the trained network.
- Train
the data using neural network software. It is important not to over train the
neural network on the developmental data set because an over trained network
would predict ozone concentrations based on random noise associated with the
developmental data set. When presented with a new data set the network will
likely give incorrect output since the new data random noise will be different
than the random noise of the developmental data set: the network memorized the
training examples but it did not learn to generalize to new situations.
One of the most commonly used method for
improving generalization is called “early stopping”. In this technique, when
the validation error increases for a specified number of iterations, the
training is stopped, and the weights and biases at the minimum of the
validation error are fixed.
- Test
the generally trained network on a test data set to evaluate the performance.
If the results are satisfactory, the network is ready to use for forecasting.
5. Neural
networks operation
The operation of an ANN is simple and requires
little expertise.
Although use of the network does not require an
understanding of meteorology and air quality processes, it is advisable that
someone with meteorological experience be involved in the development of the
method and evaluate the ozone prediction for
reasonableness.
As part of a forecasting program forecasters should
regularly evaluate the forecast quality. The verification process can be
complex since there are many ways to evaluate a forecast including accuracy,
bias and skill. Many verification statistics are needed to compute in order to
evaluate completely the quality of the forecast program.
References
- Guideline for developing an ozone
forecasting program. U.S. Environmental Protection Agency. July 1999.
- Artificial Neural Networks Technology. Data & Analysis Center for Software. August 1992.
- Ad-Hoc
working group on ozone directive and reduction strategy development. Ozone position paper. July 1999.
- A.C.
Comrie. Comparing Neural Networks and Regression Models for Ozone Forecasting.
Journal of the Air & Waste Management Association.
June 1997.
- G. Reyes; V.J. Cortés. Ozone
forecasting in the urban area of Seville using artificial neural
network technology. Urban Transport VII. WITPRESS. 2001.
- S. Amoroso; M. Migliore. Neural
networks to estimate pollutant levels in canyon roads. Urban
Transport VII. WITPRESS. 2001. |