Research Proposal
: Study of Uncertainty in Quantitatively Represented Phenomena
Introduction and Review
The research project offered for consideration aims to explore
some of the possible effects of uncertainty on the maximal
effectiveness by which quantitative phenomena can be represented
mathematically. In particular, the project intends to identify
and categorize specific types of systems in which the consequences
of uncertainty are influential to different extents, and to
identify how (if at all possible) either the minimum unavoidable
amount of uncertainty involved or the significance of its
effect on the mathematical representation may be methodologically
reduced.
This project will not consider a decrease in uncertainty
achievable by simply improving the quantity or quality of
input data (which is trivial), nor will it consider a decrease
in uncertainty through compromise of the mathematical description
that results in it aiming to be less informative (which is
simply removal of the problem). Furthermore, the project will
limit its consideration to dynamical systems in finite configuration
space of no more than three degrees of freedom for the sake
of simplicity.
UNCERTAINTY, whose intuitive linguistic meaning refers to
a disposition such that there is likelihood of discrepancy
between an estimation of a situation and the reality of the
same situation, is also known as INFORMATION in mathematics
(a misleading term that seems to refer to how much information
is known but in fact refers to how much information must be
invested to arrive at the same knowledge, which is precisely
what uncertainty means), and is formally defined as a real-valued
function I of events E in a probability space that is dependent
only on the probability P of the events, such that all of
the following conditions are simultaneously satisfied:
1) I(E) = 0 iff P(E) = 1
Events have zero uncertainty if they have unit probability.
2) I(Ea) < I (Eb) iff P(Ea) > P(Eb) for any Ea and
Eb in the probability space
Uncertainty for events decreases as their probability increases.
3) I(Ea) + I(Eb) = I(Ea and Eb) for any independent Ea and
Eb in the probability space
Uncertainty of the coincident occurrence of independent
events is the sum of the uncertainties of their individual
occurrences.
All measurable functions satisfying these conditions can
be shown to be of the following general form:
I(E) = -c log(P(E))
where c is a positive constant.
Often uncertainty is considered not for specific events but
for categories of events, corresponding to partitioned sections
of the probability space, where partitions may be constructed
to reflect realistically meaningful subdivisions of events
in what is being represented or modelled. (For a measurable
partition of the probability space, the expectation value
of the uncertainty function within is the ENTROPY of the partition.)Uncertainty
plays a part in all of quantitative experimental/observational
science, which depend ultimately on measurement of numerical
data to provide input from which abstraction is performed
for general theoretical description of the phenomena in consideration.
This is due to imperfection in measurement technique, which
may be a consequence of technological limitations (e.g. finite
empirical resolution of measuring equipment), statistical
constraints (e.g. small sample size) or fundamental restrictions
imposed by the system itself (e.g. Uncertainty Principle of
quantum mechanics). Hence the conclusions of this project,
which will provide an orderly generic understanding of the
relation between phenomena and their associated uncertainties,
and potentially offer (or lead further research towards) a
formulaic route of optimizing mathematical representation
for minimal effect by uncertainty, is likely to be of academic
and practical interest to audience not only in mathematics
and information theory but also in quantitative science in
general, from biology to finance. This project is in simple
terms asking whether the application of mathematics to science
can be made more efficient by improved human understanding
of uncertainty and thus improved 'engineering' of mathematical
representation. Considering the prospective expansion of quantitative
reliance in modern science with the rapidly increasing processing
power of computation (leading to the rise of ‘brute
force’ numerical methods), it may be that uncertainty
will soon become the limiting factor to the strength of future
research, hence the question posed by this project is highly
timely and in proximate intellectual demand.
Existing work in the field of uncertainty is broad, but quite
little of it is of direct relevance to the angle which this
project intends to take, hence the following review of literature
will be brief. Experimental science has established simple
statistical techniques for error analysis, which is mainly
a branch of metrology that tells us the maximal accuracy to
which conclusions based on measurements can be drawn, and
is an area that is already academically completed. (“Practical
Physics”, G. L. Squires, CUP, 1985) The main ongoing
interest in uncertainty is within computer science, and most
of all in the development of artificial intelligence systems.
(Association of Uncertainty in Artificial Intelligence www.auai.org)
The main problems in this area concern the questions of how
artificial intelligence systems can be made to reason with
uncertainty and about uncertainty. Some of the main tools
studied in terms of their suitability for this purpose include
probabilistic logic, fuzzy logic and default logic. While
intuitively (based on the above definition) it can be appreciated
why probability initially appeared to be the most obvious
technical choice for dealing with uncertainty, the alternatives
gradually arose as its practical weaknesses were gradually
learned. This however goes beyond the scope of this proposal.
(“Reasoning with Uncertainty” www.science.uva.nl/research/pion)
A more general area that is also implicitly concerned with
uncertainty is decision theory and its applications to phenomena
such as financial markets, which includes analyzes risks based
on lacking knowledge about outcomes. (International Journal
of Uncertainty, Fuzziness and Knowledge-Based Systems, various
articles).
Methodology (tentative and subject to development)
The method of investigation must begin with classification
of the different types of quantitative phenomena to be examined
and compared, which will be in terms of their structural properties,
since these will have a direct effect on their qualitative
behaviour within which any uncertainty is to manifest. The
properties to be considered is expected to include (but depending
on progress may not be limited to) the following:
EQUILIBRIUM / NONEQUILIBIRUM
An EQUILIBRIUM system refers to a system whose behaviour
is constant in time. Whether or not the behaviour of the system
on average changes with time creates different effects of
uncertainty. For a system known to be in equilibrium, uncertainty
will only cloud the position of the equilibrium. For a system
known not to be in equilibrium, on the other hand, uncertainty
will cloud not only the position of its trajectory but also
its shape, hence the same magnitude of uncertainty leads to
a more significant difficulty because more parameters have
been involved.
STATIONARY / TURBULENT
A STATIONARY system refers to a nonequilibrium system whose
cause of varying behaviour does not itself change, while a
TURBULENT system refers to the complement of this (which in
real systems usually stems from causes external to the system).
By similar reasoning to the above, this will affect the extent
to which uncertainty will be significant comparing within
nonequilibrium systems.
DETERMINISTIC / STOCHASTIC
A DETERMINISTIC system refers to a system whose behaviour
involves no genuine randomness whereas a STOCHASTIC system
refers ]to the complement of this. Randomness intrinsic to
the system will generate uncertainty of itself that may be
additional to the uncertainty arising from limited precision
of knowledge, thus the effect of uncertainty in these two
types of systems must be considered separately.
CHAOTIC / NONCHAOTIC
A CHAOTIC system refers to a nonequilibrium (typically deterministic)
system possessing sensitive dependence on initial conditions
such that an arbitrarily small change at one time brings about
an arbitrarily large difference in the trajectory some time
later, such that lacking perfect knowledge of initial conditions
the eventual behaviour is eventually rendered completely unpredictable.
The effect of uncertainty on chaotic systems compared to on
nonchaotic systems is easily intuitively perceivable
MARKOVIAN / NON-MARKOVIAN
A MARKOVIAN system refers to a stochastic system which possesses
no memory of its past history. Uncertainty is such systems
will have a different effect from uncertainty in non-Markovian
systems which have previous information embedded in the present
trajectory segment.
Together, these choices of properties offer a variety of
different combinations of resulting systems. These systems
can all be represented as dynamical functions in no more than
three configuration dimensions. Examples of each of these
systems are to be computationally generated using numerical
solution of trajectories, and uncertainty is to be introduced
into the system by imposing an adjustable artificial limitation
in measurement accuracy (e.g. cutoff of higher order significant
figures of the trajectory iterates), which can easily be incorporated
into the program. The advantage of this approach is that while
the versions of the systems with added uncertainty are used
for the simulation, the more accurate versions of the systems
are also available for reference and comparison as the standard
against which the effectiveness of decreasing effects of uncertainty
can be gauged.
1) Study of breakdown of deterministic representation:
Deterministic representation refers to treating the system
as one whose future behaviour can be completely predicted
given perfect knowledge of its current state. For systems
which are stochastic by construction (one of the categories
above), this approach obviously does not apply and would not
usually be used for practical modelling. For chaotic systems
(another category above) where it is well known that this
approach is pointless lacking infinite accuracy, this approach
would also not usually be used in practice. However, the purpose
of this study is to see how quickly uncertainty brings about
depreciation of available knowledge about the system, hence
will involve all the categories.
For each of a range of different starting points, the trajectories
will simply be computed for different magnitudes of added
uncertainty as well as for the case with no added uncertainty.
Then the rate at which the initially similar trajectories
diverge can be compared across the different values of uncertainty,
which may be formalized in terms of metric or topological
entropy, Liapunov exponents, or some of the types of fractal
dimension. This would provide a basis for evaluating the significance
of uncertainty on each category of system were it assumed
to be deterministic in representation.
It would be premature and spurious to attempt outlining the
procedure more precisely at this stage, as much will depend
on the computational tools used as well as confinements of
time. It is anticipated that the details of exactly how most
suitably to comprehensively make the comparisons will become
emergent with a survey of actual results. Hence it is expected
that a collection of results from exploratory runs be first
conducted to provide perspective on the procedure, after which
the collection of full results should be possible with greater
efficiency.
2) Study of applicability of probabilistic representation:
Probabilistic representation refers to describing a system
not in terms of its actual dynamical trajectory but only the
proportion of time on average that the trajectory will spend
in a certain place. For stochastic systems, this is the only
informative representation approach, and it can also be applied
to deterministic systems, and for the latter is especially
useful when uncertainty leaves deterministic representation
inappropriate. The purpose of this study is to examine the
accuracy achieved by various implementations of this approach
to the cases with added uncertainty compared to the same implementations
to the cases with no added uncertainty. In particular, comparisons
to be made will include: a) the range of accuracies by different
implementations within the same system category, b) the optimal
accuracies of the different system categories, and c) the
variation in accuracy with magnitude of added uncertainty
for each category. Together, this will yield understanding
on the comparative applicability of probabilistic representation
for the different system categories under consideration.
A subspace of a predetermined size within the configuration
space must first be defined. The use of this subspace will
be to measure the proportion of time spent within it by the
trajectory, with and without added uncertainty. For a given
starting point, the system (with and without added uncertainty)
will be allowed to run until the proportion of time spent
by the trajectory within the subspace stabilizes. This is
to be repeated for many different starting points, and also
many different positionings of the subspace. This would produce
data showing the difference in proportion of time spent within
the subspace (hence the effectiveness of the probabilistic
representation) when the system of a particular category has
uncertainty added to it. Staying with the same category, the
procedure can be repeated for different magnitudes of added
uncertainty, which would produce data showing the effect of
increasing uncertainty on the probabilistic representation.
And then the whole procedure can be repeated for the other
categories, showing the comparative effectiveness of the probabilistic
representation on the different categories.
A further study (which may or may not be called for depending
on the results of the above) could involve breaking up the
subspace into even smaller sections to see the effect of uncertainty
on the distribution of the trajectory time within the subspace.
Alternatively, simply different sizes of subspace may be used
to find out the effectiveness of the probabilistic representation
at different scales, ie. whether it preserves structure on
a large scale but destroys structure at smaller scales, etc..
Once again, the more refined details of the methodology must
await practical considerations and exploratory results, even
more so in this second case due to the greater complexity
of the study compared to the first.
In both studies, the value of the results produced is likely
to increase with the number of different starting points taken,
therefore as many computational runs as possible within available
time will be aimed for.
Dissemination possibilities (provisional)
The nature of this project, with the many different categories
of systems studied and thus many different individual sets
of comparisons that can be made between them, permits a possibility
of selecting out those most relevant to various realistic
quantitative phenomena and writing separate reports, each
of which may concentrate on a specific comparison of two categories
and the associated implications on modelling phenomena within
these categories. In this way, prospective audiences from
their own disciplines may find the conclusions of the research
more conveniently accessible. Also, by thus isolating the
qualitative differences between the categories for separate
consideration, the findings may be more easily understood
by general audiences without going into the quantitative details.
The length of these reports would in this case moreover be
reasonable for publishing in popular magazines or presentation
in academic (or even public) conference should such opportunities
arise.
The numerical simulations developed by the methodology may
also benefit from translation into interactive graphics programs
for quick trials by the audience (who for example can select
the magnitude of uncertainty) in order to allow better visual
understanding of the findings. This may be most appropriate
with online electronic publishing, when the programs can be
offered as supplementaries to the written reports.
-
BIBLIOGRAPHY
-
“Quantitative and Technical
Analysis” - D. VVEDENSKY; unpublished
-
“A logical approach to reasoning
about uncertainty” – J. HALPERN; Discourse,
Interaction and Communication; 1998; Kluver
-
“Dynamical Systems: Stability,
Symbolic Dynamics and Chaos” – C. ROBINSON;
1999, CRC
-
“Practical Physics”
- G. L. SQUIRES; 1985; CUP
-
“Association of Uncertainty
in Artificial Intelligence” - www.auai.org; various
-
“Reasoning with Uncertainty”
- www.science.uva.nl/research/pion; various
-
“Dictionary of Mathematics”
– E. J. BOROWSKI & J. M. BORWEIN; 1989; Collins
|