Probability Theory, at least in the standard approach due mostly to Kolmogorov, is essentially the study of certain real-valued set functions, enrichened by the near mystic complexity of Independence/Dependence relationships. I will give a brief synopsis of this development.
We begin with a Ring of sets, say
. This is a set of sets which is closed under the symmetric
difference
and the intersection. Note that closure under these two operations
guarantees closure
under all combinations of set operations
. An Algebra is
simply a Ring with a Unit element. This is simply a set (remember, elements of the Ring are sets) which
acts as the identity function on any other element of the Ring under intersection. Note that every ring contains
the empty set, which is our identity under union. If these closure properties hold not only when combining pairs (and hence,
any finite number) of sets, but also under countably infinite combinations, the Ring (or Algebra) is called
a
-Ring (or
-Algebra).
-algebras are also often called Borel Algebras. The
classic example of a
-algebra is the set of all subsets (of a given set).
Next, we define a measure as a mapping from a
-algebra to the non-negative real line. Such a function is
called Additive if the measure of the union of any two disjoint ``
-subsets'' is the sum of their measures.
If this property holds under countable (but still pair-wise disjoint) unions, the measure is said to be
-additive.
A set endowed with a
-additive measure is a Measure Space.
Now we are ready to build our probability theory.
A Probability Space is simply a Measure Space such that the measure of the entire space is 1. This
is generally notated as the triplet
representing the set
over which the
-algebra
is defined as the domain of P, the measure.
This is enough to develop classical probability theory. We can call the elements of
the Elementary
or Simple Events, the elements of
the Compound or Complex Events, and P the Probability
Distribution.
This only allows us abstract, set-theoretic events and outcomes. We need to develop Random Variables
to do quantitative modelling. We define a Random Variable (``RV'') as a mapping from
to
.
This is written, for an RV X,
Note that these can, and almost always will, be vector-valued Random Variables, though I will refer to them as Random Variables for generality. Linear Combinations of Random Variables can be defined as follows, given two RV's X and Y:
Lastly, we note that
defines a probability measure over
. Thus, a Random Variable can be thought of as defining a Probability Space
over
. In fact, it is possible to go back the other way. It can be shown that, given such a Probability Measure
over
, there must exist an
and X which induce it. So it would seem that we can
associate the measure over
, rather than the mapping from
to
, with the RV. See
[Eaton] for a more in-depth development.