We may characterize the filtering problem as the attempt to calculate the projection (in
the space of R.V.'s) of the state onto the subspace generated by the observations. That is,
we will be looking for an estimate
of the state
which is a linear
function of the observations up to (and including) time n. The space of all such functions
is the linear span of
. Why is the projection the random variable we seek?
There are two possible ways to view this: