Areté: How to do Stochastic Calculus

Context

I have recently had to learn a mathematical approach called "stochastic calculus" as part of my actuarial exams. As misery loves company, I've decided to share the details with you...

This post is a short tutorial in stochastic calculus, with an emphasis on getting to the point where we can solve practical problems and skipping as much foundational material as I can get away with.

What is stochastic calculus?

Stochastic calculus is a useful tool in financial maths. In normal calculus, you might take a function and find its derivatives (gradient, curvature, etc) as time changes. Or you might take a differential equation (an equation relating a function to its derivatives) and use it to figure out what the corresponding function looks like.

"Stochastic" just means probability-related. In stochastic calculus, you take a random variable and find its derivatives, or take a differential equation and find the random variable it represents.

An example. Let's say we're looking at the size (n_t) of a population of lemurs. We reckon that the rate of population change is directly proportional to the size of the population. In mathematical notation:
dn_t/dt = g.n_t
where g is a constant growth factor (or shrinkage factor if negative). If we solved this with differential calculus we'd find that:
n_t = n₀.exp(g.t)

But wait! Real populations don't work like that. They behave far more randomly, which can lead to very different results. In particular, you'll notice that, no matter how negative the growth rate is, the lemurs can never go extinct. Sadly this is not true of real animals.

How do we represent this uncertainty? One way is to add a "noise" term to the calculation. So:
dN_t/dt = g.N_t + v.W_t.N_t
where v is a constant volatility factor and W_t is a continuously-changing random variable - a "stochastic process". You'll note that we're now using N_t rather than n_t, to show that population size is probabilistic rather than predestined.

Brownian motion

Which stochastic process should we use for our noise term? This would depend very much on the situation - our team of biologists would need to research how species' populations change in practice. However, a very popular choice is based on "Brownian motion". It's typical to try to reframe any stochastic problem in terms of this process.

Brownian motion is named after the phenomenon of pollen grains bouncing around in water (they're small enough to be affected by the impact of individual water molecules). It is a "random walk" process, with several nice properties:

- it is distributed as N(0,t²) (e.g the standard deviation at time t is t^0.5)
- it has independent increments, so you can model B_2t as B_t + B'_t
- dB_t.dB_t = dt (please take it on trust that this is important, we'll see how it's relevant later)

In the case of our population equation, we might set W_t such that W_t.dt = dB_t. In other words, B_t is in some sense the integral between times 0 and t of W_t. This allows us to rephrase our equation as:
dN_t = g.N_t.dN_t + v.N_t.W_t.dt
= g.N_t.dN_t + v.N_t.dB_t

Interlude: What the hell is that?

I keep using terms like dB_t and dN_t to mean increments of random variables. Intuitively, this doesn't make a heck of a lot of sense. Random variables, even time-dependent ones, don't have fixed values - that's kinda the point - so how the blazes can they have well-defined increments?

The answer is: the increment is also a random variable, just an infinitesimally small one (in the same way that dt is an infinitesimally small deterministic variable). When we say (for example) I = ∫₀^t 1.ds, the answer is clearly: I=t. When we say I = ∫₀^t 1.dB_s, the answer is equally obvious: I = B_t, the sum of the random increments.

More complex expressions like ∫₀^t B_s.dB_s are quite hard to understand intuitively. Multiplying a random variable by an infinitesimal random variable and taking an infinite sum? Wtf? Basically, the purpose of stochastic calculus is to allow us to transform messy situations like this back into simple ones like the two in the preceding paragraph.

The Ito formula

So anyway, we now know what the stochastic differential equation looks like. How do we solve it? We use an equation called the Ito formula to figure out what it should look like, and then we forcibly bring the two forms together.

The Ito formula says that if:
dX_t = u.dt + v.dB_t
(u, v real numbers)

and:
Y_t = f(t, X_t) for some function f(t,x)

then:
dY_t = ∂f/∂t.dt + ∂f/∂x.dX_t + 1/2.∂²f/∂²x.dX_t.dX_t
using e.g. ∂f/∂x as a shorthand for ∂f/∂x(t, X_t)

Usually it makes sense to take X_t = B_t.

A simple example. Let's say we're evaluating I_t = ∫ B_t.dB_t. We can rephrase this as: dI_t = B_t.dB_t

The variable I_t will end up depending on time and on the effects of the random variable, so we can write: I_t = f(t, B_t). Therefore, by Ito's formula:
dI_t = ∂f/∂t.dt + ∂f/∂x.dB_t + 1/2.∂²f/∂²x.dB_t.dB_t
= (∂f/∂t + 1/2.∂²f/∂²x).dt + ∂f/∂x.dB_t because dB_t.dB_t = dt (see the section on Brownian motion above).

Since we know: dI_t = 0.dt + B_t.dB_t
we can say that: ∂f/∂x = x
and: ∂f/∂t = -1/2.∂²f/∂²x
From the first of these partial derivatives, we have:
f = x²/2 + g(t)
We can now rewrite the second equality as: ∂f/∂t = -1/2.1
giving: f = -t/2 + h(x)
So overall: f = x²/2 - t/2

Therefore (drumroll), we have: I_t = f(t,B_t) = B_t²/2 - t/2

That population equation

Now we've got the tools in place, let's solve that population equation. First assume that there is some function f(t,x) such that N_t = f(t,B_t)

We know that:
dN_t = g.N_t.dN_t + v.N_t.dB_t
and that:
dN_t = (∂f/∂t - 1/2.∂²f/∂²x).dt + ∂f/∂x.dB_t

Therefore:
∂f/∂x = v.f
f = exp(v.x).p(t)
(I'm using p(t) and q(x) as my dummy functions rather than g(t) and h(x), to avoid confusion between g and g(t))

And:
∂f/∂t + 1/2.∂²f/∂²x = g.f
∂f/∂t + 1/2.v².f = g.f
∂f/∂t = (g - v²/2).f
f = exp((g - v²/2).t).q(x)

So overall we can infer that:
f = C.exp(v.x + (g-v²/2).t) for constant C

In other words:
N_t = N₀.exp(v.B_t + (g-v²/2).t)

QED. This is similar, but not identical to what we'd expect from looking at the traditional differential equation.

How we can use this

Now, just having an equation is not terribly useful. But once we've got it we can start asking interesting questions such as "what is the probability that the population of lemurs will have shrunk at time t?"

P(N_t < N₀) = P(N₀.exp(v.B_t + (g-v²/2).t) < N₀)
= P(exp(v.B_t + (g-v²/2).t) < 1)
= P(v.B_t + (g-v²/2).t < 0)
= P(B_t < (v²/2-g).t/v)

Since B_t ~ N(0,t), this can be worked out using a set of tables. For example if g = 5%, v = 2%, and t = 5, then:
P(N₅ < N₀) = P(B₅ < (0.0004/2 - 0.05)*5/0.02)
= P(Z < (0.0004/2 - 0.05)*5/0.02/5^0.5)
= P(Z < (0.0004/2 - 0.05)*5/0.02/5^0.5)
= P(X < -5.5678)

Past a certain point the population of lemurs is likely to go extinct (although this is not modelled well by the equation - N_t can never reach 0). If the population was declining, and we could figure out values of g and v, we could determine the probability of extinction. We could then allocate our animal conservation money appropriately.

Applying these principles to pension fund capital is left as an exercise to the reader.

Why is that not the full story?

There are several things I've left out here, two mathematical points and two practical points.

Firstly, W_t is not a "real" distribution. If you actually try to work out what dB_t/dt looks like, you find that it is infinite at infinitely many points. That is just plain weird.

Fortunately, there is a mathematical hold-all concept - the "extended distribution" that sweeps all that under the carpet. And since it all works out OK, the specifics aren't really a concern, any more than you need to know Real Analysis to understand differential calculus.

Or are they a concern? Point #2 is that there are actually several ways to define ∫...dB_t, which give mutually exclusive results. Explaining the differences is beyond the scope of this post. Suffice to say that the one we've been using - the Ito integral - is mathematically nice.

Practically speaking, what we've looked at here is only half the battle. As responsible scientists, we'd want to produce sensible values for g and v, and in general confirm that the population is behaving as the equation predicts. This calibration is a nightmarish task that I'm not even going to begin to discuss here.

And finally there's the question of whether our general approach - all founded, you'll recall, on Brownian motion - is remotely sensible. There has been a lot of muttering recently pointing out that extreme events are far more likely than the normal distribution would suggest. It is very hard to produce Brownian processes that can emulate this behaviour, and nigh-on impossible to calibrate. There is a strong suggestion that a broader family called Levy processes might be more plausible, but they are also a bugger to calibrate.

This sort of thing won't be solved any time soon.

References

I learned SC from the textbook by Bernt Oksendal (5th edition). In retrospect this was probably a mistake, as he goes very deeply into the mathematical fundamentals before covering anything remotely practical. If you can get past the first 3 chapters without falling asleep then it's not a bad book, although not terribly readable (even by maths book standards...).

Areté

Wednesday, March 17, 2010

How to do Stochastic Calculus

1 comment:

About Me

Blog Archive

Recent Comments

Blogroll: science and skepticism

Blogroll: Other