*Edit 14 June 08: I'm marking this post as purgeable because large chunks of it are borderline unreadable, even to me. If anyone finds the subject matter particularly fascinating, leave a comment and I'll rewrite it.*

There is a phenomenon in the educational system known as "dumbing down". It's a least-common-denominator situation: where some people have trouble understanding a concept, you either remove the concept from your syllabus or spend crazy amounts of time explaining it.

The problem this creates is that the "improved" material is far harder for smart people to understand. Case in point: my actuarial study notes...

Actuaries are (ideally) clairvoyant accountants. Whereas a normal accountant can only see what your assets are worth now, an actuary can look at something like the cashflows associated with a pension scheme or insurance policy and make an educated guess about how much those cashflows will be worth in the future. This requires substantial maths to try to pin down the odds of various events (death, illness, car crashes) happening to the individual who took out the policy.

So actuaries have to be quite heavily trained. In England, this training is regulated by the Institute of Actuaries. The IoA run the actuarial exam system. They also provide a sort of syllabus known as the "Core Reading", which is a very terse description of everything examinable.

Problem is, a lot of people who take the exams don't have a strong maths or finance background.So a secondary market has grown up, which is mostly filled by the Actuarial Education Company (ActEd), a for-profit organisation supplying lecture notes. ActEd license use of the Core Reading material from the IoA, and they intersperse it with lots of detailed discussion.

This can sometimes be very useful. For example, where the Core Reading might just list a type of tradeable asset by name, the ActEd notes will provide a detailed description with examples. However, sometimes it can be very very annoying.

Currently I'm trying to read up on Markov Chains. For me, this is not a difficult concept. But revision is going veeery veeery sloooooowly, because for every five lines of actual maths I also have to digest two pages of wordy, confusing, often seriously dubious information. This slows my learning pace down to a crawl, not least because I frequently have to re-read stuff to convince myself that yes, they are bothering to say something that obvious.

If their only problem was overdocumentation of the Core Reading, I'd be unhappy but I'd accept it. I'm aware that many people (cough*economists*cough) don't have my familiarity with mathematical terminology, and these people also deserve some support. But what I can't handle is the fact that the Core Reading is also underdocumented.

Sounds paradoxical? Well, let me explain. The average mathematical proof, as written down, will contain approximately one part non-obvious statements to four parts obvious ("trivial") statements. The trivial bits are just mathematical filler - they make it easy to see the links between the key non-trivial assumptions and the conclusion.

As a mathematician, I would expect ActEd to take the following approach: lay out the whole proof, devote a fair amount of time to justifying the key assumptions, and if necessary spend a smaller number of column inches dissecting and rephrasing the rest of the logic.

Apparently ActEd disagrees. Their main approach appears to be: go through the proof step by step (no overview), ignore any steps that would take too long to explain, and go into mind-numbingly overwrought detail on the bits that economists might possibly be able to get their heads round. It's a sort of triage approach.

For example, one section of the material I'm working on is devoted to "time-homogenous Markov jump processes". This is a very technical name, but the concept is simple. Imagine a system (say a person) that can be in a number of states (say healthy, sick or dead). Specify the rates of transition between the various states (for instance the rate of transition from dead to healthy is zero).

One problem we often need to solve is: what are the odds of staying in a certain state S for a certain length of time T. Now it's fairly easy* to find out the odds of a person starting in one state (say healthy) and being in that state again in e.g. two years' time. But that doesn't take into account the possibility that they might have skipped between states (healthy->sick->healthy).

One approach to calculating this is as follows. Define a set of events** B

_{k}. B

_{0}is the event that the system is in state S at times 0 and T. B

_{1}is the event that the system is in state S at times 0, T/2 and T. B

_{2}gives S at times 0, T/4, T/2, 3T/4 and T. And so on, with each new event doubling the number of times after time 0 that the system must be in state S. All of these are relatively easy to calculate.

Notice two other things. Firstly, event B

_{k}"contains" event B

_{j}if j is less than k. For example, you can't hit S at times 0, T/2 and T (event B

_{1}) without automatically hitting S at times 0 and T (event B

_{0}).

Secondly, the event that the system stays in state S until time T (the thing we were having trouble calculating) is the union of all the events B

_{k}. You achieve stasis (an event we'll call {T

_{0}>T}, meaning that the first transition time is later than T) by achieving all the events B

_{k}.

This is hard to see, but imagine if the system hopped out of state S for just a minute then hopped back in again. There would be some value of k large enough that a multiple of 0.5

^{k}would fall into that minute. So if you leave state S, you forfeit that event, and therefore you forfeit the entire ensemble of events.

So we can now outline a method for calculating the probability of {T

_{0}>T}.

Step 1: {T

_{0}>T} = U

_{0-∞}(B

_{k})

The U symbol means a union of events, and the ∞ means that you include all such events right up to k=infinity.

Step 2: U

_{0-∞}(B

_{k}) = lim

_{n→∞}(U

_{0-n}(B

_{k}))

This means that, if you take a finite union (up to n) and then let n tend to infinity, you get the same result as just going straight to infinity.

Step 3: lim

_{n→∞}(U

_{0-n}(B

_{k})) = lim

_{n→∞}(B

_{n})

This follows from our earlier note that each of these events contains all lesser events. If you take the union of a finite number of events, you just get the largest of those events.

Step 4: Therefore P({T

_{0}>T}) = lim

_{n→∞}(P(B

_{n}))

P(X) just means "probability of event X taking place".

We now have a nice elegant approach to calculating this nasty value P({T

_{0}>T}). We can just find a formula for the values P(B

_{n}), and then see what happens as n gets bigger and bigger. The limiting value of this series will be the probability we're after.

Now, this was not an easy proof for me to explain. But allow me to point out three things:

1) Anyone who's got this far in the course will already have a thorough understanding of terminology like events, unions, probabilities, etc. ActEd doesn't need to explain that at this point.

2) The ActEd notes focus entirely on the symbol-manipulation steps 2-4. These steps are basically trivial - they shouldn't require much more than the line or two of elaboration I gave them. The mathematically interesting step 1, which

*should*be analysed in detail, is completely ignored.

3) And yet they still took more words to describe this proof than I've used in this entire essay!

Reading these notes is giving me such a headache.

* Trust me on this. Or alternatively complain loudly and I'll provide you with a more detailed explanation.

** An event is a thing that may or may not happen, depending on the random behaviour of the system being studied

Read the full post