Wednesday, September 20, 2006

The joys of jobhunting

Today I had the interview (flowerily termed an "assessment centre") for the company I mentioned previously. Thought it went quite well, although the other applicants were scarily good.

Actually, it was the most challenging interview process that I've had yet - and I thoroughly enjoyed it. First there was a test. I'd assumed it'd be a standardised test. Boy was I wrong. They asked us everything from complex actuarial case studies, through our thoughts on the company and role, to translating random Esperanto phrases (based on their homologies with other languages - we weren't expected to speak Esperanto).

Another stage was to produce and perform a ten-minute presentation on any subject. Now, being a skeptic by trade, I figured there might be a way to leverage my wasted time online here. I ended up doing my presentation on "Cranks and how to spot them". There was one restriction I had to deal with, though - I didn't know how my interviewers felt about any of the cranks I was planning to dissect, so I was forced to stick to marginal stuff like astrology, dowsing and the Hoxsey treatment. In particular, I felt unable to risk going after the big guns of Creationism. I thought there'd be too high a chance of my interviewers turning out to be Hovind devotees.

Or at least I did until 30 minutes into the test, when I turned a page to see the question:
Describe how evolution operates by natural selection.


I WANT THIS JOB DAMMIT!!!
Read the full post

Tuesday, September 19, 2006

CSI: the explanation

(No I'm not talking about Crime Scene Investigation, fool!)

OK, so I really should be focusing on the Protein Challenge. However, being easily sidetracked, I got thinking about Dembski's concept of CSI - Complex Specified Information. It's a surprisingly hard concept to understand, not least because AFAICT Dembski makes it as difficult as possible to do so.

The basic principle is that evolutionary processes aren't supposed to be able to produce structures that are improbable (for a sufficiently well-defined value of "improbable"), complicated (ditto) and specified. The concept of a specification is basically an attempt to extend the basic probabilistic concept of an event to handle post-hoc reasoning. A specification defines a target space of possible things that can happen.

The argument goes as follows:

1) Chance processes tend not to give results that are both unlikely and specified. So, for example, drawing 13 cards and getting all spades is highly unlikely, and you'd rightly assume that someone had tinkered with the deck.

2) Natural processes (regularities) tend not to give results that are both complex and specified. For example, though a snowflake may be complex, you won't get the same one twice.

3) Hence, anything that is complex, improbable and specified is most likely the result of intelligent intervention (nb. human brains apparently don't qualify as natural processes).

There are some problems when you try to extend this to evolution and genetic algorithms and so on - both are quite capable of generating complex, improbable, extremely useful systems. Dembski gets round this by saying that genetic algorithms can generate CSI if and only if the target space associated with the specification represents a local optimum of the fitness function. GAs work if and only if the problem you feed them (fitness function) is actually the one you want solved (specification).

That's why examples like the infamous "methinks it is like a weasel" work - the specification we choose (the text) is 'coincidentally' identical to the optimum of the fitness function. Dembski, if I understand correctly, points out that, unless we select our specification to correspond to the fitness function we're using, we still won't generate CSI. We'll have complex information, but it won't match the right specification. As such, he feels justified in saying that, in feeding the GA the right problem for our desired solution, we're "smuggling" CSI into the system.

The problem here is that, looking at the "fitness function" to which real-world genes are exposed, we see that it's basically something along the lines of "ability to survive and breed". In that context, the ability for a gene or combination of genes to produce something like a flagellum would certainly be of value for survival, and hence could represent a local optimum of the fitness function. The flagellum could evolve despite its CSI, because evolution would be selecting for the same underlying trait that we're basing our specification on - ability to live long and prosper.

Thus, simply by basing his specifications on the functionality of a system, Dembski is setting up a range of target spaces that evolution can quite definitely find. It's something of a Texan Marksman issue - Dembski is running round painting targets around all the areas that evolution by natural selection is naturally inclined to hit.

Key terms:

Complex - refers to Kolmogorov complexity, best thought of as a measure of how easily a system can be described. So, for example, "AAAAAAAAAAAAAAAAA" would be low-complexity, "AABBCCDDEEFFGGHH" would be higher, and a random string like "BJECDWYIVFYUEUBUFIIHI" would be highest.

Information - refers to Shannon information, also known as the "surprisal" of a system. So, for example, "EEEEEEEEEEEEEEEEEE" would be fairly low-information because E is a common letter - it doesn't surprise us to see it. "I LIKE FISH" would be higher, as not all of its components occur with such frequency. "XXXXXXXXXXXXXXXXXXXXXX" would be very unexpected (except in the context of really strong beer) so gets a high "surprisal" value.

Target space - refers to a set of states that we'd like the system to end up in. So, for example, the target space of a system composed of lots of bits of wood might be a bookshelf.

Search space - refers to all the states that a system could end up in. So, for example, the search space of a system composed of bits of wood could include both bookshelves and mere piles of planks.

Specification - a simple delineation of the target space. For example, the specification "bookshelf".

Genetic algorithm - a program that attempts to imitate evolution in a model system.

Fitness function - something that allows a GA to tell which of a group of organisms is the most "fit". In the real world, the primary attributes of the fitness function are ability to survive (natural selection) and ability to attract mates (sexual selection).

Local optimum - an area of the search space where there are no small changes that can increase the corresponding organisms' fitness. If you think of fitness as corresponding to height on a graph, the local optima are the peaks of the resulting mountain range.
Read the full post

Friday, September 08, 2006

Overdoing it

OK, so recently I've been looking for a job (God hel^W^W Allah sav^W^W somebody stop me!). This has been scary - it'll be my first job since graduating.

One company I've applied to is an actuarial-type firm, although they seem a bit less straight-laced than most. In particular, they obviously like to feel they're giving their applicants a challenge, an attitude that resulted in me getting this request in my inbox:

In order to progress to the next stage we need to be impressed. In ANY medium of your choice, using no more than 100 words, please convince us why you should be selected for the {company name} Graduate scheme? (Please note that no further guidance will be provided).


This is not a request you make of Lifewish if you value your sanity. I am the master of overkill.

First, the 100 words. Any medium, hmm? How about Haiku? Ten stanzas of haiku, to be precise, detailing all the excellent qualities I have to offer.

Of course, it'll be a few minutes before they figure out it's in Haiku. I locked it with a challenge. The job application process should be as much about the company as the applicant.

And for the challenge? Well, what could be a better continuation of the Japanese theme than a Sudoku problem? Which I generated using code I wrote from scratch. It's got the company initials picked out in numbers in it.

Of course, I couldn't have people peeking at the answer, so I encrypted my 100 words with a Vigenere-style cipher, using the Sudoku board as its key. That'll teach 'em to ask for experience of VBA and Excel.

Some people claim I overreact to challenges. I don't see it, myself.
Read the full post