A while back, I mentioned that I'd hopefully be doing some trials with a friend who claims the existence of psychic powers etc. Yesterday we finally got round to doing some preliminaries.
The tests will be a lot less complex than I'd expected because, rather than testing for psychic communication between two believers (my friend and a third party), we're going to test his ability to channel energy through a pendulum into my hand. It'll be a straightforward "which hand am I holding the pendulum over" thing.
Protocol 1 (non-rigorous tomfoolery)
Our initial experiments were not terribly promising. I thought I could feel something the first time we did it, and guessed correctly. So far so good. However, I suspected that the "something" I could feel was in fact heat off my friend's hand, so for the second and third runs I covered each of my hands with a sheet of paper. I got both wrong.
Protocol 2 (slightly more rigorous)
At this point, my friend commented that, when he'd been mucking about with a fellow believer, the first run in any given sequence had generally been the most successful. He speculated that, after that point, there was some kind of psychic residue contaminating the experiment that took a while to wear off.
To eliminate this factor, we arranged a new protocol: every time we see each other (about once a week), we'll repeat the test once. For the moment, the only specific precaution against bias will be closed eyes and paper-covered hands. After five runs, we'll check the tally of results to see whether there's any statistically significant effect. If there is, we'll up the rigour. If not, we'll investigate other test options.
PS. To my friend's credit, he didn't use the "psychic residue" as an excuse for failure. In fact I had to persuade him not to include the initial negative results in the final tally.
What is a statistically significant effect?
The basic approach used for statistical testing is "significance levels". If something is "significant at the 5% level", that means that the chances of getting a false positive (an apparently significant result that appeared by accident) are 5%.
If there is no psychic effect then, over five trials, the probabilities of success are as follows:
P(5 correct guesses) = 1/32 = 3.1%
P(4 correct) = 5/32 = 15.6%
P(3 correct) = 10/32 = 31.2%
P(2 correct) = 10/32 = 31.2%
P(1 correct) = 5/32 = 15.6%
P(0 correct) = 1/32 = 3.1%
If we wanted to do a significance test at the 20% level, we would say that the result were significant if 4 or 5 successes appeared (since P(5)+P(4) < 20% < P(5)+P(4)+P(3)). This is a pretty damn easy hurdle to pass, so if we don't get 4 successes then there's probably not much point carrying on with this protocol.
If we wanted to do a success at the 5% level, we would say that the results were significant if 5 successes appeared (since P(5) < 20% < P(5)+P(4)). This is a slightly tougher hurdle - if we pass it (e.g. if we have 100% success rate) then it'll be worth applying stronger controls.
Protocol 2 scoreboard
Date: 1 June 08
Correct guesses: 0
Incorrect guesses: 0