Actually, let me say that I’m going to betray the title of this post right away and say that to me these results aren’t surprising at all. However, if you read some of the info by the “experts” then stuff like this isn’t supposed to ever happen.
The reality is that I’ve tested thousands of actions over millions of visitors and the only constant is that sometimes really strange things happen when you split test. I suppose in a perfect world we could look at the effect of everything from what’s on TV to the tidal forces of the moon when analyzing data…but for now, all we get is what our split testing software tells us.
One thing is for sure- most people don’t let their tests run long enough to come to any kind of meaningful conclusion.
I recently read a product on split-testing where the author declared one version a “decided winner” after it completed only 7 actions because the next closest result was only 5. Wow. Really? (sadly, this product cost hundreds of dollars…this is what the “pros” are selling to unknowing beginners)
In the example I’m about to show you it will become clear that even after 100 actions things can change.
I recently ran a new split-test campaign, below are some “snapshots” throughout its run. The columns below represent three different landing pages, left to right. The number represent sales or “actions”.
A B C
03 06 07
20 27 29
At this point nothing exceptional has happened. Page C is in the lead but not with a huge margin. In comparison to A though, C is showing strong. Carrying on…
A B C
50 62 67
I particularly like the above numbers. They show a clear leap by page C, especially over A. 67 compared to 50 is definitely statistically significant and would make me feel safe choosing C as the winner. At this point 90% of people would remove A and either start a new test between B and C or declare C the winner and move on to another test. This would not be bad or wrong…but let’s see what happens.
A B C
84 89 95
100 105 110
124 128 134
What do we have here? All the pages are keeping their respective positions but page A is making a comeback. Those last numbers still favor page C but nowhere near as strong as before. The final numbers are even more surprising…
A B C
150 152 154
162 162 162
Wow! What the hell happened?
What happened was that after about 10,000 views each page is converting at EXACTLY the same, or 1.62%
I see this kind of thing all the time but the “experts” are calling tests done at only 7 actions. What does this mean? Well, it means that statistically there isn’t probably much difference between any of these pages. Strange, huh? If we run tests long enough it can be interesting to see the gaps close.
Based on the above, if I had to I would still choose page C over the others. It enjoyed a steady lead over the other pages throughout the test even though it was caught in the end. I would bet that if I ran the test longer that C might regain its lead. Then again, I’ve seen results like this where page A would take off and become the leader. In short, this test really just shows how strange results can be.
If in doubt, test more but more also test longer. Don’t be anxious to close a test quickly just because it’s giving you a result you like. True split-testing requires a scientific approach.
What do you think?
Update: Page A is now the winner with 176 actions. Pages B and C sit at 173 and 174 respectively.
Update 2: This is my last update as I will now be moving on to testing other things. As I mentioned, you can never be too sure…final numbers below:
A B C
188 186 184




July 12, 2009
Marketing, PPC