Why Split Testing is Killing Your Conversion Rate

If your conversion rate is flat-lining and you’ve tried virtually everything to stop its downward spiral – split testing could be to blame.

Yes, split testing – that incredible flip-switching, conversion-boosting, audience-wooing strategy that boasts major lifts – but seldom delivers on its promises – could be the very issue that’s causing even the best-planned campaign to fall on its face.

Now, before you grab your digital torches and pitchforks, I want to clarify right from the start that the idea of split testing itself – discovering your customers’ preferences, pushing their psychological hot-buttons and giving them the best possible experience from start to finish – is a solid and sound strategy. One that every business should wholeheartedly embrace.

But unfortunately, many split testing services only give you one piece of the puzzle. Let’s dig a little deeper to see what’s causing the problem and how you can fix it – starting today.

The Big Split Testing Myth

The premise is simple: create two (or more) versions of a page. Divide traffic equally between them and gauge the results. But here’s where things get real and the flaws in the system are exposed.

You’re supposed to automagically know what to do with those results, when in fact the information you’re collecting comes in the form of a piecemeal package that tries to convince you that one option is better than another.

As an example, SumAll ran an Optimizely test with two of the exact same versions of their pages. Optimizely predicted the first version was beating the second by nearly 20%.

optimizely-test

(Thankfully, Optimizely has since reworked their statistics system to prevent gaping flaws like this one from muddying the results).

But this illustrates a major issue in split testing – which is that you can predict large, grandiose upticks in performance all you want – but if those don’t translate into actionable steps that have a direct effect on revenues, they might as well be useless.

The Sneaky, Statistical Secret “They” Don’t Want You to Know About…

At the risk of sounding too much like the conversion optimization version of Kevin Trudeau, most split testing and optimization companies aren’t out to vacuum your bank accounts and leave you with nothing to show for it. It’s not a giant, covert operative designed to feed you “feel good numbers”.

But there is a wrench in the works when it comes to the kind of split testing done by these various platforms. Some, like Visual Website Optimizer and Optimizely, do what are called “one-tailed” test, while others, like Adobe Test&Target and Monetate, do “two-tailed” tests.

So what’s the difference between them, and why should you care?

SumAll brought this to light in their above piece, and the implications can have a considerable effect on how you split test. To better illustrate the idea—

A one-tailed test only looks for positive correlations. It’s like always winning at the blackjack table. Executives who don’t understand (or don’t care about) the finer points of split testing love one-tailed tests, because they seem to always throw out big wins. It sounds great for the site and the customers, not to mention job security – but there’s a big issue, in the form of all of the other statistically-significant things that didn’t make it into the analysis.

And that’s where a two-tailed test comes in.

1-2-tailed-test

A two-tailed test measures both the positives and negatives, along with anything else that happens to come up in the process. You need more data this way to reach a conclusion, but the conclusion uncovers many different points that a one-tailed test doesn’t look for.

But that doesn’t always mean that a two-tailed test is the right answer. There will be times when you want evidence to correlate a positive change with a specific test. And one-tailed tests require far less resources to conduct. It’s just important to know what you’re measuring and how much of it is really having a direct impact on your bottom line and your conversion rate – and how much of it is just made to impress.

The Mountain of Statistical Significance

I’ve written in the past about statistical significance, particularly why it matters and how it affects your tests.

Many split tests tend to nudge you toward calling a winner before enough information is collected. Even if the test picks a winner with 90% statistical significance, and you implement the change(s) on your website, of course your audience is going to observe it; especially if they’ve visited previously, because something’s different. That may trickle down into your conversion rate, but only temporarily.

And what if you don’t have enough traffic hitting the mark to consider a test statistically significant? Rather than tell you the honest truth that you need more visitors to really see a measurable result, most services will level the playing field and tell you that you can get by with “just 100” users or some other abysmally low number.

It’s not fair to you, and it’s doing a huge disservice to everyone involved.

So What Can You Do?

This piece isn’t meant as a giant swipe to conversion optimization platforms. But it is meant to inform you as to what you’re really getting. Thankfully, there are some steps you can take to make your split tests really work for you and get you the kind of results that truly and fully impact your conversion rate – consistently.

So when it comes to split testing, how can you know if your tests are really giving you deeper, more substantial results rather than simply scratching the surface?

You can still use your favorite tool to create a test, but to truly see how well it performed, use the Kissmetrics dashboard to create an A/B test report. Kissmetrics integrates with a wide variety of split testing platforms. Then, simply create a new report and select the experiment and the conversion event.

load-ab-test

Let’s assume we want to see which free trial offer brought in more paying subscribers. We’ve let our test run, and now it’s time to load the results:

run-report

Right now, there’s not much you can see that’s different than what the split test creation/testing platform gives you. You see common things like how long the test was run, how many visitors there were, and how many conversions between the two. From this example, we can clearly see that the variant beat the original.

Time to switch everything over to the variant, right?

Wait – we’re not quite done yet.

This is where most split test results pages end – but we’re just getting started.

Because we wanted to see how many people we actually got signed up as a result. To check this, we switch the conversion event from “Signed Up” to “Subscription Billed – New Customer”.

animated-gif-ab

Click “Run Report” and we’ll see that surprisingly, our original test beat out the variant for the actual number of subscriptions sold to new customers. So while the variant beat out the original on free trial sign-ups – the original did a better job at converting those sign-ups into paying customers, which is the kind of conversion event we really want in the end.

The A/B Test Report is a great way to know which one of your tests really converted in the end. And you can do so much more with it, including testing for outcomes that are outside your sales funnel – like this one.

See a quick demo of how the A/B Test Report works by watching the short video below.

Segment First, Then Test

Rather than conducting a blanket test with all users, segment them. Visitors versus customers. Mobile customers versus desktop customers. Visitors who subscribed via email versus non-subscribers. The list is endless. This way, you’re making a true apples-to-apples comparison and getting more legitimate results.

Run The Test Again

It might seem counter-intuitive, but a lot of these so-called lifts can be an illusion. Running the test again, or running it for a longer period of time, can help you get a better focus on how – and even if – the results are significant.

In his guide “Most Winning A/B Tests Are Illusory”, Martin Goodson, research lead at Qubit, makes the case for statistical power while eliminating false-positives and other points that can muddy the waters of your split test.

It’s a short, but nevertheless enlightening read, particularly if you don’t come from a statistics background (like me!).

Overcome Assumptions

A lot of A/B testing is grounded in a hypothesis. If we change X, will it have an effect on Y? But in the process, there are lots of assumptions being made. For example, you might assume that your visitors are all coming from the same background – all young moms or avid fishermen or gym-goers. And that might result in huge lifts – for the wrong target audience.

All of these points shouldn’t make you want to give up on split testing entirely, but rather realize what you’re getting and what to really dig into and look deeper at so that you don’t just get big numbers – you get big numbers that translate into definitive, long-term increases across the board.

What are your thoughts on today’s split testing platforms? What about the tests you’ve conducted? Share your thoughts with us in the comments below.

About the Author: Sherice Jacob helps business owners improve website design and increase conversion rates through compelling copywriting, user-friendly design and smart analytics analysis. Learn more at iElectrify.com and download your free web copy tune-up and conversion checklist today!