In this article I'll briefly explain A/B testing and how to start running small experiments. From what to test, and how long to let it run (so that you get reliable results) to measuring the results and moving forward.
- Start with an assumption that you wish to test
- Decide on how you want to measure success
- Set up a tool to implement and track the results
- Make sure you run your tests for at least 14 days to obtain reliable results
What is an A/B test?
An A/B test is an experiment that compares two versions of a simple page element, for a specific period of time, in order to see which is more effective based on a given assumption.
When your product is new and you have a limited number of users, testing can help you craft copy that resonates, helping to increase your conversion rate at a time when good conversion is vital to the early growth of the product.
As your product becomes more established the same experiments can help you to optimise in order to appeal to new customers and demographics, or convert those who may be dropping out mid-way through your sales funnel.
Decide what you want to test
Start with an assumption you want to test. It helps if your assumption targets a discrete element on the page. A common example of this is messaging on a call to action, or the wording of a key promotional banner. You can hypothesise that changes to these elements could make the difference in converting a user into a customer.
However, it is best to avoid testing more general ideas. For example, it would be more difficult to see whether changing the colour of that same call to action increased conversion rate. This article sum this up quite nicely, saying “If you spend your time testing small things then you miss out on testing the things that consistently bring big increases in conversions”. Testing a more general idea could lead to a result that correlates to a change but is not caused by it - we want to avoid misleading results as much as possible.
Furthermore, we also want to prove causation. In order to prove that the change has caused the result you should test only one element per experiment. Do not change the messaging and the location of the element in the same test. If you change more than one variable you will be unsure whether the results reveal that the new location catches the user's eye or that the user now relates to the new messaging.
By way of an example, at Forward Partners we added an animation to the home page of The Path Forward website that we hoped would delight users. The animation triggered on page load and gently faded the article cards into view. However, we noticed that if the user did not scroll the cards would not load and the page could appear empty. Therefore we crafted the assumption that if we removed the animation and had the cards render immediately then it would improve engagement and users would stay on the site for longer. We set up an experiment to send 50 per cent of the traffic to the original home page with animated cards (the control), and 50 per cent to the home page with the animation removed (the variant) and tracked the number of pageviews for each.
(In the image above - the results of an earlier experiment on an assumption that proved to be correct with a 96% probability to be the beat the control.)
Set up your experiment
If you wish to try out A/B testing and already use Google Analytics (or Segment) I would recommend starting out by running simple experiments using Google Optimize. Once you've included the snippet in the page you can log into the Google Optimize dashboard, and it will walk you through setting up an experiment. It’s quite a simple procedure, but if you get stuck you can usually find the answer to your question in the Support Forum.
(Google Optimize running the Forward Partners experiment I outlined earlier.)
Measure the results
In order to get good results you need to wait until your test has reached statistical significance. In simple terms reaching statistical significance means that we are “very sure” that the results are reliable. It means we have tested our assumption on enough users to say with greater than 95% certainty (an accepted standard for statistical significance) that the variant is better or worse than the original.
Depending on the traffic that your site receives you may need to wait several weeks for the confidence level of your test to reach 95%. Product's with more established customer bases may need less time, however it is still a good idea to let your test run for two weeks, regardless of the level of traffic it receives. Running an experiment for only a few days can produce misleading results. Low traffic or unforeseen anomalies which can pollute your data and lead to unreliable results.
(Google Optimize will help you by suggesting the period the test should run for when you commence the experiment.)
If you want to learn more about statistical significance and the number of users you'll need to test before you reach the right confidence level Optimizely has a brilliant sample size calculator on their website.
Many well established companies run multiple A/B tests concurrently. They make use of their heavy traffic to experiment quickly and improve their design or further optimise their conversion rates. However A/B testing is not simply for established products. Startups with new products and a small user base can benefit from small scale A/B tests. Split testing can provide you with early insights into what resonates better with your users, help you convert potential customers into actual customers, and quickly identify any areas where you may be losing people. So simply start with an assumption, and test something small.