Glossary
3.1. Data Gathering & Analysis
3.2. Defining the hypothesis
3.3. Prioritizing hypotheses
3.4. Creating variant
3.5. Running the A/B test
3.6. Measuring the results
3.7. Implementing the variant
4. CASE STUDY: Where to start A/B testing
FAQ
The scientific approach to comparing two or more page versions to identify the best performing one.
The number of users required to conduct the A/B test to reach statistical significance.
An assumption that page changes will result in its performance improvement.
Statistical proof that A/B test results are not due to random chance.
The existing version of your website's page that you want to improve.
A visual representation of user interactions with the website.
An improved version of the website's page based on the hypothesis.
Review of the website by CRO analysts to identify problems users might face.
Over the last decade, successful eCommerce businesses have been perfecting their websites using lean process optimization as part of their company strategy. For many of them, A/B testing has proven to be a trusted go-to technique.
of companies perform A/B tests on their websites
of companies run two or more A/B tests per month
of companies believe A/B testing is highly valuable for conversion rate optimization
In order to maintain a competitive edge, most online stores and apps constantly find themselves looking for ways to improve their customer experience. At the end of the day, this is essential for enabling conversions and, ultimately, increasing sales.
In this guide, we dive into the benefits of website optimization using A/B testing, and go over the main steps necessary to set up an effective testing procedure. Gather insights on A/B/n testing to make sure you stay ahead of the competition.
Our success at Amazon is a function of how many experiments we do per year, per month, per week, per day…"
Jeff Bezos, CEO at Amazon
If it can be a test, test it. If we can’t test it, we probably don’t do it.
Stuart Frisby, Booking.com
A/B testing is a scientific approach to comparing two or more page versions in order to identify the best-performing one.
The variant includes any changes to the page you need to test – for example, changes in the copy or functionality alterations. In the best-case scenario, you should be able to attribute KPI increase or decrease to a specific change on the page. Therefore, it is recommended to test different versions of the same element: one headline option vs. the other, video vs. image, two locations of one element on the page, etc.
The basic concept of A/B testing:
When the A/B test is set up, your website shows two versions of the same page simultaneously. Site visitors randomly see either control or variant on their device, and user interaction with both variants gets recorded. Later this data can be analyzed in order to identify which version performed better.
As soon as visitors open the page, they enter the experiment. Their browser cookies are registered in the system along with the information of which page version was shown to them. This ensures that during the experiment users always see the same page version as they saw the first time they visited the website.
Businesses often find themselves asking the same question: “Why should I invest in A/B testing?” In fact, there are several benefits of introducing A/B testing to your optimization program:
The introduction of radical changes to your website can sometimes cause more harm than good. A/B testing helps ensure that the changes causing drastic KPI decrease will not be implemented.
Each new A/B test brings additional data about your website visitors. This helps understand how users respond to certain page elements and makes it easier to come up with more robust and data-driven optimization hypotheses.
A common trend among businesses is to make decisions based on the highest-paid person’s opinion (HiPPO effect). Even though on many occasions this may be justified, human beings are vastly influenced by irrational factors such as biases, previous experience, and intuition. This, in turn, has the potential of affecting the ability to make the best choice on the spot. With A/B testing decisions are more likely to be made on science rather than gut feeling.
Company employees may grow reluctant to generate and share ideas if they don’t see them implemented over time. A/B testing allows you to safely test even the craziest ideas, and let the data determine which ones will be applied. Thus, everyone feels involved and heard.
Further on, we'll walk you through the most important steps that you need to take in order to run successful A/B tests.
Continual eCommerce business growth relies heavily on how well your processes are aligned. Testing multiple random ideas at once will hardly result in continuous improvement of your website and growth of your business. At the same time, a clear and well-considered optimization workflow can and often does bring a significant KPI increase.
Representation of A/B testing process:
A/B testing process can be divided into 6 cycle steps:
Step #7 is taken based on the results of each A/B test you perform. This means that the changes are only implemented if they are approved and proven to be effective.
In the following chapters, we'll dig deeper into each step.
Before proceeding with any optimization measures, you need to make sure that you have the data to guide your efforts. This includes understanding who your users are, how they interact with the website, and what are the main optimization opportunities (i.e., where you’re losing money).
You can achieve this by quantitative and qualitative data gathering and analysis.
Quantitative data allows you to identify various bottlenecks across your website. To get your hands on this information you need to first set up a digital analytics tool, such as Google Analytics, Adobe Analytics, Yandex Metrica, or any other tool that fits your needs. Next, you need to analyze the data collected in your current digital analytics tool.
In general, to achieve accurate results, we recommend performing analysis on the following data set:
Our personal favourite report to analyze is the shopping behaviour report. It shows the shopping stage in which most of your visitors leave the website.
Shopping behaviour report:
TIP: Analyse data using different segments, splitting your users into smaller groups to identify specific patterns, e.g., separate data for each device type using segments.
After completing the quantitative analysis and identifying the main areas of improvement, it is time to proceed with the qualitative data research.
Qualitative data helps you see what kind of problems visitors face on the website. This research will depend on resource availability. Some of the most efficient ways to gather this data are as follows:
The data that you have gathered and analyzed in the previous step should give you an idea of what is stopping your website visitors from making or closing the purchase. These assumptions should be turned into hypotheses that describe possible solutions. Hypotheses are commonly defined using the following formula:
By implementing change A, we expect metric X to increase/decrease.
For example, the hypothesis could be:
Due to the low cart-to-checkout rate, we assume that if users see a prominent notification that the product was added to the cart, they will proceed to complete the purchase, hence cart-to-checkout rate and conversion rate could increase.
The hypotheses you have come up with will be A/B tested at a later stage. Given that you have multiple hypotheses it is important to decide which ones are more likely to bring higher ROI. These should be prioritized and tested during the first iteration.
Here comes the interesting part. Let’s assume you have come up with 10 hypotheses for your website optimization. In order to prioritize them and decide which ones to test first, you need to rank them 1 to 10.
Note: You need to verify that your hypothesis is valid for A/B testing, i.e. it can generate 300 – 400 conversions per variant, per segment that interests you during the experiment. If this threshold isn’t met, running the test might be ineffective, as there is a high chance of getting false-positive results.
While there are multiple prioritization methods online, below we will focus on one particular approach that has proven to work reliably – including, for our team.
The approach consists of 2 steps:
• Funnel-based prioritization
• Evidence-based prioritization.
First, you need to determine which of the shopping funnel steps each hypothesis is related to. eCommerce websites usually comprise the following 5 steps:
Hypotheses that are the closest to the end of the funnel are of the highest priority.
Funnel-based prioritization is significant in identifying the highest potential ROI of the test hypothesis. The deeper your prospects are in the shopping behavior funnel, the more interested they are in a product, and more likely to make a purchase.
For example, you might have a hypothesis about the Homepage (1st step in the shopping behavior funnel) – it'll have the fifth priority, whereas a hypothesis related to Checkout (the last step in the funnel) will have the first priority.
Funnel-based prioritization table example:
Hypothesis | Funnel step | Priority |
---|---|---|
By making the Checkout form enclosed, i.e., isolated, more users will complete the purchase, hence conversion rate will increase. | Checkout - 5th step | 1 |
By changing CTA "Proceed to Checkout" into a more noticeable color, more prospects will proceed to Checkout, hence cart-to-checkout rate will increase. | Shopping cart - 4th step | 2 |
One shopping funnel step can have several related hypotheses. Thus, additional prioritization may be necessary to rank hypotheses within the same funnel step. This is where the so-called evidence-based prioritization comes in handy.
Evidence-based prioritization is inspired by PXL methodology, and the idea is to determine and prioritize the hypotheses that have the most data to back them up. By contrast, hypotheses that are based mainly on intuition, should be tested last.
Evidence-based prioritization consists of 7 questions:
The first 6 are Y/N questions, and the hypothesis receives 1 point for every ‘Yes’ and 0 points for every ‘No’. The 7th question is answered by calculating hours: up to 4h – 3 points, up to 8h – 2 points, up to 16h – 1 point, more than 16h – 0 points. The hypothesis that receives the highest score should be tested first.
Now that you’ve ranked and prioritized your hypotheses, you are ready to start preparations for the test. Test hypotheses with the highest results first.
As you are developing the variant, make sure the design changes are communicated clearly enough. All parties involved – stakeholders, developers, etc. – should be on the same page as to what the new design looks like.
When the design is ready and approved you can proceed with implementing the test. In most cases, it can be executed using ready-to-use A/B testing solutions. Overall, the implementation of the test will depend, among other things, on its difficulty, the resources available, the tools you are using to run the test.
Most tests are executed by applying ready-to-use A/B testing tools. There are multiple testing tools available on the market.
The most popular tool according to BuiltWith stats is Google Optimize. Developed as Google’s own tool for website experimentation, Google Optimize offers two versions: free Google Optimize Basic, and paid Google Optimize 360.
Google Optimize offers two versions:
The main differences in Google Optimize Basic and Optimize 360 are:
On top of the basic A/B testing, both versions offer 2 more experiment types:
Redirect testing is a form of A/B testing, where separate pages are tested against one another with the variant being defined by URL or path. This type of experiment can be helpful when testing a complete page redesign instead of changing only several of its elements.
Multivariate testing (MVT) is another form of A/B test, where two or more elements are tested simultaneously to see which element combination will perform best. Thus, instead of finding out which page variant performs best, you are able to find the most beneficial combination of several page elements.
In general, it is fair to say that Google Optimize Basic covers the needs of most small to medium-sized businesses, while enterprise-level businesses will find Google Optimize 360 more suitable for their needs and budget.
MVT example:
At this point, you should have identified both the type of test you are planning to run, as well as the testing tool you are going to use. The only two things left to do before you launch the test are calculating test duration and sample size.
A/B test duration is the timeframe recommended for running the test in order to collect the data about each version’s performance. The requirements it has to meet are as follows:
A/B test sample size is the number of participants required to make valid decisions about the results of the experiment.
Calculating both of these indexes is fairly straightforward and can be done using any of the free A/B test calculators. One of the better options is CXL’s sample size calculator, but there are multiple others available online.
Calculating this will let you know exactly when you have enough data to end the test and start analyzing the results. Now, feel free to launch the test and wait for the results.
After the test has run for at least two business cycles and reached the required sample size, it is time to close the test and evaluate the results.
Most A/B testing tools have built-in reporting capabilities. However, these standard reports often lack the capacity of in-depth and segmented analysis. This makes it difficult to evaluate test results correctly and make the right conclusions.
An important side note is that A/B test result evaluation is rooted in statistical analysis. While there are multiple tools that allow automating parts of the calculation, having an understanding of at least the basics of statistics definitely helps. Let’s take a look at some of these core concepts.
Statistical significance is the statistical proof that A/B test results are valid and not based on a random chance. Statistical significance reflects how confident you want to be in your test results. In A/B testing it is common to set a significance level to 95%, which allows only a 5% chance of error.
Statistical significance depends on two factors:
Calculating statistical significance is a two-fold process. On the surface level, there are multiple ready-to-use online calculators that will evaluate test results and calculate whether statistical significance has been reached. Nevertheless, it takes additional expertise to see correlations draw conclusions from numerical data. As a rule, in the Web/eCom context, this is a responsibility of a dedicated CRO team.
Usually, within the test, several objections are tracked. Let's look at a real example. We ran an A/B test recently for the 2-step checkout, with the following hypothesis:
“Many users assume that upsell block is unexpected products in their cart, therefore by removing the upsell block from the Checkout more users could proceed to the 2nd step (Review) and therefore more users might complete a purchase.”
Control and variant of the experiment:
For this hypothesis, we tracked the 3 following objections:
Results in Google Optimize showed that in terms of transactions the variant performed better, but it wasn't enough to make the right decision. There can be a case when the number of transactions increases due to removing upsell, but the revenue actually decreases, as there's a risk of decreasing avg. product quantity in a transaction, when removing the upsell.
Hence, each objective should be evaluated thoroughly before making any conclusions. However, evaluating only the objections you set before the experiment, is still not enough to make correct conclusions.
It's important to complete the segmented analysis. There might be several specific segments that your test should be analyzed under. Despite those, you should always analyze A/B test results for each of the device categories separately, i.e., what worked brilliantly on a desktop could look bad or not work at all on mobile and vice-versa.
To sum up, it's essential to always evaluate all objections based on device type and any other test-specific segments.
This is the step where you should have completed your test, evaluated the data, discovered the results of the experiment, and drawn conclusions. But before you can proceed with the implementation, you need to make sure that the changes are validated and accepted by various parties, such as shareholders, developers, and other teams involved in the process. This means that the results of the experiment have to be clearly communicated. This is done best with the help of data visualization tools.
At first glance, showing multiple rows of data might look credible and acknowledge the amount of analysis you did, but it will be very hard to understand what this data actually means.
Consider the two report examples below:
While table data representation is commonly used and looks credible, it lacks readability, making it difficult to make sense of on the fly.
Result visualization using Excel:
The second example presents the same information but does so in a much more comprehensible manner. Another benefit of such a report is that it can serve the purpose of test documentation over time. This makes it easy for all parties involved to go back to it and see what and when was tested and what outcomes were achieved.
This particular example was created using Google Data Studio. For those of you who are interested, we have prepared a detailed guide on how to create live-mode dashboards for A/B test results.
Evaluate monetary gains for the variant and each of the segments during every test analysis, as it will clearly show how much did the variant earn or lose. Additional revenue is the strongest argument there can be in persuading management to implement the changes tested with the variant.
After the test is evaluated and decisions are made, it is time to go back to the first step of the iterative A/B testing process, i.e., analyze new data, review the hypothesis and kick off a new test.
How can you tell whether the changes tested in the variant are ready to be implemented permanently?
The answer lies on the surface: you are good to proceed if the test results clearly suggest that the variant has outperformed the control, and an agreement has been reached to implement the changes. However, keep in mind that the experiment doesn’t end here. You still have to make sure you
monitor the objectives
set during the experiment, even after the changes are permanently implemented. This will help ensure that the improvements are real and sustainable, rather than influenced by external factors during the experiment, such as seasonal demand changes.
There are plenty of lists with 'go-to things to A/B test' out there on the Internet, but be careful with those. They're good for brainstorming sessions and finding the direction you'd like to move forward in, but remember to tailor those suggestions to your business!
To make sure you run valuable tests and to increase the possibility of them being winning tests, each tested hypothesis should be based on prior data analysis.
Hence, to get you inspired, we would like to share case studies of the A/B tests we've run:
During quantitative analysis for one of our clients, we noticed a very high drop-off from the Shopping cart. Based on the session recordings and heatmap analysis, we concluded that drop-off this high could be caused by the upsell block in the cart, as we saw many users trying to remove products from there, assuming these are unexpected and unwanted products in their Shopping Cart.
Due to the high cart abandonment rate and heatmap analysis, we can conclude that many users assume that upsell block shows unexpected products in their cart, therefore by removing the upsell block and adding a progress bar to the Checkout, more users could proceed to the 2nd step (Shipping) and therefore more users might complete a purchase.
- Removed cart alike upsell block
- Added progress bar
- Hid promo code under expandable field
- Added “Proceed to Payment” button.
Cart-to-Checkout rate, Transactions, Revenue.
The new Checkout design had a 7.25% higher conversion rate, and this resulted in USD 31 798,26 higher revenue in 3 weeks. You can see a chart with the number of transactions per variant below.
Transactions by date and variant:
Control and variant of the A/B test:
Due to the high cart abandonment rate on mobile devices, we expect that displaying the pop-up on mobile devices will inform customers of the product being successfully added to the shopping cart. We expect customer motivation of purchasing the product to increase, therefore Cart-to-Checkout rates on mobile devices could increase.
Control and variant of the A/B test:
When introducing the iterative A/B testing process to your optimization program, you should be ready that 8 out of 10 tests will most likely lose or will not make a difference. But this definitely should not stop you, as the more test you run (even if some will not be successful) you will gain valuable knowledge about you users and eventually will be able to turn it into valid hypotheses that will most often win.
We've covered the topic and process of A/B testing, now let's answer questions we get frequently asked:
1. What websites qualify for A/B testing?
Generally, any website can qualify for A/B testing. Before running the test, you need to calculate whether your website will be able to reach the required sample size in a decent time frame (you don't want the experiment running for 3 months). Hence, if your website doesn't receive high volumes of traffic yet, you may test those pages with the most traffic and test big changes. As discussed previously, the bigger the difference in the test objective, the smaller the sample size is required.
If your website has extremely low traffic and there aren't at least 350 - 400 conversions made on your website per variant, you should consider re-evaluating and investing in your traffic acquisition campaigns. Check out our SEO program offering ways to increase organic traffic that converts.
2. What is the cost of running A/B tests?
The cost of running A/B tests consists of 4 variables:
=SUM(TOOL; CONVERSION STRATEGIST; DESIGNER; DEVELOPER)
TOOL - the A/B testing tool (can be free, e.g., Google Optimize)
CONVERSION STRATEGIST - hours required for prioritization, pre & post-test analysis, results evaluation, A/B test management;
DESIGNER - hours required for creating variant designs (not always needed)
DEVELOPER - hours required for implementing the test by a developer (not always needed).
Nevertheless, you should be ready for hidden costs that can appear if your variant performs worse than the control. You can attribute this cost to insurance that negative changes will not be permanently implemented.
3. Can I run several tests at once?
It's not recommended to run multiple tests on different pages simultaneously, as traffic can overlap, which can skew the results. You can run multiple tests at once only if you can ensure that the participants of each experiment won't overlap and you'll be able to properly analyze the results.
4. Are A/B tests exclusive to websites?
Not necessarily, you can run an A/B test wherever you want. You can A/B test email campaigns, PPC campaigns, social media campaigns, and most likely any other marketing campaign you have in mind.
5. Can A/B testing hurt SEO?
It's quite of a popular concern that A/B tests can hurt your SEO due to duplicate content, but let us assure you that it is only a myth! When running a split URL test, just add "no index" and rel="canonical" to the variant to avoid Google assuming you have duplicate content on the website.
6. How to get started with A/B testing?
You'll need to put into practice the A/B testing process we described in this eBook. Identify the phase you're at and begin continuously running different experiments. If you need any advice or assistance, feel free to get in touch with us at [email protected]. Or refer to the previous section of case studies to get inspired for your first A/B test.
Scandiweb’s Digital Marketing department consists of a certified team of CRO and SEO experts - with you from preparing the first technical setups, to collecting accurate data, to producing tailored designs and user journeys.
We skilfully make use of A/B/n tests in order to find unique innovative solutions to take your business above and beyond. Tell us your conversion goals and we'll make them a reality!
A step-by-step guide on how to create a Data Studio dashboard to follow A/B test results.
Find out how we achieved an increase in the CTR and element interaction rate with value proposition optimization.
Introduce a successful CRO program to your workflow, and witness the wonders of structured conversion optimization.
Reduce cart abandonment on your online store’s shopping funnel with these eCommerce cart abandonment solutions.
Read on to find out what went on behind the scenes of creating a new checkout flow for the biggest airline in the region.
Landing page optimization best practices to make the most out of your marketing campaigns.
Build an effective abandoned cart email strategy and decrease checkout drop-off on your eCommerce store.