A/B testing of mobile apps on Google Play

Google Play experiments allow you to evaluate which version of a mobile app’s description, graphics, or icon performs better and generates more downloads. This is one variation of A/B testing, and the entire process takes place in the Google Play console for developers. It is possible to conduct tests on a single element, as well as on many at the same time, in the markets of your choice.

A/B testing of mobile apps is one of the most widely used methods for increasing app performance in app stores. It’s also part of optimizing app visibility in stores, or so-called ASO. By knowing which elements affect conversion, you can better tailor them for your audience.

Google Play offers an A/B testing feature, which you can configure yourself in the console. For this, developers will be required to have the highest level of console access. Ultimately, a higher conversion rate means more installations from the same level of traffic, thus more app traffic.

Test types of apps in Google Play

In the new version of the Google Play console, you can set two types of experiments:

Global, or default. This type of test allows you to test only visual materials, such as screenshots, icons, or videos. In this version of the test, you will not test different variants of descriptions! In addition, you can only have one active global test per account. All experiments of this kind apply only to the default language of the mobile application. Therefore, if you also use the creative for other markets, then users of those markets will not see the experiment.
Local. This type of experiment can test all application elements in the store. You can simultaneously set up as many as five active tests for different locations.

What ASO elements can you test as part of Google Play experiments?

Icon,
graphics,
screenshots,
video footage,
short description (available in local app experiments),
full/long description (available in local application experiments).

Unfortunately, testing app names using Google Play experiments is impossible.

How to configure an A/B test of a mobile application?

To run the test, log in to your Google Play Console account first. Remember that your account must have permission to create experiments. Once you have selected your mobile app, go to the Development -> Experiments tab on the app page. Then click Create Experiment.

On the experiment creation screen, you will configure:

its name, mainly for statistical purposes,
information about the application (usually, you will have one main one),
type of experiment: default (visual elements only) or global.

The next step is to set up the experiment. The first item is about the test group, where you can test on all users, regardless of their quality. The second (and this is new in the Google Play console), takes into account only that group of users who have kept the app on the device for at least one day. Basically, this is to exclude casual users and low-quality traffic. It could have a bad effect on the quality of the results.

Next, select the number of alternatives to be tested (up to 3 alternatives max). Note that with each option, the system continuously „estimates” the number of new users needed so that the result of the experiment is reliable.

The experiment audience option will allow you to determine the percentage of users visiting the application site who will see the experimental variant instead of the currently used information. These users will be divided equally among all the experimental variants.

The experiment audience option will allow you to determine the number of new users visiting the application page instead of the currently used information.

The minimum detectable effect is the required percentage difference between the original and the tested version to consider that the test has a positive effect. Google Play suggests a minimum of 2.50% here. The smaller this ratio, the larger the sample size is required to reach the specified confidence interval. This is because more qualitative data is needed to detect a smaller change in conversion rate between the control group and the original.

Finally, it remains to determine the confidence level, the percentage that will determine the confidence in the experiment. The higher the confidence you expect, the more users you need to collect. However, a 90% confidence interval seems sufficient in most cases (1/10 of the test results can then give a false positive).

When setting up the test, Google Play calculates the number of new users to complete the experiment. This allows you to see how many days the test needs to collect data. The data for the calculator is simply historical data from the past days and weeks.

At the final stage, all that remains is configuring variants for testing You can import graphic materials and customize the names of the variants to make them easier to evaluate later.

After running the experiment, you will see data regarding the tested variants. The results are presented according to the accepted confidence interval, and the apply button allows you to quickly select the best version and implement it as the target.

Design a new development strategy for your mobile app!

For our partner Blinkee, we have gained 11,000 new and engaged users of the application gaining a leading position in Poland.

Good-practices-for-testing-views-of-mobile-applications

At the beginning of the testing process, establishing the test’s purpose is a good idea – this makes it easier to design the experiment. Testing one attribute at a time (for example, only icons or screenshots) helps you understand exactly which element affected the change. When you test multiple elements simultaneously, you’ll never gain confidence in how each component affects the result.

It is worthwhile in the testing process to go through all the elements of the application page one by one, with priority given to graphical elements. Setting the duration of the experiment to at least 7 days will allow you to get data from user behaviour during both the work week and the weekend, which can vary significantly.

It is also a good idea to show the test to multiple audiences, at least 50% – a larger group will allow you to get results faster. While the first tests may involve significant graphical changes, the results can be further refined by testing, for example, a call to action on seemingly similar graphics.

The results can be refined further.

Analyze-results-with-Google-Play experiments

In the testing process, it’s worth keeping seasonality in mind. Google Play has statistically sound algorithms, but only you can understand why a particular variant performs better or worse than the current version.

If you use many different test variants, they will receive traffic from different sources, campaigns, or keywords. If, during the test, you significantly change the traffic sources, include new media, or change the visibility of the application, this will affect the final results.

Changing the traffic sources significantly, including new media, or changing the application’s visibility will affect the final results.

Like Google Ads experiments, Google Play tests can also produce false results. How to remedy this? The best way to check is to run an A/B/B test. In this case, if both B variants perform the same, you can rely on the results. If, on the other hand, there is a large discrepancy between the two B variants, then something is probably wrong (too small a data sample, the influence of randomness, trends, etc.).

You also need to remember that most tests give negative results. You won’t lose your invested time even if you don’t follow Google Play’s recommendations. You will understand what didn’t work so that subsequent tests will be designed differently.

If you apply the test results to your store information page, monitor the results in subsequent periods. Just because a result came out positive during a test doesn’t mean it will always be so. For example, there may be competitors in the market who will take over existing traffic.

Limitations of Google Play experiments

Application version experiments involve some limitations, which are worth knowing to better understand the mechanics but also to interpret the results:

You can’t select traffic sources for Google Play tests – in any case, the console will use all the traffic flowing (so it’s worth interpreting the results differently if there are big changes in traffic sources to the app).
The data in the tests do not show user-generated values; they are based only on installations, which is a big challenge for mobile app ux audits
If you plan to run multiple tests and variants of tests with different attributes, you will not have the opportunity to see what impact each component had (which severely reduces the quality of such an experiment).

Testing variants of a mobile app

Google Play provides a simple, clear, and pleasant set-up tool for experimenting with different store app variants. At the same time, it provides a statistical calculator that shows the necessary data to consider the test reliable. Thus, it is now difficult to make a mistake and draw too early conclusions about the effectiveness of the tested variants. As part of app marketing, testing should be an ongoing process of improving application visibility.

It’s always a good idea to start with a firm hypothesis and plan an appropriate amount of time per test. With this in mind, it is worth testing large changes rather than focusing on small areas, individual colours, or calls to action. Markets with higher conversion rates will also be important, as testing there will take less time. For target markets that don’t have a high flow of new users, for example, you can use paid campaigns to increase the number of samples. Even if you get a positive result in the short term, wait a few more days to confirm it.

Was the article helpful?

Rate our article, it means a lot to us!

(5.00/5), 3 votes

Let's talk!

Tomasz Starzyński

CEO and managing partner at Up&More. He is responsible for the development of the agency and coordinates the work of the SEM/SEO and paid social departments. He oversees the introduction of new products and advertising tools in the company and the automation of processes.