Blog

Writing tests for Android apps using Python and Linux

Introduction

If you’ve ever developed a mobile app, or any other piece of software, you have very likely encountered a bug at some point. Sometimes you get a bug in your code which stops the app from compiling. This sounds nasty, but in fact, the nastiest bugs are those which you can’t see straight away. You have to summon them by following a very specific series of steps in the app.

So how do you find these hidden bugs? Of course, you can look for them manually yourself (or ask somebody else to look for you). But it is very time-consuming, and will get more time-consuming as the app grows! So, to save time and effort, couldn’t we perhaps get a program to look for these bugs? Couldn’t we automate this process somehow?

Thanks to Appium, we can. Appium is a library designed specifically for testing apps. For example, let’s imagine you’re developing an app where a user can register for an account. Appium allows you to write a test script which will register a dummy user through the UI and check that all of the buttons and menus are working properly.

Appium works on Android, iOS and Windows apps. What’s more, Appium was designed to run with a variety of programming languages, including Python, Java, and Ruby, so you likely don’t have to learn a new language to use it.

This tutorial will show you how to set up automated mobile testing on Linux. We will be using Python to write a test for a free Android chess app. This tutorial was designed to work with Ubuntu 16.04+, but should also work with any similar Linux distributions such as Debian.

How does Appium work?

NOTE: If you just want to start testing, skip to the “Setting up the environment” section below.

Before diving in to some actual testing, let’s first try and understand what Appium does.

Linking Python to the Android UI

Let’s say we want Python code to control the UI of an Android app. What do we need to achieve this?

  1. We need two machines: one machine to run the Python code, and the other to run the Android app. For our purposes, we will run the Python code on our local machine, and the app will run on the Android Emulator provided by Android Studio.
  2. We need some sort of interface to program the app UI. To do this, Android provides us with the UI Automator API. This is a collection of Java classes designed specifically for interacting with and testing the UI of apps running on Android.
  3. Finally, we need some way to convert Python code into commands understood by the UI Automator API.

The last step is where Appium steps in.

The WebDriver Protocol

Appium is really just an extension of Selenium, which was designed for testing web browsers. Selenium uses something called the W3C WebDriver Protocol to communicate with browsers and “drive” browser actions.

How are browser actions “driven”, exactly? First, a server which is linked to a browser is set up which listens for HTTP requests – this is the WebDriver server. (E.g. Google provide the ChromeDriver server as the WebDriver server for Google Chrome). At a certain point, the server might receive a request to go to a specific URL:

POST /session/{session id}/url     # Request-Line for going to a URL

The body of the request would contain a URL, such as www.google.com. The server would understand this as saying: “Get the browser to go to www.google.com”. The server would then route the request to the browser, and the browser would redirect to www.google.com. Finally, the server would respond with a status message.

Similarly, the request to find a particular element on the page is:

POST /session/{session id}/element

The WebDriver protocol contains a number of such commands. What Appium does is it implements a way of getting these commands to work for mobile. For instance, Appium has a UIAutomator2 Driver which (together with a server running on the device connected via the Android Debug Bridge) can drive the UI Automator API. This means that a request like finding an element on the page will eventually convert to the corresponding UI Automator method in Java.

In addition, Appium adds a whole collection of commands specifically designed for mobile interaction, such as installing an app on the device:

POST /wd/hub/session/{sessionId}/appium/device/install_app

or toggling wi-fi:

POST /wd/hub/session/{sessionId}/appium/device/toggle_wifi

All of these HTTP requests are generated from Python code with Appium’s Python client.

Tracing a test command from Python to Android

We are now in a position to trace a single test command from Python all the way to the app UI (see Fig. 1 for a diagram).

Let’s say we have our app open in our emulator, and we want to use Python to find the “Register” button. This element has an accessibility id ‘Register’ (in Android, the accessibility id is also known as the ‘content-desc’ attribute; more below!).

The Python command for this is:

el = uiautomator2driver.find_element_by_accessibility_id('Register')

The Appium Python client converts this to a HTTP request (the session-id is already generated before the test is run):

POST /wd/hub/session/{session-id}/element
...<rest of headers>...

{
   "strategy": "accessibility id",
   "selector": "Register",
   "context": "",
   "multiple": false
}

The Appium server will receive this request and route it to the UIAutomator2 Driver. This in turn converts the request to JSON and routes it to a UIAutomator2 Server running on the device, which will in turn convert the request to the UI Automator method:

<UiDevice>.findObject(descriptionContains("Register"))

The Android OS will internally process this command and, if it can find the element, will respond with the unique ID of the element. The ID then gets stored in el in the Python code, where we can perform further actions such as clicking on it (el.click()), getting its coordinates (el.location), and more.

Fig. 1 High-level architecture of Appium. A Python command is converted to an HTTP request and sent to the Appium server (1). The server routes the request to the desired driver (2). The driver generates a JSON object and sends it to a server on the device (3). The JSON object is eventually converted to a UI Automator API command (4).

Setting up the environment

NOTE: These steps were designed to work with Ubuntu 16.04+, but should also work with any similar Linux distribution.

Optional: Download the Java Development Kit (JDK)

NOTE: Since version 2.2, Android Studio has come bundled with OpenJDK, so installing JDK separately is not necessary. However, Oracle JDK is regarded as more stable than OpenJDK so we’ve included the installation instructions for it as an alternative.

  1. Download the latest Linux distribution of JDK 8 here: https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
  2. Extract the contents of the tar file into a folder of your choosing: tar zxvf jdk-8u221-linux-x64.tar.gz

1. Download and install Android Studio

  1. Download the latest version of Android Studio here: https://developer.android.com/studio#downloads
  2. Extract the contents of the tar file into a folder of your choosing: tar zxvf android-studio-ide-191.5791312-linux.tar.gz
  3. Follow the instructions inside the extracted directory to install Android Studio on your machine.

2. Set the JAVA_HOME and ANDROID_HOME environment variables

In order for packages to know which JDK and Android SDK you are using, we need to specify the JAVA_HOME and ANDROID_HOME environment variables:

  1. In your bash profile .bashrc, add the following line: export JAVA_HOME=/path/to/JDK. If you’re using the JDK supplied with Android, the path will be something like JAVA_HOME=<Android Studio directory>/android-studio/jre. If you’re using the Oracle JDK distribution, it will be something like JAVA_HOME=<JDK directory>/jdk1.8.0_221.
  2. Add the following line as well: export ANDROID_HOME=<Android Studio directory>/Sdk where <Android Studio directory> is the path to your Android Studio directory.
  3. Finally, add the following line: export PATH=$PATH:$JAVA_HOME/bin:$ANDROID_HOME/tools:$ANDROID_HOME/tools/bin:$ANDROID_HOME/platform-tools.
  4. Now open up a terminal and type in java -version. You should see a status message showing the correct Java version.
  5. In the same terminal type in adb. You should see a help page for the Android Debug Bridge.

3. Load an Android emulator

  1. Now open up Android Studio and click on Configure > AVD Manager in the bottom right.
  2. Click on Create Virtual Device, and use the configuration to create a Pixel 2 emulator with Android 9.0.
  3. Load up the emulator by clicking on the “Play” button in the Actions column of your device (if you can’t see a device listed, make sure that, from the home screen, you are under Configure > AVD Manager).
  4. After the device has launched, go to your terminal and type in adb devices. You should see one device listed in the output. NOTE: If you see the device listed as ‘unauthorized’, try configuring an emulator with an Android 9.0 (Google APIs) image instead.

4. Install Appium

  1. Before installing Appium, we strongly recommend that you use a Node version manager such as nvm to avoid permission issues.
  2. In a terminal, run the command: npm install -g appium. NOTE: If you are not using a Node version manager, you will need to run sudo npm install -g appium --unsafe-perm=true.
  3. Install appium-doctor using npm install -g appium-doctor and type appium-doctor to ensure Appium has been properly installed.
  4. Finally, type appium into a terminal. If everything has been properly installed, you should see the Appium server load up.

5. Set up the Python environment

In order to write Python tests for mobile, we require four components:

  1. Python 3
  2. Selenium
  3. python-appium-client
  4. pytest

For this tutorial we will use Conda to manage Python environments, but you can use virtualenv as well.

  1. First, download and install Miniconda (for Python 3) if you haven’t done so already. If you have Miniconda or Anaconda already installed, update to the latest conda version by typing conda update conda into a terminal.
  2. Create a new conda environment.
  3. Inside the environment type into a terminal: conda install selenium pytest && pip install appium-python-client

You are now ready to write your first Appium test!

Your first Appium test

Let’s see Appium in action by writing a short test for an Android app. This test will start up an app on the emulator and perform a few actions near the start of the app.

NOTE: The code described below can be accessed here: https://github.com/lambertlabs/automated-mobile-testing-demo

Writing the test

For our demo we’ll be using version 3.02 of Chess Free. Click on the link to download the APK file and, with the emulator running, drag the APK file onto the emulator to install it. (At this point, feel free to open the app and look around to see what you will be testing!)

Next, create an empty directory where you want to store your test code. Create a new directory inside called android_apps and paste the downloaded APK file into this directory.

Next, in the top-level project directory create a Python file called conftest.py and fill it with the code below:

import os

import pytest
from appium import webdriver

EXECUTOR = 'http://127.0.0.1:4723/wd/hub'
ANDROID_APP_DIR = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'android_apps')

apk_files = [f for f in os.listdir(ANDROID_APP_DIR) if f.endswith('.apk')]
assert len(apk_files) == 1, 'App directory can only contain one app file.'
ANDROID_APP_PATH = os.path.join(ANDROID_APP_DIR, apk_files.pop(0))


@pytest.fixture
def app_driver():
    driver = webdriver.Remote(
        command_executor=EXECUTOR,
        desired_capabilities={
            'app': ANDROID_APP_PATH,
            # Chess Free V3.02
            'appPackage': 'uk.co.aifactory.chessfree',
            'appActivity': '.ChessFreeActivity',
            'platformName': 'Android',
            'platformVersion': '9',
            'deviceName': 'Android Emulator',
            'automationName': 'UiAutomator2',
        }
    )

    yield driver
    driver.quit()

The crucial portion of this code is the pytest fixture. This is a piece of code which will run every time we start a test.

We first define a Remote WebDriver. This is just a WebDriver where we ourselves specify which server to send HTTP requests to. In this case, we are essentially specifying that we want the Appium server to act as our “browser driver”, using the command_executor parameter.

The desired_capabilities dictionary specifies some properties which we want our Appium server to have. For instance, 'automationName': 'UiAutomator2' specifies that we want the server to route our requests to the UiAutomator2 driver as we will be testing on an Android device.

The app key should hold the path to the APK file of the app that we want to test. appPackage refers to the app package name and is used to verify that the app contained in the APK file is the same app that we actually want to test. appActivity is used by Appium to know which “section” of the app should be opened up first. To find the values of appPackage and appActivity, open the app in the emulator at the activity you want to test and, in your terminal, open the ADB shell with adb shell and type in dumpsys window windows | grep -E 'mFocusedApp'.

Now, create a subdirectory called tests and create a file inside called test_enter_app.py with the following contents:

def test_open_app(app_driver):
    app_driver.implicitly_wait(10)

    # Find/click element by resource id
    app_driver.find_element_by_id('YesButton').click()

    # Find/click element by UiSelector
    ui_selector = 'new UiSelector().textContains("OK")'
    app_driver.find_element_by_android_uiautomator(ui_selector).click()

    # Appium interacts with the Android OS, not just the app
    resource_id = 'com.android.packageinstaller:id/permission_deny_button'
    app_driver.find_element_by_id(resource_id).click()

    ui_selector = 'new UiSelector().textContains("I agree")'
    app_driver.find_element_by_android_uiautomator(ui_selector)

    # Press "Back" key.
    # See https://developer.android.com/reference/android/view/KeyEvent.html for keycodes
    app_driver.press_keycode(4)

    app_driver.find_element_by_id('ButtonPlay')

This is what the definition of a Pytest looks like. The argument app_driver tells Pytest to use the app_driver fixture from conftest.py. The app_driver variable inside the definition points to the driver yielded from the fixture definition at yield driver. Note how the name of the test begins with test_; this prefix is used by pytest to identify what to run.

First, the test script sets an implicit wait time of 10 seconds on the driver. This means that, whenever the test looks to see if a condition is satisfied (e.g. if an element is present on the screen), the test will poll for the condition for a maximum of 10 seconds before raising an error. A more in-depth discussion of implicit vs. explicit wait times can be found here.

The rest of the test script contains a series of commands which perform different actions in the app. For example, app_driver.find_element_by_id('YesButton').click() finds an element with ID YesButton and clicks it. A complete list of such commands can be found here: http://appium.io/docs/en/about-appium/api/

The UiAutomatorViewer

Say we want to click a particular button on the app. How is Appium supposed to know which button to select? In other words, how do we obtain the ID of the button?

The simplest way to obtain the element ID is by using the UiAutomatorViewer program included with Android Studio.

First, open the app in the emulator and make sure that the element you want to inspect is visible on the emulator screen. Next, open a terminal and enter uiautomatorviewer. You should see a new window open: this is the UiAutomatorViewer program.

Along the top, click on the button labelled “Device Screenshot with Compressed Hierarchy”. You will see a screenshot of the emulator screen. Now, click on the button you want to inspect on this screen and look in the ‘Node Detail’ table in the bottom right. Here we have all of the element information that the UiAutomator has access to.

Fig. 2 How to use the UiAutomatorViewer to get the text, resource-id, and content-desc of an element.

The three most useful node details are resource-id, content-desc and text. Fig. 2 shows you where to find these node details. The table below shows which Appium methods to use to look up an element based on each of these node details:

Node detailAppium method
resource-idapp_driver.find_element_by_id(<resource-id>)
content-descapp_driver.find_element_by_accessibility_id(<content-desc>)
textui_selector = 'new UiSelector().textContains(<text>)'
app_driver.find_element_by_android_uiautomator(ui_selector)

Running the test

First, make sure that Appium and your emulator are running. Next, open up a terminal in the top-level project directory, enter your virtual environment, and type in pytest. If everything has been set up correctly, you should see the Chess Free app load in the emulator, buttons being selected and, if you’re lucky, the test passing.

Congratulations! You now have all the tools required to write automated tests for Android apps.

George Lambert features on Startup Secrets Podcast

george-lambert-startup-secrets

Our founder George Lambert recently featured on the Startup Secrets Podcast where he spoke to host Seb Francis and gave insights into our company’s journey and provided tips for aspiring entrepreneurs. In the podcast George discusses the trials and tribulations of starting and growing a business from scratch. You can listen to the episode here

The Startup Secrets Podcast is a podcast for entrepreneurs, set to inspire, educate and motivate new businesses to success. 

Startup Secrets works in association with Accounts Lab, a cloud accountancy firm who specialise in startups and growth businesses across all sectors. 

With episodes released weekly, you can subscribe on itunes or follow on Spotify to never miss an episode.

Pair Programming at Lambert Labs

Pair Programming at Lambert Labs

Gone are the days of ‘traditional’ programming practices, where teams of software developers might have restricted their communciation to once weekly meetings and worked towards yearly release dates; we are now accustomed to a far more iterative process: daily standups, TDD and continuous integration and/or deployment. This iterative process is of course part of the wider set of agile programming practices.

At Lambert Labs we take our agile programming practices seriously. As part of this we regularly use pair programming as a way to ensure that we are working efficiently and effectively as possible.

Pair Programming Workstation Setup

The workstation setup for pair programming is important. If two developers are around one workstation for significant periods of time, they might get uncomfortable. This is why we use ‘dual workstations’ for our pair programming. Our dual workstations are made up of two desks and four monitors, but only one computer – the two monitors on the desk with the computer are duplicated to the two monitors on the desk without the computer. This means that both developers can sit in comfort while pair programming!

Setting up our workstations requires HDMI splitters (we use something like this). Our current desk of choice is the IKEA Skarska Standing Desk.

The Good

Code exposure: pair programming helps developers of all levels of seniority get exposure to different parts of the codebase. This is really important because it helps developers understand the overall system that they are working on and also helps provide continuity when a developer with ownership of a section of the codebase is on holiday or otherwise engaged.

Code quality: if two developers are looking at the same piece of code during development then this is similar to having an extremely thorough review process (think how often a reviewer spends as long reviewing code as the developer spent writing it – very rarely!).

Focus: yes, pair programmers might have a chat and a giggle, but what they won’t do is things like check social media, check their emails and surf the net. In this case, working together brings more productivity.

Morale: working together is fun, and it boosts morale! The life of a developer can at times be unneccesarily quiet. Pair programming discourages deathly silences and encourages collaboration.

Fewer bugs: working as a pair means that fewer bugs creep into the code base. This reduces time spent debugging at a later stage.

The Bad

Lack of ownership: developers don’t always feel that they have ownership of a section of the codebase, and feel as though they are getting pulled in multiple directions.

The Ugly

Lack of synergy: if two developers don’t ‘click’ when they are working together then the relationship might not be productive – in some cases this is just human nature!

At Lambert Labs we find that the benefits of pair programming hugely outweigh the drawbacks. It improves our productivity and standards, and we will stick with it moving forwards 🙂

How to integrate Jira with GitHub

At Lambert Labs we work with a range of clients across different sectors. What can be awkward is that they often use completely different DevOps and project management tools. On the DevOps side, some clients use AWS while others prefer Google Cloud Platform or even PaaS providers such as Heroku. Some clients use CircleCI while others prefer to use Travis/Jenkins. From the project management perspective Jira and Confluence are very popular, but a selection of our smaller projects still make use of Trello.

As a software development agency it is important for us to be expert users of as many of these project management tools as possible because it enables us to work on a broader range of projects. We recently started working on a project with a client that uses GitHub for code hosting and Jira for issue tracking. Integrating Jira with GitHub makes it easier for our project managers and software engineers to keep track of the GitHub branches and pull requests that correspond to tickets in Jira. The benefits of integration are shown in the example Jira ticket below, where there are links to a GitHub branch and pull request in the development section (these appear automatically as part of the integration).

A Jira issue demonstrating a GitHub integration
A Jira issue demonstrating a GitHub integration

Integration

Prerequisites

  • A Jira account with administration rights
  • A personal GitHub account, or a GitHub organisation account with administration rights

Setting up GitHub

First, navigate to your GitHub account settings page (if you navigate to your personal settings, this will eventually give Jira access to your personal repositories. If you navigate to your organisation settings, this will eventually give Jira access to your organisation’s repositories). Go to ‘Developer Settings -> OAuth Apps -> New OAuth App’. You will be presented with the following page:

GitHub register OAuth application
Page demonstrating how to GitHub register OAuth application

Fill in the following details and click on ‘Register Application’:

  • Application Name: Jira
  • Homepage URL: https://yoursubdomain.atlassian.net/
  • Application Description: Jira integration
  • Authorization Callback URL: https://yoursubdomain.atlassian.net/

After clicking on ‘Register Application’ you will be taken through to a confirmation page giving you a Client ID an Client Secret. Make a note of these – you will need them in a moment.

Setting up Jira

Now navigate to your organisation’s dashboard on Jira and go to ‘Settings -> Applications -> DVCS Accounts -> ‘Link GitHub Account’. You will be prompted with the following popup:

Adding a GitHub account to Jira
Adding a GitHub account to Jira

Choose either GitHub or Github Enterprise as your host (depending on what is appropriate) and put either your GitHub username or GitHub organisation name as the ‘Team or User Account’ (again, depending on what is appropriate). Client ID and Client Secret should be self explanatory! After you click ‘Add’ you will be prompted by an authorisation page (you should authorise the app, and may be required to enter your password as part of this process). You will then be redirected to the DVCS accounts page in Jira, where you should now see your linked GitHub account.

The last step is to understand how to make sure your GitHub branches and pull requests appear in the corresponding Jira tickets. To ensure the link takes place, you must name your Git(Hub) branches as ‘<JiraProjectKey>-<Jira-Issue-Number>-normal-git-branch-name’. You can find your Jira Project Key (on a per project basis) by going to a Jira project and navigating to settings. You will see a details page similar to the following:

Jira project details page
Jira project details page

So, for the above Jira project, an example branch name would be ‘LMS-5-my-new-feature’. As soon as you follow this naming convention for your branches you will see branches and pull requests appearing in your Jira tickets. Nice!