WebDriverAgent: Getting Started with Automated iOS Testing

By Matt Mercieca on 20 Apr 2017

The iOS simulator does not provide CoreMotion data. I wanted to visualize the jitter in the accelerometer and capture it on my computer for analysis. Not being able to simulate that, I needed a way of hooking my phone up to the computer and programmatically controlling it and capturing data from it.

Enter WebDriverAgent.

WebDriverAgent is an iOS WebDriver server provided by Facebook that can be used to remotely control iOS devices. WebDriver servers are compatible with programs like Appium or scripting languages that support web requests (Ruby, Python, and even Bash).

Here's how to set up WebDriverAgent and get started with controlling a connected iOS device.

Step 1: Make Sure Your Build Environment Is Sane

WebDriverAgent requires:

  • Xcode 8
    • Available for free from the app store. If you wrote an iOS app, this is probably already on your system. It is worth checking that you have the latest version.
  • A valid distribution certificate
    • WebDriverRunner is installed on the target device or in the simulator. The certificate used must be trusted by the target device, but once installed WebDriverRunner can launch and control any application.
  • A properly set up xcode-select
    • I installed the command line tools after Xcode, so I ran into an issue where I couldn’t build. Running sudo xcode-select -s /Applications/Xcode.app/Contents/Developer fixed the problem (from a helpful StackOverflow article).
  • The latest Node and NPM
    • These are used for building the inspector, among other things.
  • Carthage
    • Carthage is used to fetch all of the dependencies in bootstrap.sh.

Some other helpful tools:

  • Git
    • To clone the repository.
  • curl
    • I used this in Step 5 to make requests against the WebDriverRunner.

WebDriverAgent links directly against XCTest.framework, testing libraries in Xcode. So not only is OS X required, Xcode is required to be on the system running WebDriverAgent.

Step 2: Get and Build WebDriverAgent

I cloned it directly from the github repository.

If your build environment is set up, you should be able to run ./Scripts/bootstrap.sh from the WebDriverAgent directory and have it build.

Step 3: Build and Install the Runner

Open WebDriverAgent.xcodeproj to set up the runner before building.

Select the provisioning certificate that is trusted on the target device.

Step 3A

Full size image

  1. Select the project window in the navigator.
  2. Select the runner.
  3. Make sure the runner is signed.
  4. Select your team.

Now the project can be built (⌘B)

Once built, select the WebDriverAgentRunner scheme if it is not already selected.

Step 3B

At this point, the runner is ready to be launched by running the tests (⌘U)

Step 3C

Make sure the device is unlocked before running the tests. If all is well, the screen on the target device will go black and then return to the application screen. You’ll see the newly installed WebDriverAgentRunner. The server is running while the tests run.

Step 4: Request Setup

Next, you will need the URL of the server. It is displayed as debug output.

Activate the debug console

Get the server URL

The /inspector endpoint has some useful information. To access it, point a web browser at [SERVER_URL]/inspector. On my system that is http://10.43.0.30/inspector.

The inspector endpoint

The /inspector endpoint is the only one I've found that's compatible with a browser out of the box. The information shown there is an amalgam of information available from other endpoints, but those endpoints are not browser accessible: some headers must be set to make requests to them.

Step 5: Explore

Now the WebDriverAgent is running on the phone and is ready to accept commands.

To get familiar with how to set up the automation, I found it easiest to use the command line with curl.

Setting up CURL

Before I could do that, I set up some environment variables to save me some time.

Specifically:

DEVICE_URL=[your device url]

And

JSON_HEADER='-H "Content-Type: application/json"'

Setting up DEVICE_URL and JSON_HEADER

Getting the Device Status and a Screenshot

With curl setup, we can start hitting endpoints. /status can be used to get the sessionId and verify that everything is working.

curl -X GET $JSON_HEADER $DEVICE_URL/status

`/status` output

The screenshot shown by the /inspector endpoint is readily available.

curl -X GET $JSON_HEADER $DEVICE_URL/screenshot

`/screeenshot` output

It is JSON encoded, so viewing it will require a program capable of decoding it.

Opening an Application

Launching an application is done by posting to the /session endpoint. I found this easiest to do using the application's bundle identifier, or bundleId. For your own applications, the bundle identifier is available from the Workspace settings screen for your project.

To launch Safari on the phone, for example, I used:

curl -X POST $JSON_HEADER -d "{\"desiredCapabilities\":{\"bundleId\":\"com.apple.mobilesafari\"}}" $DEVICE_URL/session

`/session` output: launching Safari

The sessionId will be useful later.

Saving the sessionId

Issuing Commands (Clicks, Typing, and More)

A screenshot isn't the most friendly way to get information about and interact with a user interface. Fortunately the /source endpoint provides an XML description of what is on the screen.

curl -X GET $JSON_HEADER $DEVICE_URL/source

`/source` output

Automated testing wouldn't be very useful without being able to issue commands to the user interface. So far we have launched an application, but not interacted with it in any way.

A basic action would be to click on an element on the screen. The endpoint to do that is /click, however to use the /click endpoint we need the ID of the element we want to activate.

That elementId can be searched for. Being in Safari, I can search for a link to click on using the /elements endpoint. If I want to click on the link to go to the Mutually Human Team page, I can search for that text.

curl -X POST $JSON_HEADER -d "{\"using\":\"partial link text\",\"value\":\"label=Team\"}" $DEVICE_URL/session/$SESSION_ID/elements

Using `/elements` to find an element ID

Once the elementId is known, the command to click on it can be issued.

curl -X POST $JSON_HEADER -d "" $DEVICE_URL/session/$SESSION_ID/element/5ADA18CD-8872-4B11-8953-4B61E988F440/click

Using `/click` to click on an element.

The /elements endpoint can be used to search by other criteria.

  • class name can be used to find buttons (XCUIElementTypeButton).
  • predicate string can be used to find elements by their properties.
  • xpath from the /source endpoint can also be used.

Next Steps

The WebDriver Server implementation provided by WebDriverAgent should work with a number of testing frameworks. Anything that supports web requests will work. Swift Selenium bindings are under development. WebDriverAgent should be flexible enough to help you automate testing of your iOS app.

About Mutually Human

Mutually Human is a custom software design and development consultancy specializing in mobile and web-based products and services. We help our clients design, develop and bring to market innovative products and services based on insightful research and strategy aligned with business objectives. We’ve helped Fortune 500 companies, state governments, and startups.