The iOS simulator does not provide CoreMotion data. I wanted to visualize the jitter
in the accelerometer and capture it on my computer for analysis. Not being able to
simulate that, I needed a way of hooking my phone up to the computer and programmatically
controlling it and capturing data from it.

Enter WebDriverAgent.

WebDriverAgent is an iOS WebDriver server provided by Facebook that can be used to
remotely control iOS devices. WebDriver servers are compatible with programs like
Appium or scripting languages that support web requests
(Ruby, Python, and even Bash).

Here’s how to set up WebDriverAgent and get started with controlling a connected iOS device.

Step 1: Make Sure Your Build Environment Is Sane

WebDriverAgent requires:

  • Xcode 8
    • Available for free from the app store. If you wrote an iOS app, this is probably already on your
      system. It is worth checking that you have the latest version.
  • A valid distribution certificate
    • WebDriverRunner is installed on the target device or in the simulator. The certificate
      used must be trusted by the target device, but once installed WebDriverRunner can launch and
      control any application.
  • A properly set up xcode-select
    • I installed the command line tools after Xcode, so I ran into an issue where I couldn’t build.
      Running sudo xcode-select -s /Applications/Xcode.app/Contents/Developer fixed the
      problem (from a helpful StackOverflow
      article
      ).
  • The latest Node and NPM
    • These are used for building the inspector, among other things.
  • Carthage
    • Carthage is used to fetch all of the dependencies in bootstrap.sh.

Some other helpful tools:

  • Git
    • To clone the repository.
  • curl
    • I used this in Step 5 to make requests against the WebDriverRunner.

WebDriverAgent links directly against XCTest.framework, testing libraries in Xcode. So not only is OS X
required, Xcode is required to be on the system running WebDriverAgent.

Step 2: Get and Build WebDriverAgent

I cloned it directly from the github repository.

If your build environment is set up, you should be able to run ./Scripts/bootstrap.sh from the
WebDriverAgent directory and have it build.

Step 3: Build and Install the Runner

Open WebDriverAgent.xcodeproj to set up the runner before building.

Select the provisioning certificate that is trusted on the target device.


Step 3A

Full size image

  1. Select the project window in the navigator.
  2. Select the runner.
  3. Make sure the runner is signed.
  4. Select your team.

Now the project can be built (⌘B)

Once built, select the WebDriverAgentRunner scheme if it is not already selected.

Step 3B

At this point, the runner is ready to be launched by running the tests (⌘U)

Step 3C

Make sure the device is unlocked before running the tests. If all is well, the screen on the
target device will go black and then return to the application screen. You’ll see the newly
installed WebDriverAgentRunner. The server is running while the tests run.

Step 4: Request Setup

Next, you will need the URL of the server. It is displayed as debug output.

Activate the debug console


Get the server URL

The /inspector endpoint has some useful information. To access it, point a web browser at
[SERVER_URL]/inspector. On my system that is http://10.43.0.30/inspector.

The inspector endpoint

The /inspector endpoint is the only one I’ve found that’s compatible with a browser out
of the box. The information shown there is an amalgam of information available from other
endpoints, but those endpoints are not browser accessible: some headers must be set to
make requests to them.

Step 5: Explore

Now the WebDriverAgent is running on the phone and is ready to accept commands.

To get familiar with how to set up the automation, I found it easiest to use the command line
with curl.

Setting up CURL

Before I could do that, I set up some environment variables to save me some time.

Specifically:

DEVICE_URL=[your device url]

And

JSON_HEADER='-H "Content-Type: application/json"'

Setting up DEVICE_URL and JSON_HEADER

Getting the Device Status and a Screenshot

With curl setup, we can start hitting endpoints. /status can be used to get the sessionId and verify
that everything is working.

curl -X GET $JSON_HEADER $DEVICE_URL/status

`/status` output

The screenshot shown by the /inspector endpoint is readily available.

curl -X GET $JSON_HEADER $DEVICE_URL/screenshot

`/screeenshot` output

It is JSON encoded, so viewing it will require a program capable of decoding it.

Opening an Application

Launching an application is done by posting to the /session endpoint. I found this easiest
to do using the application’s bundle identifier, or bundleId. For your own applications, the
bundle identifier is available from the Workspace settings screen for your project.

To launch Safari on the phone, for example, I used:

curl -X POST $JSON_HEADER -d "{"desiredCapabilities":{"bundleId":"com.apple.mobilesafari"}}" $DEVICE_URL/session

`/session` output: launching Safari

The sessionId will be useful later.

Saving the sessionId

Issuing Commands (Clicks, Typing, and More)

A screenshot isn’t the most friendly way to get information about and interact with a user
interface. Fortunately the /source endpoint provides an XML description of what is on
the screen.

curl -X GET $JSON_HEADER $DEVICE_URL/source

`/source` output

Automated testing wouldn’t be very useful without being able to issue commands to the user interface.
So far we have launched an application, but not interacted with it in any way.

A basic action would be to click on an element on the screen. The endpoint to do that is /click,
however to use the /click endpoint we need the ID of the element we want to activate.

That elementId can be searched for. Being in Safari, I can search for a link to click on using
the /elements endpoint. If I want to click on the link to go to the Mutually Human Team
page
, I can search for that text.

curl -X POST $JSON_HEADER -d "{"using":"partial link text","value":"label=Team"}" $DEVICE_URL/session/$SESSION_ID/elements

Using `/elements` to find an element ID

Once the elementId is known, the command to click on it can be issued.

curl -X POST $JSON_HEADER -d "" $DEVICE_URL/session/$SESSION_ID/element/5ADA18CD-8872-4B11-8953-4B61E988F440/click

Using `/click` to click on an element.

The /elements endpoint can be used to search by other criteria.

  • class name can be used to find buttons (XCUIElementTypeButton).
  • predicate string can be used to find elements by their properties.
  • xpath from the /source endpoint can also be used.

Next Steps

The WebDriver Server implementation provided by WebDriverAgent should work with a number
of testing frameworks. Anything that supports web requests will work. Swift Selenium bindings
are under development. WebDriverAgent should be flexible enough to help you automate testing of your
iOS app.

technology logo

Get a Free Consultation

Your Free Consultation will be packed full of discussions, brainstorming, and hopefully, excitement. The meeting is designed to help uncover your challenges, define your needs, and outline possible solutions so you can make decisions that will lead to the business outcomes you desire.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.