The iOS simulator does not provide CoreMotion data. I wanted to visualize the jitter
in the accelerometer and capture it on my computer for analysis. Not being able to
simulate that, I needed a way of hooking my phone up to the computer and programmatically
controlling it and capturing data from it.
Enter WebDriverAgent.
WebDriverAgent is an iOS WebDriver server provided by Facebook that can be used to
remotely control iOS devices. WebDriver servers are compatible with programs like
Appium or scripting languages that support web requests
(Ruby, Python, and even Bash).
Here’s how to set up WebDriverAgent and get started with controlling a connected iOS device.
Step 1: Make Sure Your Build Environment Is Sane
WebDriverAgent requires:
- Xcode 8
- Available for free from the app store. If you wrote an iOS app, this is probably already on your
system. It is worth checking that you have the latest version.
- Available for free from the app store. If you wrote an iOS app, this is probably already on your
- A valid distribution certificate
- WebDriverRunner is installed on the target device or in the simulator. The certificate
used must be trusted by the target device, but once installed WebDriverRunner can launch and
control any application.
- WebDriverRunner is installed on the target device or in the simulator. The certificate
- A properly set up xcode-select
- I installed the command line tools after Xcode, so I ran into an issue where I couldn’t build.
Runningsudo xcode-select -s /Applications/Xcode.app/Contents/Developer
fixed the
problem (from a helpful StackOverflow
article).
- I installed the command line tools after Xcode, so I ran into an issue where I couldn’t build.
- The latest Node and NPM
- These are used for building the inspector, among other things.
- Carthage
- Carthage is used to fetch all of the dependencies in
bootstrap.sh
.
- Carthage is used to fetch all of the dependencies in
Some other helpful tools:
- Git
- To clone the repository.
- curl
- I used this in Step 5 to make requests against the WebDriverRunner.
WebDriverAgent links directly against XCTest.framework
, testing libraries in Xcode. So not only is OS X
required, Xcode is required to be on the system running WebDriverAgent.
Step 2: Get and Build WebDriverAgent
I cloned it directly from the github repository.
If your build environment is set up, you should be able to run ./Scripts/bootstrap.sh
from the
WebDriverAgent directory and have it build.
Step 3: Build and Install the Runner
Open WebDriverAgent.xcodeproj
to set up the runner before building.
Select the provisioning certificate that is trusted on the target device.
- Select the project window in the navigator.
- Select the runner.
- Make sure the runner is signed.
- Select your team.
Now the project can be built (⌘B)
Once built, select the WebDriverAgentRunner scheme if it is not already selected.
At this point, the runner is ready to be launched by running the tests (⌘U)
Make sure the device is unlocked before running the tests. If all is well, the screen on the
target device will go black and then return to the application screen. You’ll see the newly
installed WebDriverAgentRunner. The server is running while the tests run.
Step 4: Request Setup
Next, you will need the URL of the server. It is displayed as debug output.
The /inspector
endpoint has some useful information. To access it, point a web browser at
[SERVER_URL]/inspector
. On my system that is http://10.43.0.30/inspector
.
The /inspector
endpoint is the only one I’ve found that’s compatible with a browser out
of the box. The information shown there is an amalgam of information available from other
endpoints, but those endpoints are not browser accessible: some headers must be set to
make requests to them.
Step 5: Explore
Now the WebDriverAgent is running on the phone and is ready to accept commands.
To get familiar with how to set up the automation, I found it easiest to use the command line
with curl
.
Setting up CURL
Before I could do that, I set up some environment variables to save me some time.
Specifically:
DEVICE_URL=[your device url]
And
JSON_HEADER='-H "Content-Type: application/json"'
Getting the Device Status and a Screenshot
With curl
setup, we can start hitting endpoints. /status
can be used to get the sessionId
and verify
that everything is working.
curl -X GET $JSON_HEADER $DEVICE_URL/status
The screenshot shown by the /inspector
endpoint is readily available.
curl -X GET $JSON_HEADER $DEVICE_URL/screenshot
It is JSON encoded, so viewing it will require a program capable of decoding it.
Opening an Application
Launching an application is done by posting to the /session
endpoint. I found this easiest
to do using the application’s bundle identifier, or bundleId
. For your own applications, the
bundle identifier is available from the Workspace settings screen for your project.
To launch Safari on the phone, for example, I used:
curl -X POST $JSON_HEADER -d "{"desiredCapabilities":{"bundleId":"com.apple.mobilesafari"}}" $DEVICE_URL/session
The sessionId
will be useful later.
Issuing Commands (Clicks, Typing, and More)
A screenshot isn’t the most friendly way to get information about and interact with a user
interface. Fortunately the /source
endpoint provides an XML description of what is on
the screen.
curl -X GET $JSON_HEADER $DEVICE_URL/source
Automated testing wouldn’t be very useful without being able to issue commands to the user interface.
So far we have launched an application, but not interacted with it in any way.
A basic action would be to click on an element on the screen. The endpoint to do that is /click
,
however to use the /click
endpoint we need the ID of the element we want to activate.
That elementId
can be searched for. Being in Safari, I can search for a link to click on using
the /elements
endpoint. If I want to click on the link to go to the Mutually Human Team
page, I can search for that text.
curl -X POST $JSON_HEADER -d "{"using":"partial link text","value":"label=Team"}" $DEVICE_URL/session/$SESSION_ID/elements
Once the elementId
is known, the command to click on it can be issued.
curl -X POST $JSON_HEADER -d "" $DEVICE_URL/session/$SESSION_ID/element/5ADA18CD-8872-4B11-8953-4B61E988F440/click
The /elements
endpoint can be used to search by other criteria.
-
class name
can be used to find buttons (XCUIElementTypeButton). -
predicate string
can be used to find elements by their properties. -
xpath
from the/source
endpoint can also be used.
Next Steps
The WebDriver Server implementation provided by WebDriverAgent should work with a number
of testing frameworks. Anything that supports web requests will work. Swift Selenium bindings
are under development. WebDriverAgent should be flexible enough to help you automate testing of your
iOS app.