I’ve been working on an Ember.js application for a client that started life as a Rails application some years ago. When we adopted Ember we already had a good number of existing full-stack tests in place. As our application has grown we’ve added all kinds of new tests, including many new full-stack tests. We’ve also removed many of the original full-stack tests and replaced them with a combination of tests that exercise the client and server separately.
For the purposes of this post I’ll define full-stack tests as tests that run both the client and the server and talk to your application via the browser. My example application uses Capybara but there are similar projects for most languages and frameworks
In this post I’ll go over some of the pros and cons I’ve experienced using full-stack tests. I’ll also give a basic methodology for replacing a full-stack test with a combination of client and server tests.
Full Stack Tests: The Good
Full-stack tests promote thinking about the app from user’s point of view. With a few assertions they can exercise all the disparate pieces of the system to confirm that they’re properly working in concert. For features like websocket interactions, a single full-stack smoke test is almost essential to confirm that things are going as planned, even if the individual parts of the interaction are also covered by unit tests.
Most of the projects I work on are Rails applications that use Capybara for testing. In that case, we also have the luxury of using test factories from the Rails application to build test data for the full-stack tests (thank you FactoryGirl). Once we have some good page factories and page objects in place, new tests are quick to write and provide a lot of coverage. With a mature client framework like Ember, however, I find myself taking advantage of the separation between client and server and writing fewer full-stack test, or even removing existing ones.
Full stack tests are great for:
- Quickly getting some test coverage for the whole application
- Testing complicated interactions between front end and back end
- Leveraging existing test code on the server
Full Stack Tests: The Bad
Why would I remove a full-stack test? Isn’t that coverage a good thing? Sometimes it can be too much of a good thing! With a traditional server-rendered application there’s almost a 1:1 ratio of user actions on the screen to controller actions on the back end. I go to a page, maybe fill out a few simple form fields, and hit ‘Submit’. With a client application the user could perform many actions before ever doing something that communicates with the server. I can create complex form validations on the client that never have to submit to the server, and having to test those interactions via the full stack starts to feel a little wasteful.
Most tests that exercise the full stack can be split up into a set of smaller tests that touch the client and the API separately. Those smaller tests are often much faster as well. Waiting 15 seconds for a Capybara test to run vs. having the Ember test server auto reload in 2 seconds makes huge difference in the feedback loop for a feature. Those tests can also be fragile setup-wise, especially if an application has a lot of different kinds of data to load.
Creating tests for the system as a black box also doesn’t provide me as a developer with any feedback about the design of the system’s parts. Unit tests and more focused integration tests will give off a kind of ‘testability’ smell, and I’ve always found it beneficial to follow my nose in that regard. Making a component (whether it’s a part of the front end or the back end) easier to test normally results in a component that’s easier to reason about and easier to change later. In Ember terms, testability drives me towards using Services and components.
I also find that moving away from end to end tests makes me ask better questions before I write tests. The goal isn’t to write more tests, it’s to write better tests that give me the best overall bang for the buck.
Full stack tests by themselves leave a lot to be desired:
- They can be much slower than testing the client and server in isolation
- They don’t give good feedback on why they fail
- They can require a lot of setup
- More focused tests will help to improve the design of the tested code
To talk about the conceptual process of moving from a full-stack test to a series of other tests, I’ll make up a fake application with some hypothetical tests and break them down.
Auditing a Set of Full Stack Tests
I’ve to made up a rolodex app in an Ember Twiddle to use as an example for testing. This application lets users add an entry to their contacts. That’s it! It’s not much but it’s perfect for the purpose of this discussion. The app only lets users persist a contact to their rolodex if they’ve filled in a last name. It also disables the ‘save’ button if the user has entered an improperly formatted phone number. Admittedly this is all arbitrary but it gives me a few different conditions to test.
Original Full Stack Tests
I’ll start with a fictional set of full-stack tests to evaluate. I’m going to represent tests as sequence diagrams (handily created via plantuml) rather than actually coding them out. For me the diagrams make it easier to talk about what the tests do without getting bogged down by how they do it.
Here’s the first test for the ‘happy path’ where everything goes well.
This test creates a new entry in the rolodex, fills it out, saves, and then reloads to ensure that the new entry was properly persisted. That’s a lot of things for a single test! In some ways this is a great smoke test. It verifies that a big chunk of the system is working in a holistic fashion.
Now I’ll add three other tests. Note that these aren’t going to be shining paragons of great testing.
Separating the Client and Server
All four of the tests here will test different parts of the same page. How can I make it easier to see which tests might be redundant as full-stack tests? I’m going to add the API Server as an actor to my first test sequence to give some granularity as to what I’m actually testing.
This test makes 3 server requests
- GET api/entries: loads the list of entries from the server.
- POST api/entries: creates a new entry
- GET api/entries: loads the list of entries from the server again
Here’s the other test that deals with persistence, broken up in the same fashion.
This test makes 2 server requests
- GET api/entries: loads the list of entries from the server.
- POST api/entries: creates a new entry (with errors)
Both of the tests I just broke apart are interacting solely with the ‘entries’ endpoint. No matter what happens on the client — whether the form is totally filled out, empty, or something in between — the end result from the server’s point of view is just another permutation of a POST request.
Full-stack tests that consume the same API endpoints are prime candidates for splitting into client-side tests and server-side tests. I’ll go over my thinking for splitting up a few of those tests next.
Revealing Implicit Assertions
As an example of the high level process I use, I’m going to go back to the first persistence test from above, ‘User can persist an entry with all fields filled out.’, and try to add in all the implicit assertions to happen over the course of the test. As you can see in the diagram below there are a bunch!
In a perfect world, there’d only be one test for each of those assertions in the entire test suite. Right now, this full stack test does just that: It singularly owns these assertions. At the same time, the above test ensures everything is working in concert up and down the stack. This is great, right?! Yes, it can be, but it’s not free. Implicit assertions are hard to uncover and debug. This is why I prefer testing the full stack only when the situation calls for it. This leads to more tests, but the benefit is that these tests are much smaller, narrowly focused, easier to reason about – especially when something fails.
For the time being I’ll break this test up into a few client tests and a few server tests.
For the client side I’ll assume I’m using Ember acceptance tests, replacing the real API with a mock. I’ll go over a few ways to mock the API server later.
The first test mimics the ‘happy path’ test I did in the beginning, showing that a user can create a new contact.
This smaller test still does a lot! It visits a route, interacts with the DOM, and talks to the mock API. In this test I’m making one assertion on the client side after the mock returns its POST response because the flash message will only appear when the promise for `entry.save()` resolves.
Looking back at the sequence diagram for the full-stack test, I’ve tested the User->Client portions of the system, but I still need to test the Client->API part now. The original full-stack test was checking a few cases for the API, so I’ll need multiple server tests to make sure I don’t lose any coverage.
Here’s a test for the invalid entry (where the last name is blank). This test exposes the fact that in my application the client doesn’t have anything to do with validating a contact – the client is just processing what gets sent back from the server.
Given this test, it might make sense to have another client test describing what happens when the server returns a 422. I’ll forgo writing that test for this post, though.
Food for Thought
In this case I diagrammed out the interaction between the User, Client (browser), and API (server) to help answer questions about the original tests. This helped me design my new tests before writing them.
Here are a few questions I find helpful to ask when designing new tests:
- What about my system am I trying to test here?
- Why am I testing that thing at this layer of the system rather than at some other layer?
- Is this test going to give me good feedback on how I’ve designed this part of the system?
These kinds of questions help me work out the broad strokes of the new tests. The next step would be to actually write them. I’m not going to discuss the technical details of how to do that in this post, but I will share some general tips I find myself coming back to when I create client tests. I’ll discuss the kinds of things I dont test, how I pick the types of client tests to write, how I think about using preconditions and postconditions while writing and debugging those tests, and some strategies for creating test data for the client.
Tips for the Actual Conversion:
What Not to Test
A good thing about using a comprehensive framework is that big chunks of functionality are already in place, and normally they ‘just work’ provided they’re used correctly. In this regard Ember’s acceptance tests are good because they force me to avoid directly asserting things like ‘when the server responds with correct POST data the state of the model is updated.’ Those kind of tests have their place if I’m writing custom Ember Data adapters or serializers, but for most testing the system overall shouldn’t have to care. I want to test the unique aspects of my pages and leave the common stuff to the framework.
Picking Client Test Types
Just as full-stack tests aren’t appropriate as the only type of test for an application, client acceptance tests shouldn’t be the only tool in the toolbox for testing the client side.
Component tests and unit tests will also come into play in a well-designed test suite. I wrote a blog post about using component tests to improve a test suite’s design. Some parts of the system will be complicated enough that they’ll need to be tested more exhaustively than is sensible with an acceptance test. Component tests can also eliminate much of the cost and pain of setting up test data, especially when the component under test lives at the bottom of a route hierarchy.
Preconditions and Postconditions
Creating new tests can be debugging adventures unto themselves. When I find myself getting stuck while trying to set up a test (which is often), I’ll ask myself a few big-picture questions.
- What needs to be true before this test runs in order for it to succeed? (preconditions)
- What should be true after this test runs? (postconditions)
I’ve wasted a lot of time assuming something was already true when it wasn’t. For example, I thought I’d mocked out an API request, but I’d mocked it out incorrectly and I only actually checked my assumption after chasing my tail for 20 minutes, etc. Take nothing for granted.
Creating Test Data
For me this has initially been the most difficult part of most test migrations. Most of my backends are Rails, and if I’m already using FactoryGirl then it’s very convenient to reuse the existing factories for full-stack tests. Mirage can simulate a full-blown API server and it can be used for development too. FactoryGuy is more meant for pushing models straight into the Ember Store, but it also has facilities for mocking HTTP requests. Both are good solutions for acceptance tests.
Full-stack testing is really useful but it shouldn’t be the only tool a developer reaches for. More granular tests focusing on specific parts of the application stack provide better design feedback for those parts. If an application already has a hard split between the client and the server (like an Ember.js front end), the client has its own rich set of testing tools that can be used to ask better questions an avoid relying too much on having to test the full stack.