Beginning Unit Tests

The most common thing I hear from people who don’t unit test but would like to is that they don’t know where to start. I’ve documented here the basics of how to write an initial unit test, given some knowledge of Zend Framework and phpunit. I’ve also tried to cover the main points of conflict that developers who have not previously automated their tests tend to run into.

Testing goes from “want” to “have”. So start with a “want”, e.g. “title must be html escaped”. Presumably this “want” will be a description of a new feature or solution to a bug in your software.

Document your want in a test case:

// A PHPUnit subclass with some fixtures to initialise your application
// and make it runnable.
class ProductPageTestCase 
  extends \Zend\Test\PHPUnit\Controller\AbstractHttpControllerTestCase 
{

  public function setUp()
  {
     // ...
  }

  public function testTitleMustBeHtmlEscaped() 
  {
  }
}

Work out how to express your “want” in terms of an assertion. Start at the assertion and work back:

$this->assertContains("<title>" . htmlspecialchars("<") . "</title>", $content);

(I’m using htmlspecialchars because wordpress really loves to ruin escaping in code samples, normally you would write the escaped less-than in the most simple way).

Next work out how to run the page. In the case of a Zend controller test you run:

$this->dispatch("/product-page/1");
$content = $this->getResponse()->getContent();

All that’s happening here is that Zend is simulating the dispatch process that would happen when it receives a real HTTP request.

Work out how you can make the content contain your use case:

$this->database->insert('product', array('product_name' => "<"));

This is the "fixture, run, test" trifecta which you will see in all tests.

Note that what I've written in the example is really an "integration" or "component" test which is testing the full application stack rather than one completely isolated unit. This is usually what you must start with if your application does not currently have any test automation and will give you the most testing value in the fastest way. A unit test of your application classes would need to avoid dependencies as much as possible in order to avoid failing every time a dependency changes. The method of writing the tests is still essentially the same.

The fixture is a point where many get confused. Surely you have to make massive changes to your app to make your database editable from within a test, right? The answer is yes — and that's expected. You are forced into a design by your tests which requires that you make dependencies injectable and mockable. This requires that your application is modular and decoupled which is generally a good thing so the tests are error-checking your design too.

Tools like Zend's service locator are intended to make this kind of design easy, for example a service factory can return an in-memory sqlite for your database and mock transports for your apis when in a test environment. If you're using a modern framework like Zend you may find that even if you're not automating your testing now you are already using the tools you need to do it (but you will need to change the specifics of your implementation to take advantage of that).

In Zend, a test setup fixture might look something like this:

public function setUp() 
{
   // I prefer to use a config which uses my autoloads but skips production
   // credentials and overrides everything to mocked values.  This gives a 
   // short reusable test setup but means your service factories have to do 
   // more work.
   $this->setApplicationConfig(require APP_ROOT . "/config/environments/unit-tests.php");
   $services = $this->getApplicationServiceLocator();
   // You can alter the services manually too:
   $db = new \PDO("sqlite::memory:");
   $db->exec("CREATE TABLE product ( ... )");
   $services->setService("MyAppDb", $db);
}

There are several choices for implementing fixtures. My example simply manipulates the persistence layer directly, which is usually the only choice for existing untested software (at least to begin with). Fixtures can be more readable and further encourage good design if you involve your application's real models (or mocks thereof) rather than bypassing them and using the persistence layer directly.

Generally when you're modifying an existing test case, it should be very obvious how to create a new test method since new methods should look a lot like the existing ones. Structural duplication is expected in a unit test since simplicity — especially avoiding deeply layered software — is more valuable than duplication in test maintenance.

It is also important to be aware that the test will make it hard for you to change the behaviour of that use case or of its fixture. This will make it necessary for you to maintain backwards compatibility features for the sake of your tests. This can be awkward and seems to be a nasty surprise for new developers who are used to breaking compatibility at will. Typically large applications worked on by multiple people need to maintain backwards compatibility anyway so this is another case of the unit tests forcing you into a particular design philosophy which is usually a good idea anyway. Nevertheless this seems to be a huge annoyance among new unit testers so it's worth keeping in mind how your software design needs to change.

Once your test is complete, watch it fail. It should fail in the right way; so in the example above you should expect to see an invalid HTML title produced. This is important because it verifies that your tests are detecting the condition that you intended.

Next implement the feature. Do this in the fastest possible way you can. Don't worry about what sins you have to commit to make the thing happen. Repeatedly run the test until it passes. This is a great way to break out of design worry deadlock and over-engineering — it does not matter how it works so long as it works.

If you find yourself writing lots of print statements at the implementation stage then your test may be too broad. Try abstracting and writing unit tests of the specific part that you are working with. Once again the unit testing is giving you a signal to change the design of your software. In general your tests should be replacing debugging, not adding additional work. Avoiding the debugging process is probably the most important factor for making TDD a fast development process. This can be a hard habit to break and needs some practice. Try implementing a new assertion every time you would have added a print statement.

Once the test passes you can work on cleaning up the code. Remove the sins you committed, clear up and make things pretty. You should not generally need to make major architectural changes at this point; your main job is to remove anything that could be interpreted as duplication, as well as getting rid of future risks like global variables.

If you do this right, you can develop features very quickly without putting any major hurdles into the product for the next time that you have to modify it. Quality trends upwards and your existing code has less impact on the cost of new features. The more you keep the product supple, the more your simple changes become trivial, and your complex changes become achievable.

There is more to testing than this, of course, but the general mantras are:

  • always work from "want" to "have"
  • keep your tests as simple as possible
  • work in small increments
  • don't duplicate code
  • remember the cleanup stage
  • allow the unit tests to steer your design decisions
Advertisements
Beginning Unit Tests

Beware of Dumb Objects

In some projects I often see systems which look like this:

class InvoiceFactory
{
    public function __construct($service);
    public function createObjects($param);
}

class Invoice
{
    public function __construct($array);
    public function createObjects();
}

class InvoiceProducts
{
    public function __construct($object);
}

class WarehouseInvoice
{
    public function __construct($invoice, $products);
}

// Classes are used like this:
$invoices = new InvoiceFactory($database);
// Load database rows by id.
$array = $invoices->fetchInvoices($someId);
// Create useable objects from those rows
$invoice = new Invoice($array);
// List of products
$productList = $invoice->getProducts();
// Usable list products.
$products = InvoiceProducts($productList);
// Convert those rows into a usable form.
$objectsDto = $products->fetchProducts();
// Convert one useable form into the one that we actually need.
$products = new WarehouseProducts($invoice, $products);
// Convert list of products to one we can use in a template
$yetAnotherDto = $products->listProducts();
// Actually use the object.
doSomething($yetAnotherDto[0]->actualProperty);

Here we’ve gone from a factory to an object to a list of objects and back to a single object again several times. This can go on and on through several iterations of tiny one-function stateless objects that just convert data.

In this example, the code never actually does anything until the objects are converted into primitive data because all the models only accept primitive data. It seems attractive at first that all the objects would be without dependencies and behave as simple services, however there are several problems with this:

  • usage code is long
  • all objects involved need to be understood in terms of how they accept and produce data in order to know what’s going on
  • difficult to include data from other services without performing more data wrangling
  • difficult to change the behaviour of primitive data since it’s all pre-computed
  • no common API for access to data derived from the service, every client has to build their own conversion
  • difficult to unit test correct usage because it all relies on integration

There seems to be a natural tendency to obsessively break objects down to their smallest possible form, especially around a single database table or similar technical object.

I think the example above would have been just as good with a single object of functions to perform these conversions.

When you find yourself implementing this kind of pattern, consider using just one root service model and smarter objects to access the data.

// Persistence and external service access.
class ObjectService
{
    public function __construct($service);
    public function fetchObjectById($id);
    public function fetchExtraObjectData($id);
}

// Containing value object.
class ObjectList
{
    public function __construct($service, $id);
    public function getActualObjects();
}

// Element value object.  This can still use the ObjectService.
class ActualObject 
{
    public function __construct($objectList, $id);
    public function getActualProperty();
}

// Declare dependencies.
$service = new ObjectService($database);
$objects = new ObjectList($service, $id);
// Load usable data.
$list->getActualObjects()[0]->getActualProperty();

This gives you a more flexible implementation where your data access concerns are dealt with by the smart value objects and the service concerns are dealt with the root object service. The maintainer does not need to know anything about the interactions between these objects, they only need to know the dependencies (which should be defined by the constructor interface).

The conclusions I draw are:

  1. don’t be scared of high-level value objects
  2. dependencies are fine if they mean client code needs to know less
  3. create new abstractions only when they remove work from client code
Beware of Dumb Objects

Event Based Actions to Remove Boilerplate Code

In my application I tend to have a large number of actions that fall into a set of very similar boilerplate structures. They are never close enough to use a standardised model framework and there are too many different structures to use fancy controller configuration. A useful pattern to deal with this is to use events to remove the boilerplate.

public function createSomeObjectAction() 
{
    $form = new SomeForm();
    $editor = new SomeEditor();
    $action = FormAction::create($this, $form);

    $action->onValid(function($context) use($editor) {
       $id = $editor->createSomeObject($context->getValidData());
       return $this->redirect()->toRoute('EditObject', array('id' => $id));
    });

    $action->onOther(function($context) {
       $view = new \Zend\ViewModel\ViewModel();
       $view->setVariable('form', $context->getForm());
       $view->setTemplate('app/generic/bootstrap-from');
       return $view;
    });

    return $action->run();
}

It’s useful to see if you can write controller actions this way without using any if statements. This makes it easy to establish integration tests which just trigger each of the events. You also don’t need to do any deep zend specific configuration like you would with rendering strategies or event listeners — it’s “just php”.

The alternative here is to implement some interface which has its own idea of the onValid and onOther functions and passing this to some strategy.

public function createSomeObjectAction() 
{
    $editor = new SomeEditor();
    $model = new SomeObjectActionModel();
    $model->setEditor($editor);
    return ActionModelRunner::create($model)->run();
}

While this makes the action a lot shorter, it means that the maintainer now has to understand more APIs: the controller action, the action model interface, the action model, and the action model runner. With the event based system you only really need to understand the controller action and the FormAction. The functionality is right in front of you.

I find in general that reducing the number of APIs that the maintainer has to deal with is usually a good idea up to the point where the implementation complexity is too hard to understand in one go. These actions are basically just chaining function calls together — no if statements — so abstracting things does not have any benefit.

Another use case is to deal with errors as a structure:

public function createSomeObjectAction() 
{
    return $this->handleErrors(function() {
        $model = new SomeModel();
        $model->doSomethingFailable();
        return ViewModel();
    });
}

private function handleErrors($action) 
{
    $handler = new ApiErorrHandlerAction($action);
    $handler->on('SpecificException', function($handler, $ex) {
        return $handler->respondWithTemporaryError($ex->getMessage());
    });
    return $handler->run();
}

This is very useful when dealing with APIs which can throw errors at any point and reduces your need for the same boilerplate try-catch in every single action. Instead your intention is shown by the structure of the action.

The drawback is that your generic action needs to be somewhat understood by the reader. This is not too bad in the case of the FormAction example, but it’s more of a problem when you write something like a PaginatedListingAction or a CreateOrEditAction which will abstract dealing with your model code. It’s important to avoid making things so abstract and deeply layered that the maintainer can’t understand how the alter the behaviour of the system.

That said, if you have a lot of actions with the same structure, this technique of declaring behaviour as events can be more understandable and more flexible than a highly engineered framework based on model interfaces.

Event Based Actions to Remove Boilerplate Code

Using Patched Dependencies with Composer

Quite frequently it’s useful to patch a dependency and use it in your project immediately from a local file path without deploying to packageist. This is easily achieved with composer.

  "repositories": [
    {
      "type":"git",
      "url":"/home/jamesw/login-and-pay-with-amazon-sdk/"
    }
  ]

If you don’t have a the dependency added yet you can use the composer require command or edit your composer.json directly to set the version of your dependency to dev-master.

Composer’s update can be restricted to just one dependency in order to update your lock file and install the new version.

$ ./composer.phar update amzn/login-and-pay-with-amazon-sdk-php
  - Updating amzn/login-and-pay-with-amazon-sdk-php dev-master

If you get “no driver for repository” then try changing the “type” in the repository configuration entry. Some documentation says use type ‘vcs’, but I needed ‘git’.

If the dependency you are overriding is already a “dev-master” version then this is sufficient to cause your chosen repository to be the source for the dependency. If the dependency has a release version then you will need the --prefer-source argument when you update.

Remember that if you do this your package will only be installable for you so it’s a good idea to keep the changes to composer.json and composer.lock isolated to a single commit on a private branch. You can then use a rebase to skip or squash those changes when your commits are in the public version of the dependency or you have otherwise deployed those patches publicly.

To make your patches publicly installable, create a public branch on your fork at github and reference this in your requirement, for example:

{
  "repositories": [
    {
      "type": "git",
      "url": "https://github.com/bnkr/login-and-pay-with-amazon-sdk-php"
    }
  ],
  "require": {
    "amzn/login-and-pay-with-amazon-sdk-php ": "dev-abstract-transport"
  }
}
Using Patched Dependencies with Composer