Supercharging Visual Regression Testing

Visual Regression testing in the last few years has proven to become an additional pillow of comfort when testing that a site is operating correctly in regards to the User Interface.

By being able to take a set of reference screenshots of a website, the gold masters, and comparing these images to how the website appears in it's current state - any differences can be found and inconsistencies caught.

This can prove to be a valuable asset for showing how a change in one area of the codebase does or does not affect UI components - something that may be worth it's weight in gold (if the abstract concept of software weighed anything that is) for legacy codebases where the above scenario is more often to occur.

The testing skeleton

Recently in our team we were adding visual regression testing for the first time to a site serving multiple languages, including right-to-left languages such as Arabic. The tool we chose was BackstopJS due to previous success internally.

After the initial setup we had a foundation to build on. BackstopJS is configurable using the backstop.json file, generated during setup. This config file can be JS based if prefered, allowing for variables, logic and comments - giving a 1-2-3 punch of dev niceities. We chose this option to give extra flexibility and help make our config DRYer.

Once more, with feeling

The newly renamed and slightly re-structured backstop.js file is where every and any piece of setup is done for visual testing using BackstopJS. The base JS config file now looks something like this (or you can see the default BackstopJS config)

module.exports = {  
  id: 'ID',
  viewports: [ ..viewportsArray.. ],
  scenarios: [
    {
      label: 'Components',
      url: 'http://foo.bar/baz',
      selectors: [ '.foo' ],
      selectorExpansion: true,
      hideSelectors: [],
      removeSelectors: [],
      readyEvent: null,
      delay: 500,
      misMatchThreshold : 0.1,
      onReadyScript: 'onReady.js'
    }
  ],
  paths: { ..pathsObject.. },
  casperFlags: [],
  engine: 'phantomjs',
  report: ['cli'],
  debug: false
}

The viewports and the scenarios parts are where the magic happens. scenarios tell BackstopJS what and how to take screenshots and viewports explains what screen sizes these screenshots should be taken in.

Where this fits into the workflow

We are using gulp for our build and rather than calling BackstopJS from the console, it can also be required as module and invoked directly.

The module itself is also Promise based and fits into the gulp workflow, as it's own task, like a glove. We set a config object to pass to backstop along with defining two separate tasks - one for generating reference shots and one for running tests using this.

//Config object
var backstopConfig = {  
  //Config file location
  config: '/path/to/file.js'
  //incremental reference image capturing
  i: true
}

//Generate Reference files
gulp.task('visual-test:reference', function() {  
    return backstopjs('reference', backstopConfig);
});

//Generate Test files and compare with reference
gulp.task('visual-test:test', function(done) {  
    return backstopjs('test', backstopConfig);
});

Making things easier for ourselves

The base setup to running the tests is to run a left-to-right test (in this case on the English page) and the exact same test as a right-to-left (Arabic page) test. This would mean that any number of test scenarios that you have would be doubled due to the requirement of ltr and rtl tests. The viewports and selectors for these LTR and RTL variants in our case would be the same. With this in mind, the scenario selectors were split into their own file for the config.

Next was deciding where the tests would be run on the site. Running a test on any page that was content managed would bring with it potential issues where the content in the references was different to the content on the site itself. Visual regression testing, as the name suggests, should and does really only care about the visual aspects - this being the design - and tests should be run on a environment where the content is controlled. At the same time there needs to be care taken to maximise the amount of components that are included in the tests.

The natural fit, and something that covers both of these requirements, is a project's styleguide. Here, the content is / can be controlled and a multitude of components used throughout of the site would be placed on the page - giving a perfect platform to test on.

Bringing it all together

With the aspects mentioned earlier and a few other refactors, the final config file looked like.

Certain areas here have been expanded and updated. The yargs module has been brought in to allow command line arguments to add flexibility when running Backstop, mainly setting the debug config key and the base URL to test.

Other areas changed include extracting out common areas, including making the mismatch percentage a common variable MISMATCH_PERCENTAGE_THRESHOLD, splitting up viewports, components and paths config data into their own files. With this, adding a new component selector in the components file will then be tested for both English and Arabic scenarios seen above.

Further areas of refactoring, such as the scenarios themselves, could also be done if required.

//parse command line arguments
var argv = require('yargs').argv;

//Get the base URL from command line argument. If that isn't provided, use the default
var BASE_URL = (typeof argv.url === 'undefined') ? 'http://default-test-url.foo/' : argv.url;

//Get debug value from command line argument.
// If not provided returns 'undefined' - 'double bang' coerces the value to a boolean
var DEBUG = !!argv.debug;

//Set how much is considered to be a valid percentage mismatch.
// - Anything lower is a pass,
// - Anything higher is a fail,
var MISMATCH_PERCENTAGE_THRESHOLD = 4;

//Test viewports
var viewports = require('./config/viewports.js');

//component selector list
var components = require('./config/components');

//BackstopJS paths
var paths = require('./config/paths');

//Helper function to create full URL string
function getFullUrl(pageUrl) {  
  return BASE_URL + pageUrl;
}

module.exports = {  
  id: 'AI',
  viewports: viewports,
  scenarios: [
    {
      label: 'Components EN',
      url: getFullUrl('en/awesome-page'),
      selectors: components.static,
      selectorExpansion: true,
      hideSelectors: components.hide,
      removeSelectors: [],
      readyEvent: null,
      delay: 1000,
      misMatchThreshold : MISMATCH_PERCENTAGE_THRESHOLD,
      onReadyScript: 'onReady.js'
    },
    {
      label: 'Components AR',
      url: getFullUrl('ar/awesome-page'),
      selectors: components.static,
      selectorExpansion: true,
      hideSelectors: components.hide,
      removeSelectors: [],
      readyEvent: null,
      delay: 1000,
      misMatchThreshold : MISMATCH_PERCENTAGE_THRESHOLD,
      onReadyScript: 'onReady.js'
    }
  ],
  paths: paths,
  casperFlags: [],
  report: ['cli', 'CI'],
  ci: {
    format: 'junit',
    testSuiteName: 'backstopJS'
  },
  debug: DEBUG
}

Example usage would be:

# run the reference task using the provided url
gulp visual-test:reference --url=http://custom-domain.dev

# run the test task with debug logging turned on
gulp visual-test:test --debug  

That's a wrap

This is a basic start of getting a project using visual regression testing. You can start with a basic config file and mould to it however you need, depending on what you are testing.

The act of implementing this has shown that the styleguide page we're testing on is missing a few components. Their absence revealing gaps in the test coverage. The next job would be to bring the styleguide fully up to date!

The site being tested also uses lazyloading for alot of it's images. At the time of writing, these aren't captured by the tests - instead showing an grey block. Whilst this is on the line of testing the content itself, it doesn't fully represent a realistic view of the components. If the images all break, showing a grey background, the tests may return a false positive and pass.