by

Seeing Screenshots with a Headless Chrome

Reading Time: 4 minutes

Version 59 of Chrome introduced something really cool for Mac developers: a Headless Chrome. It means that now we can run Chrome without using Chrome. How cool is that?

In the past we’ve relied on phantom.js or Selenium Webdriver for browser automation and testing. But now, the Chrome team has provided us the ability to test Chrome with Chrome.

What is one of the more common things that we want to do with a browser automation tool? Take screenshots. So what we’re going to do is walk through the very basic steps of using Headless Chrome and node.js, and grab a screenshot of a web browser, entirely from the command line.

TL;DR

If you’re just here for the code, my feelings won’t be hurt. Here’s the final product.

Pre-requisites and preliminaries

As of right now, only Mac users get Headless Chrome. You windows users are going to have to wait a bit longer.

Make a command line alias to Chrome

Assuming Google Chrome is installed in your root applications folder, run this to make sure you’ve got Chrome available to you at the command line:

  alias chrome="/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome"

Apply a healthy dose of yarn

I relied very heavily on Eric Bidelman’s intro to Headless Chrome for this, and he starts off with yarn. so I recommend you do the same. Yarn is just an NPM package that caches packages, to avoid ridiculous amounts of downloading later:

If you don’t have yarn, start with npm i yarn in a new folder.

Then follow up with these commands:

yarn add lighthouse
yarn add chrome-remote-interface
yarn add minimist

Starting with Eric’s Tutorial

My code is inspired by shamefully stolen from Eric Bidelman’s final example. What I first want to do is show you the code and then, explain some of what I’ve learned from trial and error mashing on my keyboard until things happened.

Here’s the example we’ll build our screengrabber from:

const CDP = require('chrome-remote-interface');

(async function() {
const chrome = await launchChrome();
const protocol = await CDP({port: chrome.port});
const {Page, Runtime} = protocol;

await Promise.all([Page.enable(), Runtime.enable()]);

Page.navigate({url: 'https://www.chromestatus.com/'});

Page.loadEventFired(async () => {
  const js = "document.querySelector('title').textContent";
  const result = await Runtime.evaluate({expression: js});

  console.log('Title of page: ' + result.result.value);

  protocol.close();
  chrome.kill(); // Kill Chrome.
});

})();

 

(async function() {})()

Just about everything that comes out of the chrome interface is a promise. With it being a promise, we want any code executing it to also execute asynchronously.

When you use async, you’re going to want to use await. await pauses the function it’s inside of, and waits for that function to resolve. Which is why we have this:

const chrome = await launchChrome();
const protocol = await CDP({port: chrome.port});

We’re pausing the wrapping function, and not letting it execute until these values here resolve. Put another way, we’re not doing anything until Headless Chrome first starts up.

Page and Runtime

Our next goal is to get the tools we’re going to need. We do that with these two lines:

const {Page, Runtime} = protocol;
await Promise.all([Page.enable(), Runtime.enable()]);

First, we use this really cool destructuring technique to create constants called Page and Runtime from properties that exist on protocol.

Then, we make sure that Page and Runtime are enabled.

Page.Navigate()

Page.navigate({url: 'https://www.chromestatus.com/'}); is something we know we want to change. We want to be able to make that value a command line argument. But it’s good to know for now that .navigate() is a method that exists on the page.

Page.loadEventFired(async () => {})

More async goodness! We want to know when the page has loaded, and then only execute code after it’s loaded.

It’s stuff like this that make me realize why promises were created. Without a load of promises, we’d be dealing with a ton of timeOuts in a nested callback hell.

What we know is that this asynchronous callback is where we want to do all of our heavy lifting and screen grabbin:

Page.loadEventFired(async () => {
  const js = "document.querySelector('title').textContent";
  const result = await Runtime.evaluate({expression: js});

  console.log('Title of page: ' + result.result.value);

  protocol.close();
  chrome.kill(); 
});

So, now that we’ve gotten that out of the way…

Into the node we go!

In a new JavaScript file, let’s go ahead and add our dependencies first:

const chromeLauncher = require('lighthouse/chrome-launcher/chrome-launcher');
const CDP = require('chrome-remote-interface');
const fs = require('fs');

Get some arguments from the command line

Let’s start at the end. We want to be able to get a screenshot and set dimensions from the command line. Like this:

node headless-screenshot.js -w 1024 -h 768 --url=http://google.com

So, this is why we installed minimist up front. Minimist will convert command line arguments into an object. So what we’ll do is create that object and have it ignore the first two arguments (since those are node and headless-screenshot.js ):

const argv = require('minimist')(process.argv.slice(2));
const windowWidth = argv.w ? argv.w : 1024;
const windowHeight = argv.h ? argv.h : 1024;
const launchConfig = {
    chromeFlags: [
        `--window-size=${windowWidth},${windowHeight}`,
        '--disable-gpu',
        '--headless'
    ]
}

 

Launch Chrome

So now we have all the information we need to launch a headless chrome:

async function launchChrome() {
  return await chromeLauncher.launch(launchConfig);
}

Navigate to a page

We’ve gotten values from a config, we’ve launched chrome, now it’s time to get our hands dirty in the asynchronous IIFE. We’ll navigate to the page provided by the argument:

(async function() {
     const chrome = await launchChrome();
     const protocol = await CDP({port: chrome.port});
     const {Page, Runtime} = protocol;

     await Promise.all([Page.enable(), Runtime.enable()]);

     Page.navigate({url: argv.url});

})()

 

Getting our screenshot

Ok, we’re back into our Page.loadEventFired(). And this is where I’m here to tell you two things that I didn’t grasp the first dozen times I looked at the documentation on screenshots:

  • it actually returns a promise
  • when it resolves, it’s an object that is returned, not a string

So what we want is something like this:

Page.loadEventFired(async () => { 
    const titleJS = "document.querySelector('title').textContent"; 
    const pageTitle = await Runtime.evaluate({expression: titleJS}); 
    const screenshot = await Page.captureScreenshot(); 
    
    console.log(`title of page: ${pageTitle.result.value}`); 
    Promise.resolve(screenshot)
      .then((imageData)=>{ 
        // do something with the image data 
      }); 
     protocol.close() 
     chrome.kill(); 
});

 

Save the screenshot

We’ll make a function whose job it will be to make a file from the data. That data that’s coming back to us is a base64 string. So, it should be somewhat straightforward to use the file system to make a file out of our screencapture.

We’ll set the filename to be the same as the url. We’ll just strip out the http:// and replace those pesky / with something less likely to be confused as a folder structure:

function saveScreenshot(imageData) {
    const filename = `${argv.url.replace('http://','').replace(/\//g,'_')}.png`;

    fs.writeFile(
        filename,
        imageData.data, {encoding:'base64'},
        (err)=>{
            console.warn('error', err);
        }
    );
}

So now, we can resolve our promised screenshot with this:

Promise.resolve(screenshot)
    .then((imageData)=>{ 
      saveScreenshot(imageData); 
    });

 

Putting it all together

Here’s the final gist that we get when it all gets put together. What we find is that Headless Chrome is a very useful utility for automation and browser testing. This is just one small, contrived example of how we can make use of it.