PhantomJS 1.2 Release Notes

PhantomJS 1.2, Birds of Paradise, was released on June 21, 2011. It is a major update, it introduces a whole set of new API. It is not compatible with the previous version. For porting existing scripts into the new API, follow the description below.

WebPage object

In order to improve the security aspect (see issue 41), PhantomJS scripts will not run in the context of web page execution. This means, there is no way for malicious scripts to detect the presence of ‘phantom’ object and exploits its API.

The "sandboxing" is achieved via a new WebPage object. It is an encapsulation of a web page. A specific URL can be loaded using its open() function. A typical usage is:

var page = new WebPage();

page.open(url, function (status) {
// do something
});

The callback in the open() is executed when the page loading is completed, with status equals to "success" if there is no error and "fail" if an error has occurred.

The above construct is a convenient version of the following:

var page = new WebPage();

page.onLoadFinished = function (status) {
// do something
};

page.open(url);

Beside onLoadFinished, there is also onLoadStarted which is invoked when page loading starts for the first time:

var page = new WebPage();

page.onLoadStarted = function () {
    console.log('Start loading...');
};

page.onLoadFinished = function (status) {
    console.log('Loading finished.');
};

page.open(url);

Page settings

The behavior of the web page can be set via its settings object, with the following properties:

As an example, here is how to change the user agent:

var page = new WebPage();

page.settings.userAgent = 'Dragonless Phantom';

page.open(url, function (status) {
    // do something
});

Rasterization

A web page can be rasterized to an image or a PDF file using render() function.

This rasterize.js is all it takes to capture a web site.

var page = new WebPage(),
  address, output, size;

if (phantom.args.length < 2 || phantom.args.length > 3) {
  console.log('Usage: rasterize.js URL filename');
  phantom.exit();
} else {
  address = phantom.args[0];
  output = phantom.args[1];
  page.viewportSize = { width: 600, height: 600 };
  page.open(address, function (status) {
      if (status !== 'success') {
          console.log('Unable to load the address!');
      } else {
          window.setTimeout(function () {
              page.render(output);
              phantom.exit();
          }, 200);
      }
  });
}

Network traffic

All the resource requests and responses can be sniffed using the onResourceRequested and onResourceReceived. An example to dump everything is:

var page = new WebPage();
page.onResourceRequested = function (request) {
    console.log('Request ' + JSON.stringify(request, undefined, 4));
};
page.onResourceReceived = function (response) {
    console.log('Receive ' + JSON.stringify(response, undefined, 4));
};
page.open(url);

The included examples/netsniff.js shows how to capture and process all the resource requests and responses and export the result in HAR format.

The following shows the waterfall diagram obtained from BBC website:

JavaScript evaluation

To evaluate JavaScript code in the context of the web page, use evaluate() function. The execution is sandboxed, there is no way for the code to access any JavaScript objects and variables outside its own page context. An object can be returned from evaluate(), however it is limited to simple objects and can’t contain functions or closures.

Here is an example to show the title of a web page:

var page = new WebPage();
page.open(url, function (status) {
  var title = page.evaluate(function () {
      return document.title;
  });
  console.log('Page title is ' + title);
});

Any console message from a web page, including from the code inside evaluate(), will not be displayed by default. To override this behavior, use the onConsoleMessage callback. The previous example can be rewritten to:

var page = new WebPage();
page.onConsoleMessage = function (msg) {
  console.log('Page title is ' + msg);
};
page.open(url, function (status) {
  page.evaluate(function () {
      console.log(document.title);
  });
});

To inject external code, use injectJs function passing the file name containing the code to be loaded. If the file can not be found in the current directory, it will be searched in the path specified in the libraryPath property. Both phantom and WebPage object have injectJs function.

To load external JavaScript library, includeJs is very useful. It behaves like the well-known dynamic script loading technique. An example:

page.includeJs("http://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js", function() {
    // jQuery is loaded, now manipulate the DOM
  });

Bug fixes

New features

Examples

Back to all releases.