webpage

This module provide a create() function that returns a webpage object. This object allows to load and manipulate a web page.

var page = require("webpage").create();

In the page variable, you have then an object with many properties and methods. See below.

Note: most of properties and methods are implemented, but not documented. Help us to document them ;-)

Webpage object

Properties list:

clipRect canGoBack canGoForward content cookies customHeaders event focusedFrameName frameContent frameName framePlainText frameTitle frameUrl framesCount framesName libraryPath navigationLocked offlineStoragePath offlineStorageQuota ownsPages pages pagesWindowName paperSize plainText scrollPosition settings title url viewportSize windowName zoomFactor

Functions list:

addCookie() childFramesCount() childFramesName() clearCookies() close() currentFrameName() deleteCookie() evaluateJavaScript() evaluate() evaluateAsync() getPage() go() goBack() goForward() includeJs() injectJs() open() openUrl() release() reload() render() renderBase64() sendEvent() setContent() stop() switchToFocusedFrame() switchToFrame() switchToChildFrame() switchToMainFrame() switchToParentFrame() uploadFile()

Callbacks list:

onAlert onCallback onClosing onConfirm onConsoleMessage onError onFilePicker onInitialized onLoadFinished onLoadStarted onNavigationRequested onPageCreated onPrompt onResourceRequested onResourceReceived onUrlChanged

Internal methods to trigger callbacks:

closing() initialized() javaScriptAlertSent() javaScriptConsoleMessageSent() loadFinished() loadStarted() navigationRequested() rawPageCreated() resourceReceived() resourceRequested() urlChanged()

clipRect

This is an object indicating the coordinates of an area to capture, used by the render() method. It contains four properties: top, left, width, height.

To modify it, set an entire object on this property.

page.clipRect = { top: 14, left: 3, width: 400, height: 300 };

canGoBack

canGoForward

content

This property contain the source code of the actual webpage. You can set this property with the source code of an HTML page to replace the content of the current web page.

cookies

This is an array of all Cookie objects stored in the current profile, and which corresponds to the current url of the webpage.

When you set an array of Cookie to this property, cookies will be set for the current url: their domain and path properties will be changed.

Note: modifying an object in the array won’t modify the cookie. You should retrieve the array, modify it, and then set the cookies property with this array. Probably you would prefer to use the addCookie() method to modify a cookie.

If cookies are disabled, modifying this property does nothing.

Be careful about the inconsistent behavior of the expiry property.

customHeaders

This property is an object defining additionnal HTTP headers that will be send with each HTTP request, both for pages and resources.

Example:

webpage.customHeaders = {
    "foo": "bar"
}

To define user agent, prefer to use webpage.settings.userAgent

event

This is an object (read only) that hosts some constants to use with sendEvent().

There is a modifier property containing constants for key modifiers:

page.event.modifier.shift
page.event.modifier.ctrl
page.event.modifier.alt
page.event.modifier.meta
page.event.modifier.keypad

There is a key property containing constants for key codes.

focusedFrameName

frameContent

This property contain the source code of the current frame. You can set this property with the source code of an HTML page to replace the content of the current frame.

frameName

framePlainText

frameTitle

frameUrl

framesCount

framesName

libraryPath

offlineStoragePath

Indicates the path of the sqlite file where content of window.localStorage is stored. Read only.

Note: in PhantomJS, this is the path of a directory. The storage is different than in Gecko. Contrary to PhantomJS, this property cannot be changed with the --local-storage-path flag from the command line.

offlineStorageQuota

Contains the maximum size of data for a page, stored in window.localStorage. The number is in Bytes. Default is 5 242 880 (5MB). Read only.

To change this number, use the --local-storage-quota flag in the command line.

ownsPages

This boolean indicates if pages opening by the webpage (by window.open()) should be children of the webpage (true) or not (false). Default is true.

When it is true, child pages appears in the pages property.

pages

This is the list of child pages that the page has currently opened with window.open().

If a child page is closed (by window.close() or by webpage.close()), the page is automatically removed from this list.

You should not keep a strong reference to this array since you obtain only a copy, so in this case you won’t see changes.

If “ownsPages” is “false”, this list won’t owns the child pages.

pagesWindowName

list of window name (strings) of child pages.

The window name is the name given to window.open().

The list is only from child pages that have been created when ownsPages was true.

paperSize

plainText

scrollPosition

This property contains an object indicating the scrolling position. You can read or modify it. The object contains two properties: top and left

Example:

page.scrollPosition = { top: 100, left: 0 };

settings

This property allows to set some options for the load of a page. Changing them after the load has no effect.

  • javascriptEnabled (not supported yet)
  • javascriptCanCloseWindows (not supported yet)
  • javascriptCanOpenWindows (not supported yet)
  • loadImages (not supported yet)
  • localToRemoteUrlAccessEnabled (not supported yet)
  • maxAuthAttempts (not supported yet)
  • password (not supported yet)
  • userAgent: string to define the user Agent in HTTP requests. By default, it is something like "Mozilla/5.0 (X11; Linux x86_64; rv:21.0) Gecko/20100101 SlimerJS/0.7" (depending of the version of Firefox/XulRunner you use)
  • userName (not supported yet)
  • XSSAuditingEnabled (not supported yet)
  • webSecurityEnabled (not supported yet)
page.settings.userAgent = "My Super Agent / 1.0"

title

It allows to retrieve the title of the loaded page. (Readonly)

url

This property contains the current url of the page. If nothing is loaded yet, this is an empty string. Read only.

viewportSize

windowName

zoomFactor

childFramesCount()

childFramesName()

clearCookies()

Delete all cookies corresponding to the current url.

close()

currentFrameName()

deleteCookie(cookiename)

It deletes all cookies that have the given name and corresponding to the current url.

It returns true if some cookies have been deleted. It works only if cookies are enabled.

evaluateJavaScript()

evaluate()

evaluateAsync()

getPage(windowName)

This methods returns the child page that matches the given “window.name”.

Only children opened when ownsPage was true are checked.

go()

goBack()

goForward()

includeJs()

injectJs()

open(url...)

This method allows to open a page into a virtual browser.

Since this operation is asynchronous, you cannot do something on the page after the call of open(). You should provide a callback or you should use the returned promise (not compatible with PhantomJS), to do something on the loaded page. The callback or the promise receives a string “success” if the loading has been succeded.

Example with a callback function:

page.open("http://slimerjs.org", function(status){
     if (status == "success") {
         console.log("The title of the page is: "+ page.title);
     }
     else {
         console.log("Sorry, the page is not loaded");
     }
})

Example with the returned promise (not compatible with PhantomJS):

page.open("http://slimerjs.org")
    .then(function(status){
         if (status == "success") {
             console.log("The title of the page is: "+ page.title);
         }
         else {
             console.log("Sorry, the page is not loaded");
         }
    })

To load two pages, one after an other, here is how to do:

page.open("http://example.com/page1", function(status){
     // do something on the page...

     page.open("http://example.com/page2", function(status){
         // do something on the page...
     })
})

With the promise, it’s better in term of code (not compatible with PhantomJS):

page.open("http://example.com/page1")
    .then(function(status){
        // do something on the page...

        return page.open("http://example.com/page2")
    })
    .then(function(status){
        // do something on the page...

        // etc...
        return page.open("http://example.com/page3")
    })

Other arguments:

The open() method accepts several arguments:

  • open(url)
  • open(url, callback)
  • open(url, httpConf)
  • open(url, httpConf, callback)
  • open(url, operation, data)
  • open(url, operation, data, callback)
  • open(url, operation, data, headers, callback)

Remember that in all cases, the method returns a promise.

httpConf is an object. See webpage.openUrl below. operation, data and headers should have same type of values as you can find in httpConf.

Note that open() call in fact openUrl().

openUrl(url, httpConf, settings, callback)

Like open(), it loads a webpage. The only difference is the number and the type of arguments.

httpConf is an object with these properties:

  • httpConf.operation: the http method. Allowed values: 'get' or 'post' (other methods are not supported in SlimerJS)
  • httpConf.data: the body. Useful only for 'post' method
  • httpConf.headers: the headers to send. An object like webpage.customHeaders, but it doesn’t replace webpage.customHeaders. It allows you to specify additionnal headers for this specific load.

httpConf is optional and you can give null instead of an object. The default method will be 'get', without data and without specific headers.s

settings is an object like webpage.settings. In fact the given value changes webpage.settings. You can indicate null if you don’t want to set new settings.

callback is a callback function, called when the page is loaded.

openUrl() returns a promise.

release()

Similar to close(). This method is deprecated in PhantomJS. webpage.close() should be used instead.

reload()

render(filename, options)

This method takes a screenshot of the web page and stores it into the given file. You can limit the area to capture by setting the clipRect property.

By default, it determines the format of the file by inspecting its extension. It supports only jpg and png format (PDF and gif probably in future version).

The second parameter is an object containing options. Here are its possible properties:

  • format: indicate the file format (ignore then the file extension). possible values: jpg, png, jpeg.
  • quality: the compression quality. A number between 0 and 1.
  • ratio: (SlimerJS only), a number between 0 and 1, indicating the “zoom level” of the capture.

renderBase64(format)

This method takes a screenshot of the web page and returns it as a string containing the image in base64. The format indicates the format of the image: jpg, png, jpeg.

You can limit the area to capture by setting the clipRect property.

Instead of giving the format, you can give an object containing options (SlimerJS only). See reh render() function.

sendEvent()

setContent(content, url)

This method allows to replace the content of the current page with the given HTML source code. The URL indicates the url of this new content.

stop()

It stops the loading of the page.

switchToFocusedFrame()

switchToFrame()

switchToChildFrame()

switchToMainFrame()

switchToParentFrame()

uploadFile(selector, filename)

A form may content an <input type="file"> element. Of course, because SlimerJs is a scriptable browser, you cannot manipulate the file picker opened when you click on this element. uploadFile() allows you to set the value of such elements.

Arguments are the CSS selector of the input element, and the full path of the file. The file must exist. You can also indicate an array of path, if the input element accepts several files.

Note that a virtual file picker is opened when calling uploadFile(), and so the onFilePicker callback is called. If this callback exists and returns a filename, the filename given to uploadFile() is ignored.

onAlert

onCallback

onClosing

onConfirm

onConsoleMessage

onError

onFilePicker

This callback is called when the browser needs to open a file picker. This is the case when a click is made on an <input type="file"> element.

The callback receives the previous selected file, and should return the path of the new selected file. If the target element accepts several files, you can return an array of file path.

onInitialized

onLoadFinished

onLoadStarted

onNavigationRequested

onPageCreated

onPrompt

onResourceReceived

This callback is invoked when the browser received a part of a resource. It can be called several times with multiple chunk of data, during the load of this resource. A resource can be the web page itself, or any other resources like images, frames, css files etc.

The unique parameter received by the callback is an object containing these informations:

  • id: the number of the requested resource
  • url: the url of the resource
  • time: a Date object
  • headers: the list of headers (list of objects {name:'', value:''})
  • bodySize: the size of the received content (may increase during multiple call of the callback)
  • contentType: the content type of the resource
  • contentCharset: the charset used for the content of the resource
  • redirectURL: if the request has been redirected, this is the redirected url
  • stage: “start”, “end” or “” for intermediate chunk of data
  • status: the HTTP response code (200..)
  • statusText: the HTTP response text for the status (“Ok”...)
  • referrer: the referer url (slimerjs only)
  • body: the content, it may change during multiple call for the same request (slimerjs only).
page.onResourceReceived = function(response) {
    console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + JSON.stringify(response));
};

onResourceRequested

This callback is invoked when the browser starts to load a resource. A resource can be the web page itself, or any other resources like images, frames, css files etc.

The callback may accept two parameters :

  • requestData, a metadata object containing informations about the resource
  • networkRequest, an object to manipulate the network request.
page.onResourceRequested = function(requestData, networkRequest) {
    console.log('Request (#' + requestData.id + '): ' + JSON.stringify(requestData));
};

Properties of requestData are:

  • id: the number of the requested resource
  • method: the http method (“get”, “post”..)
  • url: the url of the resource
  • time: a Date object
  • headers: the list of headers (list of objects {name:'', value:''})

The networkRequest object has two methods:

  • abort(): call it to cancel the request. onResourceReceived and onLoadFinished

    will be called.

  • changeUrl(url): abort the current request and do an immediate redirection to

    the given url.

onUrlChanged

closing(page)

Call the callback onClosing with given parameters, if the callback has been set.

initialized()

Call the callback onInitialized if it has been set.

javaScriptAlertSent(message)

Call the callback onAlert with given parameters, if the callback has been set.

javaScriptConsoleMessageSent(message, lineNumber, fileName)

Call the callback onConsoleMessage with given parameters, if the callback has been set.

loadFinished(status, url, isFrame)

Call the callback onLoadFinished with given parameters, if the callback has been set.

loadStarted(url, isFrame)

Call the callback onLoadStarted with given parameters, if the callback has been set.

rawPageCreated(page)

Call the callback onPageCreated with given parameters, if the callback has been set.

resourceReceived(response)

Call the callback onResourceReceived with given parameters, if the callback has been set.

resourceRequested(requestData, networkRequest)

Call the callback onResourceRequested with given parameters, if the callback has been set.

urlChanged(url)

Call the callback onUrlChanged with given parameters, if the callback has been set.

Table Of Contents

Previous topic

Cookie

Next topic

system

This Page