Jump to content

Headless browser

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Citation bot (talk | contribs) at 12:57, 2 June 2020 (Removed parameters. | You can use this bot yourself. Report bugs here. | Activated by AManWithNoPlan | Category:CS1 maint: archived copy as title | via #UCB_Category). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A headless browser is a web browser without a graphical user interface.

Headless browsers provide automated control of a web page in an environment similar to popular web browsers, but are executed via a command-line interface or using network communication. They are particularly useful for testing web pages as they are able to render and understand HTML the same way a browser would, including styling elements such as page layout, colour, font selection and execution of JavaScript and AJAX which are usually not available when using other testing methods.[1][2]

Use cases

Headless browsers are used for:[3][4]

  • Test automation in modern web applications.
  • Taking screenshots of web pages.
  • Running automated tests for JavaScript libraries.
  • Scraping web sites for data.
  • Automating interaction of web pages.

Google stated in 2009 that using a headless browser could help their search engine index content from websites that use AJAX.[5]

Malicious

Headless browsers can also be used to:

List of headless browsers

This is a list of browsers providing a complete or near-complete headless implementation.

  • Google Chrome – since version 59 Chrome supports headless mode in Linux, macOS and Windows[10]
  • Firefox – headless mode is available on linux since version 55.[11] Version 56 added support for headless mode in Windows and macOS[12]
  • PhantomJS – a headless web browser using WebKit layout engine for rendering web pages and JavaScriptCore for executing scripted tests. PhantomJS was originally developed by Ariya Hidayat in 2010 and has gained a wide following and extensive development ecosystem. However, the project has since been archived and is no longer under active development.[13][14][15][16][17][18]
  • HtmlUnit – a headless browser written in Java. HtmlUnit uses the Rhino engine to provide JavaScript and AJAX support as well as partial rendering capability.[19][20]
  • TrifleJS – a headless Internet Explorer scriptable browser using the Trident layout engine for rendering pages and the V8 JavaScript engine for executing scripted tests. TrifleJS uses the same API language as PhantomJS and works by using the .NET WebBrowser object to control whatever version of IE is installed on the machine.[4][21]
  • Splash – a headless web browser with an HTTP API, Lua scripting support and a built-in IPython (Jupyter)-based IDE. Splash is written in Python and uses WebKit layout engine. Development started at ScrapingHub in 2013; it is partially funded by DARPA.[22][23]
  • SimpleBrowser – a lightweight, highly capable, headless web browser with a scriptable .NET Standard API. SimpleBrowser written in C#, supporting .NET Standard 2.0.

Simulated

These are browsers that simulate a browser environment. While they are able to support common browser features (HTML parsing, cookies, XHR, some javascript, etc.), they do not render DOM and have limited support for DOM events. They usually perform faster than full browsers, but are unable to correctly interpret many popular websites.[24][25][26]

  • Zombie.js – a simulated browser environment for Node.js.[27]
  • ENVJS – a simulated browser environment written in JavaScript for the Rhino engine.[28]
  • Edbrowse (limited DOM support)

Scriptable

These are browsers that may still require a user Interface but have programmatic APIs and are intended to be used in ways similar to traditional headless browsers.

See also

References

  1. ^ "What is a headless browser?". arhg.net.
  2. ^ "Quick Start". phantomjs.org.
  3. ^ "PhantomJS - PhantomJS". phantomjs.org.
  4. ^ a b "trifleJS".
  5. ^ "Official Google Webmaster Central Blog: A proposal for making AJAX crawlable". Official Google Webmaster Central Blog.
  6. ^ "Headless Browser Botnet Used in 150 hour DDoS attack". Business 2 Community.
  7. ^ "Headless Web Traffic Threatens Internet Economy". ecommercetimes.com.
  8. ^ "Headless browsers: legitimate software that enables attack". ITProPortal.
  9. ^ "Credential stuffing". owasp.org.
  10. ^ "Getting Started with Headless Chrome". developers.google.com.
  11. ^ "Headless mode - browser support". developer.mozilla.org.
  12. ^ "Firefox 56 release notes". developer.mozilla.org.
  13. ^ "PhantomJS - PhantomJS". phantomjs.org.
  14. ^ "FAQ". phantomjs.org.
  15. ^ "Google Groups". google.com.
  16. ^ "Commits · ariya/phantomjs · GitHub". GitHub.
  17. ^ "ariya/phantomjs". GitHub.
  18. ^ "Archiving the project: suspending the development · Issue #15344 · ariya/phantomjs". GitHub. Retrieved 2018-12-05.
  19. ^ Mike Bowler. "HtmlUnit – Welcome to HtmlUnit". sourceforge.net.
  20. ^ "Platform (Vaadin 7.3.4 API)". vaadin.com. 6 November 2014.
  21. ^ "Home". GitHub.
  22. ^ "scrapinghub/splash". GitHub.
  23. ^ "Archived copy". Archived from the original on 2015-05-28. Retrieved 2015-05-28.{{cite web}}: CS1 maint: archived copy as title (link)
  24. ^ "assaf/zombie". GitHub.
  25. ^ "ヘルペスが口や目からうつる?感染した時の症状と病院の治療方法とは". www.envjs.com. Archived from the original on 2015-02-23. Retrieved 2015-03-13.
  26. ^ "JavaScriptMVC - EnvJS". javascriptmvc.com.
  27. ^ "Zombie". labnotes.org.
  28. ^ Resig, John (29 January 2018). "env-js: A pure-JavaScript browser environment" – via GitHub.
  29. ^ Laurent Jouanneau. "SlimerJS". slimerjs.org.