From Wikipedia, the free encyclopedia
Jump to: navigation, search
Technology Startup
Founded 2012
Founder David White, CEO
Matthew Painter, CTO
Andrew Fogg, CDO
Headquarters London, UK
San Francisco, US
Website is a web-based platform for extracting data from websites without writing any code. The tool allows people to create an API using their point and click interface.

Users navigate to a website and teach the app to extract data by highlighting examples of data from the page, learning algorithms then generalise from these examples to work out how to get all the data on the website. The data that users collect is stored on’s cloud servers and can be downloaded as CSV, Excel, Google Sheets or JSON and shared. Users can also generate an API from the data allowing them to easily integrate live web data into their own applications or third party analytics and visualization software. For more technical users, offers real-time data retrieval through JSON REST-based and streaming APIs, integration with several common programming languages and data manipulation tools, as well as a federation platform which allows up to 100 data sources to be queried simultaneously.


The company has offices in London and San Francisco. The company was founded by David White, Andrew Fogg and Matthew Painter. incorporated in June 2012 and launched into Beta in September 2013.[1]

Awards[edit] has won a number of startup awards including Best Startup by O'Reilly Strata Santa Clara,[2] GigaOM[3] and Web Summit.[4]

Funding[edit] has raised a total of $4.3M[5] from its founders along with Angel Investors David Axmark (co-founder of MySQL), Andy McLoughlin (co-founder of Huddle), Emmanuel Javal and Louis Monier (co-founder of AltaVista) as well as Venture Capital firms Open Ocean Capital, Jerry Yang's (founder of Yahoo) fund AME Cloud Ventures and Wellington Partners. [6]

Features[edit] has a number of features:[7]

  • Extractors - Turn unstructured and semi-structured data from similar web pages into a structured Dataset
  • Crawlers - Convert an entire website's worth of pages into a structured database
  • Connector - Convert any website's search box into a queryable API
  • Table Auto Extract - Extract a table of data automatically
  • Authenticates APIs - Extract data from a site that requires you to log in
  • Throughput - Fast, parallelized data acquisition distributed automatically by our cloud architecture
  • Uptime - High availability even for high volume usage
  • Federation - Query up to 100 unique sources with the same set of inputs to create your own aggregated search
  • Data Extraction Formats - Data can be extracted as text, numbers, locations, URLs, images and more; for easier sorting, filtering, and metadata retrieval
  • Dataset - Combine up to 100 unique data sources into one Dataset
  • Client Libraries - Leverage expert-approved client libraries for your language
  • Integrations - Generate example code to integrate with your own data sources in the language of your choice
  • Tests - All data sources are automatically tested twice a day, and will notify you if any issues are detected
  • Storage - All of your data sources are listed in our powerful cloud search infrastructure, allowing you to quickly search, sort, and filter them to get to the data you need


  1. ^ Sarah Marshal, "Data scraping tool for non-coding journalists launches",, September 9, 2013
  2. ^ "Winning and Strata Santa Clara", blog, March 6, 2013
  3. ^ Stacey Higginbotham, " wins the Structure:Europe 2013 Launchpad", GigaOm, September 18, 2013
  4. ^ Sieuwert van Otterloo, "Placemeter and Import IO winners of the Websummit startup competition", StartUpJuncture, November 1, 2013
  5. ^ Crunchbase, " Crunchbase profile",
  6. ^ Mike Butcher, " raises $3m seed round backed by Yahoo and MySQL founders", Venture Beat, September 9, 2014
  7. ^ [1], " features"

External links[edit]