Databricks

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Databricks, Inc.
TypePrivate
IndustryComputer software
Founded2013 (2013)
FoundersAli Ghodsi, Andy Konwinski, Ion Stoica, Patrick Wendell, Reynold Xin, Matei Zaharia, Arsalan Tavakoli
Headquarters,
RevenueIncrease $813 Million (2022)[1]
Websitedatabricks.com

Databricks is an American enterprise software company founded by the creators of Apache Spark.[2] Databricks develops a web-based platform for working with Spark, that provides automated cluster management and IPython-style notebooks.

History[edit]

Databricks grew out of the AMPLab project at University of California, Berkeley that was involved in making Apache Spark, an open-source distributed computing framework built atop Scala. The company was founded by Ali Ghodsi, Andy Konwinski, Arsalan Tavakoli-Shiraji, Ion Stoica, Matei Zaharia,[3] Patrick Wendell, and Reynold Xin.

In November 2017, the company was announced as a first-party service on Microsoft Azure via the integration Azure Databricks.[4]

The company develops Delta Lake, an open source project aimed at bringing reliability to data lakes for machine learning and other data science use cases.[5]

In June 2020, Databricks acquired Redash, an open source tool designed to help data scientists and analysts visualize and build interactive dashboards of their data.[6]

In February 2021 together with Google Cloud, Databricks provided integration with the Google Kubernetes Engine and Google's BigQuery platform.[7] Fortune ranked Databricks as one of the best large "Workplaces for Millennials" in 2021.[8] At the time, the company said more than 5,000 organizations used its products.[9]

In August 2021, Databricks finished their eighth round of funding by raising $1.6 billion and valuing the company at $38 billion.[10]

In October 2021, Databricks made its second acquisition of German no-code company 8080 Labs. 8080 Labs makes bamboolib, a data exploration tool that does not require coding to use.[11]

Funding[edit]

In September 2013, Databricks announced it raised $13.9 million from Andreessen Horowitz and said it aimed to offer an alternative to Google's MapReduce system.[12][13] Microsoft was a noted investor of Databricks in 2019, participating in the company's Series E at an unspecified amount.[14][15] The company has raised $1.9 billion in funding, including a $1 billion Series G led by Franklin Templeton at a $28 billion post-money valuation in February 2021. Other investors include Amazon Web Services, CapitalG (a growth equity firm under Alphabet, Inc.) and Salesforce Ventures.[9]

Funding Rounds
Series Date Amount (million $) Lead Investors
A 2013 13.9[12] Andreessen Horowitz
B 2014 33[16] New Enterprise Associates
C 2016 60[17] New Enterprise Associates
D 2017 140[18] Andreessen Horowitz
E Feb. 2019 250[19] Andreessen Horowitz
F Oct. 2019 400[20] Andreessen Horowitz
G Jan. 2021 1,000 [21] Franklin Templeton Investments
H Aug. 2021 1,600 [22] Morgan Stanley

Products[edit]

Databricks develops and sells a cloud data platform using the marketing term "lakehouse", a portmanteau based on the terms "data warehouse" and "data lake".[23] Databricks' lakehouse is based on the open source Apache Spark framework that allows analytical queries against semi-structured data without a traditional database schema.[24]

Databricks' Delta Engine launched in June 2020 as a new query engine that layers on top of Delta Lake to boost query performance.[25] It is compatible with Apache Spark and MLflow, which are also open source projects from Databricks.[26]

In November 2020, Databricks introduced Databricks SQL (previously known as SQL Analytics) for running business intelligence and analytics reporting on top of data lakes. Analysts can query data sets directly with standard SQL or use product connectors to integrate directly with business intelligence tools like Tableau, Qlik, Looker, and ThoughtSpot.[27]

Databricks also offers a platform for other workloads including machine learning, data storage and processing, streaming analytics and business intelligence. [28]

The company has also created Delta Lake, MLflow and Koalas, open source projects that span data engineering, data science and machine learning.[29] In addition to building the Databricks platform, the company has co-organized massive open online courses about Spark[30] and a conference for the Spark community called the Data + AI Summit,[31] formerly known as Spark Summit.

Operations[edit]

Databricks is headquartered in San Francisco.[32] It also has operations in Canada, the United Kingdom, Netherlands, Singapore, Australia, Germany, France, Japan, China, India and Brazil.[citation needed]

References[edit]

  1. ^ "Databricks reaches $813M ARR".
  2. ^ Dwoskin, Elizabeth (June 9, 2016). "This is where the real action in artificial intelligence takes place". Washington Post. Retrieved 2016-08-16.
  3. ^ Zaharia, Matei. "Matei Zaharia". Retrieved 2016-08-16.
  4. ^ "Microsoft makes Databricks a first-party service on Azure". TechCrunch. Retrieved 2021-04-06.
  5. ^ "Databricks launches Delta Lake, an open source data lake reliability project". VentureBeat. 2019-04-24. Retrieved 2021-04-06.
  6. ^ "Databricks acquires Redash, a visualizations service for data scientists". TechCrunch. Retrieved 2021-04-06.
  7. ^ "Databricks brings its lakehouse to Google Cloud". TechCrunch. Retrieved 2021-02-18.
  8. ^ "100 Best Large Workplaces for Millennials". Fortune. June 16, 2021. Retrieved 2021-07-16.
  9. ^ a b Konrad, Alex (February 2, 2021). "Databricks Raises $1 Billion At $28 Billion Valuation, With The Cloud's Elite All Buying In". Forbes. Retrieved July 29, 2021.
  10. ^ Mellor, Chris (2021-09-01). "Databricks raises data lake of cash at monstrous $38bn valuation". Blocks & Files. Retrieved 2021-09-04.
  11. ^ Eric Rosenbaum (October 6, 2021). "$38 billion software start-up Databricks makes acquisition to leave code behind". CNBC. Retrieved February 20, 2022.
  12. ^ a b Harris, Derrick (September 25, 2013). "Databricks raises $14M from Andreessen Horowitz, wants to take on MapReduce with Spark". Retrieved September 28, 2014.
  13. ^ Lorica, Ben (September 25, 2013). "Databricks aims to build next-generation analytic tools for Big Data". O'Reilly Media. Retrieved September 28, 2014.
  14. ^ "Databricks raises $250M at a $2.75B valuation for its analytics platform". TechCrunch. Retrieved 2021-04-08.
  15. ^ Novet, Jordan (2019-02-05). "Microsoft used to scare start-ups but is now an 'outstandingly good partner,' says Silicon Valley investor Ben Horowitz". CNBC. Retrieved 2021-04-06.
  16. ^ Miller, Ron (June 30, 2014). "Databricks Snags $33M In Series B And Debuts Cloud Platform For Processing Big Data". TechCrunch. Retrieved September 28, 2014.
  17. ^ Shieber, Jonathan. "Databricks raises $60 million to be big data's next great leap forward". TechCrunch. Retrieved 2016-12-16.
  18. ^ "Databricks Secures $140 Million to Accelerate Analytics and Artificial Intelligence in the Enterprise". Databricks. Retrieved 2019-05-16.
  19. ^ "Databricks' $250 Million Funding Supports Explosive Growth and Global Demand for Unified Analytics; Brings Valuation to $2.75 Billion". Databricks. Retrieved 2019-02-05.
  20. ^ "Databricks announces $400M round on $6.2B valuation as analytics platform continues to grow". TechCrunch. Retrieved 2019-10-24.
  21. ^ "Databricks raises $1B at $28B valuation as it reaches $425M ARR". Tech Crunch. Retrieved 2021-02-14.
  22. ^ "Databricks raises $1.6B at $38B valuation as it blasts past $600M ARR". Tech Crunch. Retrieved 2021-07-01.
  23. ^ Michael, Armbrust; Ghodsi, Ali; Xin, Reynold; Zaharia, Matei (January 2021). "Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics" (PDF). Conference on Innovative Data Systems Research. Retrieved July 29, 2021.
  24. ^ "With massive $1B infusion, Databricks takes aim at IPO and rival Snowflake". SiliconANGLE. 2021-02-01. Retrieved 2021-04-08.
  25. ^ "Databricks Cranks Delta Lake Performance, Nabs Redash for SQL Viz". Datanami. 2020-06-24. Retrieved 2021-04-08.
  26. ^ "Databricks launches Delta Lake, an open source data lake reliability project". VentureBeat. 2019-04-24. Retrieved 2021-04-08.
  27. ^ "Databricks launches SQL Analytics". TechCrunch. Retrieved 2021-04-08.
  28. ^ Brust, Andrew. "Databricks, champion of data "lakehouse" model, closes $1B series G funding round". ZDNet. Retrieved 2021-04-08.
  29. ^ "The Two Sigma Ventures Open Source Index". Two Sigma Ventures. Retrieved 2021-04-08.
  30. ^ "Databricks to run two massive online courses on Apache Spark". Databricks. 2014-12-02. Retrieved 2016-12-16.
  31. ^ "Data + AI Summit". Databricks. Retrieved 2021-04-08.
  32. ^ staff, CNBC com (2020-06-16). "36. Databricks". CNBC. Retrieved 2021-04-08.