Apache Kylin

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Apache Kylin
Apache Kylin Logo
Developer(s)Apache Kylin Committee
Initial releaseJune 10, 2015; 5 years ago (2015-06-10)[1]
Stable release
3.1.0 / July 2, 2020; 4 months ago (2020-07-02)[2]
Preview release
4.0.0-alpha / September 13, 2020; 2 months ago (2020-09-13)[3]
RepositoryKylin Repository
Written inJava
LicenseApache License 2.0
Websitekylin.apache.org

Apache Kylin is an open source distributed analytics engine designed to provide a SQL interface and multi-dimensional analysis (OLAP) on Hadoop and Alluxio supporting extremely large datasets.

It was originally developed by eBay, and is now a project of the Apache Software Foundation.[4]

History[edit]

The Kylin project was started in 2013, in eBay's R&D in Shanghai, China. In Oct 2014, Kylin v0.6 was open sourced on github.com with the name "KylinOLAP".[5]

In November 2014, Kylin joined Apache Software Foundation incubator.

In December 2015, Apache Kylin graduated to be a Top Level Project.[4]

In March 2016, Kyligence, Inc. was founded by the creators of Apache Kylin.[6][7] Kyligence provides a commercial analytics platform based on Apache Kylin for on-premise and cloud-based datasets.[8]

Architecture[edit]

Apache Kylin is built on top of Apache Hadoop, Apache Hive, Apache HBase, Apache Parquet, Apache Calcite, Apache Spark and other technologies.[9] These technologies enable Kylin to easily scale to support massive data loads.[10]

Kylin has the following core components:[11][9]

  • REST Server: Receive and response user or API requests
  • Metadata: Persistent and manage system, especially the cube metadata;
  • Query Engine: Parse SQL queries to execution plan, and then talk with storage engine;
  • Storage Engine: Pushdown and scan underlying cube storage (default in HBase);
  • Job Engine: Generate and execute MapReduce or Spark job to build source data into cube;

Users[edit]

Apache Kylin has been adopted by many companies as their OLAP platform in production. Typical users includes eBay, Meituan, XiaoMi, NetEase, Beike, Yahoo! Japan.

Roadmap[edit]

Apache Kylin roadmap (from Kylin website[12]):

  • Hadoop 3.0 support (Erasure Coding) - completed (v2.5)
  • Fully on Spark Cube engine - completed (v2.5)
  • Connect more data sources (MySQL, Oracle, SparkSQL, etc) - completed (v2.6)
  • Real-time analytics with Lambda Architecture - completed (v3.0)
  • Cloud-native storage (Parquet) - In progress (v4.0.0-alpha)
  • Ad-hoc queries without Cubing

References[edit]

  1. ^ "Previous Release". v0.7.1-incubating (First Apache Release). Retrieved 15 June 2019.
  2. ^ "Previous Release". v3.1.0. Retrieved 30 September 2020.
  3. ^ "Apache Kylin - Release Notes". v4.0.0-alpha. Retrieved 30 September 2020.
  4. ^ a b Apache Software Foundation. "The Apache Software Foundation Announces Apache™ Kylin™ as a Top-Level Project", 8 December 2015
  5. ^ "Announcing Kylin: Extreme OLAP Engine for Big Data". www.ebayinc.com. 2014-10-20. Retrieved 2018-11-08.
  6. ^ "Apache Kylin Through the Eyes of the Founders - Part One". Kyligence. 2020-06-12. Retrieved 2020-09-30.
  7. ^ "Big Data Analytics Platform | Learn More About Kyligence". Kyligence. Retrieved 2020-09-30.
  8. ^ "Big Data Analytics Platform: Apache Kylin vs. Kyligence". Kyligence. Retrieved 2020-09-30.
  9. ^ a b "Apache Kylin | Analytical Data Warehouse for Big Data". kylin.apache.org. Retrieved 2020-09-30.
  10. ^ Knorr, Eric (2016-03-07). "What eBay looks like under the hood". InfoWorld. Retrieved 2020-09-30.
  11. ^ "Apache Kylin Adds Real-time OLAP". www.i-programmer.info. Retrieved 2020-09-30.
  12. ^ Kylin, Apache. "Apache Kylin | Development Quick Guide". kylin.apache.org. Retrieved 2020-09-30.