Azure Data Lake

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by WikiCleanerBot (talk | contribs) at 08:14, 8 December 2020 (v2.04b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Azure Data Lake
Developer(s)Microsoft
Initial releaseNovember 16, 2016; 7 years ago (2016-11-16)
Available inEnglish
TypeData storage and analytics service
Websiteazure.microsoft.com/en-us/solutions/data-lake/

Azure Data Lake[1] is a scalable data storage and analytics service. The service is hosted in Azure, Microsoft's public cloud.

History

Azure Data Lake service was released on November 16, 2016. It is based on COSMOS,[2] which is used to store and process data for applications such as Azure, AdCenter, Bing, MSN, Skype and Windows Live. COSMOS features a SQL-like query engine called SCOPE upon which U-SQL was built.[2]

Azure Data Lake Store

Users can store structured, semi-structured or unstructured data produced from applications including social networks, relational data, sensors, videos, web apps, mobile or desktop devices. A single Azure Data Lake Store account can store trillions of files where a single file can be greater than a petabyte in size.

Azure Data Lake Analytics

Azure Data Lake Analytics is a parallel on-demand job service. The parallel processing system is based on the Microsoft Dryad solution.[3] Dryad can represent arbitrary Directed Acyclic Graphs (DAGs) of computation. Data Lake Analytics provides a distributed infrastructure that can dynamically allocate or de-allocate resources so customers pay for only the services they use.

Azure Data Lake Analytics uses Apache YARN, the central part of Apache Hadoop to govern resource management and deliver operations across the Hadoop clusters. Microsoft Azure Data Lake Store supports any application that uses the open Apache Hadoop Distributed File System (HDFS) standard.[3]

U-SQL

Using Data Lake Analytics, users can develop and run parallel data transformation and processing programs in U-SQL, a query language that combines SQL with C#. U-SQL was designed as an evolution of the declarative SQL language with native extensibility through the user code written in C#. U-SQL uses C# data types and the C# expression language.

See also

References

  1. ^ "Data Lake". Microsoft Azure. Retrieved 2019-06-17.
  2. ^ a b Harris, Derrick (2015-02-05). "Why opening up its Cosmos big data system would be the right move for Microsoft". gigaom.com. Retrieved 2017-07-27.
  3. ^ a b Harris, Ed. "Cosmos" (PDF).

External links