Apache Storm
Developer(s) | Backtype, Twitter |
---|---|
Stable release | 1.0.2
/ 10 August 2016 |
Repository | |
Written in | Clojure & Java |
Operating system | Cross-platform |
Type | Distributed stream processing |
License | Apache License 2.0 |
Website | storm |
Apache Storm is a distributed stream processing computation framework written predominantly in the Clojure programming language. Originally created by Nathan Marz[1] and team at BackType,[2] the project was open sourced after being acquired by Twitter.[3] It uses custom created "spouts" and "bolts" to define information sources and manipulations to allow batch, distributed processing of streaming data. The initial release was on 17 September 2011.[4]
A Storm application is designed as a "topology" in the shape of a directed acyclic graph (DAG) with spouts and bolts acting as the graph vertices. Edges on the graph are named streams and direct data from one node to another. Together, the topology acts as a data transformation pipeline. At a superficial level the general topology structure is similar to a MapReduce job, with the main difference being that data is processed in real time as opposed to in individual batches. Additionally, Storm topologies run indefinitely until killed, while a MapReduce job DAG must eventually end.[5]
Storm became an Apache Top-Level Project in September 2014[6] and was previously in incubation since September 2013.[7][8]
Development
Apache Storm is developed under the Apache License, making it available to most companies to use.[9] Git is used for version control and Atlassian JIRA for issue tracking, under the Apache Incubator program.
Version | Release Date |
---|---|
1.1.0 | 29 Mar 2017 |
1.0.0 | 12 April 2016 |
0.10.0 | 5 November 2015 |
0.9.6 | 5 November 2015 |
0.9.5 | 4 June 2015 |
0.9.4 | 25 March 2015 |
0.9.3 | 25 November 2014 |
0.9.2 | 25 June 2014 |
0.9.1 | 10 February 2014 |
Historical (non-Apache) Version | Release Date |
0.9.0 | 8 December 2013 |
0.8.0 | 2 August 2012 |
0.7.0 | 28 February 2012 |
0.6.0 | 15 December 2011 |
0.5.0 | 19 September 2011 |
Peer platforms
Storm is but one of dozens of stream processing engines, for a more complete list see Stream processing. Twitter announced Heron on June 2, 2015[10] which is API compatible with Storm. There are other comparable streaming data engines such as Spark Streaming and Flink.[11]
See also
- Ateji PX
- Boost.Thread
- Charm++
- Cilk
- Coarray Fortran
- CUDA
- Dryad
- C++ AMP
- Global Arrays
- Lambda architecture
- MPI
- OpenMP
- OpenCL
- OpenHMPP
- OpenACC
- TPL
- PLINQ
- PVM
- POSIX Threads
- RaftLib
- UPC
- TBB
References
- ^ Marz, Nathan. "About Nathan Marz". Nathan Marz. Retrieved 28 March 2013.
- ^ "BackType Website (defunct)". BackType. Retrieved 28 March 2013.
- ^ "A Storm is coming: more details and plans for release". Engineering Blog. Twitter Inc. Retrieved 29 July 2015.
- ^ "Storm Codebase". Github. Retrieved 8 February 2013.
- ^ "Tutorial - Components of a Storm cluster". Documentation. Apache Storm. Retrieved 29 July 2015.
- ^ "Apache Storm Graduates to a Top-Level Project".
- ^ "Storm Project Incubation Status". Apache Software Foundation. Retrieved 29 October 2013.
- ^ "Storm Proposal". Apache Software Foundation. Retrieved 29 October 2013.
- ^ "Powered By Storm". Documentation. Apache Storm. Retrieved 29 July 2015.
- ^ "Flying faster with Twitter Heron". Engineering Blog. Twitter Inc. Retrieved 3 June 2015.
- ^ "Benchmarking Streaming Computation Engines: Storm, Flink and Spark Streaming" (PDF). IEEE. May 2016.
External links
- Project Homepage
- Storm's issue tracker
- Storm Code Repository on Github
- Storm is used to improve Twitter Search
- Nathan Marz's Presentation on Storm: Distributed and Fault-Tolerant Real-time Computation
- Storm Mailing List Archives
- Petrel, a tool for creating Storm applications in Python
- FsStorm, a lib for authoring Storm components and topologies in F#