Data warehouse appliance
In computing, a data warehouse appliance is a marketing term for an integrated set of servers, storage, operating system(s), DBMS and software specifically pre-installed and pre-optimized for data warehousing (DW). Alternatively, the term can also apply to similar software-only systems promoted as easy to install on specific recommended hardware configurations or preconfigured as a complete system.
Most DW appliances use massively parallel processing (MPP) architectures to provide high query performance and platform scalability. MPP architectures consist of independent processors or servers executing in parallel. Most MPP architectures implement a "shared-nothing architecture" where each server operates self-sufficiently and controls its own memory and disk. DW appliances distribute data onto dedicated disk storage units connected to each server in the appliance. This distribution allows DW appliances to resolve a relational query by scanning data on each server in parallel. The divide-and-conquer approach delivers high performance and scales linearly as new servers are added into the architecture.
MPP database architectures have a long pedigree. Some consider Teradata's initial product as the first DW appliance — or Britton-Lee's. Teradata acquired Britton Lee — renamed ShareBase — in June, 1990. Others disagree, considering appliances as a "disruptive technology" for Teradata
Additional vendors, including Tandem Computers, and Sequent Computer Systems also offered MPP architectures in the 1980s. Open source and commodity computing components aided a re-emergence of MPP data warehouse appliances. Advances in technology reduced costs and improved performance in storage devices, multi-core CPUs and networking components. Open-source RDBMS products, such as Ingres and PostgreSQL, reduce software-license costs and allow DW-appliance vendors to focus on optimization rather than providing basic database functionality. Open-source Linux became a common operating system for DW appliances.
Other DW appliance vendors use specialized hardware and advanced software, instead of MPP architectures. Netezza announced a "data appliance" in 2003, and used specialized field-programmable gate array hardware. Kickfire followed in 2008 with what they called a dataflow "sql chip".
In 2009 more DW appliances emerged. IBM integrated its InfoSphere Warehouse (formerly DB2 Warehouse) with its own servers and storage to create the IBM InfoSphere Balanced Warehouse. Netezza introduced its TwinFin platform based on commodity IBM hardware. Other DW appliance vendors have also partnered with major hardware vendors to help bring their appliances to market. DATAllegro, prior to acquisition by Microsoft, partnered with EMC Corporation and Dell and implemented open-source Ingres on Linux. Greenplum has a partnership with Sun Microsystems and implements Greenplum Database (based on PostgreSQL) on Solaris using the ZFS file system. HP Neoview has a wholly owned solution and uses HP NonStop SQL. XtremeData offers a software stack that can be used to create a "virtual data-warehousing appliance" built on commodity hardware, on-premise or in the Cloud for "deep analytics" and data mining.
The market has also seen the emergence of data-warehouse bundles where vendors combine their hardware and database software together as a data warehouse platform. The Oracle Optimized Warehouse Initiative combines the Oracle Database with hardware from various computer manufacturers (Dell, EMC, HP, IBM, SGI and Sun Microsystems). Oracle's Optimized Warehouses offer pre-validated configurations and the database software comes pre-installed. In September 2008 Oracle began offering a more classic appliance offering, the HP Oracle Database Machine, a jointly developed and co-branded platform that Oracle sold and supported and HP built in configurations specifically for Oracle. In September 2009, Oracle released a second-generation Exadata system, based on their newly acquired Sun Microsystems hardware.
- Queries From Hell blog » When is an appliance not an appliance?
- DBMS2 — DataBase Management System Services»Blog Archive » Data warehouse appliances – fact and fiction
- Omer Trajman, Alain Crolotte, David Steinhoff, Raghunath Nambiar, Meikel Poess: Database Are Not Toasters: A Framework for Comparing Data Warehouse Appliances
- Kobielus, James (April 22, 2008). "Teradata Goes Appliance, Officially". Retrieved 2011-01-14. "Teradata effectively established the DW appliance market a quarter-century ago when it rolled out the first in a long line of preconfigured, preoptimized solutions that combine CPUs, storage, software, and database to address the most demanding analytical and decision support requirements"
- "Database machines and data warehouse appliances – the early days". Monash Research. September 15, 2008. Retrieved 2011-01-15. "But for all practical purposes, the first two significant “database machine” vendors were Britton-Lee and Teradata. And since Britton-Lee eventually sold out to Teradata (after a brief name change to ShareBase), Teradata is entitled to whatever historical glory accrues from having innovated the database management appliance category."
- All, Ann (Apr 6, 2007). "Will a Data Warehouse Appliance Work for You?". Retrieved 2011-01-14. "DATAllegro has a site at Sears. Sears uses [the appliance] as a front end to their Teradata warehouse to calculate aggregates. So when they want to do slice-and-dice, how many we sold in which stores and of what color, they use the appliance...I think [appliances] could be a disruptive technology for Teradata"
- "Netezza Performance Server (NPS™) 8000 Series". Product web page. Netezza. Archived from the original on June 4, 2003. Retrieved August 16, 2013.
- Oracle Performance Architect Kevin Clossen - Oracle Exadata Storage Server
- Oracle Exadata - What is the benefit?
- Alex Gorbachev (September 15, 2009). "Unveiling the OLTP Oracle Database Machine & Exadata v2". Blog. Pythian. Retrieved August 16, 2013.