Pentaho

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by ChazzI73 (talk | contribs) at 23:39, 16 April 2014. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Pentaho Business Intelligence
Developer(s)Pentaho Corporation
Stable release
Business Analytics Suite 4.8.0/ November, 2012
Operating systemWindows, Linux, Mac OS X
PlatformJava (software platform)
TypeBusiness intelligence
LicensePentaho Community Edition (CE): Apache version 2.0; Pentaho Enterprise Edition (EE): Commercial License
Websitewww.pentaho.com

Pentaho was founded in 2004 by five founders. [1][2] It offers a suite of open source Business Intelligence (BI) products called Pentaho Business Analytics providing data integration, OLAP services, reporting, dashboarding, data mining and ETL capabilities.[3] Pentaho is headquartered in Orlando, FL, USA.[4]

Overview

The Pentaho suite consists of two offerings, an enterprise and community edition. The enterprise edition contains extra features not found in the community edition. The enterprise edition is obtained through an annual subscription and includes extra support services. Pentaho's core offering is frequently enhanced by add-on products, usually in the form of plugins, from the company itself and also the broader community of users and enthusiasts. The table below summarizes the most popular products and plugins in the Pentaho ecosystem.

Server Applications

Product Offering Type Recent version Description
Pentaho BI Platform EE, CE Server Application 4.8.0 Commonly referred to as the BI Platform, and recently renamed Business Analytics Platform (BA Platform), makes up the core software piece that hosts content created both in the server itself through plugins or files published to the server from the desktop applications. It includes features for managing security, running reports, displaying dashboards, report bursting, scripted business rules, OLAP analysis and scheduling out of the box.
Commercial plugins from Pentaho expand out-of-the-box features. A few open-source plugin projects also expand capabilities of the server. The Pentaho BA Platform runs in the Apache Java Application Server. It can be embedded into other Java Application Servers.
Pentaho Analysis Services (Mondrian) EE, CE Server Application 3.5.0 Pentaho Analysis Services, codenamed Mondrian, is an open source OLAP (online analytical processing) server, written in Java.

It supports the MDX (multidimensional expressions) query language and the XML for Analysis and olap4j interface specifications. It reads from SQL and other data sources and aggregates data in a memory cache. Mondrian can be run separately from the Pentaho BI Platform, but is always bundled with the platform itself in both EE and CE versions.

Pentaho Dashboard Designer (PDD) EE Server Plugin 4.8-GA A commercial plugin provided to enterprise edition (EE) subscribers. It allows users to create dashboards, which are collections of other content components displayed together with the goal of providing a centralized view of key performance indicators (KPI)s and other business data movements, letting users monitor them and make decisions. Content components are usually individual Information graphics, tables, OLAP views or reports. The plugin simplifies dashboard creation through the use of layout templates, drag-and-drop interaction and a GUI for providing parameters and inputs to dashboard components.
Pentaho Analysis (Analyzer) (PAZ) EE Server Plugin 4.8-GA The Pentaho Analyzer plugin provides an advanced web based, drag-and-drop OLAP viewer. It allows a user to visually create MDX queries by dragging parts of a previously defined Mondrian OLAP schema onto a canvas, where other activities can take place like filtering, sorting, creating calculated members from other measures, exporting the result table to PDF or MS Excel, and optionally graphing the data. It is also known to work on Apple iPads by using the Safari web browser.
Pentaho Interactive Reporting (PIR) EE Server Plugin 4.8-GA This plugin enables users to create ad hoc reports in a visual drag-and-drop fashion.
Pentaho Data Access Wizard EE,CE Server Plugin 4.8-GA This plugin is bundled with all servers and allows users to create new data sources for use throughout the system from other databases or CSV files uploaded to the server while using a setup wizard. During the steps of creating a datasource users also are given a chance to create a data model describing how columns or fields relate to each other creating hierarchies of relationships like Time: Year, Quarters, Months, Weeks and Product Division, Category, Type etc. The resultant model is used by Mondrian and any other plugin like Analyzer or Saiku to create new queries against the newly created data source. This component is part of what Pentaho introduces as agile BI, which simply means having a way to start from basic data and quickly iterate through steps to discover the proper way to structure, study and present the data.[5]
Pentaho Mobile EE Server Piece 4.8-GA A new addition since 4.5-GA suite that is a user interface adapted for use with the Apple iPad. It exposes all of the major functionality of OLAP analysis and running of reports and dashboards that allow greater interaction on a small, touch screen. Mobile also adds features for bookmarking favorite content for easy access and the concept of opening several pieces of content in tabs.

Desktop / Client Applications

Product Offering Type Recent version Description
Pentaho Data Integration (PDI) EE, CE Desktop Application 4.4.0-stable Pentaho Data Integration, codenamed Kettle, consists of a core data integration (ETL) engine, and GUI applications that allow the user to define data integration jobs and transformations. It supports deployment on single node computers as well as on a cloud, or cluster.
Pentaho for Big Data EE, CE PDI Plugin N/A Pentaho for Big Data is a data integration tool based on Pentaho Data Integration.[6] It allows executing ETL jobs in and out of big data environments such as Apache Hadoop or Hadoop distributions such as Amazon, Cloudera, EMC Greenplum, MapR, and Hortonworks.[7] It also supports NoSQL data sources such as MongoDB and HBase.[8]
Pentaho Report Designer EE, CE Desktop Application 3.9.1-stable Pentaho Report Designer is a visual, banded report writer. Features include using subreports, charts and graphs. It can query and use data from many sources including SQL, MDX, Community Data Access, scripting, static table definitions and more. It consists of a core reporting engine, capable of generating reports based on an XML definition file stored in a Zip (file format) with a .PRPT extension. Many tools have been developed surrounding the reporting engine, including GUI designers and ad hoc wizards that guide the user through a step-by-step process of creating a report, using solely graphical tools without the need to write any code.
Pentaho Data Mining EE, CE Desktop Application [2] Pentaho Data Mining used the Waikato Environment for Knowledge Analysis (Weka) to search data for patterns. Weka consists of machine learning algorithms for a broad set of data mining tasks.[9] It contains functions for data processing, regression analysis, classification methods, cluster analysis, and visualization. Based on the discovered patterns, users can predict future trends.[10]
Pentaho Metadata Editor (PME) EE, CE Desktop Application Metadata/ 4.8 The metadata editor is used to create business models and act as an abstraction layer from the underlying data sources. The resulting metadata models are used by Pentaho Interactive Reporting, Saiku Reporting, and Pentaho's legacy AD HOC reporting plugin applications to create reports within the BA server without using any of the other external desktop applications.
Pentaho Aggregate Designer (PAD) EE, CE Desktop Application 1.5.0 Aggregate Designer operates on Pentaho Analysis (Mondrian) XML schema files and the database with the underlying tables described by the schema to generate precalculated, aggregated answers to speed up analysis work and MDX queries executed against Mondrian. This is accomplished by the software examining the hierarchies described in the schema and the measures also defined there and generating SQL which would result in the creation of tables storing those answers away for future use by Mondrian. After using the software to generate these aggregate tables, the original Mondrian XML schema file describing the OLAP cube is modified to reference the precomputed results.
Pentaho Schema Workbench (PSW) EE, CE Desktop Application 3.5.0 Pentaho Schema Workbench provides a graphical interface for designing OLAP cubes for Pentaho Analysis (Mondrian). The schema created is stored as a regular XML file on disk. It is not necessary to use the Schema Workbench to create schema, but it is often helpful for beginners and even experts who need go inspect a cube visually and come up to speed with how to maintain or extend it.
Pentaho Design Studio (PDS) EE, CE Desktop Application 4.0 The Pentaho BA Server supports special XML scripts called xactions to implement business logic and other forms of automation in the platform. Design Studio is a modified version of the Eclipse Development Environment with a plugin designed to understand the components supported by xaction scripts. Xactions are very powerful, and useful, but sometimes prove difficult to troubleshoot because of the low-level way they interact with parts of the BA server. Developers are starting to use Pentaho Data Integration transformation files to carry out automation and business logic tasks. The transformations can be run directly by the BA Server and visually debugged in Pentaho Data Integration (PDI) and are quickly gaining favor in the community over xactions. It is a small leap to imagine PDI transformations will eventually replace xactions entirely.

Community Driven, Open-Source Pentaho Server Plugins

All of these plugins function with Pentaho Enterprise Edition (EE) and Pentaho Community Edition (CE).

Product Type Recent version Description
Ctools Server Plugin Suite Various Known as the Community tools, it includes a growing array of features usually contained in a package with an abbreviated name where the first C always stands for community and simultaneously represents its status as being both free of cost and open-source. The tools are produced and managed by Webdetails.[11] Documentation on the tools is found at ctools.webdetails.org. Most often the Ctools suite is installed by using a linux script.,[12] but there are plans in an upcoming release to have a package manager included in the BA Server that helps with installation.[13]
Community Charting Components (CCC) Server Plugin Various A charting library on top of Protovis,[14] a very powerful free and open-source visualization toolkit. The aim of CCC is to provide developers with a way to include into their dashboards the basic chart types without losing the main principle: Extensibility. The charts created with CCC become components that appear in dashboards.
Community Build Framework (CBF) Build Script Framework 3.7 Focused on a multi-project/ multi-environment scenario, the Community Build Framework (CBF) provide a way to set up and deploy Pentaho-based applications. It is an Apache Ant, Java build-script that allows a user to create a template of their Pentaho BA Server installation, including patches and any customizations or special content and roll it out quickly. It can help migrations to new versions of the BA Server, and with rapidly producing customized Pentaho servers for clients.
Community Data Access (CDA) Server Plugin latest Acts as a common layer for accessing data on the Pentaho BA server. CDA files can contain SQL, MDX, Pentaho Data Integration transformation files, scripted data sources and more.[15] CDA also provides a REST API interface for directly calling the Pentaho BA server and receiving the results of a query back as JSON, XML, XLS, HTML or CSV. The default is JSON.[16] HTML output makes it easy for MS Excel users to perform Web queries and pull results directly into an Excel workbook without additional software in the middle. CDA comes bundled in all of Pentaho's servers.
Community Data Browser (CDB) Server Plugin Community Data Browser uses a visual OLAP browser called Saiku to create a query which can be used by R for performing analytics on the result set.
Community Distributed Cache (CDC) Server Plugin latest

CDC stands for Community Distributed Cache and allows for high-performance, scalable and distributed memory clustering cache based on Hazelcast for both CDA and Mondrian. CDC is a pentaho plugin that provides the following features:

  • CDA distributed cache support
  • Mondrian distributed cache support
  • Ability to switch between default and CDC cache for cda and mondrian
  • Gracefully handles adding / removing new cache nodes
  • Allows to selectively clear cache of specific CDE dashboards
  • Allows to selectively clear cache of specific schemas / cubes / dimensions of Mondrian cubes
  • Provides an API to clean the cache from the outside (e.g.: after running ETL)
  • Provides a view over cluster status
  • Supports multiple pentaho servers using the same cluster (e.g.: stage and production)
  • Supports several memory configuration options
Community Data Generator (CDG) PDI Jobs N/A CDG is a data warehouse generator that helps create sample data for creating proof of concept dashboards. Given the definition of dimensions that we want, CDG will randomize data within certain parameters and output 3 different things:
  • Database and table for the fact table.
  • A file with inserts for the fact table.
  • Mondrian schema file to be used within Pentaho.
Community Data Validation (CDV) Server Plugin CDV adds the ability of creating validation tests on the Pentaho BA server for the purpose of verifying both the integrity of the server itself and also the data being used by the server.
Community Graphics Generator (CGG) Server Plugin latest

Pentaho Plugin that allows the user to export CCC/CDE charts as images, enabling the inclusion of CDE charts inside Pentaho Report Designer reports. In short, this plugin is able to render server-side exactly the same chart that is rendered on the browser by CDE/CDF.
Main characteristics:

  • Executes a CCC chart definition server-side and outputs the chart as an image or a svg file.
  • Exposes the chart as an url - use your chart exports wherever you can embed a link.
  • Seamless integration with CDE.
  • Can also be used to render custom-made svg transformations and javascript files server side and output them as images.
Community Dashboard Editor (CDE) Server Plugin 20120719 CDE is an advanced user tool for creating dashboards in the Pentaho BA server. CDE and the technology underneath (CDF, CDA and CCC) allows to develop and deploy dashboards in the Pentaho platform in a fast and effective way. It is not as user friendly as Pentaho Dashboard Designer plugin, but enables users to create much more sophisticated designs.
Community Dashboard Framework (CDF) Server Plugin 4.8-stable CDF comes bundled in all of Pentaho's servers. It is the framework used both by CDE and Pentaho's Dashboard Designer to create dashboards on the system.[17]
  • It separates logic (JavaScript) of the presentation (HTML, CSS)
  • It features a life cycle with components interacting with each other
  • It uses AJAX
  • It is extensible, which gives the users a high level of customization
  • Advanced users can extend the library of components.
  • They also can insert their own snippets of JavaScript and jQuery code.
Community Startup Tabs (CST) Server Plugin 1.0 Out of the box a Pentaho BA Server comes with a user interface called the Pentaho User Console (PUC) which show all content by opening tabs within itself. Community Startup Tabs provide an easy way to define and show specialized content to users by automatically opening tabs when they sign in.[18]
  • it allows you to define different startup tabs for each user that logs into the PUC. .it is easy to configure.
  • it allows to define startup tabs based on user names or user roles.
  • for the definition of the startup tabs it allows you to specify user names or roles using regular expressions.
Saiku Server Plugin 2.4 Saiku is a modular open-source analysis suite offering lightweight OLAP which remains easily embeddable, extendable and configurable. It is similar in form and function to the Pentaho Analyzer Plugin.

A RESTful server connects to existing OLAP systems, which then powers user-friendly, intuitive analytics via a lightweight JQuery-based frontend.

Saiku-Reporting Server Plugin 1.0-GA A rapidly developing AD HOC reporting tool, similar to Pentaho's Interactive Reporting Plugin.
Key Features:
  • Drag & Drop Report-Design
  • Export to: PDF,CSV,XLS,CDA,PRPT
  • Uses Pentaho Report Designer PRPT-Templates
  • Grouping
  • Aggregation
  • Totals
  • OpenFormula Support

Social Media Communication


Official Sources

  • Pentaho's Official Blog
    • The blog is a good resource to read about Pentaho's corporate direction, current commercial development focus and company philosophy.
  • Pentaho's Official Evaluation Center
    • Pentaho employees post video demonstrations and tutorials of accomplishing tasks frequently found to add value for companies pursuing enterprise solutions. Frequently, interesting and educational usage of the commercial features of the suite are on display.
  • Pentaho's Official Documentation Center
    • Previously known as the 'Knowledge Base' and included only in commercial offerings, it is now fully open to the public. The information helps users come up to speed on Pentaho products and understand what is required to achieve and manage an enterprise grade deployment.
  • Pentaho Community Forums
    • Enterprise subscribers, product evaluators, hobbyists, developers et al. post messages and respond to questions, learning quests and problems encountered while using or creating solutions with all of the products in the suite. Everyone is welcome to read and post.
  • Pentaho Community Wiki
    • The Pentaho Community Wiki provides not only product configuration and usage information, but also developer oriented information to help newcomers acquire the source code, set up a development environment, understand scrum development, see current development objectives, and understand the architecture of the business analytics platform and surrounding products.
  • Pentaho Community Technical WebEx Recordings
  • Pentaho Enhancement and Bug Tracking System
  • Twitter: @pentaho


Prominent Pentaho Figures and Social Media

Notable Pentaho Partners and Community Members

Licensing

Pentaho follows a commercial open source business model. It provides two different editions of Pentaho Business Analytics: a community edition and an enterprise edition. The enterprise edition needs to be purchased on a subscription model. The subscription model includes support, services, and product enhancements via annual subscription.[34] The enterprise edition is available under a commercial license. There are three variants of the enterprise edition: Basic, Professional, and Enterprise. The community edition is a free open source product licensed under the GNU General Public License version 2.0 (GPLv2), GNU Lesser General Public License version 2.0 (LGPLv2), and Mozilla Public License 1.1 (MPL 1.1).

Accolades and Awards

  • InfoWorld Bossie Award 2008, 2009, 2010, 2011, 2012 [35]
  • Ventana Research Leadership Award 2010 for StoneGate Senior Care [3]
  • CRN Emerging Technology Vendor 2010 [4]
  • ROI Awards 2012 - Nucleus Research
  • More awards on Pentaho's Awards Page

See also

References

<references> [1] [2] [3] [4] [11] [12] [14] [15] [29] [30] [31] [32]

  1. ^ a b Glyn Moody, Computerworld UK. "How Do You Make a Pentaho?" May 5, 2010. Retrieved April 9, 2012.
  2. ^ a b Seth Grimes, InformationWeek. "Open-Source BI Startup Pentaho Makes Debut." June 16, 2005. Retrieved April 5, 2012.
  3. ^ a b Madan Sheina, Ovum. "Pentaho BI Suite Enterprise Edition." September 15, 2010. Retrieved February 12, 2011.
  4. ^ a b Steven Brown, San Francisco Business Times. "Florida’s Pentaho hires Quentin Gallivan as CEO in San Francisco." October 4, 2011. Retrieved April 12, 2012.
  5. ^ Michael Terallo, Pentaho Data Access Wizard Retrieved July 29, 2012
  6. ^ Surya Mukherjee, Ovum. "Pentaho expands coverage for Big Data." March 8, 2012. Retrieved April 11, 2012.
  7. ^ James Kobielus, Forrester Research. "The Forrester Wave: Enterprise Hadoop Solutions." February 2, 2012. Retrieved May 10, 2012.
  8. ^ David Menninger, Ventana Research. "Pentaho 4 Unites Enterprise Business intelligence and Data Integration." June 22, 2011. Retrieved April 8, 2012.
  9. ^ Nikos Mastorakis, Valeria Mladenov and Vassiliki Kontargyri. "Proceedings of the European Computing Conference." Heidelberg, Germany: Springer Science and Business Media, 2009. ISBN 978-0387848136. p. 789. Retrieved July 11, 2012.
  10. ^ Ed Woord, FLOSS FOR SCIENCE. "Machine Learning with WEKA: AN Interview with Mark Hall." July 1, 2012. Retrieved July 25, 2012
  11. ^ a b Webdetails Consulting Company, Portugal
  12. ^ a b Pedro, Alves "Back to basics: Step by step Pentaho + Ctools installation" December 15, 2011, Retrieved July 27, 2012
  13. ^ Will, Gorman Pentaho Wiki "Pentaho BI Server Marketplace Plugin February 17, 2012, Retrieved July 27, 2012
  14. ^ a b Stanford Visualization Group, Protovis http://mbostock.github.com/protovis/
  15. ^ a b CDA Documentation Retrieved July 26, 2012.
  16. ^ CDA web API reference: doQuery Retrieved July 27, 2012
  17. ^ CDF Documentation
  18. ^ CST Documentation
  19. ^ a b c Pentaho, "Pentaho Appoints Quentin Gallivan As CEO" October 4, 2011
  20. ^ a b Pentaho, "Meet the Pentaho Team" Retrieved July 27, 2012
  21. ^ James, Dixon, "James Dixon's Blog - Dixon's thoughts on commercial open source and open source business intelligence" Retrieved July 27, 2012
  22. ^ Will, Gorman, Packt Publishing, "Pentaho Reporting 3.5 for Java Developers", Published September 2009
  23. ^ Julian Hyde, "Julian Hyde on Streaming Data, Open Source OLAP. And stuff." Retrieved July 27, 2012
  24. ^ Matt Casters, "matt casters on data integration" Retrieved July 27, 2012
  25. ^ a b Matt Casters, Bouman, Dongen, Wiley Pentaho Kettle Solutions: Building Open Source ETL Solutions with Pentaho Data Integration" September 2010
  26. ^ Thomas, Morgner, "Reporting Tales - Pentaho Reporting Tips and Tricks from the Author" Retrieved July 27, 2012
  27. ^ Roland, Bouman, "Roland Bouman's blog" Retrieved July 27, 2012
  28. ^ a b Roland Bouman, Dongen, Wiley, "Pentaho Solutions: Business Intelligence and Data Warehousing with Pentaho and MySQL", August 2009
  29. ^ a b Slawomir Chodnicki - Google Analytics Plugin
  30. ^ a b Slawomir Chodnicki - Excel Writer Plugin
  31. ^ a b Slawomir Chodnicki - Edi2Xml Plugin
  32. ^ a b Slawomir Chodnicki - Ruby Scripting Plugin
  33. ^ a b Paul, Stoellberger, Saiku, "Paul Stoellberger - Pentaho Community Meetup 2011 / Frascati" Retrieved July 27, 2012
  34. ^ Torben Pedersen and Mukesh Mohania. "Data Warehousing and Knowledge Discovery." Heidelberg, Germany: Springer Science and Business Media, 2009. ISBN 978-3642037290. p.296-298. Retrieved April 6, 2012.
  35. ^ [1] Retrieved Oct 1, 2012

External links