Jump to content

Hierarchical and recursive queries in SQL

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 144.36.227.150 (talk) at 10:58, 24 October 2013 (Unary operators). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

A hierarchical query is a type of SQL query that handles hierarchical model data. They are special case of more general recursive fixpoint queries, which compute transitive closures.

In standard SQL:1999 hierarchical queries are implemented by way of recursive common table expressions (CTEs). Unlike the Oracle extension described below, the recursive CTEs were designed with fixpoint semantics from the beginning.[1] The recursive CTEs from the standard were relatively close to the existing implementation in IBM DB2 version 2.[2] Recursive CTEs are also supported by Microsoft SQL Server,[3] Firebird 2.1[4] , PostgreSQL 8.4+,[5] Oracle 11g Release 2 and CUBRID.

An alternative syntax is the non-standard CONNECT BY construct; it was introduced by Oracle in the 1980s.[6] Prior to Oracle 10g, the construct was only useful for traversing acyclic graphs because it returned an error on detecting any cycles; in version 10g Oracle introduced the NOCYCLE feature (and keyword), making the traversal work in the presence of cycles as well.[7]

CONNECT BY

"CONNECT BY" is supported by EnterpriseDB,[8] Oracle database,[9] CUBRID,[10] and DB2 although only if it is enabled as a compatibility mode.[11] Syntax:

 SELECT select_list
 FROM table_expression
 [ WHERE ... ]
 [ START WITH start_expression ]
 CONNECT BY [NOCYCLE] { PRIOR parent_expr = child_expr | child_expr = PRIOR parent_expr }
 [ ORDER SIBLINGS BY column1 [ ASC | DESC ] [, column2 [ ASC | DESC ] ] ...
 [ GROUP BY ... ]
 [ HAVING ... ]
 ...
For example
 SELECT LEVEL, LPAD (' ', 2 * (LEVEL - 1)) || ename "employee", empno, mgr "manager"
 FROM emp START WITH mgr IS NULL
 CONNECT BY PRIOR empno = mgr;

The output from the above query would look like:

 level |  employee   | empno | manager
-------+-------------+-------+---------
     1 | KING        |  7839 |
     2 |   JONES     |  7566 |    7839
     3 |     SCOTT   |  7788 |    7566
     4 |       ADAMS |  7876 |    7788
     3 |     FORD    |  7902 |    7566
     4 |       SMITH |  7369 |    7902
     2 |   BLAKE     |  7698 |    7839
     3 |     ALLEN   |  7499 |    7698
     3 |     WARD    |  7521 |    7698
     3 |     MARTIN  |  7654 |    7698
     3 |     TURNER  |  7844 |    7698
     3 |     JAMES   |  7900 |    7698
     2 |   CLARK     |  7782 |    7839
     3 |     MILLER  |  7934 |    7782
(14 rows)

Pseudocolumns

  • LEVEL
  • CONNECT_BY_ISLEAF
  • CONNECT_BY_ISCYCLE

Unary operators

The following example returns the last name of each employee in department 10, each manager above that employee in the hierarchy, the number of levels between manager and employee, and the path between the two:


SELECT ename "Employee", CONNECT_BY_ROOT ename "Manager",

  LEVEL-1 "Pathlen", SYS_CONNECT_BY_PATH(ename, '/') "Path"
  FROM emp
  WHERE LEVEL > 1 and deptno = 10
  CONNECT BY PRIOR empno = mgr
  ORDER BY "Employee", "Manager", "Pathlen", "Path";

Functions

  • SYS_CONNECT_BY_PATH

Common table expression

A Common Table Expression, or CTE, (in SQL) is a temporary named result set, derived from a simple query and defined within the execution scope of a SELECT, INSERT, UPDATE, or DELETE statement.

CTEs can be thought of as alternatives to derived tables (subquery), views, and inline user-defined functions.

Common table expressions are supported by DB2, Firebird,[12] Microsoft SQL Server, Oracle (with recursion since 11g release 2), PostgreSQL (since 8.4), HyperSQL and H2 (experimental).[13] Oracle calls CTEs "subquery factoring".[14]

Syntax:

WITH [RECURSIVE] with_query [, ...]
SELECT...

with_query looks like

with query_name [ (column_name [,...]) ] AS (SELECT ...)

Recursive CTEs (or "recursive subquery factoring"[15] in Oracle lingo) can be use to traverse relations (as graphs or trees) although the syntax is much more involved because there are no automatic pseudocolums created (like LEVEL above); if these are desired, they have to be created in the code. See MSDN documentation[3] or IBM documentation[16][17] for tutorial examples.

The RECURSIVE keyword is not usually needed after WITH in systems other than PostgreSQL.[18]

In SQL:1999 a recursive (CTE) query may appear anywhere a query is allowed. It's possible for example to name the result using CREATE [RECURSIVE] VIEW.[19] Using a CTE inside an INSERT INTO one can populate a table with data generated from a recursive query; random data generation is possible using this technique without using any procedural statements.[20]

An example of recursive query computing the factorial of numbers from 0 to 9 is the following

WITH RECURSIVE temp (n, fact) AS 
(SELECT 0, 1 -- Initial Subquery
  UNION ALL 
 SELECT n+1, (n+1)*fact FROM temp -- Recursive Subquery 
        WHERE n < 9)
SELECT * FROM temp;

See also

References

  1. ^ Jim Melton; Alan R. Simon (2002). SQL:1999: Understanding Relational Language Components. Morgan Kaufmann. p. 343. ISBN 978-1-55860-456-8.
  2. ^ Jim Melton; Alan R. Simon (2002). SQL:1999: Understanding Relational Language Components. Morgan Kaufmann. p. 353. ISBN 978-1-55860-456-8.
  3. ^ a b Microsoft. "Recursive Queries Using Common Table Expressions". Retrieved 2009-12-23.
  4. ^ Helen Borrie (2008-07-15). "Firebird 2.1 Release Notes". Retrieved 2009-02-07.
  5. ^ "WITH Queries". PostgreSQL
  6. ^ Benedikt, M.; Senellart, P. (2011). "Databases". In Blum, Edward K.; Aho, Alfred V. (eds.). Computer Science. The Hardware, Software and Heart of It. p. 189. doi:10.1007/978-1-4614-1168-0_10. ISBN 978-1-4614-1167-3.. {{cite book}}: Check |isbn= value: invalid character (help)
  7. ^ Sanjay Mishra; Alan Beaulieu (2004). Mastering Oracle SQL. O'Reilly Media, Inc. p. 227. ISBN 978-0-596-00632-7.
  8. ^ Hierarchical Queries, EnterpriseDB
  9. ^ Hierarchical Queries, Oracle
  10. ^ "CUBRID Hierarchical Query". Retrieved 11 February 2013.
  11. ^ Jonathan Gennick (2010). SQL Pocket Guide (3rd ed.). O'Reilly Media, Inc. p. 8. ISBN 978-1-4493-9409-7.
  12. ^ Comparison of relational database management systems#Database capabilities
  13. ^ http://www.h2database.com/html/advanced.html#recursive_queries
  14. ^ Karen Morton; Robyn Sands; Jared Still (2010). Pro Oracle SQL. Apress. p. 283. ISBN 978-1-4302-3228-5. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  15. ^ Karen Morton; Robyn Sands; Jared Still (2010). Pro Oracle SQL. Apress. p. 304. ISBN 978-1-4302-3228-5. {{cite book}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
  16. ^ http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/topic/com.ibm.db2z9.doc.apsg/src/tpc/db2z_xmprecursivecte.htm
  17. ^ http://publib.boulder.ibm.com/infocenter/iseries/v5r4/index.jsp?topic=%2Fsqlp%2Frbafyrecursivequeries.htm
  18. ^ Regina Obe; Leo Hsu (2012). PostgreSQL: Up and Running. O'Reilly Media. p. 94. ISBN 978-1-4493-2633-3.
  19. ^ Jim Melton; Alan R. Simon (2002). SQL:1999: Understanding Relational Language Components. Morgan Kaufmann. p. 352. ISBN 978-1-55860-456-8.
  20. ^ Don Chamberlin (1998). A Complete Guide to DB2 Universal Database. Morgan Kaufmann. pp. 253–254. ISBN 978-1-55860-482-7.

Further reading

Academic textbooks. Note that these cover only the SQL:1999 standard (and Datalog), but not the Oracle extension.

  • Abraham Silberschatz; Henry Korth; S. Sudarshan (2010). Database System Concepts (6th ed.). McGraw-Hill. pp. 187–192. ISBN 978-0-07-352332-3.
  • Raghu Ramakrishnan; Johannes Gehrke (2003). Database management systems (3rd ed.). McGraw-Hill. ISBN 978-0-07-246563-1. Chapter 24.
  • Hector Garcia-Molina; Jeffrey D. Ullman; Jennifer Widom (2009). Database systems: the complete book (2nd ed.). Pearson Prentice Hall. pp. 437–445. ISBN 978-0-13-187325-4.