Correlated subquery

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

In a SQL database query, a correlated subquery (also known as a synchronized subquery) is a subquery (a query nested inside another query) that uses values from the outer query. Because the subquery may be evaluated once for each row processed by the outer query, it can be slow.

Here is an example for a typical correlated subquery. In this example, the objective is to find all employees whose salary is above average for their department.

 SELECT employee_number, name
   FROM employees emp
   WHERE salary > (
     SELECT AVG(salary)
       FROM employees
       WHERE department = emp.department);

In the above query the outer query is

 SELECT employee_number, name
   FROM employees emp
   WHERE salary > ...

and the inner query (the correlated subquery) is

 SELECT AVG(salary)
   FROM employees
   WHERE department = emp.department

In the above nested query the inner query has to be re-executed for each employee. (A sufficiently smart implementation may cache the inner query's result on a department-by-department basis, but even in the best case the inner query must be executed once per department.)

Correlated subqueries may appear elsewhere besides the WHERE clause; for example, this query uses a correlated subquery in the SELECT clause to print the entire list of employees alongside the average salary for each employee's department. Again, because the subquery is correlated with a column of the outer query, it must be re-executed for each row of the result.[citation needed]

 SELECT
   employee_number,
   name,
   (SELECT AVG(salary) 
      FROM employees
      WHERE department = emp.department) AS department_average
   FROM employees emp

Correlated subqueries in the FROM clause[edit]

It is generally meaningless to have a correlated subquery in the FROM clause because the table in the FROM clause is needed to evaluate the outer query, but the correlated subquery in the FROM clause can't be evaluated before the outer query is evaluated, causing a chicken-and-egg problem. Specifically, MariaDB lists this as a limitation in its documentation.[1]

However, in some database systems, it is allowed to use correlated subqueries while joining in the FROM clause, referencing the tables listed before the join using a specified keyword, producing a number of rows in the correlated subquery and joining it to the table on the left. For example, in PostgreSQL, adding the keyword LATERAL before the right-hand subquery,[2] or in Microsoft SQL Server, using the keyword CROSS APPLY or OUTER APPLY instead of JOIN[3] achieves the effect.

References[edit]

  1. ^ "Subquery Limitations". MariaDB Knowledgebase. Retrieved 2020-12-24.
  2. ^ "Table Expressions". postgresql.org. Retrieved 2020-12-14.
  3. ^ "FROM clause plus JOIN, APPLY, PIVOT (Transact-SQL)". docs.microsoft.com. 2019-06-01. Retrieved 2020-12-24.

External links[edit]