In a SQL database query, a correlated subquery (also known as a synchronized subquery) is a subquery (a query nested inside another query) that uses values from outer query. The subquery is evaluated once for each row processed by the outer query
Here is an example for a typical correlated subquery. In this example we are finding the list of all employees whose salary is above average for their departments.
SELECT employee_number, name FROM employees AS Bob WHERE salary > ( SELECT AVG(salary) FROM employees WHERE department = Bob.department);
In the above query the outer query is
SELECT employee_number, name FROM employees AS Bob WHERE salary > ...
and the inner query (the correlated subquery) is
SELECT AVG(salary) FROM employees WHERE department = Bob.department
In the above nested query the inner query has to be re-executed for each employee. (A sufficiently smart implementation may cache the inner query's result on a department-by-department basis, but even in the best case the inner query must be executed once per department. See "Optimizing correlated subqueries" below.)
Correlated subqueries may appear elsewhere besides the WHERE clause; for example, this query uses a correlated subquery in the SELECT clause to print the entire list of employees alongside the average salary for each employee's department. Again, because the subquery is correlated with a column of the outer query, it must be re-executed for each row of the result.
SELECT employee_number, name, (SELECT AVG(salary) FROM employees WHERE department = Bob.department) AS department_average FROM employees AS Bob;
The effect of correlated subqueries can in some cases be obtained using joins. For example, the queries above (which use inefficient correlated subqueries) may be rewritten as follows. The example uses a Natural JOIN for convenience.
-- This subquery is not correlated with the outer query, and is therefore -- executed only once, regardless of the number of employees. SELECT employee_number, name FROM employees NATURAL JOIN (SELECT department, AVG(salary) AS department_average FROM employees GROUP BY department) AS temp WHERE salary > department_average;
Another way of improving performance is to create a view (which is computed once), and then query the view:
CREATE VIEW dept_avg AS SELECT department, AVG(salary) AS department_average FROM employees GROUP BY department; -- List employees making more than their department average. SELECT employee_number, name FROM employees NATURAL JOIN dept_avg WHERE salary > department_average; -- List employees alongside their respective department averages. SELECT employee_number, name, department_average FROM employees NATURAL JOIN dept_avg; DROP VIEW dept_avg;