5. Advanced SQL: Mastering Subqueries

When basic SELECT, WHERE, and JOIN clauses aren't enough, you need Subqueries. A subquery (or inner query) is simply a query nested inside another SQL query (the outer query). Subqueries allow you to perform advanced filtering, retrieve complex results, and perform calculations dynamically.

There are two main types of subqueries: Non-Correlated and Correlated.


1. Non-Correlated Subqueries (Independent)

A Non-Correlated Subquery is a query that can be executed independently of the outer query. It runs once, returns its result, and the outer query then uses that result. These are most often found in the WHERE clause.

Example 1: Finding Products Above Average Price

Goal: Retrieve the name and price of all products that are more expensive than the overall average product price.

  1. Inner Query (Runs First): Calculates the single average price of all products.

  2. Outer Query (Uses Result): Filters the Product table using that single average price.


SELECT Name, Price
FROM Product
WHERE Price > (
    SELECT AVG(Price)
    FROM Product
);
NamePrice
Laptop1200.00
Monitor450.00
......

2. Correlated Subqueries (Dependent)

A Correlated Subquery is a query that depends on the outer query for its values. It runs once for every single row processed by the outer query. This is computationally expensive but necessary when the inner calculation needs to be customized for each row.

Correlated subqueries often use the EXISTS or IN operators and can sometimes be rewritten using joins, though they excel at filtering based on group aggregates.

Example 2: Finding Customers with High-Value Transactions

Goal: Retrieve the name of every customer who has made at least one sale transaction with a quantity greater than 50.

  1. Outer Query (Iterates): Selects a customer row from the Customer table.

  2. Inner Query (Runs for Each Customer): Checks the Sales table to see if a sale exists for that specific customer where the quantity was $> 50$.


SELECT Name
FROM Customer C
WHERE EXISTS (
    SELECT 1  -- We just need a true/false signal (1 is faster than *)
    FROM Sales S
    WHERE S.CustomerID = C.CustomerID  -- This links the inner and outer query
      AND S.Quantity > 50
);
Name
Customer A
Customer C
...

In this example, the inner query is "correlated" because it refers to C.CustomerID (a value from the outer query's current row). This dependency is what distinguishes a correlated subquery.

Comments

Popular Posts