At the beginning of December, at the UKOUG Tech17 conference in Birmingham (GB), I presented a comparison of the query optimizers of MySQL 8.0.3 and PostgreSQL 10.1. One of the things I talked about is their ability to handle subqueries. I summarized my findings with the following sentence:
Simple sub-queries that are not correctly optimized were observed.
It goes without saying that such a sentence leaves a lot of questions open. After all, it is just a summary. The aim of this post is to show you which subqueries I tested, and to compare my expectations with the execution plans generated by the query optimizers. In addition, since I’m not limited in time and scope as during a 50-minute presentation, I also discuss how the Oracle Database 12.2 query optimizer handles the same queries.
To check how well a query optimizer handles subqueries, it’s in my opinion sufficient to challenge it with queries that should be obvious (at least for a human being). The type of queries where the response time ratio between a good and a bad execution plan is of several orders of magnitude. If a query optimizer isn’t able to correctly handle such queries, with more complex ones it can only be worse…
For the tests I use two very simple tables:
CREATE TABLE small (u INTEGER NOT NULL, nu INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL); CREATE UNIQUE INDEX small_u ON small (u); CREATE INDEX small_nu ON small (nu); CREATE INDEX small_n ON small (n); CREATE INDEX small_nn ON small (nn);
CREATE TABLE large (u INTEGER NOT NULL, nu INTEGER NOT NULL, n INTEGER NULL, nn INTEGER NOT NULL, p VARCHAR(128) NULL); CREATE UNIQUE INDEX large_u ON large (u); CREATE INDEX large_nu ON large (nu); CREATE INDEX large_n ON large (n); CREATE INDEX large_nn ON large (nn);
The table “small” contains 10 rows; its unique key contains the integer values between 1 and 10. The table “large” contains 1 million rows; its unique key contains the integer values between 1 and 1 million. Note that for both tables the columns “nu” (not unique), “n” (null), and “nn” (not null) contain the same value as the unique key column. The only exception is that the column “n” contains “null” instead of the value “7.” Basically, only the constraints applied to them are different. In addition, the column “p” contains a string of 128 characters that is only present to have tables that aren’t too small (i.e. not very representative).
I considered six types of subqueries:
- Type A – Scalar subqueries with equality predicate
- Type B – Scalar subqueries with inequality predicate
- Type C – Uncorrelated subqueries with either IN or EXISTS
- Type D – Uncorrelated subqueries with either NOT IN or NOT EXISTS
- Type E – Correlated subqueries with either IN or EXISTS
- Type F – Correlated subqueries with either NOT IN or NOT EXISTS
Few notes:
- For each type I considered two sub-types; the difference between them is given by the position of the tables “small” and “large”
- I didn’t consider subqueries outside the WHERE clause
- “IN” could be replaced by either “=ANY” or “=SOME”
- “NOT IN” could be replaced by “!=ALL”
Type A – Scalar subqueries with equality predicate
Subtype A1 – Table “large” in the subquery
A10: SELECT * FROM small WHERE u = (SELECT nu FROM large WHERE u = 6)
A11: SELECT * FROM small WHERE n = (SELECT n FROM large WHERE u = 6)
A12: SELECT * FROM small WHERE n = (SELECT nn FROM large WHERE u = 6)
A13: SELECT * FROM small WHERE nn = (SELECT n FROM large WHERE u = 6)
A14: SELECT * FROM small WHERE nn = (SELECT nn FROM large WHERE u = 6)
The execution plan I expect for these queries carries out the following operations:
- Access the table “large” through an index scan that returns at most one value. The operation is executed one single time.
- Access the table “small” through either a table scan or an index scan and return the rows matching the value returned by the previous operation. The operation is executed one single time.
MySQL selects the following execution plans. All of them fulfill the expectations.
A10
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | 1 | PRIMARY | small | NULL | const | small_u | small_u | 4 | const | 1 | 100.00 | NULL | | 2 | SUBQUERY | large | NULL | const | large_u | large_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
A11/A12
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ref | small_n | small_n | 5 | const | 1 | 100.00 | Using where | | 2 | SUBQUERY | large | NULL | const | large_u | large_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
A13/A14
+----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ref | small_nn | small_nn | 4 | const | 1 | 100.00 | Using where | | 2 | SUBQUERY | large | NULL | const | large_u | large_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
Oracle Database selects the following execution plans. All of them fulfill the expectations.
A10
----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 141 | 1 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | SMALL | 1 | 141 | 1 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- 2 - access("U"= (SELECT "NU" FROM "LARGE" "LARGE" WHERE "U"=6)) 4 - access("U"=6)
A11
----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 141 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 1 | 141 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | SMALL_N | 1 | | 1 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- 2 - access("N"= (SELECT "N" FROM "LARGE" "LARGE" WHERE "U"=6)) 4 - access("U"=6)
A12
----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 141 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 1 | 141 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | SMALL_N | 1 | | 1 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- 2 - access("N"= (SELECT "NN" FROM "LARGE" "LARGE" WHERE "U"=6)) 4 - access("U"=6)
A13
------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 141 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 1 | 141 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | SMALL_NN | 1 | | 1 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ 2 - access("NN"= (SELECT "N" FROM "LARGE" "LARGE" WHERE "U"=6)) 4 - access("U"=6)
A14
------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 141 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 1 | 141 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | SMALL_NN | 1 | | 1 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ 2 - access("NN"= (SELECT "NN" FROM "LARGE" "LARGE" WHERE "U"=6)) 4 - access("U"=6)
PostgreSQL selects the following execution plans. All of them fulfill the expectations.
A10
Seq Scan on small (cost=8.44..9.57 rows=1 width=148) Filter: (u = $0) InitPlan 1 (returns $0) -> Index Scan using large_u on large (cost=0.42..8.44 rows=1 width=4) Index Cond: (u = 6)
A11/A12
Seq Scan on small (cost=8.44..9.57 rows=1 width=148) Filter: (n = $0) InitPlan 1 (returns $0) -> Index Scan using large_u on large (cost=0.42..8.44 rows=1 width=4) Index Cond: (u = 6)
A13/A14
Seq Scan on small (cost=8.44..9.57 rows=1 width=148) Filter: (nn = $0) InitPlan 1 (returns $0) -> Index Scan using large_u on large (cost=0.42..8.44 rows=1 width=4) Index Cond: (u = 6)
Subtype A2 – Table “small” in the subquery
A20: SELECT * FROM large WHERE u = (SELECT nu FROM small WHERE u = 6)
A21: SELECT * FROM large WHERE n = (SELECT n FROM small WHERE u = 6)
A22: SELECT * FROM large WHERE n = (SELECT nn FROM small WHERE u = 6)
A23: SELECT * FROM large WHERE nn = (SELECT n FROM small WHERE u = 6)
A24: SELECT * FROM large WHERE nn = (SELECT nn FROM small WHERE u = 6)
The execution plan I expect for these queries carries out the following operations:
- Access the table “small” through either a table scan or an index scan that returns at most one value. The operation is executed one single time.
- Access the table “large” through an index scan and return the rows matching the value returned by the previous operation. The operation is executed one single time.
MySQL selects the following execution plans. All of them fulfill the expectations.
A20
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+ | 1 | PRIMARY | large | NULL | const | large_u | large_u | 4 | const | 1 | 100.00 | NULL | | 2 | SUBQUERY | small | NULL | const | small_u | small_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
A21/A22
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+ | 1 | PRIMARY | large | NULL | ref | large_n | large_n | 5 | const | 1 | 100.00 | Using where | | 2 | SUBQUERY | small | NULL | const | small_u | small_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
A23/A24
+----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+ | 1 | PRIMARY | large | NULL | ref | large_nn | large_nn | 4 | const | 1 | 100.00 | Using where | | 2 | SUBQUERY | small | NULL | const | small_u | small_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+----------+---------+-------+------+----------+-------------+
Oracle Database selects the following execution plans. All of them fulfill the expectations.
A20
----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 149 | 3 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 149 | 3 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- 2 - access("U"= (SELECT "NU" FROM "SMALL" "SMALL" WHERE "U"=6)) 4 - access("U"=6)
A21
----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 149 | 4 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1 | 149 | 4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | LARGE_N | 1 | | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- 2 - access("N"= (SELECT "N" FROM "SMALL" "SMALL" WHERE "U"=6)) 4 - access("U"=6)
A22
----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 149 | 4 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1 | 149 | 4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | LARGE_N | 1 | | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- 2 - access("N"= (SELECT "NN" FROM "SMALL" "SMALL" WHERE "U"=6)) 4 - access("U"=6)
A23
------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 149 | 4 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1 | 149 | 4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | LARGE_NN | 1 | | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ 2 - access("NN"= (SELECT "N" FROM "SMALL" "SMALL" WHERE "U"=6)) 4 - access("U"=6)
A24
------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 149 | 4 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1 | 149 | 4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | LARGE_NN | 1 | | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID | SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ 2 - access("NN"= (SELECT "NN" FROM "SMALL" "SMALL" WHERE "U"=6)) 4 - access("U"=6)
PostgreSQL selects the following execution plans. All of them fulfill the expectations.
A20
Index Scan using large_u on large (cost=1.55..9.57 rows=1 width=148) Index Cond: (u = $0) InitPlan 1 (returns $0) -> Seq Scan on small (cost=0.00..1.12 rows=1 width=4) Filter: (u = 6)
A21/A22
Index Scan using large_n on large (cost=1.55..9.57 rows=1 width=148) Index Cond: (n = $0) InitPlan 1 (returns $0) -> Seq Scan on small (cost=0.00..1.12 rows=1 width=4) Filter: (u = 6)
A23/A24
Index Scan using large_nn on large (cost=1.55..9.57 rows=1 width=148) Index Cond: (nn = $0) InitPlan 1 (returns $0) -> Seq Scan on small (cost=0.00..1.12 rows=1 width=4) Filter: (u = 6)
Type B – Scalar subqueries with inequality predicate
Subtype B1 – Table “large” in the subquery
B10: SELECT * FROM small WHERE u != (SELECT nu FROM large WHERE u = 6)
B11: SELECT * FROM small WHERE n != (SELECT n FROM large WHERE u = 6)
B12: SELECT * FROM small WHERE n != (SELECT nn FROM large WHERE u = 6)
B13: SELECT * FROM small WHERE nn != (SELECT n FROM large WHERE u = 6)
B14: SELECT * FROM small WHERE nn != (SELECT nn FROM large WHERE u = 6)
The execution plan I expect for these queries carries out the following operations:
- Access the table “large” through an index scan that returns at most one value. The operation is executed one single time.
- Access the table “small” through a table scan and discard the rows matching the value returned by the previous operation. The operation is executed one single time.
MySQL selects the following execution plans. All of them fulfill the expectations.
B10
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | range | small_u | small_u | 4 | NULL | 9 | 100.00 | Using where | | 2 | SUBQUERY | large | NULL | const | large_u | large_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
B11/B12
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | range | small_n | small_n | 5 | NULL | 8 | 100.00 | Using where | | 2 | SUBQUERY | large | NULL | const | large_u | large_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
B13/B14
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | small_nn | NULL | NULL | NULL | 10 | 90.00 | Using where | | 2 | SUBQUERY | large | NULL | const | large_u | large_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
Oracle Database selects the following execution plans. All of them fulfill the expectations.
B10
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1269 | 6 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL | SMALL | 9 | 1269 | 3 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter("U"<> (SELECT "NU" FROM "LARGE" "LARGE" WHERE "U"=6)) 3 - access("U"=6)
B11
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1269 | 6 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL | SMALL | 9 | 1269 | 3 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter("N"<> (SELECT "N" FROM "LARGE" "LARGE" WHERE "U"=6)) 3 - access("U"=6)
B12
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1269 | 6 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL | SMALL | 9 | 1269 | 3 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter("N"<> (SELECT "NN" FROM "LARGE" "LARGE" WHERE "U"=6)) 3 - access("U"=6)
B13
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1269 | 6 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL | SMALL | 9 | 1269 | 3 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter("NN"<> (SELECT "N" FROM "LARGE" "LARGE" WHERE "U"=6)) 3 - access("U"=6)
B14
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1269 | 6 (0)| 00:00:01 | |* 1 | TABLE ACCESS FULL | SMALL | 9 | 1269 | 3 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter("NN"<> (SELECT "NN" FROM "LARGE" "LARGE" WHERE "U"=6)) 3 - access("U"=6)
PostgreSQL selects the following execution plans. All of them fulfill the expectations.
B10
Seq Scan on small (cost=8.44..9.57 rows=9 width=148) Filter: (u <> $0) InitPlan 1 (returns $0) -> Index Scan using large_u on large (cost=0.42..8.44 rows=1 width=4) Index Cond: (u = 6)
B11/B12
Seq Scan on small (cost=8.44..9.57 rows=8 width=148) Filter: (n <> $0) InitPlan 1 (returns $0) -> Index Scan using large_u on large (cost=0.42..8.44 rows=1 width=4) Index Cond: (u = 6)
B13/B14
Seq Scan on small (cost=8.44..9.57 rows=9 width=148) Filter: (nn <> $0) InitPlan 1 (returns $0) -> Index Scan using large_u on large (cost=0.42..8.44 rows=1 width=4) Index Cond: (u = 6)
Subtype B2 – Table “small” in the subquery
B20: SELECT * FROM large WHERE u != (SELECT nu FROM small WHERE u = 6)
B21: SELECT * FROM large WHERE n != (SELECT n FROM small WHERE u = 6)
B22: SELECT * FROM large WHERE n != (SELECT nn FROM small WHERE u = 6)
B23: SELECT * FROM large WHERE nn != (SELECT n FROM small WHERE u = 6)
B24: SELECT * FROM large WHERE nn != (SELECT nn FROM small WHERE u = 6)
The execution plan I expect for these queries carries out the following operations:
- Access the table “small” through either a table scan or an index scan that returns at most one value. The operation is executed one single time.
- Access the table “large” through a table scan and discard the rows matching the value returned by the previous operation. The operation is executed one single time.
MySQL selects the following execution plans. All of them fulfill the expectations.
B20
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | range | large_u | large_u | 4 | NULL | 494590 | 100.00 | Using where | | 2 | SUBQUERY | small | NULL | const | small_u | small_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
B21/B22
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | large_n | NULL | NULL | NULL | 989170 | 50.00 | Using where | | 2 | SUBQUERY | small | NULL | const | small_u | small_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
B23/B24
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | large_nn | NULL | NULL | NULL | 989170 | 50.00 | Using where | | 2 | SUBQUERY | small | NULL | const | small_u | small_u | 4 | const | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+-------+--------+----------+-------------+
Oracle Database selects the following execution plans. All of them fulfill the expectations.
B20
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 142M| 5791 (1)| 00:00:01 | |* 1 | TABLE ACCESS FULL | LARGE | 999K| 142M| 5790 (1)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter("U"<> (SELECT "NU" FROM "SMALL" "SMALL" WHERE "U"=6)) 3 - access("U"=6)
B21
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 142M| 5791 (1)| 00:00:01 | |* 1 | TABLE ACCESS FULL | LARGE | 999K| 142M| 5790 (1)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter("N"<> (SELECT "N" FROM "SMALL" "SMALL" WHERE "U"=6)) 3 - access("U"=6)
B22
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 142M| 5791 (1)| 00:00:01 | |* 1 | TABLE ACCESS FULL | LARGE | 999K| 142M| 5790 (1)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter("N"<> (SELECT "NN" FROM "SMALL" "SMALL" WHERE "U"=6)) 3 - access("U"=6)
B23
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 142M| 5791 (1)| 00:00:01 | |* 1 | TABLE ACCESS FULL | LARGE | 999K| 142M| 5790 (1)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter("NN"<> (SELECT "N" FROM "SMALL" "SMALL" WHERE "U"=6)) 3 - access("U"=6)
B24
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 142M| 5791 (1)| 00:00:01 | |* 1 | TABLE ACCESS FULL | LARGE | 999K| 142M| 5790 (1)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter("NN"<> (SELECT "NN" FROM "SMALL" "SMALL" WHERE "U"=6)) 3 - access("U"=6)
PostgreSQL selects the following execution plans. All of them fulfill the expectations.
B20
Seq Scan on large (cost=1.12..34724.12 rows=999999 width=148) Filter: (u <> $0) InitPlan 1 (returns $0) -> Seq Scan on small (cost=0.00..1.12 rows=1 width=4) Filter: (u = 6)
B21/B22
Seq Scan on large (cost=1.12..34724.12 rows=999999 width=148) Filter: (n <> $0) InitPlan 1 (returns $0) -> Seq Scan on small (cost=0.00..1.12 rows=1 width=4) Filter: (u = 6)
B23/B24
Seq Scan on large (cost=1.12..34724.12 rows=999999 width=148) Filter: (nn <> $0) InitPlan 1 (returns $0) -> Seq Scan on small (cost=0.00..1.12 rows=1 width=4) Filter: (u = 6)
Type C – Uncorrelated subqueries with either IN or EXISTS
Subtype C1 – Table “large” in the subquery
C10: SELECT * FROM small WHERE n IN (SELECT n FROM large)
C11: SELECT * FROM small WHERE n IN (SELECT nn FROM large)
C12: SELECT * FROM small WHERE nn IN (SELECT n FROM large)
C13: SELECT * FROM small WHERE nn IN (SELECT nn FROM large)
C14: SELECT * FROM small WHERE EXISTS (SELECT * FROM large)
The execution plan I expect for the queries C10-C13 carries out a semi-join between two data sets:
- Access the table “small” through a table scan. The operation is executed one single time.
- For every row of the table “small,” access the table “large” through an index scan.
However, for the query C14, I expect the following operations:
- Access the table “large” through either a table scan or an index scan. The operation is executed one single time, and only needs to check whether one row exists.
- If the table “large” contains at least one row, access the table “small” through a table scan. The operation is executed one single time.
MySQL selects the following execution plans. All of them fulfill the expectations.
C10
+----+-------------+-------+------------+------+---------------+---------+---------+---------------+------+----------+--------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+---------+---------+---------------+------+----------+--------------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_n | NULL | NULL | NULL | 10 | 100.00 | Using where | | 1 | SIMPLE | large | NULL | ref | large_n | large_n | 5 | chris.small.n | 1 | 100.00 | Using index; FirstMatch(small) | +----+-------------+-------+------------+------+---------------+---------+---------+---------------+------+----------+--------------------------------+
C11
+----+-------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+--------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+--------------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_n | NULL | NULL | NULL | 10 | 100.00 | Using where | | 1 | SIMPLE | large | NULL | ref | large_nn | large_nn | 4 | chris.small.n | 1 | 100.00 | Using index; FirstMatch(small) | +----+-------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+--------------------------------+
C12
+----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+--------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+--------------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_nn | NULL | NULL | NULL | 10 | 100.00 | NULL | | 1 | SIMPLE | large | NULL | ref | large_n | large_n | 5 | chris.small.nn | 1 | 100.00 | Using index; FirstMatch(small) | +----+-------------+-------+------------+------+---------------+---------+---------+----------------+------+----------+--------------------------------+
C13
+----+-------------+-------+------------+------+---------------+----------+---------+----------------+------+----------+--------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+----------+---------+----------------+------+----------+--------------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_nn | NULL | NULL | NULL | 10 | 100.00 | NULL | | 1 | SIMPLE | large | NULL | ref | large_nn | large_nn | 4 | chris.small.nn | 1 | 100.00 | Using index; FirstMatch(small) | +----+-------------+-------+------------+------+---------------+----------+---------+----------------+------+----------+--------------------------------+
C14
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | NULL | | 2 | SUBQUERY | large | NULL | index | NULL | large_nu | 4 | NULL | 989170 | 100.00 | Using index | +----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+
Oracle Database selects the following execution plans. Even though the execution plan of the queries C10-C11 doesn’t completely fulfill the expectations (the table “small” isn’t accessed through a table scan), the difference wouldn’t be noticeable at runtime. Therefore, I consider that all of them fulfill the expectations.
C10
------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 9 | 1314 | 12 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 9 | 1314 | 12 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 9 | 1269 | 2 (0)| 00:00:01 | |* 3 | INDEX FULL SCAN | SMALL_N | 9 | | 1 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | LARGE_N | 1000K| 4882K| 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ 3 - filter("N" IS NOT NULL) 4 - access("N"="N")
C11
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1314 | 12 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 9 | 1314 | 12 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 9 | 1269 | 2 (0)| 00:00:01 | |* 3 | INDEX FULL SCAN | SMALL_N | 9 | | 1 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | LARGE_NN | 1000K| 4882K| 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 3 - filter("N" IS NOT NULL) 4 - access("N"="NN")
C12
------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 10 | 1460 | 23 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 10 | 1460 | 23 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | LARGE_N | 1000K| 4882K| 2 (0)| 00:00:01 | ------------------------------------------------------------------------------ 3 - access("NN"="N")
C13
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1460 | 23 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 10 | 1460 | 23 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | LARGE_NN | 1000K| 4882K| 2 (0)| 00:00:01 | ------------------------------------------------------------------------------- 3 - access("NN"="NN")
C14
---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1410 | 5 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | | 3 | INDEX FAST FULL SCAN| LARGE_NN | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------- 1 - filter( EXISTS (SELECT 0 FROM "LARGE" "LARGE"))
PostgreSQL selects the following execution plans. Only the execution plan of the query “C14” fulfills the expectations. That said, the other execution plans, in this specific case, perform well. I checked that when the data changes, also the execution plans changes. So, even though I was surprised by them, everything looks good.
C10
Merge Semi Join (cost=11.52..13.58 rows=9 width=148) Merge Cond: (small.n = large.n) -> Index Scan using small_n on small (cost=0.14..12.29 rows=10 width=148) -> Index Only Scan using large_n on large (cost=0.42..81855.69 rows=1000000 width=4)
C11
Merge Semi Join (cost=11.52..13.58 rows=9 width=148) Merge Cond: (small.n = large.nn) -> Index Scan using small_n on small (cost=0.14..12.29 rows=10 width=148) -> Index Only Scan using large_nn on large (cost=0.42..81855.69 rows=1000000 width=4)
C12
Merge Semi Join (cost=0.56..13.59 rows=10 width=148) Merge Cond: (small.nn = large.n) -> Index Scan using small_nn on small (cost=0.14..12.29 rows=10 width=148) -> Index Only Scan using large_n on large (cost=0.42..81855.69 rows=1000000 width=4)
C13
Merge Semi Join (cost=0.56..13.59 rows=10 width=148) Merge Cond: (small.nn = large.nn) -> Index Scan using small_nn on small (cost=0.14..12.29 rows=10 width=148) -> Index Only Scan using large_nn on large (cost=0.42..81855.69 rows=1000000 width=4)
C14
Result (cost=0.03..1.13 rows=10 width=148) One-Time Filter: $0 InitPlan 1 (returns $0) -> Seq Scan on large (cost=0.00..32223.00 rows=1000000 width=0) -> Seq Scan on small (cost=0.03..1.13 rows=10 width=148)
Subtype C2 – Table “small” in the subquery
C20: SELECT * FROM large WHERE n IN (SELECT n FROM small)
C21: SELECT * FROM large WHERE n IN (SELECT nn FROM small)
C22: SELECT * FROM large WHERE nn IN (SELECT n FROM small)
C23: SELECT * FROM large WHERE nn IN (SELECT nn FROM small)
C24: SELECT * FROM large WHERE EXISTS (SELECT * FROM small)
The execution plan I expect for the queries C20-C23 carries out a semi-join between two data sets:
- Access the table “small” through a table scan. The operation is executed one single time.
- For every row of the table “small,” access the table “large” through an index scan.
However, for the query C24, I expect the following operations:
- Access the table “small” through either a table scan or an index scan. The operation is executed one single time, and only needs to check whether one row exists.
- If the table “small” contains at least one row, access the table “large” through a table scan. The operation is executed one single time.
MySQL selects the following execution plans. All of them fulfill the expectations.
C20
+----+-------------+-------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------------------------------+ | 1 | SIMPLE | small | NULL | index | small_n | small_n | 5 | NULL | 10 | 100.00 | Using where; Using index; LooseScan | | 1 | SIMPLE | large | NULL | ref | large_n | large_n | 5 | chris.small.n | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+---------+---------+---------------+------+----------+-------------------------------------+
C21
+----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+ | 1 | SIMPLE | small | NULL | index | small_nn | small_nn | 4 | NULL | 10 | 100.00 | Using index; LooseScan | | 1 | SIMPLE | large | NULL | ref | large_n | large_n | 5 | chris.small.nn | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+
C22
+----+-------------+-------+------------+-------+---------------+----------+---------+---------------+------+----------+-------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+---------------+------+----------+-------------------------------------+ | 1 | SIMPLE | small | NULL | index | small_n | small_n | 5 | NULL | 10 | 100.00 | Using where; Using index; LooseScan | | 1 | SIMPLE | large | NULL | ref | large_nn | large_nn | 4 | chris.small.n | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+----------+---------+---------------+------+----------+-------------------------------------+
C23
+----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+ | 1 | SIMPLE | small | NULL | index | small_nn | small_nn | 4 | NULL | 10 | 100.00 | Using index; LooseScan | | 1 | SIMPLE | large | NULL | ref | large_nn | large_nn | 4 | chris.small.nn | 1 | 100.00 | NULL | +----+-------------+-------+------------+-------+---------------+----------+---------+----------------+------+----------+------------------------+
C24
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | NULL | | 2 | SUBQUERY | small | NULL | index | NULL | small_nu | 4 | NULL | 10 | 100.00 | Using index | +----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+
Oracle Database selects the following execution plans. All of them fulfill the expectations.
C20
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1368 | 18 (6)| 00:00:01 | | 1 | NESTED LOOPS | | 9 | 1368 | 18 (6)| 00:00:01 | | 2 | NESTED LOOPS | | 9 | 1368 | 18 (6)| 00:00:01 | | 3 | SORT UNIQUE | | 9 | 27 | 1 (0)| 00:00:01 | |* 4 | INDEX FULL SCAN | SMALL_N | 9 | 27 | 1 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | LARGE_N | 1 | | 2 (0)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 149 | 4 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 4 - filter("N" IS NOT NULL) 5 - access("N"="N")
C21
----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1520 | 18 (6)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1520 | 18 (6)| 00:00:01 | | 2 | NESTED LOOPS | | 10 | 1520 | 18 (6)| 00:00:01 | | 3 | SORT UNIQUE | | 10 | 30 | 1 (0)| 00:00:01 | | 4 | INDEX FULL SCAN | SMALL_NN | 10 | 30 | 1 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | LARGE_N | 1 | | 2 (0)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 149 | 4 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- 5 - access("N"="NN")
C22
----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1368 | 13 (8)| 00:00:01 | | 1 | NESTED LOOPS | | 9 | 1368 | 13 (8)| 00:00:01 | | 2 | NESTED LOOPS | | 9 | 1368 | 13 (8)| 00:00:01 | | 3 | SORT UNIQUE | | 9 | 27 | 1 (0)| 00:00:01 | |* 4 | INDEX FULL SCAN | SMALL_N | 9 | 27 | 1 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | LARGE_NN | 1 | | 2 (0)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 149 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- 4 - filter("N" IS NOT NULL) 5 - access("NN"="N")
C23
----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1520 | 13 (8)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1520 | 13 (8)| 00:00:01 | | 2 | NESTED LOOPS | | 10 | 1520 | 13 (8)| 00:00:01 | | 3 | SORT UNIQUE | | 10 | 30 | 1 (0)| 00:00:01 | | 4 | INDEX FULL SCAN | SMALL_NN | 10 | 30 | 1 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | LARGE_NN | 1 | | 2 (0)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 149 | 3 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- 5 - access("NN"="NN")
C24
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000K| 142M| 5789 (1)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL| LARGE | 1000K| 142M| 5788 (1)| 00:00:01 | | 3 | INDEX FULL SCAN | SMALL_NN | 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------- 1 - filter( EXISTS (SELECT 0 FROM "SMALL" "SMALL"))
PostgreSQL selects the following execution plans. All of them fulfill the expectations.
C20
Merge Semi Join (cost=1.74..2.59 rows=9 width=148) Merge Cond: (large.n = small.n) -> Index Scan using large_n on large (cost=0.42..81855.69 rows=1000000 width=148) -> Sort (cost=1.27..1.29 rows=10 width=4) Sort Key: small.n -> Seq Scan on small (cost=0.00..1.10 rows=10 width=4)
C21
Merge Semi Join (cost=1.69..2.60 rows=10 width=148) Merge Cond: (large.n = small.nn) -> Index Scan using large_n on large (cost=0.42..81855.69 rows=1000000 width=148) -> Sort (cost=1.27..1.29 rows=10 width=4) Sort Key: small.nn -> Seq Scan on small (cost=0.00..1.10 rows=10 width=4)
C22
Merge Semi Join (cost=1.74..2.59 rows=9 width=148) Merge Cond: (large.nn = small.n) -> Index Scan using large_nn on large (cost=0.42..81855.69 rows=1000000 width=148) -> Sort (cost=1.27..1.29 rows=10 width=4) Sort Key: small.n -> Seq Scan on small (cost=0.00..1.10 rows=10 width=4)
C23
Merge Semi Join (cost=1.69..2.60 rows=10 width=148) Merge Cond: (large.nn = small.nn) -> Index Scan using large_nn on large (cost=0.42..81855.69 rows=1000000 width=148) -> Sort (cost=1.27..1.29 rows=10 width=4) Sort Key: small.nn -> Seq Scan on small (cost=0.00..1.10 rows=10 width=4)
C24
Result (cost=0.11..32223.11 rows=1000000 width=148) One-Time Filter: $0 InitPlan 1 (returns $0) -> Seq Scan on small (cost=0.00..1.10 rows=10 width=0) -> Seq Scan on large (cost=0.11..32223.11 rows=1000000 width=148)
Type D – Uncorrelated subqueries with either NOT IN or NOT EXISTS
Subtype D1 – Table “large” in the subquery
D10: SELECT * FROM small WHERE n NOT IN (SELECT n FROM large)
D11: SELECT * FROM small WHERE n NOT IN (SELECT nn FROM large)
D12: SELECT * FROM small WHERE nn NOT IN (SELECT n FROM large)
D13: SELECT * FROM small WHERE nn NOT IN (SELECT nn FROM large)
D14: SELECT * FROM small WHERE NOT EXISTS (SELECT * FROM large)
The execution plan I expect for the queries D10-D13 carries out an anti-join between two data sets:
- Access the table “small” through a table scan. The operation is executed one single time.
- For every row of the table “small,” access the index associated to referenced column of the table “large” to check whether rows that fulfill the WHERE clause exist.
However, for the query D14, I expect the following operations:
- Access the table “large” through either a table scan or an index scan. The operation is executed one single time, and only needs to check whether one row exists.
- If the table “large” contains no row, access the table “small” through a table scan. The operation is executed one single time.
MySQL selects the following execution plans. All of them fulfill the expectations.
D10
+----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+-------------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+-------------------------------------------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | index_subquery | large_n | large_n | 5 | func | 2 | 100.00 | Using where; Using index; Full scan on NULL key | +----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+-------------------------------------------------+
D11
+----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------------------------------------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | index_subquery | large_nn | large_nn | 4 | func | 1 | 100.00 | Using where; Using index; Full scan on NULL key | +----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------------------------------------------+
D12
+----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+--------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+--------------------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | index_subquery | large_n | large_n | 5 | func | 2 | 100.00 | Using where; Using index | +----+--------------------+-------+------------+----------------+---------------+---------+---------+------+------+----------+--------------------------+
D13
+----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | index_subquery | large_nn | large_nn | 4 | func | 1 | 100.00 | Using index | +----+--------------------+-------+------------+----------------+---------------+----------+---------+------+------+----------+-------------+
D14
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+------------------+ | 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE | | 2 | SUBQUERY | large | NULL | index | NULL | large_nu | 4 | NULL | 989170 | 100.00 | Using index | +----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+------------------+
Oracle Database selects the following execution plans. The execution plan of the queries D13-D14 fulfills the expectations. The others, because of the table scan on “large”, are suboptimal. Note that for the queries D12 the query optimizer is restricted by the fact that the “large_n” index doesn’t contain the null value. I consider this a physical database design issue, not a query optimizer issue. However, the execution plan of the queries D10-D11 is suboptimal because of a query optimizer limitation with nullable values.
D10
---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 146 | 5793 (1)| 00:00:01 | |* 1 | HASH JOIN ANTI NA | | 1 | 146 | 5793 (1)| 00:00:01 | | 2 | TABLE ACCESS FULL| SMALL | 10 | 1410 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL| LARGE | 1000K| 4882K| 5787 (1)| 00:00:01 | ---------------------------------------------------------------------------- 1 - access("N"="N")
D11
---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 146 | 616 (2)| 00:00:01 | |* 1 | HASH JOIN ANTI SNA | | 1 | 146 | 616 (2)| 00:00:01 | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | | 3 | INDEX FAST FULL SCAN| LARGE_NN | 1000K| 4882K| 610 (1)| 00:00:01 | ---------------------------------------------------------------------------------- 1 - access("N"="NN")
D12
---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 146 | 5793 (1)| 00:00:01 | |* 1 | HASH JOIN ANTI NA | | 1 | 146 | 5793 (1)| 00:00:01 | | 2 | TABLE ACCESS FULL| SMALL | 10 | 1410 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL| LARGE | 1000K| 4882K| 5787 (1)| 00:00:01 | ---------------------------------------------------------------------------- 1 - access("NN"="N")
D13
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 146 | 23 (0)| 00:00:01 | | 1 | NESTED LOOPS ANTI | | 1 | 146 | 23 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | LARGE_NN | 900K| 4394K| 2 (0)| 00:00:01 | ------------------------------------------------------------------------------- 3 - access("NN"="NN")
D14
---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1410 | 5 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | | 3 | INDEX FAST FULL SCAN| LARGE_NN | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "LARGE" "LARGE"))
PostgreSQL selects the following execution plans. Only the execution plan of the query D14 fulfills the expectations. The others, because of the table scan on “large” and its materialization, are really bad.
D10-D13
Seq Scan on small (cost=0.00..218151.12 rows=5 width=148) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Materialize (cost=0.00..41130.00 rows=1000000 width=4) -> Seq Scan on large (cost=0.00..32223.00 rows=1000000 width=4)
D14
Result (cost=0.03..1.13 rows=10 width=148) One-Time Filter: (NOT $0) InitPlan 1 (returns $0) -> Seq Scan on large (cost=0.00..32223.00 rows=1000000 width=0) -> Seq Scan on small (cost=0.03..1.13 rows=10 width=148)
Subtype D2 – Table “small” in the subquery
D20: SELECT * FROM large WHERE n NOT IN (SELECT n FROM small)
D21: SELECT * FROM large WHERE n NOT IN (SELECT nn FROM small)
D22: SELECT * FROM large WHERE nn NOT IN (SELECT n FROM small)
D23: SELECT * FROM large WHERE nn NOT IN (SELECT nn FROM small)
D24: SELECT * FROM large WHERE NOT EXISTS (SELECT * FROM small)
The execution plan I expect for the queries D20-D23 carries out an anti-join between two data sets:
- Access the table “small” through a table scan and put the resulting data into a memory structure. This operation is executed one single time.
- Access the table “large” through a table scan and, for every row, check the memory structure created by the previous operation to find out whether rows that fulfills the WHERE clause exist. This operation is executed one single time.
However, for the query D14, I expect the following operations:
- Access the table “small” through either a table scan or an index scan. The operation is executed one single time, and only needs to check whether one row exists.
- If the table “small” contains no row, access the table “large” through a table scan. The operation is executed one single time.
MySQL selects the following execution plans. All of them fulfill the expectations (what isn’t visible in the execution plans is that for the queries D20-D23 the result set generated by the subquery is materialized).
D20/D22
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | SUBQUERY | small | NULL | index | small_n | small_n | 5 | NULL | 10 | 100.00 | Using index | +----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
D21/D23
+----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | SUBQUERY | small | NULL | index | small_nn | small_nn | 4 | NULL | 10 | 100.00 | Using index | +----+-------------+-------+------------+-------+---------------+----------+---------+------+--------+----------+-------------+
D24
+----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+------------------+ | 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE | | 2 | SUBQUERY | small | NULL | index | NULL | small_nu | 4 | NULL | 10 | 100.00 | Using index | +----+-------------+-------+------------+-------+---------------+----------+---------+------+------+----------+------------------+
Oracle Database selects the following execution plans. All of them fulfill the expectations.
D20
--------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 144M| 5795 (1)| 00:00:01 | |* 1 | HASH JOIN RIGHT ANTI NA| | 999K| 144M| 5795 (1)| 00:00:01 | | 2 | TABLE ACCESS FULL | SMALL | 10 | 30 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5788 (1)| 00:00:01 | --------------------------------------------------------------------------------- 1 - access("N"="N")
D21
------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 144M| 5793 (1)| 00:00:01 | |* 1 | HASH JOIN RIGHT ANTI SNA| | 999K| 144M| 5793 (1)| 00:00:01 | | 2 | INDEX FULL SCAN | SMALL_NN | 10 | 30 | 1 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5788 (1)| 00:00:01 | ------------------------------------------------------------------------------------- 1 - access("N"="NN")
D22
--------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 144M| 5795 (1)| 00:00:01 | |* 1 | HASH JOIN RIGHT ANTI NA| | 999K| 144M| 5795 (1)| 00:00:01 | | 2 | TABLE ACCESS FULL | SMALL | 10 | 30 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5788 (1)| 00:00:01 | --------------------------------------------------------------------------------- 1 - access("NN"="N")
D23
--------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 144M| 5793 (1)| 00:00:01 | |* 1 | HASH JOIN RIGHT ANTI| | 999K| 144M| 5793 (1)| 00:00:01 | | 2 | INDEX FULL SCAN | SMALL_NN | 10 | 30 | 1 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5788 (1)| 00:00:01 | --------------------------------------------------------------------------------- 1 - access("NN"="NN")
D24
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000K| 142M| 5789 (1)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL| LARGE | 1000K| 142M| 5788 (1)| 00:00:01 | | 3 | INDEX FULL SCAN | SMALL_NN | 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "SMALL" "SMALL"))
PostgreSQL selects the following execution plans. All of them fulfill the expectations.
D20-D23
Seq Scan on large (cost=1.12..34724.12 rows=500000 width=148) Filter: (NOT (hashed SubPlan 1)) SubPlan 1 -> Seq Scan on small (cost=0.00..1.10 rows=10 width=4)
D24
Result (cost=0.11..32223.11 rows=1000000 width=148) One-Time Filter: (NOT $0) InitPlan 1 (returns $0) -> Seq Scan on small (cost=0.00..1.10 rows=10 width=0) -> Seq Scan on large (cost=0.11..32223.11 rows=1000000 width=148)
Type E – Correlated subqueries with either IN or EXISTS
Subtype E1 – Table “large” in the subquery
E10: SELECT * FROM small WHERE n IN (SELECT n FROM large WHERE large.u = small.u)
E11: SELECT * FROM small WHERE n IN (SELECT nn FROM large WHERE large.u = small.u)
E12: SELECT * FROM small WHERE nn IN (SELECT n FROM large WHERE large.u = small.u)
E13: SELECT * FROM small WHERE nn IN (SELECT nn FROM large WHERE large.u = small.u)
E14: SELECT * FROM small WHERE n IN (SELECT n FROM large WHERE large.nu = small.u)
E15: SELECT * FROM small WHERE n IN (SELECT nn FROM large WHERE large.nu = small.u)
E16: SELECT * FROM small WHERE nn IN (SELECT n FROM large WHERE large.nu = small.u)
E17: SELECT * FROM small WHERE nn IN (SELECT nn FROM large WHERE large.nu = small.u)
E18: SELECT * FROM small WHERE EXISTS (SELECT * FROM large WHERE large.u = small.u)
E19: SELECT * FROM small WHERE EXISTS (SELECT * FROM large WHERE large.nu = small.u)
The execution plan I expect for these queries is a join between two data sets:
- Access the table “small” through a table scan. The operation is executed one single time.
- For every row of the table “small,” access the table “large” through an index scan and check whether at least one row fulfills the WHERE clause.
Note that only for the queries E14-E17/E19 a semi-join is necessary. For the others, because the WHERE clause in the subquery references a unique value in table “large”, a “regular” join can take place.
MySQL selects the following execution plans. Except for the query E18, the others fulfill the expectations. For the query E18 the query optimizer does not recognize that no semi-join is necessary.
E10
+----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+ | 1 | SIMPLE | small | NULL | ALL | small_u,small_n | NULL | NULL | NULL | 10 | 100.00 | NULL | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_n | large_u | 4 | chris.small.u | 1 | 5.00 | Using where | +----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
E11
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | 1 | SIMPLE | small | NULL | ALL | small_u,small_n | NULL | NULL | NULL | 10 | 100.00 | NULL | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_nn | large_u | 4 | chris.small.u | 1 | 5.00 | Using where | +----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
E12
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | 1 | SIMPLE | small | NULL | ALL | small_u,small_nn | NULL | NULL | NULL | 10 | 100.00 | NULL | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_n | large_u | 4 | chris.small.u | 1 | 5.00 | Using where | +----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
E13
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | 1 | SIMPLE | small | NULL | ALL | small_u,small_nn | NULL | NULL | NULL | 10 | 100.00 | NULL | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_nn | large_u | 4 | chris.small.u | 1 | 5.00 | Using where | +----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
E14
+----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_u,small_n | NULL | NULL | NULL | 10 | 100.00 | NULL | | 1 | SIMPLE | large | NULL | ref | large_nu,large_n | large_nu | 4 | chris.small.u | 1 | 5.00 | Using where; FirstMatch(small) | +----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+
E15
+----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_u,small_n | NULL | NULL | NULL | 10 | 100.00 | NULL | | 1 | SIMPLE | large | NULL | ref | large_nu,large_nn | large_nu | 4 | chris.small.u | 1 | 5.00 | Using where; FirstMatch(small) | +----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+
E16
+----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_u,small_nn | NULL | NULL | NULL | 10 | 100.00 | NULL | | 1 | SIMPLE | large | NULL | ref | large_nu,large_n | large_nu | 4 | chris.small.u | 1 | 5.00 | Using where; FirstMatch(small) | +----+-------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+--------------------------------+
E17
+----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_u,small_nn | NULL | NULL | NULL | 10 | 100.00 | NULL | | 1 | SIMPLE | large | NULL | ref | large_nu,large_nn | large_nu | 4 | chris.small.u | 1 | 5.00 | Using where; FirstMatch(small) | +----+-------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+--------------------------------+
E18
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | eq_ref | large_u | large_u | 4 | chris.small.u | 1 | 100.00 | Using index | +----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+
E19
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | ref | large_nu | large_nu | 4 | chris.small.u | 1 | 100.00 | Using index | +----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+
Oracle Database selects the following execution plans. Even though the execution plan of the queries E10/E11/E14/E15 doesn’t completely fulfill the expectations (the table “small” isn’t accessed through a table scan), the difference wouldn’t be noticeable at runtime. Therefore, I consider that all of them fulfill the expectations.
E10
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1359 | 20 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 9 | 1359 | 20 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 9 | 1359 | 20 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 9 | 1269 | 2 (0)| 00:00:01 | |* 4 | INDEX FULL SCAN | SMALL_N | 9 | | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 6 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 10 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 4 - filter("N" IS NOT NULL) 5 - access("LARGE"."U"="SMALL"."U") 6 - filter("N"="N")
E11
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1359 | 20 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 9 | 1359 | 20 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 9 | 1359 | 20 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 9 | 1269 | 2 (0)| 00:00:01 | |* 4 | INDEX FULL SCAN | SMALL_N | 9 | | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 6 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 10 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 4 - filter("N" IS NOT NULL) 5 - access("LARGE"."U"="SMALL"."U") 6 - filter("N"="NN")
E12
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1510 | 23 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1510 | 23 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 10 | 1510 | 23 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 5 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 10 | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 4 - access("LARGE"."U"="SMALL"."U") 5 - filter("NN"="N")
E13
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1510 | 23 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1510 | 23 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 10 | 1510 | 23 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 5 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 10 | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 4 - access("LARGE"."U"="SMALL"."U") 5 - filter("NN"="NN")
E14
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1359 | 29 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 9 | 1359 | 29 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 9 | 1269 | 2 (0)| 00:00:01 | |* 3 | INDEX FULL SCAN | SMALL_N | 9 | | 1 (0)| 00:00:01 | |* 4 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1000K| 9765K| 3 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | LARGE_NU | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 3 - filter("N" IS NOT NULL) 4 - filter("N"="N") 5 - access("LARGE"."NU"="SMALL"."U")
E15
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1359 | 21 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 9 | 1359 | 21 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 9 | 1269 | 2 (0)| 00:00:01 | |* 3 | INDEX FULL SCAN | SMALL_N | 9 | | 1 (0)| 00:00:01 | |* 4 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1000K| 9765K| 3 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | LARGE_NN | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 3 - filter("N" IS NOT NULL) 4 - filter("LARGE"."NU"="SMALL"."U") 5 - access("N"="NN")
E16
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1510 | 33 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 10 | 1510 | 33 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1000K| 9765K| 3 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | LARGE_NU | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 3 - filter("NN"="N") 4 - access("LARGE"."NU"="SMALL"."U")
E17
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1510 | 33 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 10 | 1510 | 33 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1000K| 9765K| 3 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | LARGE_NN | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 3 - filter("LARGE"."NU"="SMALL"."U") 4 - access("NN"="NN")
E18
------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 10 | 1460 | 13 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1460 | 13 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN| LARGE_U | 1 | 5 | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------ 3 - access("LARGE"."U"="SMALL"."U")
E19
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1460 | 23 (0)| 00:00:01 | | 1 | NESTED LOOPS SEMI | | 10 | 1460 | 23 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | LARGE_NU | 1000K| 4882K| 2 (0)| 00:00:01 | ------------------------------------------------------------------------------- 3 - access("LARGE"."NU"="SMALL"."U")
PostgreSQL selects the following execution plans. Except for the execution plan of the query E19, the others fulfill the expectations. Note that, in this specific case, the execution plan selected for the query E19 perform well. I checked that when the data changes, also the execution plans changes. So, even though I was surprised by it, it is good.
E10-E13
Seq Scan on small (cost=0.00..45.47 rows=5 width=148) Filter: (SubPlan 1) SubPlan 1 -> Index Scan using large_u on large (cost=0.42..8.44 rows=1 width=4) Index Cond: (u = small.u)
E14-E17
Seq Scan on small (cost=0.00..45.47 rows=5 width=148) Filter: (SubPlan 1) SubPlan 1 -> Index Scan using large_nu on large (cost=0.42..8.44 rows=1 width=4) Index Cond: (nu = small.u)
E18
Merge Join (cost=1.69..2.60 rows=10 width=148) Merge Cond: (large.u = small.u) -> Index Only Scan using large_u on large (cost=0.42..81855.69 rows=1000000 width=4) -> Sort (cost=1.27..1.29 rows=10 width=148) Sort Key: small.u -> Seq Scan on small (cost=0.00..1.10 rows=10 width=148)
E19
Merge Semi Join (cost=0.56..13.59 rows=10 width=148) Merge Cond: (small.u = large.nu) -> Index Scan using small_u on small (cost=0.14..12.29 rows=10 width=148) -> Index Only Scan using large_nu on large (cost=0.42..81855.69 rows=1000000 width=4)
Subtype E2 – Table “small” in the subquery
E20: SELECT * FROM large WHERE n IN (SELECT n FROM small WHERE small.u = large.u)
E21: SELECT * FROM large WHERE n IN (SELECT nn FROM small WHERE small.u = large.u)
E22: SELECT * FROM large WHERE nn IN (SELECT n FROM small WHERE small.u = large.u)
E23: SELECT * FROM large WHERE nn IN (SELECT nn FROM small WHERE small.u = large.u)
E24: SELECT * FROM large WHERE n IN (SELECT n FROM small WHERE small.nu = large.u)
E25: SELECT * FROM large WHERE n IN (SELECT nn FROM small WHERE small.nu = large.u)
E26: SELECT * FROM large WHERE nn IN (SELECT n FROM small WHERE small.nu = large.u)
E27: SELECT * FROM large WHERE nn IN (SELECT nn FROM small WHERE small.nu = large.u)
E28: SELECT * FROM large WHERE EXISTS (SELECT * FROM small WHERE small.u = large.u)
E29: SELECT * FROM large WHERE EXISTS (SELECT * FROM small WHERE small.nu = large.u)
The execution plan I expect for these queries is a join between two data sets:
- Access the table “small” through a table scan. The operation is executed one single time.
- For every row of the table “small,” access the table “large” through an index scan and select the rows that fulfills the WHERE clause.
Note that only for the queries E24-E27/E29 a semi-join is necessary. For the others, since the WHERE clause in the subquery references a unique value in table “small”, a “regular” join can take place.
MySQL selects the following execution plans. The ones for the queries E20-E27 fulfill the expectations. The execution plan of the others, because of the way the table “large” is accessed, is really bad.
E20
+----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+ | 1 | SIMPLE | small | NULL | index | small_u,small_n | small_n | 5 | NULL | 10 | 100.00 | Using index | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_n | large_u | 4 | chris.small.u | 1 | 5.00 | Using where | +----+-------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
E21
+----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+ | 1 | SIMPLE | small | NULL | index | small_u,small_nn | small_nn | 4 | NULL | 10 | 100.00 | Using index | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_n | large_u | 4 | chris.small.u | 1 | 5.00 | Using where | +----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+
E22
+----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | 1 | SIMPLE | small | NULL | index | small_u,small_n | small_n | 5 | NULL | 10 | 100.00 | Using index | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_nn | large_u | 4 | chris.small.u | 1 | 5.00 | Using where | +----+-------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
E23
+----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+ | 1 | SIMPLE | small | NULL | index | small_u,small_nn | small_nn | 4 | NULL | 10 | 100.00 | Using index | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_nn | large_u | 4 | chris.small.u | 1 | 5.00 | Using where | +----+-------------+-------+------------+--------+------------------+----------+---------+---------------+------+----------+-------------+
E24
+----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_nu,small_n | NULL | NULL | NULL | 10 | 100.00 | Start temporary | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_n | large_u | 4 | chris.small.nu | 1 | 5.00 | Using where; End temporary | +----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+
E25
+----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_nu,small_nn | NULL | NULL | NULL | 10 | 100.00 | Start temporary | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_n | large_u | 4 | chris.small.nu | 1 | 5.00 | Using where; End temporary | +----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+
E26
+----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_nu,small_n | NULL | NULL | NULL | 10 | 100.00 | Start temporary | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_nn | large_u | 4 | chris.small.nu | 1 | 5.00 | Using where; End temporary | +----+-------------+-------+------------+--------+------------------+---------+---------+----------------+------+----------+----------------------------+
E27
+----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+ | 1 | SIMPLE | small | NULL | ALL | small_nu,small_nn | NULL | NULL | NULL | 10 | 100.00 | Start temporary | | 1 | SIMPLE | large | NULL | eq_ref | large_u,large_nn | large_u | 4 | chris.small.nu | 1 | 5.00 | Using where; End temporary | +----+-------------+-------+------------+--------+-------------------+---------+---------+----------------+------+----------+----------------------------+
E28
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | eq_ref | small_u | small_u | 4 | chris.large.u | 1 | 100.00 | Using index | +----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+
E29
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | ref | small_nu | small_nu | 4 | chris.large.u | 1 | 100.00 | Using index | +----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+
Oracle Database selects the following execution plans. Even though the execution plan of the queries E20-E17 doesn’t completely fulfill the expectations (the table “small” isn’t accessed through a table scan), the difference wouldn’t be noticeable at runtime. Therefore, I consider that all of them fulfill the expectations.
E20
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1395 | 20 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 9 | 1395 | 20 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 9 | 1395 | 20 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 9 | 54 | 2 (0)| 00:00:01 | |* 4 | INDEX FULL SCAN | SMALL_N | 9 | | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 6 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 149 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 4 - filter("N" IS NOT NULL) 5 - access("SMALL"."U"="LARGE"."U") 6 - filter("N"="N")
E21
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1550 | 22 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1550 | 22 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 10 | 1550 | 22 (0)| 00:00:01 | | 3 | VIEW | index$_join$_002 | 10 | 60 | 2 (0)| 00:00:01 | |* 4 | HASH JOIN | | | | | | | 5 | INDEX FAST FULL SCAN | SMALL_NN | 10 | 60 | 1 (0)| 00:00:01 | | 6 | INDEX FAST FULL SCAN | SMALL_U | 10 | 60 | 1 (0)| 00:00:01 | |* 7 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 8 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 149 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 4 - access(ROWID=ROWID) 7 - access("SMALL"."U"="LARGE"."U") 8 - filter("N"="NN")
E22
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1395 | 20 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 9 | 1395 | 20 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 9 | 1395 | 20 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 9 | 54 | 2 (0)| 00:00:01 | |* 4 | INDEX FULL SCAN | SMALL_N | 9 | | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 6 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 149 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter("N" IS NOT NULL) 5 - access("SMALL"."U"="LARGE"."U") 6 - filter("NN"="N")
E23
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1550 | 22 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1550 | 22 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 10 | 1550 | 22 (0)| 00:00:01 | | 3 | VIEW | index$_join$_002 | 10 | 60 | 2 (0)| 00:00:01 | |* 4 | HASH JOIN | | | | | | | 5 | INDEX FAST FULL SCAN | SMALL_NN | 10 | 60 | 1 (0)| 00:00:01 | | 6 | INDEX FAST FULL SCAN | SMALL_U | 10 | 60 | 1 (0)| 00:00:01 | |* 7 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 8 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 149 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 4 - access(ROWID=ROWID) 7 - access("SMALL"."U"="LARGE"."U") 8 - filter("NN"="NN")
E24
-------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1395 | 13 (8)| 00:00:01 | | 1 | NESTED LOOPS | | 9 | 1395 | 13 (8)| 00:00:01 | | 2 | NESTED LOOPS | | 9 | 1395 | 13 (8)| 00:00:01 | | 3 | SORT UNIQUE | | 9 | 54 | 2 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 9 | 54 | 2 (0)| 00:00:01 | |* 5 | INDEX FULL SCAN | SMALL_N | 9 | | 1 (0)| 00:00:01 | |* 6 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 7 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 149 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------- 5 - filter("N" IS NOT NULL) 6 - access("SMALL"."NU"="LARGE"."U") 7 - filter("N"="N")
E25
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1550 | 13 (8)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1550 | 13 (8)| 00:00:01 | | 2 | NESTED LOOPS | | 10 | 1550 | 13 (8)| 00:00:01 | | 3 | SORT UNIQUE | | 10 | 60 | 2 (0)| 00:00:01 | | 4 | VIEW | index$_join$_002 | 10 | 60 | 2 (0)| 00:00:01 | |* 5 | HASH JOIN | | | | | | | 6 | INDEX FAST FULL SCAN | SMALL_NN | 10 | 60 | 1 (0)| 00:00:01 | | 7 | INDEX FAST FULL SCAN | SMALL_NU | 10 | 60 | 1 (0)| 00:00:01 | |* 8 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 9 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 149 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 5 - access(ROWID=ROWID) 8 - access("SMALL"."NU"="LARGE"."U") 9 - filter("N"="NN")
E26
-------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1395 | 13 (8)| 00:00:01 | | 1 | NESTED LOOPS | | 9 | 1395 | 13 (8)| 00:00:01 | | 2 | NESTED LOOPS | | 9 | 1395 | 13 (8)| 00:00:01 | | 3 | SORT UNIQUE | | 9 | 54 | 2 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 9 | 54 | 2 (0)| 00:00:01 | |* 5 | INDEX FULL SCAN | SMALL_N | 9 | | 1 (0)| 00:00:01 | |* 6 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 7 | TABLE ACCESS BY INDEX ROWID | LARGE | 1 | 149 | 2 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------- 5 - filter("N" IS NOT NULL) 6 - access("SMALL"."NU"="LARGE"."U") 7 - filter("NN"="N")
E27
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1550 | 13 (8)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1550 | 13 (8)| 00:00:01 | | 2 | NESTED LOOPS | | 10 | 1550 | 13 (8)| 00:00:01 | | 3 | SORT UNIQUE | | 10 | 60 | 2 (0)| 00:00:01 | | 4 | VIEW | index$_join$_002 | 10 | 60 | 2 (0)| 00:00:01 | |* 5 | HASH JOIN | | | | | | | 6 | INDEX FAST FULL SCAN | SMALL_NN | 10 | 60 | 1 (0)| 00:00:01 | | 7 | INDEX FAST FULL SCAN | SMALL_NU | 10 | 60 | 1 (0)| 00:00:01 | |* 8 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | |* 9 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 149 | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 5 - access(ROWID=ROWID) 8 - access("SMALL"."NU"="LARGE"."U") 9 - filter("NN"="NN")
E28
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1520 | 12 (0)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1520 | 12 (0)| 00:00:01 | | 2 | NESTED LOOPS | | 10 | 1520 | 12 (0)| 00:00:01 | | 3 | INDEX FULL SCAN | SMALL_U | 10 | 30 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | | 5 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 149 | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 4 - access("SMALL"."U"="LARGE"."U")
E29
----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 1520 | 8 (13)| 00:00:01 | | 1 | NESTED LOOPS | | 10 | 1520 | 8 (13)| 00:00:01 | | 2 | NESTED LOOPS | | 10 | 1520 | 8 (13)| 00:00:01 | | 3 | SORT UNIQUE | | 10 | 30 | 1 (0)| 00:00:01 | | 4 | INDEX FULL SCAN | SMALL_NU | 10 | 30 | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 149 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- 5 - access("SMALL"."NU"="LARGE"."U")
PostgreSQL selects the following execution plans. Only the execution plan of the queries E28/E29 fulfill the expectations. The execution plan of the others, because of the way the table “large” is accessed, is really bad.
E20-E23
Seq Scan on large (cost=0.00..598473.00 rows=500000 width=148) Filter: (SubPlan 1) SubPlan 1 -> Seq Scan on small (cost=0.00..1.12 rows=1 width=4) Filter: (u = large.u)
E24-E27
Seq Scan on large (cost=0.00..598473.00 rows=500000 width=148) Filter: (SubPlan 1) SubPlan 1 -> Seq Scan on small (cost=0.00..1.12 rows=1 width=4) Filter: (nu = large.u)
E28
Merge Join (cost=1.69..2.60 rows=10 width=148) Merge Cond: (large.u = small.u) -> Index Scan using large_u on large (cost=0.42..81855.69 rows=1000000 width=148) -> Sort (cost=1.27..1.29 rows=10 width=4) Sort Key: small.u -> Seq Scan on small (cost=0.00..1.10 rows=10 width=4)
E29
Merge Semi Join (cost=1.69..2.60 rows=10 width=148) Merge Cond: (large.u = small.nu) -> Index Scan using large_u on large (cost=0.42..81855.69 rows=1000000 width=148) -> Sort (cost=1.27..1.29 rows=10 width=4) Sort Key: small.nu -> Seq Scan on small (cost=0.00..1.10 rows=10 width=4)
Type F – Correlated subqueries with either NOT IN or NOT EXISTS
Subtype F1 – Table “large” in the subquery
F10: SELECT * FROM small WHERE n NOT IN (SELECT n FROM large WHERE large.u = small.u)
F11: SELECT * FROM small WHERE n NOT IN (SELECT nn FROM large WHERE large.u = small.u)
F12: SELECT * FROM small WHERE nn NOT IN (SELECT n FROM large WHERE large.u = small.u)
F13: SELECT * FROM small WHERE nn NOT IN (SELECT nn FROM large WHERE large.u = small.u)
F14: SELECT * FROM small WHERE n NOT IN (SELECT n FROM large WHERE large.nu = small.u)
F15: SELECT * FROM small WHERE n NOT IN (SELECT nn FROM large WHERE large.nu = small.u)
F16: SELECT * FROM small WHERE nn NOT IN (SELECT n FROM large WHERE large.nu = small.u)
F17: SELECT * FROM small WHERE nn NOT IN (SELECT nn FROM large WHERE large.nu = small.u)
F18: SELECT * FROM small WHERE NOT EXISTS (SELECT * FROM large WHERE large.u = small.u)
F19: SELECT * FROM small WHERE NOT EXISTS (SELECT * FROM large WHERE large.nu = small.u)
The execution plan I expect for these queries is an anti-join between two data sets:
- Access the table “small” through a table scan. The operation is executed one single time.
- For every row of the table “small,” access the table “large” through an index scan and check whether rows that fulfill the WHERE clause exist.
MySQL selects the following execution plans. All of them fulfill the expectations.
F10
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | eq_ref | large_u,large_n | large_u | 4 | chris.small.u | 1 | 100.00 | Using where | +----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
F11
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | eq_ref | large_u,large_nn | large_u | 4 | chris.small.u | 1 | 100.00 | Using where | +----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
F12
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | eq_ref | large_u,large_n | large_u | 4 | chris.small.u | 1 | 19.00 | Using where | +----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+------+----------+-------------+
F13
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | eq_ref | large_u,large_nn | large_u | 4 | chris.small.u | 1 | 10.00 | Using where | +----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+------+----------+-------------+
F14
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | ref | large_nu,large_n | large_nu | 4 | chris.small.u | 1 | 100.00 | Using where | +----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+
F15
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | ref | large_nu,large_nn | large_nu | 4 | chris.small.u | 1 | 100.00 | Using where | +----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+
F16
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | ref | large_nu,large_n | large_nu | 4 | chris.small.u | 1 | 19.00 | Using where | +----+--------------------+-------+------------+------+------------------+----------+---------+---------------+------+----------+-------------+
F17
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | ref | large_nu,large_nn | large_nu | 4 | chris.small.u | 1 | 10.00 | Using where | +----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+------+----------+-------------+
F18
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | eq_ref | large_u | large_u | 4 | chris.small.u | 1 | 100.00 | Using index | +----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+------+----------+-------------+
F19
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+ | 1 | PRIMARY | small | NULL | ALL | NULL | NULL | NULL | NULL | 10 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | large | NULL | ref | large_nu | large_nu | 4 | chris.small.u | 1 | 100.00 | Using index | +----+--------------------+-------+------------+------+---------------+----------+---------+---------------+------+----------+-------------+
Oracle Database selects the following execution plans. All of them fulfill the expectations.
F10/F12
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1269 | 18 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "LARGE" "LARGE" WHERE "LARGE"."U"=:B1 AND LNNVL("N"<>:B2))) 3 - filter(LNNVL("N"<>:B1)) 4 - access("LARGE"."U"=:B1)
F11
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1269 | 18 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID| LARGE | 1 | 10 | 3 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 2 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "LARGE" "LARGE" WHERE "LARGE"."U"=:B1 AND LNNVL("NN"<>:B2))) 3 - filter(LNNVL("NN"<>:B1)) 4 - access("LARGE"."U"=:B1)
F13
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 151 | 23 (0)| 00:00:01 | | 1 | NESTED LOOPS ANTI | | 1 | 151 | 23 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID| LARGE | 1000K| 9765K| 2 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | LARGE_U | 1 | | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 3 - filter("NN"="NN") 4 - access("LARGE"."U"="SMALL"."U")
F14/F16
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1269 | 23 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1 | 10 | 4 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | LARGE_NU | 1 | | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "LARGE" "LARGE" WHERE "LARGE"."NU"=:B1 AND LNNVL("N"<>:B2))) 3 - filter(LNNVL("N"<>:B1)) 4 - access("LARGE"."NU"=:B1)
F15
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9 | 1269 | 23 (0)| 00:00:01 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1 | 10 | 4 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | LARGE_NU | 1 | | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "LARGE" "LARGE" WHERE "LARGE"."NU"=:B1 AND LNNVL("NN"<>:B2))) 3 - filter(LNNVL("NN"<>:B1)) 4 - access("LARGE"."NU"=:B1)
F17
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 151 | 33 (0)| 00:00:01 | | 1 | NESTED LOOPS ANTI | | 1 | 151 | 33 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL | SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| LARGE | 1000K| 9765K| 3 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | LARGE_NN | 1 | | 2 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 3 - filter("LARGE"."NU"="SMALL"."U") 4 - access("NN"="NN")
F18
------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 146 | 13 (0)| 00:00:01 | | 1 | NESTED LOOPS ANTI | | 1 | 146 | 13 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | INDEX UNIQUE SCAN| LARGE_U | 900K| 4394K| 1 (0)| 00:00:01 | ------------------------------------------------------------------------------ 3 - access("LARGE"."U"="SMALL"."U")
F19
------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 146 | 23 (0)| 00:00:01 | | 1 | NESTED LOOPS ANTI | | 1 | 146 | 23 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL| SMALL | 10 | 1410 | 3 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | LARGE_NU | 900K| 4394K| 2 (0)| 00:00:01 | ------------------------------------------------------------------------------- 3 - access("LARGE"."NU"="SMALL"."U")
PostgreSQL selects the following execution plans. Except for the queries F18/F19, the others fulfill the expectations. However, in this specific case, the execution plan selected for the queries E18/E19 perform well. I checked that when the data changes, also the execution plans changes. So, even though I was surprised by them, they are good.
F10-F13
Seq Scan on small (cost=0.00..45.47 rows=5 width=148) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Index Scan using large_u on large (cost=0.42..8.44 rows=1 width=4) Index Cond: (u = small.u)
F14-F17
Seq Scan on small (cost=0.00..45.47 rows=5 width=148) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Index Scan using large_nu on large (cost=0.42..8.44 rows=1 width=4) Index Cond: (nu = small.u)
F18
Merge Anti Join (cost=0.56..13.59 rows=1 width=148) Merge Cond: (small.u = large.u) -> Index Scan using small_u on small (cost=0.14..12.29 rows=10 width=148) -> Index Only Scan using large_u on large (cost=0.42..81855.69 rows=1000000 width=4)
F19
Merge Anti Join (cost=0.56..13.59 rows=1 width=148) Merge Cond: (small.u = large.nu) -> Index Scan using small_u on small (cost=0.14..12.29 rows=10 width=148) -> Index Only Scan using large_nu on large (cost=0.42..81855.69 rows=1000000 width=4)
Subtype F2 – Table “small” in the subquery
F20: SELECT * FROM large WHERE n NOT IN (SELECT n FROM small WHERE small.u = large.u)
F21: SELECT * FROM large WHERE n NOT IN (SELECT nn FROM small WHERE small.u = large.u)
F22: SELECT * FROM large WHERE nn NOT IN (SELECT n FROM small WHERE small.u = large.u)
F23: SELECT * FROM large WHERE nn NOT IN (SELECT nn FROM small WHERE small.u = large.u)
F24: SELECT * FROM large WHERE n NOT IN (SELECT n FROM small WHERE small.nu = large.u)
F25: SELECT * FROM large WHERE n NOT IN (SELECT nn FROM small WHERE small.nu = large.u)
F26: SELECT * FROM large WHERE nn NOT IN (SELECT n FROM small WHERE small.nu = large.u)
F27: SELECT * FROM large WHERE nn NOT IN (SELECT nn FROM small WHERE small.nu = large.u)
F28: SELECT * FROM large WHERE NOT EXISTS (SELECT * FROM small WHERE small.u = large.u)
F29: SELECT * FROM large WHERE NOT EXISTS (SELECT * FROM small WHERE small.nu = large.u)
The execution plan I expect for these queries is an anti-join between two data sets:
- Access the table “small” through a table scan and put the resulting data into a memory structure. This operation is executed one single time.
- Access the table “large” through a table scan and, for every row, check the memory structure created by the previous operation to find out whether rows that fulfill the WHERE clause exist. This operation is executed one single time.
MySQL selects the following execution plans. None of them, because of the access to table “small” for every row in the table “large,” fulfill the expectations. They are really bad.
F20
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | eq_ref | small_u,small_n | small_u | 4 | chris.large.u | 1 | 100.00 | Using where | +----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+
F21
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | eq_ref | small_u,small_nn | small_u | 4 | chris.large.u | 1 | 100.00 | Using where | +----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+
F22
+----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | eq_ref | small_u,small_n | small_u | 4 | chris.large.u | 1 | 19.00 | Using where | +----+--------------------+-------+------------+--------+-----------------+---------+---------+---------------+--------+----------+-------------+
F23
+----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | eq_ref | small_u,small_nn | small_u | 4 | chris.large.u | 1 | 10.00 | Using where | +----+--------------------+-------+------------+--------+------------------+---------+---------+---------------+--------+----------+-------------+
F24
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | ref | small_nu,small_n | small_nu | 4 | chris.large.u | 1 | 100.00 | Using where | +----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+
F25
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | ref | small_nu,small_nn | small_nu | 4 | chris.large.u | 1 | 100.00 | Using where | +----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+
F26
+----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | ref | small_nu,small_n | small_nu | 4 | chris.large.u | 1 | 19.00 | Using where | +----+--------------------+-------+------------+------+------------------+----------+---------+---------------+--------+----------+-------------+
F27
+----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | ref | small_nu,small_nn | small_nu | 4 | chris.large.u | 1 | 10.00 | Using where | +----+--------------------+-------+------------+------+-------------------+----------+---------+---------------+--------+----------+-------------+
F28
+----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | eq_ref | small_u | small_u | 4 | chris.large.u | 1 | 100.00 | Using index | +----+--------------------+-------+------------+--------+---------------+---------+---------+---------------+--------+----------+-------------+
F29
+----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+ | 1 | PRIMARY | large | NULL | ALL | NULL | NULL | NULL | NULL | 989170 | 100.00 | Using where | | 2 | DEPENDENT SUBQUERY | small | NULL | ref | small_nu | small_nu | 4 | chris.large.u | 1 | 100.00 | Using index | +----+--------------------+-------+------------+------+---------------+----------+---------+---------------+--------+----------+-------------+
Oracle Database selects the following execution plans. Only the execution plan of the queries F23/F27-F29 fulfill the expectations. The others, because of the missing anti-join optimization, are bad.
F20/F22
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 142M| 1000K (1)| 00:00:40 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5789 (1)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID| SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "SMALL" "SMALL" WHERE "SMALL"."U"=:B1 AND LNNVL("N"<>:B2))) 3 - filter(LNNVL("N"<>:B1)) 4 - access("SMALL"."U"=:B1)
F21
---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 142M| 1000K (1)| 00:00:40 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5789 (1)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID| SMALL | 1 | 6 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | SMALL_U | 1 | | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "SMALL" "SMALL" WHERE "SMALL"."U"=:B1 AND LNNVL("NN"<>:B2))) 3 - filter(LNNVL("NN"<>:B1)) 4 - access("SMALL"."U"=:B1)
F23
-------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 147M| 5794 (1)| 00:00:01 | |* 1 | HASH JOIN RIGHT ANTI | | 999K| 147M| 5794 (1)| 00:00:01 | | 2 | VIEW | index$_join$_002 | 10 | 60 | 2 (0)| 00:00:01 | |* 3 | HASH JOIN | | | | | | | 4 | INDEX FAST FULL SCAN| SMALL_NN | 10 | 60 | 1 (0)| 00:00:01 | | 5 | INDEX FAST FULL SCAN| SMALL_U | 10 | 60 | 1 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5788 (1)| 00:00:01 | -------------------------------------------------------------------------------------------- 1 - access("SMALL"."U"="LARGE"."U" AND "NN"="NN") 3 - access(ROWID=ROWID)
F24/F26
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 142M| 1994K (1)| 00:01:18 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5789 (1)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 1 | 6 | 2 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | SMALL_NU | 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "SMALL" "SMALL" WHERE "SMALL"."NU"=:B1 AND LNNVL("N"<>:B2))) 3 - filter(LNNVL("N"<>:B1)) 4 - access("SMALL"."NU"=:B1)
F25
------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 142M| 1994K (1)| 00:01:18 | |* 1 | FILTER | | | | | | | 2 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5789 (1)| 00:00:01 | |* 3 | TABLE ACCESS BY INDEX ROWID BATCHED| SMALL | 1 | 6 | 2 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | SMALL_NU | 1 | | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------- 1 - filter( NOT EXISTS (SELECT 0 FROM "SMALL" "SMALL" WHERE "SMALL"."NU"=:B1 AND LNNVL("NN"<>:B2))) 3 - filter(LNNVL("NN"<>:B1)) 4 - access("SMALL"."NU"=:B1)
F27
-------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 147M| 5794 (1)| 00:00:01 | |* 1 | HASH JOIN RIGHT ANTI | | 999K| 147M| 5794 (1)| 00:00:01 | | 2 | VIEW | index$_join$_002 | 10 | 60 | 2 (0)| 00:00:01 | |* 3 | HASH JOIN | | | | | | | 4 | INDEX FAST FULL SCAN| SMALL_NN | 10 | 60 | 1 (0)| 00:00:01 | | 5 | INDEX FAST FULL SCAN| SMALL_NU | 10 | 60 | 1 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5788 (1)| 00:00:01 | -------------------------------------------------------------------------------------------- 1 - access("SMALL"."NU"="LARGE"."U" AND "NN"="NN") 3 - access(ROWID=ROWID)
F28
-------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 144M| 5793 (1)| 00:00:01 | |* 1 | HASH JOIN RIGHT ANTI| | 999K| 144M| 5793 (1)| 00:00:01 | | 2 | INDEX FULL SCAN | SMALL_U | 10 | 30 | 1 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5788 (1)| 00:00:01 | -------------------------------------------------------------------------------- 1 - access("SMALL"."U"="LARGE"."U")
F29
--------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 999K| 144M| 5793 (1)| 00:00:01 | |* 1 | HASH JOIN RIGHT ANTI| | 999K| 144M| 5793 (1)| 00:00:01 | | 2 | INDEX FULL SCAN | SMALL_NU | 10 | 30 | 1 (0)| 00:00:01 | | 3 | TABLE ACCESS FULL | LARGE | 1000K| 142M| 5788 (1)| 00:00:01 | --------------------------------------------------------------------------------- 1 - access("SMALL"."NU"="LARGE"."U")
PostgreSQL selects the following execution plans. Only the ones of the queries F28/F29 fulfill the expectations. The others, because of the table scan on “small” for every row in the table “large,” are really bad.
F20-F23
Seq Scan on large (cost=0.00..598473.00 rows=500000 width=148) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Seq Scan on small (cost=0.00..1.12 rows=1 width=4) Filter: (u = large.u)
F24-F27
Seq Scan on large (cost=0.00..598473.00 rows=500000 width=148) Filter: (NOT (SubPlan 1)) SubPlan 1 -> Seq Scan on small (cost=0.00..1.12 rows=1 width=4) Filter: (nu = large.u)
F28/F29
Hash Anti Join (cost=1.23..44849.14 rows=999990 width=148) Hash Cond: (large.u = small.u) -> Seq Scan on large (cost=0.00..32223.00 rows=1000000 width=148) -> Hash (cost=1.10..1.10 rows=10 width=4) -> Seq Scan on small (cost=0.00..1.10 rows=10 width=4)
Summary
The number of queries that the query optimizers handle correctly are the following:
- Oracle Database 12.2: 72 out of 80
- MySQL 8.0.3: 67 out of 80
- PostgreSQL 10.0: 60 out of 80
Since not all queries are handled correctly, for best performance it is sometimes necessary to rewrite them.
Hi Chris,
Are you just creating a ACID test for database optimizer?
Cheers,
Miguel Anjo
Hi Miguel
I’m not creating anything specific… just asking myself how a specific engine can handle common queries.
Cheers,
Chris
[…] Read More (Community […]
Hello… I have rerun the queries on MariaDB. My results were that MariaDB was 5 queries better than MySQL:
* E18 – is converted into semi-join
* E28, E29 – EXISTS-to-IN conversion works, the query plan is now good
* F28, F29 – Materialization is now applied, the query plan becomes good
Full details are posted at https://jira.mariadb.org/browse/MDEV-18835 .
Thank you very much for sharing that info!
Hello, I’m just going through you article and occasionally executing some examples and it seems that in Oracle db version 18.5, the queries C10 and C11 do produce the expected results (full tablescan on “small”):
explain plan for SELECT * FROM small WHERE n IN (SELECT nn FROM large);
——————————————————————————-
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————-
| 0 | SELECT STATEMENT | | 10 | 1310 | 17 (0)| 00:00:01 |
| 1 | NESTED LOOPS SEMI | | 10 | 1310 | 17 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| SMALL | 10 | 1180 | 17 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | LARGE_NN | 95651 | 1214K| 0 (0)| 00:00:01 |
——————————————————————————-
Hi, thank you for the information. This is basically expected… newer version should be better. Not only in case of Oracle Database. I should re-run all tests to see what the difference is. Best.