I want to address a topic that I think a lot of people have been taught in the past and still live by today. I’ve heard different variations of this “rule” in the years that I have been programming and they go something like this.
- “Where possible ALWAYS have you smallest tables joined to first.”
- “Where possible ALWAYS have you largest tables joined to first”
So already you might start to doubt the validity of the rule :-) GOOD
I’ve created 5 tables all with the same structure.
1: CREATE TABLE <Table Name>
2: (
3: RowNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
4: SomeInt INT NULL,
5: SomeString CHAR(2) NULL,
6: SomeCSV VARCHAR(80) NULL,
7: SomeNumber MONEY NULL,
8: SomeDate DATETIME NULL
9: )
I’ve added different amounts of data to each table as follows:
--Table1 100 rows
--Table2 1000 rows
--Table3 10000 rows
--Table4 100000 rows
--Table5 1000000 rows
I then created 6 queries all with various join orders:
1: --Query1
2: SELECT m.*
3: FROM Table5 m
4: INNER JOIN Table1 a1 ON m.RowNum = a1.RowNum
5: INNER JOIN Table2 a2 ON m.RowNum = a2.RowNum
6: INNER JOIN Table3 a3 ON m.RowNum = a3.RowNum
7: INNER JOIN Table4 a4 ON m.RowNum = a4.RowNum
8:
9: --Query2
10: SELECT m.*
11: FROM Table5 m
12: INNER JOIN Table4 a4 ON m.RowNum = a4.RowNum
13: INNER JOIN Table3 a3 ON m.RowNum = a3.RowNum
14: INNER JOIN Table2 a2 ON m.RowNum = a2.RowNum
15: INNER JOIN Table1 a1 ON m.RowNum = a1.RowNum
16:
17: --Query3
18: SELECT m.*
19: FROM Table5 m
20: INNER JOIN Table1 a1 ON m.RowNum = a1.RowNum
21: INNER JOIN Table3 a3 ON m.RowNum = a3.RowNum
22: INNER JOIN Table4 a4 ON m.RowNum = a4.RowNum
23: INNER JOIN Table2 a2 ON m.RowNum = a2.RowNum
24:
25: --Query4
26: SELECT m.*
27: FROM Table1 m
28: INNER JOIN Table2 a2 ON m.RowNum = a2.RowNum
29: INNER JOIN Table3 a3 ON m.RowNum = a3.RowNum
30: INNER JOIN Table4 a4 ON m.RowNum = a4.RowNum
31: INNER JOIN Table5 a5 ON m.RowNum = a5.RowNum
32:
33: --Query5
34: SELECT m.*
35: FROM Table5 m
36: INNER JOIN Table4 a4 ON m.RowNum = a4.RowNum
37: INNER JOIN Table3 a3 ON m.RowNum = a3.RowNum
38: INNER JOIN Table2 a2 ON m.RowNum = a2.RowNum
39: INNER JOIN Table1 a1 ON m.RowNum = a1.RowNum
40:
41: --Query6
42: SELECT m.*
43: FROM Table3 m
44: INNER JOIN Table1 a1 ON m.RowNum = a1.RowNum
45: INNER JOIN Table5 a5 ON m.RowNum = a5.RowNum
46: INNER JOIN Table4 a4 ON m.RowNum = a4.RowNum
47: INNER JOIN Table2 a2 ON m.RowNum = a2.RowNum
Each Query returns 100 rows as expected and using :
1: SET STATISTICS TIME ON
I get the same result for each query: (I ran each query over 10times):
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
(100 row(s) affected)
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 3 ms.
For some people this might seem strange as they may have been taught that changing the order of your joins can dramatically affect your result.
Now let’s look at the execution plans:
OK so this is all very strange isn’t it!!!
If you run Queries 1 -3 together you get this:
Query 1: Query cost (relative to the batch) 55%
Query 2: Query cost (relative to the batch) 13%
Query 3: Query cost (relative to the batch) 33%
If you run Queries 4 -6 together you get this:
Query 4: Query cost (relative to the batch) 33%
Query 5: Query cost (relative to the batch) 33%
Query 6: Query cost (relative to the batch) 33%
The reason the above percentages are different and why the percentages in the Query Plans for each Clustered Index Seek/Scan are different for each query is mainly due to the statistics on the tables. After running the following statement:
1: UPDATE STATISTICS Table1
2: UPDATE STATISTICS Table2
3: UPDATE STATISTICS Table3
4: UPDATE STATISTICS Table4
5: UPDATE STATISTICS Table5
The Query Costs relative to the batch for each query became equal and the percentages for each operator in the different Execution plans became the same.
More importantly we notice that the Execution plans themselves are very much the same for each query.
At this stage I would like to point you to Grant Fritchey’s blog about a recent article Execution Plan Estimated Operator Cost for more information on the strange percentages in execution plans.
The most important thing to remember here is to understand what you are writing. Test what people tell you if you not sure, on top of this understand what an execution plan is showing you. Understand the plan don’t blindly believe the percentages …