SQLpronews Home PageAbout iEntryNewsWebProWorld ForumsJaydeiEntryContactAdvertiseDownloadsiEntry
^ click above ^
01.27.04

SQL Server Parallel Execution Plans

By Joe Chang

Microsoft SQL Server introduced parallel query processing capability in version 7.0. The purpose of parallel query execution is to complete a query involving a large amount of data more quickly than possible with a single thread on computers with more than one processor.

Books Online and various Microsoft documents describe the principles of parallel execution and the use of configuration setting that affect parallel execution. However, there is very little explanation on the matter of interpreting execution plans when parallelism is involved. A comparison of several queries with parallel execution plans, to the execution plan with parallelism disabled, provide a means of explaining some of the characteristics of parallel execution plans, including the meaning the of the estimated cost. An examination of actual queries within and without parallel execution plans provides additional guidance on circumstances where parallel execution is beneficial and when it should be disabled.
Parallel Query Processing

Figure 1 below shows a portion of the execution plans for a query with and without parallel operations. The query is a simple SELECT with a WHERE clause search argument (SARG), aggregates in the SELECT list, and a GROUP BY clause. The non-parallel execution plan is forced with the OPTION (MAXDOP 1) clause.





Figure 1. Non-parallel and parallel portions of an execution plan.

Note that the symbols for the common SQL operations--such as Index Seek, Hash Match, and Compute Scalar--have a yellow circle with arrows in the lower right corner when parallel execution is involved for that step. The Parallelism/Gather Streams symbol in Figure 1 is specific to parallel plans. Other symbols specific to parallel plans include Parallelism/Broadcast, Parallelism/Distribute Streams, and Parallelism Repartition Streams.


Get UnixProNews Newsletter Free ">Go Here


Figure 2 below shows the left side portion of the two execution plans from Figure 1. The top query with cost 66.65% relative to the batch (Query 1 and Query 2) is the non-parallel execution plan and the bottom query with cost 33.35% is the parallel execution plan.



Figure 2. Non-parallel (top) and parallel (bottom) execution plan relative cost.

Figure 3 below show the detail boxes for the SELECT symbols on the non-parallel (left) and parallel (right) execution plans. The non-parallel plan has a total estimated cost of 121 and the parallel plan has a total estimated cost of 60.7, from which the 66.65% and 33.35% relative costs in Figure 2 are derived.



Figure 3. Non-parallel (left) and parallel (right) plan cost detail.

Nowhere in the public Microsoft documentation for SQL Server is the unit of measure for the execution plan cost explained. In fact, all Microsoft SQL Server documentation on this subject seems to be deliberately vague. Is the execution plan cost some measure of time or processor utilization? If it were processor utilization, is it relative to one processor or all processors? An examination of non-parallel and parallel execution plans provide a hint that the unit of measure is most probably time, and that lower plan costs represents lower execution time, but makes no representation of CPU resources.

Figure 4 below shows the cost details for the two major components of each execution plan discussed above. Figure 4a is the non-parallel Clustered Index Seek and Figure 4b is the parallel plan Clustered Index Seek operation. In both cases the estimated row counts are identical, yet the parallel plan Clustered Index Seek has a cost of 47.1095, approximately one-half the cost of the non-parallel Clustered Index Seek at 94.2190.



Figure 4a. Non-parallel Clustered Index Seek cost details.



Figure 4b. Parallel Clustered Index Seek cost details.


The same pattern is observed in the non-parallel and parallel Hash Match/Aggregate component operations shown in Figure 4c and Figure 4d.



Figure 4c. Non-parallel Hash Match/Aggregate cost details.



Figure 4c. Non-parallel Hash Match/Aggregate cost details.

There is no rational reason to believe that having two or more threads separately process portions of the index seek or hash match operations reduces the total CPU-cycles in processing the entire operation, except under unusual circumstances. In fact, the parallel operation should be more expensive taking into account the cost of merging the partial results from each thread. It is reasonable, however, to expect the parallel operation to complete faster than the non-parallel operation, assuming all required resources are available. The logical conclusion is that the unit of measure for the cost in SQL Server’s execution plan is probably time and not processor utilization.

Go Here to Read the Full Article


About the Author:
Joe Chang is a freelance consultant specializing in SQL Server, database architecture, design, performance tuning, and scalability analysis. Joe has more than 12 years experience in software development, including performance and scalability analysis, for microprocessors, server systems and database applications. The materials and tools in this series of articles are available as a 1- or 2-day onsite training course.

Read this newsletter at: http://www.sqlpronews.com/2004/0127.html

Free Newsletters
Part of the iEntry Network
over 4 million subscribers
SQLpronews
FlashNewz
DevWebPro


Send me relevant info on products and services.


 

 

 

 

 

 

 

 

 

 

 

 

-- SQLProNews is an iEntry, Inc. publication --
iEntry, Inc. 880 Corporate Drive, Lexington, KY 40503
2004 iEntry, Inc.  All Rights Reserved  Privacy Policy  Legal

advertising info | news headlines | free newsletters | comments/feedback | submit article