I had the pleasure of mentoring a very good class last week. One feature in SQL Server 2005 they really liked was table partitioning. In brief, this technology allows a table to be fragmented over n-many file groups that consist of one or more database files. There a couple of very good reasons to do this: for transactional tables, it takes fewer resources to backup, restore since only the active parts of the database are written to specific file groups. Second, if the partitioning is aligned to anticipated, commonly used queries, the data access engine will have to read a minimum amount of index and tabular data to complete them. That means that I/O is reduced and queries complete faster and cheaper.
That is, at least, the theory. However, a demonstration of table partitioning illustrated a two other interesting effects. First, a well-designed partitioning schema can produce lower cost, non-parallelized query plans that same designed tables. Second, even queries that would not benefit from the partitioning design can still show considerable improvement.
Let us start with the design of the database. For our purposes, this database will host a single table -- one that records the line items of sales transactions. In a real database, this design may or may not be appropriate. However, we are interested in showing how the concept works. Following is the create database statement used. Notice that we have file groups distributed over three name drives. In my case, I am using a USB hub with three USB 2.0 memory sticks plugged into it[i]. We will create a non-partitioned version of the table on drive "D." We will distribute the transaction log over three files on drives D, F and G. We will also create sixteen file groups each containing a single file. We will eventually create the partitioned table over these file groups.
create database [xmlazon] on primary
( name = N'xmlazoN',filename = N'd:\xmlazon.mdf',size = 224mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fg0(name = N'xmlazon_f0',filename = N'd:\xmlazon_f0.ndf',size = 64mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fg1(name = N'xmlazon_f1',filename = N'd:\xmlazon_f1.ndf',size = 32mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fg2(name = N'xmlazon_f2',filename = N'f:\xmlazon_f2.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fg3(name = N'xmlazon_f3',filename = N'g:\xmlazon_f3.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fg4(name = N'xmlazon_f4',filename = N'f:\xmlazon_f4.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fg5(name = N'xmlazon_f5',filename = N'g:\xmlazon_f5.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fg6(name = N'xmlazon_f6',filename = N'f:\xmlazon_f6.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fg7(name = N'xmlazon_f7',filename = N'f:\xmlazon_f7.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fg8(name = N'xmlazon_f8',filename = N'g:\xmlazon_f8.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fg9(name = N'xmlazon_f9',filename = N'f:\xmlazon_f9.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fgA(name = N'xmlazon_fA',filename = N'g:\xmlazon_fA.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fgB(name = N'xmlazon_fB',filename = N'f:\xmlazon_fB.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fgC(name = N'xmlazon_fC',filename = N'g:\xmlazon_fC.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fgD(name = N'xmlazon_fD',filename = N'f:\xmlazon_fD.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fgE(name = N'xmlazon_fE',filename = N'g:\xmlazon_fE.ndf',size = 8mb,maxsize = unlimited,filegrowth = 1mb )
,filegroup xmlazon_fgF(name = N'xmlazon_fF',filename = N'd:\xmlazon_fF.ndf',size = 32mb,maxsize = unlimited,filegrowth = 1mb )
log on
( name = N'xmlazon_log0',filename = N'd:\xmlazon_log0.ldf',size = 74mb,maxsize = 2048gb,filegrowth = 10% )
,( name = N'xmlazon_log1',filename = N'f:\xmlazon_log1.ndf',size = 74mb,maxsize = 2048gb,filegrowth = 10% )
,( name = N'xmlazon_log2',filename = N'g:\xmlazon_log2.ndf',size = 74mb,maxsize = 2048gb,filegrowth = 10% );
Now we can turn our attention to the two versions of the table in question. First, here is the non-partition version. It is a basic table design illustrating some of the basics of good design such as using the smallest possible data type and having clustered index on the table.
create table dbo.saleDetail ( customerID smallInt not null , orderDate smallDateTime not null check(orderDate > '2005-10-06') , orderLine tinyint not null check(orderLine > 1) , productID smallint not null , quantity smallint not null check(quantity > 0) , unitSalePrice decimal(7,2) not null check(unitSalePrice > 1.0) , lineTotal as quantity*unitSalePrice persisted constraint pkSaleDetail primary key(customerID,orderDate,orderLine));
The partitioned version of the table depends on both a partitioning function and partitioning scheme. The partition function is simple. Based on an input data type (SmallDateTime in this case), it returns a whole number. Think of it as being like a CASE ... WHEN statement with a twist. The values listed in the body of the function act as delimiters. When a value enters the function, it is compared with the entries in the function value. In our example, the value 7 October 2005 returns one, whereas 1 January 2006 returns two and so on.
create partition function quarteringFunction(smallDateTime) as right for values ('20060101','20070101','20070201','20070301','20070401','20070501','20070601','20070701','20070801','20070901','20071001','20071101','20071201','20080101','20090101');
The partition scheme is also straight forward as this object actually invokes the partitioning function and uses the result to return a file group name on action statements:
create partition scheme quartersToFileGroups as partition quarteringFunction to(xmlazon_fg0,xmlazon_fg1,xmlazon_fg2,xmlazon_fg3,xmlazon_fg4,xmlazon_fg5,xmlazon_fg6,xmlazon_fg7,xmlazon_fg8,xmlazon_fg9,xmlazon_fga,xmlazon_fgb,xmlazon_fgc,xmlazon_fgd,xmlazon_fge,xmlazon_fgf);
For example, if a row is inserted with a date of 7 October 2005, the partition function returns one and the partition scheme cues to the data engine to write to file group xmlazon_fg0.
The table definitions are nearly identical, however the partitioned table must be created on the partitioning schema and must have the partitioning column value passed in:
create table dbo.saleDetail2 ( customerID smallInt not null , orderDate smallDateTime not null check(orderDate > '2005-10-06') , orderLine tinyint not null check(orderLine > 1) , productID smallint not null , quantity smallint not null check(quantity > 0) , unitSalePrice decimal(7,2) not null check(unitSalePrice > 1.0) , lineTotal as quantity*unitSalePrice persisted constraint pkSaleDetail primary key(customerID,orderDate,orderLine)) on quartersToFileGroups(orderDate);
As for test data: I wrote a little C# application that generates an ADO.NET DataTable with about two and half million rows t in it, then wrote that data to the non-partitioned version. To make repeating the test a bit easier and more stable, I then used the BCP program to dump the data to native, binary file. I can then load the file using the BULK INSERT statement:
bulk insert xmlazon.dbo.saleDetail from 'd:\saleDetails.raw' with(batchsize=100000,dataFileType='native',tablock);
Loading the entire file into the non-partition table took, on average, about 74 seconds. Loading into the partitioned table took, again on average, about 113 seconds, or about 52.7% longer. This makes sense: each of the inbound rows needs to resolved to a file group using the partition scheme (and thus the partition function).
I wanted to begin testing by using a query that should take maximum advantage of the partitioning concept. The following query against the non-partitioned version of the table selects about 2.78% (69,584) row from the database. Since I am running on a dual-core machine I used the MAXDOP option emulate a having a single CPU. Data is read from the table using a Clustered Index Scan.
select productID,quantity,lineTotal from dbo.saleDetail where orderDate between '2007-11-23' and '2007-12-26' option(MAXDOP 1);
We can increase MAXDOP to two easily enough:
select productID,quantity,lineTotal from dbo.saleDetail where orderDate between '2007-11-23' and '2007-12-26' option(MAXDOP 2);
Running these same queries against the partitioned table yields dramatic differences. The following table summarizes the results for all four of the test query versions:
Measure | Non-Partitioned, Non-Parallelized | Non-Partitioned, Parallelized | Partitioned, Non-Parallelized | Partitioned, Parallelized |
Initial row count | 68415.62 | 68415.62 | 64729.71 | 64729.71 |
Total plan IO | 7.783866 | 7.783866 | 0.9781018 | 0.9781018 |
Total plan CPU | 2.757001562 | 1.528797662 | 0.350537321 | 0.350537321 |
Total plan Cost | 12.74087 | 10.41267 | 1.384889 | 1.384889 |
While the numbers tell a large part of the story here -- mainly that partitioning the table dramatically reduces the plan costs in all categories -- there is something else you need to see. Here are the plan tree texts for the partition table queries. The MAXDOP 1 listing is first:
Compute Scalar(DEFINE:([xmlazon].[dbo].[saleDetail2].[lineTotal]=[xmlazon].[dbo].[saleDetail2].[lineTotal]))
Clustered Index Scan(OBJECT:([xmlazon].[dbo].[saleDetail2].[pkSaleDetail2]), SEEK:([PtnId1000] >= (13) AND [PtnId1000] <= (14)), WHERE:([xmlazon].[dbo].[saleDetail2].[orderDate]>='2007-11-23 00:00:00.000' AND [xmlazon].[dbo].[saleDetail2].[orderDate]<='2007-12-26 00:00:00.000') ORDERED FORWARD)
Here is the MAXDOP 2 plan:
Compute Scalar(DEFINE:([xmlazon].[dbo].[saleDetail2].[lineTotal]=[xmlazon].[dbo].[saleDetail2].[lineTotal]))
Clustered Index Scan(OBJECT:([xmlazon].[dbo].[saleDetail2].[pkSaleDetail2]), SEEK:([PtnId1000] >= (13) AND [PtnId1000] <= (14)), WHERE:([xmlazon].[dbo].[saleDetail2].[orderDate]>='2007-11-23 00:00:00.000' AND [xmlazon].[dbo].[saleDetail2].[orderDate]<='2007-12-26 00:00:00.000') ORDERED FORWARD)
Yes, you are right -- they are exactly the same plan. Here we have such low-cost access to the data that parallelizing the plan does not help reduce its cost. This can be very helpful in situations where you find that you running into issues with CPU load. It also demonstrates one of the good side-effects of using partitioned tables.
All of this is well and good you might be thinking but what queries that run "against the grain" of the table partitioning design? Is there some downside to it? Well, to test that, consider the following query. It has been designed to read all of the data in table, not just some parts of it:
select distinct sd.customerID,sd.productID,COUNT(sd.productID),SUM(sd.lineTotal) from dbo.saleDetail sd group by sd.customerID,sd.productID with rollup order by sd.customerID,sd.productID option(maxdop 1);
The following table summarizes the statistics for the combinations of parallelization and use of the partition table:
Measure | Non-Partitioned, Non-Parallelized | Non-Partitioned, Parallelized | Partitioned, Non-Parallelized | Partitioned, Parallelized |
Initial Row Count | 2500003 | 2500003 | 2500003 | 2500003 |
Final Row Count | 11103.24 | 11103.24 | 11098.89 | 11098.89 |
Total IO | 7.80646652 | 7.795127262 | 7.84741552 | 0.005630631 |
Total CPU | 213.045881 | 137.0510621 | 132.3262086 | 11.79203117 |
Total Cost | 220.8524 | 144.8462 | 140.1736 | 94.92677 |
As you can see, while this query would not seem to benefit from table partitioning, it actually does. This demonstrates the second side-effect of table partitioning. Since partitioned tables can be spread over many file groups, the data engine may be able to access data resting in tables and indexes more effectively. In some cases, this affect can out-perform parallelization when compared to non-partitioned tables. However, partitioning should not be considered a "silver bullet" for solving all performance issues as there is a demonstrated impact in insert -- a thus a like similar impact on update and delete -- performance as well.
Source code and data are available for download on request, use use the contact feature here.
[i] Think a RARID. A redundant array of REALLY inexpensive disks.
Source Click Here.
Post a Comment