Database Administrators\Developers with minimal experience of partitioning in OLTP\OLAP databases.
This mainly demo based session will cover the following:
- An introduction to partitioning, core concepts and benefits.
- Choosing a Partitioning Key
- Creating Partition Functions & Schemes
- Indexing Considerations
- Splitting, Merging & Switching Partitions
- Implementing Partition Sliding Windows
- Filegroup Restores with Partitioning
After the session, attendees will have an insight into partitioning and have a platform on which to be able to investigate further into implementing a partitioning solution in their own environment.
Enjoy the Podcast?
Why I Want to Present This Session:
I worked with partitioning for a pretty long time and I think it is an absolutely amazing tool that increases the manageability of our data.
However, there are a bunch of gotchas when it comes to building an effective partitioning solution so I want to share the knowledge and tips that I’ve accumulated over the years.
(Plus running these demos is really good fun 🙂 )
Andrew Pruski: Thank you very much and welcome to everyone to this session on an introduction to partitioning. My name is Andrew Pruski. I’m a SQL Server DBA and a Microsoft Data Platform MVP; originally from Swansea, Wales, but I’ve been living in Dublin for coming up on five years now. My Twitter handle and my email address are there, @dbafromthecold and email@example.com – so if you have any questions after today, please feel free to ping me. I’m always willing to talk about this stuff. My blog is there, dbafromthecold.com – I’ve posted multiple articles about partitioning; some of which I’m basing this session on and some of which go into a little bit further detail. So if you want to, you’re more than welcome to check them out. But whilst I’m presenting this session, I’ll be Tweeting out links to those blog posts that I’ve written. So if you follow #GroupByconf, you’ll be able to find them there. And finally, there’s my GitHub, /dbafromthecold – and on there, you’ll find all the code and the slides for the demos I’ll be running today.
So, first things first, what’s the aim of this session? Now, partitioning has been around as we know it since about 2005. And there’s been various updates throughout the years, the biggest one, I think, being SQL Server 2016 SP1, where it went from an Enterprise only feature to being available in Standard Edition. But it’s been around for quite some time, and as such, I think that partitioning is a core DBA skill. Every DBA should know the theory behind partitioning, how to set it up, and how to manage it on an ongoing basis. So that is the aim of this session; to give you a basic knowledge to work with partitioning in SQL Server and to give you some code as well that you can take away and use to play around and build your own partitioning solution.
So here’s what we’ll cover; we’ll start off with partitioning, its definition, so what is partitioning. And then we’ll go into looking at what is a partitioning key. Once we’ve got that, we’ll then look at partition functions and schemes, the objects that we build partition tables upon. Then we’ll have a look at indexing, because when you apply indexes to a partitioned table, there are some considerations that we need to take into account. And then, we can get into the specialist operations that we can apply to individual partitions on a table; so the split, the merge, and my favorite operation, the switch operation. Then, once we’ve covered those, we’ll have a look at wrapping them all together and building what’s known as a partition sliding window. And that’s just a way of combining the split, the merge, and the switch operations on a rolling basis. And then finally, we’ll have a look at what people might not consider is a benefit of partitioning, but filegroup restores. If we implement partitioning on a table and spread our partitions across multiple file groups, in the event of a disaster recovery situation, we can reduce our recovery time by only restoring the file groups that we need. Kind of handy to have in your arsenal as a DBA, so we’ll have a look at the end as well.
But first things first, what is partitioning? Now, partitioning is a way of splitting a table up horizontally into units or partitions that can be spread across multiple physical locations. We have a limit of up to 15,000 partitions per table. And each one of those partitions can have specialist operations apply to it which increase the manageability of our data. Now, I stress manageability because that is what partitioning is for. It is primarily for maintenance of data. I have seen partitioning improve query performance, but more times than not, I have seen partitioning absolutely decimate query performance. You need to be very, very careful if you’re applying partitioning to a table thinking you’ll get a performance boost because, more than likely, the opposite is going to happen.
Okay, so what benefits doo we get from partitioning? Now, the first one there is fairly obvious; partitioned tables appear as normal tables. We don’t need to tell SQL Server, once we partition our table, how to insert the data into the different partitions. That’s done automatically for us. I’ve already mentioned the specialist operations that we have that we can use to easily switch data out of tables or switch it back in; very, very handy when it comes to archiving or deleting data from your tables. Individual partitions can be compressed, saving bit of disk space. And individual partitions can be rebuilt, saving us a little bit of maintenance.
Okay, there are some drawbacks. The first one there, it requires management of partitions and file groups. Yes, once we set up partitioning, we need to set something up; usually a stored procedure that runs via a SQL Server agent job, just to manage our partitions for us. If you don’t set something up, you will end up with one partition with an absolute ton-worth of data in that and you’ll have to manage that. The specialist operations I mentioned, they can be blocked by DML operations. You need to be careful with this because those specialist operations try to acquire schema mod locks. So if you try and apply these at an incorrect time, you will end up with severe blocking on your system. Next point down there, foreign keys referencing partitioned tables – you can have foreign keys referencing partitioned tables no problem, but if you do that, you will lose the ability to perform something like the switch operation, and therefore, lose a lot of the power of partitioning. And the final one there, I think is the biggest one, performance of your queries that don’t reference your partitioning key on your table will have their performance affected. You need to be very, very careful when you’re applying partitioning to a table in case you end up in a performance nightmare and you end up having to dig your way out of it.
So those are, very briefly, the benefits, drawbacks, and the definition of partitioning. I really wanted this to be a demo-based session, so I want to get straight into building a partitioned table. So the first thing we need to do is pick our partitioning key. Now, a partitioning key is the column on the table that’s going to define our partitions for us. So we need to think about things like, how is our data going to split up? How are we going to partition our data? Do we have any archiving or retention policies for that data? And very importantly, how is that table going to be queried? Now, I wish there was some sort of golden rule I could give you on how to pick your partitioning key, but unfortunately, the answer when people ask me is, always it depends. It depends on your data. It depends on how you want to partition your table. And it depends on how that table is going to be queried. The best advice I can give is know your systems. Know your data. Know your queries. Know how it’s going to be used. And if you know all that, then you will be able to pick an appropriate partitioning key.
And you do have quite a few options. You can pretty much use all column types, except the ones listed there; timestamp, ntext, text, image. You can even use persisted computed column if you want to. So you do have options there. So that’s the partitioning key. Once you’ve analyzed your system, decided on your partitioning key, we can then go and start building the objects to finally create our partitioned table.
So the first thing we need to do is build a partition function. Now, the partition function is what maps rows in the table to our partitions. Really simple to set up, we simply say create partition function, give it a name, specify a data type, then we pick as range right or left? This is how our partition boundaries work, and I’ll come to that in a second. and then at the end, we say, for values, and then we specify, in order, our partition boundary values; so N1, N2, all the way up to however many partition boundaries you want.
Now, I mentioned the range right or left, and this is how we define how our partition boundaries work. So we have an example here. I’m going to say, create my partition function, and it’s going to be partitioned by date. So this tables is going to be partitioned by a date column and the values there I have are the 1/1/2016, 2017, 2018. So this table is going to be partitioned by a date column, and each partition is going to have, say, a year’s worth of data. Now, if I picked a right range type for my partition function and I have partition X there, I would have data in that partition all the way from 1/1/2017 all the way up to, but not including 1/1/2018. So I would have a year’s worth of data in that partition.
So if I picked a left partition function, I would have data not including 1/1/2017, so the first date in that partition would be 2/1/2017, all the way up to and including 1/1/2018. So there would be an overlap of a year’s worth of data there. So this is why, when I come to working with partitions, especially when they’re date or date time partition functions, I’d always recommend using a right range type. And you need to be a little bit careful because you don’t have to specify it. And if you don’t specify it, the default will, of course, be left. And there’s another thing to consider, that there’s a kind of nasty little gotcha with right range partition functions that raises its head when you perform a merge operation. And I’ll cover that in in the demo in a little bit.
Erik Darling: So what would be a good example of a data type that you would use a left function with? Would it be, like, integers or something else?
Andrew Pruski: you could use integers, yeah.
Erik Darling: So like, if you wanted an example of when it’s a good idea to use a left-based partition function, when would you do that?
Andrew Pruski: I’m trying to think now – it all depends on the – it really does depend. I’ve personally always used range right, but more often than not, I’m working with date or date time. I’m, on personal preference, using right because that’s how I think of it in my head, how I visualize how the partition boundaries are working…
Erik Darling: far less confusing…
Andrew Pruski: Yes, absolutely. I find with left, you end up having quite complicated partition boundaries as well.
Erik Darling: So there’s a question over in GoToWebinar and it is, “Is partitioning a table by a bit field an unwise choice?”
Andrew Pruski: By a bit field? So you partition either by one or zero?
Erik Darling: I mean, unless there’s anulls…
Andrew Pruski: I would say, more likely an unwise choice, yes.
Erik Darling: I would agree.
Andrew Pruski: Alright, so we have our partitioning key. We’ve built our partition function. We can now build our partition scheme. And this is what maps our partitions to our file groups in our database. This is where, for me, partitioning starts to show how powerful it is. We can spread our table across multiple different physical locations just by using this one object here. So we say create partition scheme, give it a name as partition, and then specify our partition function. And then we say to, and then file groups, and we list our file groups in order that we want our partitions to be mapped to. We do have the option here to say all, and that will say all mapped partitions to one file group. So we’d say all to, and then file group. And that will map every single partition on that table to that one file group. You can do this. I do it in demos quite a lot. I wouldn’t recommend doing this in a live system. At least initially, specify individual file groups; it just gives you a little more flexibility when you’re working.
So there we are. We have our partitioning key, our partitions function, and then our partition scheme. We can then build our partition table. And building a partitioned table is exactly the same as a normal table, with one exception; instead of specifying a file group that the table goes on, we specify our partition scheme and then our partitioning key. And this is what partitions our table for us. So we just say create table, and we have a look at this table here, dbo.partitoned table. We have an integer identity column there, some random columns, and then a date at the end, and that’s our partitioning key. So this table will be partitioned by date.
So, enough of me talking, let’s go and build a partitioned table. So if I switch over to Management Studio here, and I’m going to set up a database with a whole bunch of file groups that we’re going to build our partitioned table in.
Erik Darling: You’re so much more ambitious than me. I always just do all the primaries. Like, you’ll get it later…
Andrew Pruski: Okay, so we have our database. Let’s create our partition function scheme. What I’m going to do is, the same as the slide, create a partitioned table that’s going to be partitioned by date. So the first thing I’m going to do here is partition as range left. And one thing to note, one, two, three, four, five, six, seven, and eight file groups. So we have seven partition boundaries, and that will create eight partitions. So we need to specify eight file groups. Seems obvious, but it’s caught me out more than once. Always have one more file group for your partitions. So let’s create that, hit execute, there we go. And now, we can create our partitioned table. I’m just saying create table, exactly the same as the slides. Here’s my partition scheme and then my partitioning key. Hit execute. There we go, and then this nice little query here will show us our partitions. So we have our eight partitions here, an open ended one, our file groups, boundary values, zero rows in it, we haven’t put any data in. but there’s our partition table.
Now, there’s a nice little function when it comes to partitioning, which is dollar partition, which will allow us to check the destination of any data that we want to insert into our table. So I’ve got three values here, 1/1/2012, 31/12/2011, and 1/12011. I can use this function to see which partition of my table this data, if inserted into my table, would end up in. so hit execute, and if I have a look at this – so 1/1/2011 is going to go into partition one. 31/12/2011 is going to go into partition two. And 1/1/2012 is also going to go into partition two. So that partition number two is going to have data in it from 2011 and 2012. So it’s not something I really want, so what I’m going to do is blow it all away and rebuild it as range right, which changes how the boundaries work. And we’ll then have a look at the destination partitions again.
So I’m going to blow everything away including the table, rebuild my function and my scheme, hit execute, and now I can recheck the destination of the data. So now I have 1/1/2011 destination partition two, 31/12/2011 also in partition two, and 1/1/2012 in partition three. So that’s looking good to me. I’ve got 2011 all in one partition, and the next batch of data, 2012, is going into a separate partition. So in my head, I’m getting one partition for one year and that makes it a lot easier for me to manage. Let’s put some data in and then we can have fine, we can have another look at partitions.
Okay, good, there’s my table, and I’ve just done this as prep for the next demos really, but exactly the same, just with a whole bunch of data in. so that’s great. We’ve got our partition function, we’ve got our partition scheme, we’ve built our partitioned table, but what we’ve actually done is built a partitioned heap. Now, you can perform all the same specialist operations on a partitioned heap as you can with a partitioned table that has a clustered index. But you’re going to end up with problems that are inherent to heaps; forwarded pointers and such. So we probably want to apply a couple of indexes to this table.
So before we go into the next demo, let’s head back to the slides and let’s talk about some considerations we need to take into account when working with indexes on partitioned tables. So first one, clustered indexes – I’d always recommend putting a clustered index on a new table. What we need to take into account is whether or not our clustered index is going to be unique or non-unique. If our clustered index is unique, we need to explicitly put the partitioning key in our index definition. And this is so that the uniqueness of that index can be checked by SQL just by checking that one partition. If it’s non-unique, we don’t need to explicitly add the partitioning key to our index, but SQL will implicitly put it in the background for us. It doesn’t tell you it’s doing it, but it happens in the background.
So that’s clustered indexes. We don’t want to just stay with one clustered index on the table. We probably want to add some non-clustered indexes as well. Now, there’s another consideration we need to take into account with non-clustered indexes, and that’s whether they are aligned or non-aligned. An aligned non-clustered index is an index that is partitioned using the same partition scheme as our base table. So it is partitioned the same as our base table. It is aligned. A non-aligned non-clustered index is a non-clustered index that is either not partitioned at all, we just specified a file group and we created it, or it uses a different partitioning scheme.
Now, I would always recommend using aligned non-clustered indexes on your tables, as if you have a non-aligned, non-clustered index on your partitioned table, you will lose the benefits of things like the switch operation, and again, lose a lot of the power of partitioning. And I typically see people putting non-aligned non-clustered indexes on partitioned tables to try and dig themselves out of some sort of performance hole that they’ve found them in. So that’s whether our non-clustered indexes are aligned or non-aligned. We also have the same considerations as the clustered indexes; whether it’s going to be unique or non-unique.
If our non-clustered index is going to be unique, again, we have to explicitly enter the partitioning key in our index definition. If it’s non-unique, the partitioning key, and we don’t include the partitioning key, it will be added as an included column in our index. SQL won’t tell you it’s doing it, but it’s happening in the background. So let’s add some indexes to our newly created partitioned table.
So if we can close that demo, open up the second one, have a quick look at our table – there we go, and if we jump in here, we can have a look at it. No indexes, let’s go and put some indexes on it. So first things first, let’s create, say, a unique clustered index on the ID column, that identity integer on our partitioned column. And look where I’m creating the index; I’m not specifying a file group, because that would make it non-aligned. Well actually, it’s a clustered index, so if I put a file group here, it would actually rebuild the whole table and make it non-partitioned. So I’m saying our partition scheme and my partitioning column, and then create my unique clustered index. Not good, and that’s because – I’ve got my error there because I’m creating a unique clustered index on a partitioned table and I haven’t included my partitioning key. I need to explicitly put that in my index definition or SQL will error out like this.
So let’s say, okay, let’s not make it unique. Let’s just make it a clustered index. So exactly the same index statement, just drop in unique and hit execute. There we go, nicely done. So let’s drill in and see what’s happened in the background there. What I’m going to do is just grab a file ID. I’m going to take five, and I’m going to use DBCC IND to have a look at the data pages of my table. And what I’m going to do is take a data page and drill down into it. We can actually have a look at the data on that data page. So using DBCC IND, database name, table name, output type, hit execute. Now, I’ve got a whole bunch of stuff here and what I’m looking for is page titles one data page, take that, drop that in. And now I can use DBCC PAGE to actually drill down into that data page and have a look at that data on disk. So using a trace flag and DBCC PAGE, hit execute… Okay, and I’ve got a whole bunch of stuff here, but if I scroll down, here we go, I can have a look at the data on my data page. Here’s my columns, ID, create a date, columns A, column B.
First thing to note, if I uniqueify here, this is Standard SQL Server behavior. SQL adds this uniqueifier to all non-unique clustered indexes. It does it in the background, so that’s why that’s there. Then I have my ID, which is in my clustered index. But then I have my created date column. Now, if this was a heap, the columns would be in order of the table. So I’d have ID, column A, column B, and then created date. But because I’ve put a clustered index on a partitioned table, SQL has actually added that partitioning key into my clustered index. It didn’t tell me it was doing it, which actually changed the data on the disk in the background. So we need to be a little bit aware of what’s happening in the background when we do things like add non-unique clustered indexes to partitioned tables.
So, there’s our clustered index. Let’s add a non-clustered index to our table. So let’s do that. So I’m going to put in on column A; this varchar column here. And I’m going to drop the clustered index as well, there we go. Using DBCC IND again, and I’m going to grab an index page, which is page number two, and I’m going to use the same file ID, five. Throw them into DBCC PAGE, excellent. Good output this time. We’ve got a column A here which is the key of our index. Look here, we have a created date. SQL has added that implicitly in the background without telling us. It’s added it as an included column and then it’s changed the physical structure of our index. So we need to be aware of what happens in the background when we add indexes to partitioned tables.
Now, the reason I mention this is it becomes a bit of a problem when we check out things like the switch operation. If we’re switching into another table, that table has to be exactly identical to our table that we’re switching from, and that includes index definitions. And so if stuff like this is happening here…
Erik Darling: Real quick, before you go into the next part, there’s two questions. One, “Can partition boundaries be made up of multiple columns?”
Andrew Pruski: No.
Erik Darling: No, okay, straight up and down on that. So, there’s another question, “If you’ve got a table that’s partitioned, why would you have a non-aligned index? Is there any advantage for that?”
Andrew Pruski: Yes, it’s typically done to try and get out of a performance hole. You get a query that’s coming in that is suffering badly because, say, it’s having to scan multiple partitions, so people drop non-aligned indexes on there so they can get a little bit of a performance boost.
Erik Darling: I actually have a link to that. I’ll stick that in the Slack chat.
Andrew Pruski: Okay, so the switch operation, it has to be identical and SQL is doing all this stuff in the background. And with the switch operation, you don’t have to switch into a partitioned table. So you’ll be switching from a partitioned table into a non-partitioned table thinking your indexes are identical, but SQL’s done all this stuff in the background to your partitioned table. And your switch will fail, you’ll be sitting there scratching your head going, why, and this is the reason. Now, there is a simple way of fixing this and avoiding this problem, but I like to go through all that because it’s kind of cool. And the simplest way is to put your partitioning key in a clustered index on your table, because as soon as you do that, that partitioning key will be on each non-clustered index, so you can safely add indexes to a non-partitioned table and switch into it without having to worry about all this stuff happening in the background.
So that’s what I’ve done there. I’ve dropped those other two indexes, created a unique clustered index on my created date, and I’ve dropped in that ID column there so that it becomes unique. I don’t like SQL adding that uniqueifier in the background. I’m a bit of a control freak. I like to have something that I can control, so I use an identity integer column like that. and then finally, we have a non-clustered index here, exactly the same, drop that in as well. Excellent.
So there we are. We have created our partition table and we have dropped a couple of indexes onto it as well. Cool, okay, so we’ve create dour table, we’ve got our indexes, we can now start looking at some of the specialist operations that we can apply to those partitions. Now, let’s start off with the merge and the split operations. First one, merging partitions. Now, these do exactly what they say on the tin; merging a partition effectively removes a partition from the table by merging two partitions into one. This is a metadata only operation if performed on an empty partition. However, if that partition is not empty, SQL will need to read the data in that partition to determine, if it needs to move that, which partition to move it into. This can cause blocking as these things hold schema mod locks and transaction log growth as well. So you need to be a little bit careful about when you apply this operation.
Very simple syntax, alter partition function, give it a name, merge, range, and then we pass in the boundary value that we want to drop. Very, very simple function. Now, we can move this split operation as well. Again, exactly what it says on the tin. This creates a new partition with a new boundary value that we pass into it. New boundary value obviously has to be distinct from other values. So obviously, you can’t have a boundary value that already exists on your table. It takes a schema mod lock on the table, and this again is a metadata only operation if the partition that we’re splitting from – so if we’re putting a partition at the end of the table, is empty, SQL won’t need to read any data, it will be metadata only. If SQL needs to read data, again, holds that schema mod lock or get some severe blocking on your system, whilst SQL reads that data to determine which partition that data needs to end up in…
Erik Darling: We’ve got a merging question, “Can you merge part of a partition with another partition?”
Andrew Pruski: No, it’s pretty much – you’re dropping the boundary value, so you’ll get the two partitions going straight into one. It’s a pretty much all or nothing there. There’s a nasty little gotcha with it as well, which I’ll come into in the demos which I kind of like showing people because it catches people out.
Erik Darling: I’m looking forward to this one.
Andrew Pruski: So we have the split partition syntax there. It’s a little bit more complicated than the merge, but not really. We start off with alter partition scheme next use, because we need to tell SQL where our new partition is going to live. So we just say which file group our partition is going to live on, and then we can split out our partition function. So we just say split range and pass in the new boundary value.
Excellent, okay, so let’ jump straight back in, close down the last one. Okay, so here are our partitions. We have a whole bunch of them. We’ve got eight of them. We’ve got five of them with data in and I’ve purposefully left these two empty. So let’s say we merge this partition here. So we’re going to merge 1/1/2011. Now, I’m setting statistics I/O on. So if no one’s used this before, this will show me the I/O operations, logical and physical, that SQL is going to perform. I’m going to switch this on to show you that SQL doesn’t perform any I/O work when I do this on an empty partition. It is a metadata only operation. So I run that, straight through, no work done, metadata only operation.
If we have a look here, partition is gone. But something odd seems to have happened here. I was dropping 1/1/2011, which was the boundary value of data one, right? So why is that file group still there? Why is data one still there? I thought I was getting rid of that. Now, this is that nasty little gotcha that I was talking about when it comes to merging right range functions. That boundary value there does not belong to that partition. It is the lower boundary value of this partition.
Now, the reason this is nasty is because I’m going to merge this partition here. 1/1/2012, hey, that’s an empty partition. I can merge that with no problems. So let’s run the merge again. SQL’s had to do some work there. Okay, it’s not a lot of work, but there weren’t a lot of records in the table. The fact that it’s had to do work that I wasn’t expecting is a little bit nasty. Let’s have a look at the table again. Again, data one is there. So what’s happened is when I’ve merged 1/1/2012, it dropped partition ID two and merged its data into data one. That boundary value there does not belong to this partition. It is the lower boundary of this partition.
Erik Darling: Oh man, that sucks…
Andrew Pruski: I remember when I first came across this and I was sitting there going, that doesn’t make any sense at all. It’s a bit difficult to get your head around. So this is why, when I’m working, especially with data, large volume tables have at least two week’s worth of empty partitions so that when you run your merge, you know, you are sure, that there’s no chance this can happen.
Erik Darling: So a question of mine here is would you just avoid merging partitions if it causes this much trouble?
Andrew Pruski: You could. You can have up to 15,000 partitions on your table, so if there’s no reason for you to merge, no problem.
Erik Darling: Even if you had a partition for each day, that’s a lot of days…
Andrew Pruski: That’s a lot of days, yeah, absolutely. I typically have done this when I’ve been archiving data out and I’ve just been running it with a sliding window and then just two weeks back, merging two empty partitions into each other, never ever getting anywhere near data at all. Always give yourself a nice little buffer there. So at least seven days because if you do three days and it fails on a Friday, by that Sunday, you’re going to run into issues, so…
Erik Darling: There goes your weekend.
Andrew Pruski: Pretty much, you get a page at something like 10 o’clock Sunday evening with a shed load of blocking happening on your system. Okay, so that’s merging. That’s that nasty little boundary.
Let’s have a look at the split operation. Now, I’ve got this one here, latest value is 1/1/2017. Set statistics I/O on. New partition, because I’m on data eight, so I’m going to specify a new file group of data nine, and then I’m going to split out my range of 1/1/2018. Hit execute, there we go, and I can have a look at my partitions. So no I/O performed, metadata only operation, and there is my new partition. So let’s just fire in some data and have a look at that again. So fire in some data to those empty partitions at the end, hit execute, and now I have two partitions here with a whole bunch of data in. so I’ve got that partition in there, sorry, with a whole bunch of data in them, spread across two years. So if I – yep, two years, three years.
So if I go to split out now another partition, I’ll switch the I/O on again, specify a new file group, hit execute, and boom, SQL has had to read that data because it needs to read the data in that partition to determine which side of the new boundary value that data is going to live. And if I have a look at the tables, hit execute, some rows ended up here, some rows ended up here. So SQL read it and determined which partition that data was going to live in. And whilst it was doing that, it was holding a schema mod lock, nothing else was hitting that table. So again, same with the merge, I’d always have at least two weeks ahead of your table so you’re always splitting off empty partitions, you won’t run into this issue.
Okay, setting up for the next demo, depending on a viewpoint, this is either a really great feature or a really terrible feature that came in in SQL Server 2016, and it is the ability to truncate an individual partition on a table. So all we say is truncate table, dbo.partiton table, with partitions so we can specify more than one if we want, and we can hit execute. There we go, and the data is now gone…
Erik Darling: I think that’s exciting.
Andrew Pruski: Yeah, I like that. It’s nice because before we had this, if you weren’t archiving your data, if you were just deleting your data from a partition table, you had to build a switch table, switch it out, truncate it. And now, you can just run, boom, gone, boom, gone. The danger is, obviously, getting the incorrect partition.
So setting up the last one, there we go. Excellent. Okay, so that’s the merge and split. Let’s get on and talk about my favorite operation, and that is the switch operation. Now, this is my favorite operation because it allows us to move one partition from one table to another pretty much instantaneously. This is always a metadata only operation because there’d be no point in doing this if it was affected by data. We don’t want to move empty partitions around. We want to move partition with all its data from one table to another. There are some caveats for this. Both tables must have the same structures. So that’s the same columns, the same columns, the same data types, the same indexes, and they also have to live on the same file group, unfortunately. Which is a bit of a kicker because it would be really nice to have an archive table sitting off in an archive file group and just say, switch over there. But we do what we can. We can just switch into another table.
The destination partition, if we’re switching into another partition table, must be empty, but that table doesn’t need to be partitioned. We can just switch into an empty table. However, if we are doing that, that table must be completely empty. Really, really simple syntax again. We’re just saying alter table, our source table, switch partition, specify the partition number, to our destination table. And if that destination table is partitioned, we then specify our destination partition. So we say partition and then our partition number. Okay, so no mucking around, let’s go straight back into the demos. So here is my table that we’ve built and we have a partition here with 196 records. And say that is now outside of our retention policy, we want to switch that data out into another table. So first things first, we need to build a table that it can be switched into.
So this table has to be identical to our live table. So I’m creating it with the same columns, same data types with the same indexes, but I’m not going to partition this table. This partition lives on data one file group, so I’m just going to create that switch table on data one. There we go, if I refresh, I now have my two tables. We have our columns with our indexes. If I scroll down, they are exactly the same. Cool. So no mucking around; built the table. I can now switch that partition out. So I’m going to set statistics I/O on and switch to my destination table. I don’t have to specify a partition because it’s not partitioned. Excellent. Pretty much instantaneous. Let’s have a look – and my 196 records are gone. Count the switch table – there they are, switched pretty much instantaneously from my live table into my destination switch table.
Now, this has happened to me more than once where I’ve had partition maintenance running, data has been switched out into a switch table, someone’s come running along going where’s the data? Well no, we’ve got an archiving and retention policy of, say, three months. That day, you get switched out. Oh, we’ve changed that. We’d like you bring that data back please. Can you switch that data from your switch table back into your live table? I’m like, sure, not a problem at all. Let’s switch that back. So switching it round here, we’re saying alter from my switch table. Switch partition one – but we don’t really need to specify that because it’s not partitioned, but I do need to specify, to my live table, partition one. Hit execute.
Okay, what have we got here? Okay, what that’s telling me is that there’s data in my switch table that aren’t allowed, by the range defined in my partition, in my target table, my live table. Now, I know that the data in my switch table fits into that partition because I’ve just switched out from it. But SQL doesn’t know that. There’s nothing telling SQL, hey that data fits into that partition. So we have to tell SQL. We have to apply a constraint to say, SQL, that data definitely fits into that partition.
So if you have a look at the partition, it has an upper boundary of 1/1/2013. Nothing in that partition can be equal to or above 1/1/2013. So we need to tell SQL that. so let’s add a constraint to our switch table and just say none of that data can go past 1/1/2013. Hit execute, couple of reads there, and now we can switch back in. that’s telling me there that it’s ignored partition one for my switch table because it wasn’t partitioned. But who cares? The data is now all back. If I check the switch table, the data is gone. And if I check my partitions, my data is back, the business is happy, and I can hopefully go home.
So that’s something to be aware of if we are switching out into a non-partitioned table. If you try and switch back, you need to put a constraint on that table just to tell SQL, hey, that data does fit into that partition you’re trying to switch back into. Cool, so that is switching partitions.
Okay, so let’s have a look at wrapping all of the merge, switch, and split operation into one and implementing what’s known as a partition sliding window. Again, this is pretty much what it says on the tin. This is a method of pushing all the data out of a table and bringing new data in and then rolling the table forward by merging and splitting off partitions. So at the end of the operation, we have the same number of partitions that we did at the start of the operation.
No mucking around, let’ shave a look at demos. Okay, so what I’m going to do here is go through the step by step, not running a stored procedure or anything, just showing the individual operations that we perform in a sliding window operation. So let’s have a look at our partitions. Cool, okay, so we have our eight partitions there and we have data in six of them. So what I’m going to do is create an archive table exactly the same as a switch table, but I’m not going to use that on data one, that’s going to hold my old partition. And I’m going to create a staging table, which is going to be on data nine, which is going to be where I load my data and switch it into my live table. Okay, so now, I can switch out the oldest partition to my archive table. If I refresh here, we have all of our tables. There we go. Don’t specify destination partition, it’s not partitioned. Hit execute and we can have a look – our data is gone and it is in our archive table, brilliant.
So now what I can do is merge that oldest partition. And I usually do this live, so is anyone going to spot what I’m about to do wrong? I’m thinking I’m going to merge that partition. I’m going to say merge range 1/12013. Let’s see what happens… That little gotcha has caught up to me again. If I have a look at the table, I thought I was merging 1/1/2013, I dropped that partition boundary but it merged the partition that I thought was above it’s data into it. So be very, very careful when you’re doing the merge if you’re just looking at one empty partition because the merger operation you’re applying might not behave the way you’re expecting. So always have at least a couple of weeks behind so you’re always performing this stuff on empty partitions.
Okay, so let’ shave a quick look there, and we’re going to split off a new partition, say data 11, 1/1/2020, yes. Hit execute, have a look here, and there is our new partition with its lower boundary value. Okay, first things first, I’m just going to drop a load of data into my staging table, which I’m then going to switch into a partition in my table very quickly. And then because that staging table isn’t partitioned, I need to drop a constraint onto that table to tell SQL that it will fit in that partition. And because this partition is in the middle of the table and has two values, I need to have greater or equal to 1/1/2018 and less than 1/1/2019. And I also need to specify the created date is not null because SQL doesn’t like dealing with nulls when it comes to this. If it doesn’t know of that data will fit in, it will throw an error. So I need to drop that in as well and hit execute.
There we go, so let’s check those partitions on my staging table. There’s my 250 rows. We’re going to fire it up to here. Alter table switch. Don’t need to specify a partition – staging table’s not partitioned – to my partitioned table and my destination partition. Excellent. And then finally, we can check our partitions. And we’ve got our data in our new partition and we’ve ended up with eight partitions; the same as we had at the start of the operation. And that’s pretty much the individual steps that you would implement, more than likely, in a stored procedure, running as a SQL Server agent job, that would perform your partition sliding window maintenance for you. So you’re archiving all the data out and bringing new data in on an ongoing basis.
Okay, final section, and then I’ve got a little story to tell just to finish off. So let’s do the session on file group restores. Now, this is something that people wouldn’t really consider when it comes to partitioning, but it is a benefit that we can get. Think about the table that we’ve set up. We have our table with each partition on a different file group. So if something happens to one of the data on that file group, we can use a file group restore to restore that data. This can be useful for very large databases and can be useful to store live partitions into development, if that’s your thing. Individual partitions need to be on different file groups and you need to take into consideration whether or not data in the old file groups doesn’t change or is not needed, because as we’ll see, when you do a file group restore, you need to restore all the file groups of that table otherwise they won’t be online.
But this is really, really nice because what you can do is restore side by side, bring that database up, restore only the file group that you need and then maybe copy that data in, so that in the event of a disaster recovery situation, you are reducing your recovery time by implementing this. It’s a really nice thing to have in your arsenal. So let’ snot much about much, let’s go straight into the final demo.
Okay, so first things first, I am going to take a full backup. Let’s make sure my directory is empty. There we go. Excellent., and then let’s have a look at our data. So there’s our data and what I’m going to do here is basically something disastrous is going to happen to all those rows there in partition number six. So, before I do any of this, the first thing I need to do is have a backup of all my file groups. So this database has a whole bunch of file groups which I need to take backups of. So using master, I say backup database, keyword of file group and I do this for each one. So each file group that I have, I take a backup of.
There we go, if we have a look here, there are all my file group backups. And of course, if we’re running the same in production, we’ll be taking log backups. So let’s take log backups as well. Okay, there’s my log backup, there they all are, and let’ shave a look at that data. Okay, there it all is, happy as Larry. Something disastrous is about to happen there. And this is truncate table with partition six – I’m going to wipe out that partition’s entire worth of data with one click. Cool, so let’s have a look. And there is the data. The data is now gone. Massive panic, oh no, we need to get the data back, we need to get the data back. Okay, first things first, let’s take a [tail log backup] of that database. Execute… And that has put my database into restoring and I can now start my recovery process.
The first thing I need to do is restore the primary file group. You need to do this each time and this is a case for keeping your primary file group as small as possible because you always need to restore it. If anyone works with query store, guess where query store stores its data – and there’s nothing you can do about it. Bit of a pain. Okay, so we’re going to restore the database, file group primary, and the keyword here is with partial. Replace no recovery, keep the database so I can continue restoring. Hit execute, and now I’m going to restore the file group that contained the data that I wiped. So it’s file group data nine, so I can say restore database file group from disk with no recovery. Hit execute… Excellent. And then I’m going to restore my transaction log backup and I’m going to bring my database now online.
Have a look at my partitions… Boom, and my data is back. Everything looks great. Not quite, okay. If I select data from that partition, all is well. However, if I try and select data from any other partition, I’m going to get an error because I have not brought that file group online. So yes, this is really good for reducing recovery time for that one partition, but when I have had to do this a couple of times in production, I’ve done a side by side restore, just bringing that up, and then I’ve copied the data across instead of having to take the entire database offline, run through the restores of each individual file group. So this is a nice little way of bringing data up quickly, getting the data across into my tables. It’s a nice little thing to have in your arsenal there and I really like doing that demo. It’s good fun.
Okay, cool, so that is pretty much it for the presentation. I just want to round off with a short story about partitioning. This is 100% true. It happened to me at the start of this year…
Erik Darling: The best kind of story.
Andrew Pruski: Yeah, a friend of mine had – well, he’s inherited – a partitioned table in a database and there was no maintenance running on it. So what had happened was, over a period of time, that table had all of its data in one partition. And we were chatting about it, and we’re talking millions and millions and millions of records, very big table – well, relatively big table. I was talking to him on a Friday and he was going, well can I just split out a new partition and then continue on with the maintenance? I was saying, well yes. If you work out a partition boundary value that none of your data crosses, that will be a metadata only operation. It will go in straight away and then you can split off more partitions and you’ll be good to go.
This was about two o’clock on a Friday afternoon and I didn’t hear from him for the rest of the day until I think about six. He said, fancy a pint? I was like, yeah sure, what time? And he said, I’ll meet you in the pub at about seven. Cracking, great stuff. And there’s a great little pub, I live in Bray in Ireland, I live on the seafront and there’s a great little pub down there. It has a nice little room at the front you can sit in, sea views, absolutely brilliant, so I knew he was going to be in there. Wandered down there happy as Larry by this point, workday is over, all this. walked into the pub and he’s sitting there with his laptop with his pint, looked at me and went, right ready to do this? Ready to do what? Well, we’re going to implement this partitioning solution…
Erik Darling: You’re going to need more than a pint for that…
Andrew Pruski: you want to implement a partitioning solution on a Friday evening in production in the pub? Yeah, sure, welcome to Ireland. I was like, yeah alright, why not… So we sat down. I got a pint first, sat down. I said, have you got all the scripts there. Yep, got all the scripts. It’s a metadata only operation, we’ll split off a couple of partitions and we’ll have the rest of the evening. Okay, fine, went to run the split, ran the split, errored out immediately. Not even started executing, errors everywhere. I was looking at it going, okay, what’s happened here? What I hadn’t realized and what I’d forgotten about is that that table had a clustered column store index on it. If you have a column store index on a partitioned table, you cannot split off a non-empty partition. SQL will not allow you to do it.
The advice from Microsoft is to disable or drop the column store index, split off the partitions, and then rebuild it. Not the greatest advice, I must admit. Very big table – well, we could easily get this stuff up and running, we definitely couldn’t drop the column store index because everything hitting that table would have gone to pot. So what we ended up doing was building a dev system, because there wasn’t any dev system that you could test ready to go. So we built that and we switched that partition’s data out into another table, split off all the partitions, and switched it back in. Now, this is about nine o’clock at night by this point when we were testing and building all the scripts and we had to do it in a transaction. We had to open a transaction block, all the incoming other transactions, switch the data out, split the partitions off, switch it back in, and close the transaction. So we needed to get this right first time.
I’m having a great time by the way because I’m on my third pint. He’s looking a bit stressed, but we finally got it all ready by about half past nine that evening. And ready to go, hit execute, yep, transaction open, data was switched out, partitions were split off, data came back in, closed the transaction, job done. And I was pretty pleased with that and I’ll never forget what he said to me. About quarter to 10 at this point, I’m sitting there going, great, this is brilliant. He looked at me and went, great, one down, four to go…
We were there until about midnight, but we eventually got it all sorted. So that’s the key point of this is, if you have a column store index on your partitioned tables, you need to be pretty careful with your maintenance because once they fill up, you cannot split off non-empty partitions.
Okay, so there’s a couple of resources here. Just for the code for all these demos on my GitHub, dbafromthecold/introtopartitioning. I’ve been Tweeting out the links to all my blogs that I’ve written, but there’s the link there. I’ve got a summary of every single blog that I’ve written about partitioning, so you can access that there. Thank you very much.
Erik Darling: Someday, you’re going to have to tell me what happy as Larry means. I’ve never heard that one before.
Andrew Pruski: Never heard happy as Larry? Just means really, really happy.
Erik Darling: Alright, cool. I didn’t know Larry was such a happy guy. I met one Larry, he was miserable. You know what always gets me about partitioning is when I see non-aligned non-clustered indexes out there, I always forget that it’s not alter index rebuild with the partitioning thing. It’s create index with drop existing [alter] partition. And every time, I’m like, wait a minute, why doesn’t alter index fix that?
Andrew Pruski: Absolutely, caught me out a couple of times. When we were testing in the story, because we were switching the data out and we weren’t partitioning the switch table and we were trying to switch it back in, and obviously it was failing because of the constraints. And it was the nulls – we were sitting there for ages going, why is this not working? My friend eventually went, hang on a minute, and not null, boom, straight back in.
Erik Darling: Very nice. Alright, there’s a question from Pablo here, “Can partitioning help with locking activity on an OLTP system?”
Andrew Pruski: It depends…
Erik Darling: That’s a big it depends on that…
Andrew Pruski: It depends on exactly what’s happening with queries coming in. if you’re inserting into one partition and you want to read off another, yeah, sure, absolutely.
Erik Darling: You can change your lock escalation from table to auto, and that will get you, sometimes, partition level locking. But there’s a pretty big warning on that feature, most of the Microsoft documentation around pretty frequent deadlocks, which is why it’s not the default behavior for partitioned tables. Again, test that carefully in a development environment.
Andrew Pruski: I think with anything, when it comes to partitioning, as I said at the start, know the system. If you know your system, you can pretty much limit how much issues you can run into. It’s not one of those things you can just implement and see, well if it doesn’t work, then we can just drop it, because you end up with a really poor performing system. It’s difficult to get rid of it because you have to basically rebuild a table.
Erik Darling: I was going to say it’s worse than Hekaton, but Hekaton you have to drop the whole database. You have other issues… It’s funny that that’s a limitation with column store because partitioning and column store do kind of go hand in hand so well together. It’s like, no, you can’t do that, and you’re like, that’s basic stuff, Microsoft, come on.
Andrew Pruski: Pretty much. There was a bit of cursing going on when we saw that one, yeah…
Erik Darling: Like, really, that? For all the pushing you do of these two things together, I can’t do that? Strange. It was a pleasure having you. I don’t see any other questions. I assume you want to go back to that fancy seaside pub and have several pints this evening?
Andrew Pruski: I think so. That sounds good.
Erik Darling: Sounds like a good one. I would buy you one if I could, but I’m stuck here in Brooklyn. Anyway, thank you. Thank you everyone for coming and hanging out with us today at GroupBy.