Enjoy the Podcast?
Database Administrators, SQL Architects, Developers who write queries, BI Devs working with ETL.
Have you ever considered a situation where Columnstore Index can be quite the opposite of what one would expect from it? A slow, wasteful source of painfully slow queries, lagging the performance, consuming irresponsible amount of resources …
Setting the wrong expectations (it won’t run 100 times faster on EVERY query), selecting the wrong architecture (partition by 100s of rows instead of millions), using and aggregating by the large Strings in the fact tables – this list is actually quite large.
What about some of the less known limitations for building Columnstore Indexes? The ones that will bite you suddenly in the middle of the project – when you do not expect it at all?
Let me show you how to achieve those painful mistakes and you will surely know how to avoid them 🙂
Why I Want to Present This Session:
I have seen people giving up on the Columnstore Indexes because they did not understand the internals and I would like to help people avoid those painful mistakes.
Brent Ozar: This is Niko, I’m not even going to try to pronounce Niko’s last name. Niko is infamous around the world for his expertise on column store indexes but of course, now he’s going to share that knowledge with you. So Niko, take it away.
Niko Neugebauer: Alright ladies and gentlemen and everybody who’s watching this recording. Welcome to the session on the column store indexes worst practices and lesser known limitations. Not unknown, less known. This will not be a very basic session, I’m sorry for that. I will not do a really explanation, I will try to just like, try to dive into this stuff. If you are missing the concepts, please feel free to re-watch the recording. There are great blogs, there is a very nice – it was written by Hugo Kornelis on SQL Server central, on the column store indexes. Sunil Agarwal, the principal program manager who led the 2016 release of the column store indexes. There’s a huge number of blog posts on the column store indexes, there are presentations and so on, go and learn this stuff. If you are like – you know, I’m talking absolutely gibberish, that’s not your fault. I’m Portuguese, I’m crazy, I’m not even sorry.
Alright, so let’s dive in to this stuff. So my name is Niko Neugebauer, don’t try to pronounce my last name, as Brent just mentioned. I describe myself as a Microsoft data platform professional. It means I like working with different aspects of Microsoft data platform. I work for a great company, which is called OH22 and so on and so forth. So you can see all these details, I run a couple of user groups locally in Portugal, organize a couple of events blah, blah, not interesting. I blog quite a lot on the Column store indexes, I have over a hundred blog posts. Every month I do like three, four new and check out the CISL, column store indexes script library – this is a library which you can use to diskover the inside and some of the internals and do the optimizations of the column store index. Free library, GitHub, use it. Do whatever you want, I’m fine with it. Don’t blame me, okay you can, if you wish, who cares. So today’s agenda is very simple. I have like six, seven, whatever slides and then we just go – just examples, nothing else.
So intro, overall recommendation, computing column store indexes, quite important to this stuff. Computing and reviewing, so loading data into column store indexes, very important. I’ve seen cases of row store process running in few minutes, 15, 20 minutes, changing to column store and running over five hours, which drove the client absolutely nuts on the next morning, because of course it broke a couple of things. So loading is really, really important, because you can load the data into the column store indexes faster than into the row store.
Querying the column store indexes, well, once you get the data into the column store, you want to extract the data, you want to get some insight. That’s not about artificial intelligence, that’s what I have, I have an artificial intelligence, nobody knows, but let’s go into the column store indexes. Column store indexes are – and that’s the only basic slide I’ve got. They are vertically separated, so I used to say this is how the non-IT people used to think how the databases are storing indexes, because people – they look at the Excel and they see like the columns and they’re like this is how we store that in computers. And now you can say with the column store indexes, yes, this is how we do it. So they are grouped into these segments, so these segments, they can contain between one and 1,048,576. Not one million rows, because you can and you will have the very small rows.
But these are kind of a partitioning, so you have the huge amount of data, on the image you can see I have like three row groups, and let’s say you have three million rows, you will have at least three row groups. Let’s say everything went fine, we got no memory pressure, not dictionary pressure, no other limitations. We’ve got every row group with one million rows, and segment corresponds to one columns row group. So for me, column C1 is split into three segments. One segment for each of the horizontal row group. It contains the values for the rows of these one million rows, hopefully, and just the values of the C1 column. Very extremely compressed.
Once you are migrating from the heaped table, non-compressed, completely non-compressed table into the columns for what I tell to my clients, the expectation is you get around ten X improvements. This is what I’ve seen. I have seen over a hundred times, let’s put it this way. This will depend hugely on your data. If you Google, there are – I’ve also read a couple of blog posts on how column store indexes compresses – how they achieve these. Extremely, extremely compressed. If you compare with the page compression, the best available row store traditional index compression, you can expect two, three times. Your mileage will vary because it will depend on the quality of the data, and they are tuned for processing large volumes of data.
So right, I’m going a big data, I’ve got 50,000 rows, no sir, you’ve got 50,000 rows, that’s a rather small dimension in data reality. It’s – okay, you can put the column store indexes – they can bring you some significant advantages but it might not happen. They’re thought for the millions, millions, billions and gazillions of rows. And I told you this, you already know it so let’s dive into the real stuff.
So back to the agenda again, overall, the highest scene, once, especially you migrate into 2016 and the same will apply to the RTM and the future versions, as expected of the SQL Server. If you do not use the highest availability compatibility level, you will be removing some of the most advanced column store indexes batch processing mode functionalities and it will do a huge – and I can tell you that every single week I receive emails, people basically they do the migration and they don’t see like the advanced functionalities bringing them these advantages. What is going on, and I’m like check your compatibility level, might be the thing.
What about the building? Cutting memory, just like get the biggest amount of memory you can get, scream at your DBAs, scream at your managers, cry like a baby, I don’t care, do something. Get some more memory. When you are rebuilding column store indexes, one of the most important things will be resource governor. Do not put – rebuild the column store indexes to compete for the resources with row store processes. You will not like what you will see. Do not even put outer index reorganize the top remover into this same situation, never. Just – I can tell you in the future, there will be more information on this stuff definitely, but right now I have found out at SQLbits that some of the problems that take place on some of the machines I have avoided just by using this practice, which I set a couple a years ago for myself. So that’s an important one.
Don’t use strings – well, if you can avoid strings, they are evil. Strings are evil for the column store indexes. 2016 has improved enough stuff but still they are so evil, and if you’re using…
Brent Ozar: There’s a – one of the Oleg – Oleg asks, “How about GUIDs? Are GUIDs considered strings or is it okay to put GUIDs in column store?”
Niko Neugebauer: Not a very good idea, because GUIDs are very, very – and most probably you will get a lot of problems. I would expect instantly the dictionary problem on the GUIDs, which is not solvable at all, right now, currently.
Alright, so partitioning. You should be absolutely using partitioning. If your 2016 service pack one, there is no reason you can’t use partitioning, right? And it’s not like just because we can we should do this. No, it’s not about that. So there are scenarios and I’ll show you the demo when over partitioning can bring the column store indexes on its knees, virtually on its knees. Not a very good thing. Compression delay, just a small feature but can irritate and piss off badly people. Just you need to check on it.
What about the loading? Not using bad code API, so if you’re loading less than 102400 rows at a time, your data will end not into the compressed segment beautifully, it will end into the delta store.
In 2014, these delta store page compressed, in 2016, very important improvement, they are not compressed anymore but they will end there. Very, very slow process. Very, very slow. So basically the recommendation here is group – if you’re loading into data rows and huge amount of data, just group it. Just re-sort it on the most important column and then load it. It will be beautiful. Using IDENTITY, not a good idea. Using merge, evil. Parallel loading, you need to be using parallel loading in 2016. Amazing feature, I don’t know why Microsoft even blogged about it so late, like they took – after somebody mentioned it publicly, I blogged about it, it took like six, seven months for them to blog – to put a blog post on this. This is huge, huge thing in 2016 and you need to use this. A couple of things on integration services, a couple because there are no many news in integration services 2016, oops, I said it.
And then we have something about the [create], so casting on the wrong side, which is not the east side or the west side, it’s the wrong side. Overusing NULL expressions, you know, I know that Brent, he loves NULLs, he just – you know, his life is like – it’s getting better when there are NULLs.
Focusing on cross apply where it’s not needed, don’t use it. I mean, if you need to use it, if there are good reasons, go for it, but be aware what it might bring to you. Not using the advanced storage features, this is like a bad thing. And one of the most cardinal scenes is not believing that the estimate and the actual, they can be very, very different and for the column store indexes, they can be absolutely different. I’m not talking about the adaptive query processing 2017, which is bringing like another – oh yes, got it. Is not NULL.
Brent Ozar: I couldn’t fit and I tried that, I wanted like longer things on the license plate or NULL – NOT NULL – didn’t work.
Niko Neugebauer: That’s beautiful sir. So estimate or not, do not compare the estimate for the column store indexes. This can lead you to the wrong path for the performance evaluation. And well, I’m sure I’ve mentioned just – don’t use natively compiled stored procedures when you have the in-memory column store indexes. They will not be used. That’s bad and you will not be able to see it.
SQL, the lone star, the Texas style. Sir, yes sir, with honors. And so that’s it, that slides are gone, Niko is going down to the management studio, hopefully. Alright, and so I have a couple of demos. So basically this is what I’ve prepared, there’s like 13 different folders, each one has – some of them have over 10 scripts that’s absolutely crazy, we wouldn’t fit it into two hours and we’ll try it anyway. Let’s do this. Just so I can join myself, I don’t care. Unless Brent shuts me down, I’m doing my 60, 70 minutes hour and [inaudible] will take over and the data monster will be showing you a blue tongue. Scary stuff people. Alright, so what I have here is a very, very beautiful 2016 SQL Server with a service pack one. I know, I know, can be very evil without the latest cumulative update. Cumulative update two has fixed nine very bad bugs, so use the latest cumulative updates for column store. Just a little tip. And I have here a couple of databases. First of all, I have the infamous or famously famous – three top … times famous Contoso retail data-house database, which didn’t change things Brent, 2008, 2009? Probably 2010, let’s be you know, respectful of the old databases. And I also got TPC-H database. What is it? TPC-H. These are a kind of a data warehousing schema kind of snowflake thing comparable; and how can you get one? Because I will not give you mine, I’m bad. There is something, which is called HammerDB. It’s a free stuff, many people – I believe Kevin Klein, Kendra Little have blogged, published video, so many people talked about this. Google it, find it, install it, generate your database. You can go into terabytes.
The latest version, they generate column store indexes by default. Smart people. Alright, okay, so let’s run a couple of things on my – there was a Contoso retail data-house database. So I will go to it and I will set the compatibility level to 130.
The compatibility level, directly, can be set through clicking – right clicking on the database, selecting the properties, going into the options and selecting the right compatibility level. This is what you want to have on 2016. When you do upgrade, when you use the query store, all these techniques to keep your execution plans correct while upgrading, this is all the same stuff that everybody told you before. I’m talking, this is your goal, this is where you will get the performance improvement. So I have it on 130, I will set the statistics time I/O on and I will run a simple query, doing a fact online sales table join with the dimension of the product, doing a small filter aggregate. And it takes me well, let’s say 580 milliseconds, very slow. Fact online sales table has 12.6 million rows, so it’s half a second, alright, in my case.
Two quarters, and let me just show you the fact online sales because I’m not showing you the actual execution plan right now. But it’s got this beautiful cluster and at the same time column store index, right? Okay, so let’s run the same query on the 120 compatibility level. Let’s see what happens. How will it work? Well, that’s a similar time, and actually it’s even a little bit faster. That’s incredibly. Well, are you saying, Niko, that I’m doing downgrade? No. It’s an average time because sometimes – I consider to be similar, okay? Sometimes you’ll get one faster, sometimes other – maybe I’m writing some more code in parallel – I don’t know, whatever. So let’s run this query on 120, on 2014 compatibility level. I’m just going from two – DLP, from two cores into one core. What time did it took? Ladies and gentlemen, we burnt our CPU.
That’s actually a little bit slower. Slower how? Like that’s – that’s almost ten times slower. Niko, what did you do? Well, well, well, well, that’s 2014. What if we do an upgrade to 2016? So compatibility level back, let’s rerun the script. How much time will we take? Oh gosh, that’s not five seconds, that’s 1.5 times slower. This is what you would expect. The thing is, this is the biggest fix Microsoft did, thiswas a huge pain in the back in 2014 and in 2012. Batch execution mode runs only when you have two cores or more. And 2016 – on 2016, if your compatibility level 120, you’re still not getting it and that will drive people crazy. It’s like – it’s not faster. Check your compatibility level, alright? I have a query which is using a – not very much documented or supported trace flag but which will – basically will put this execution plan into the row execution mode on 2016. And you can see that I have a similar execution time with the compatibility level. So it’s all only about the batch execution mode and on one core you can get it only with 2016 compatibility level. On two days ago announced 2017, there will be stuff like this definitely. So just keep in mind this, be aware.
Sorting, this is one huge improvement. No matter what people will say, windowing functions are beautiful. The most of the time the execution plan is pending, it is doing sorting the data. If you don’t get the sorting right, it will be slow. This was [inaudible] So the same thing, I’m not running the demos because I want to spare because I have too much stuff. You’ll have to trust me. I know it’s tough, but you got to do this.
Brent Ozar: A couple people asked are they going to get the scripts? Are they like on your website?
Niko Neugebauer: Never. Never, I’m not showing. It’s just like ever, Brent, I’m sending them – not after I finish the session but tonight.
Brent Ozar: Light them on fire.
Niko Neugebauer: All the time, so 130 compatibility level, I’m running a basic aggregation predicate pushdown, okay, and it took me whatever time it took, so this is four milliseconds. Four milliseconds, this is 12.6 million rows ladies and gentlemen.Four milliseconds of getting MAX, not bad. Okay, let’s do it the old style, 2014. That’s not milliseconds. Not very much milli, not very much – 2.4 seconds. 2016 is not faster. Did you ever check the compatibility level? I warn you, that’s not going – you know, I tried, I talked to the psychologist, my wife, she’s like – she said like in your case Niko, sorry.
Brent Ozar: The best you can do is just get on webcast.
Niko Neugebauer: Bingo, Brent Ozar [crosstalk] this is how we’re rolling, alright. So compatibility level is important. Another one is memory pressure, so here is a script. You’ll get it, you can’t run it, don’t run it on production please. So I create something evil, a resource pool which is called low memory, with a max memory percent on 20, set, and then a greedy workload group, which is using just five percent over those 20 and we run the reconfigure, we go to clear the classified function and say if Niko is entering with [inaudible] no, then you use the greedy pool. And let’s do the system maintenance, so I already rebuild my column store index, fact online sales. And you can see that I have here on fact online sales it’s – well, whatever, it’s not filtering very much but I have 13 row groups, okay?
The maximum row here you can see is number 30 and it goes on to the other tables. I can use my CISL function, which I always use and it will get me a little bit more complete information and so on and so forth. I can see I only got the compressed row groups and it’s 13 and it’s 12.6 million rows over there. This is all the information. Wrong button Niko, wrong button. My…
Brent Ozar: It’s always hard to remember those.
Niko Neugebauer: Yes, my [inaudible] is not the professional one. So let’s do this, let’s do something evil. I will do this and I will enter as a Niko slow and I will run the rebuild on the right database. You got to be aware, [crosstalk] on the wrong database, most of your queries will be blazing really fast.
Brent Ozar: I like the demo that is – this is how security works. Oh, while you’re in there, Elsemer asks, “With the MAXDOP stuff – does this still hold true for using the legacy cardinality estimator from a query hint? Do we get slowed down if we’re using a query hint for a legacy CE?”
Niko Neugebauer: I don’t think so because it has nothing to do with the query optimizer improvement. These are about the estimation, not about the processing itself. The batch execution mode is processing, not the building of the execution plan, alright? So while it’s running, I want to see the estimations. Ladies and gentlemen, we’ve got 13 row groups for 12.6 million rows, how many will we get? How many? Tell me the numbers, Brent.
Brent Ozar: It was how many rows?
Niko Neugebauer: 12.6.
Brent Ozar: 13 row groups?
Niko Neugebauer: Yes, the original one, the good one.
Brent Ozar: The good one – it’s going to be different now?
Niko Neugebauer: Because I’ve set 20% of the memory…
Brent Ozar: Oh memory – [crosstalk] I didn’t think that – wow, that affects the size of your row groups too? Oh man. Oh, that blows. That sucks.
Niko Neugebauer: You don’t see what I’m typing because I’m not typing.
Brent Ozar: Yes, oh sure, yes, evil assistant. Dang, so it depends on how much memory the box has. Oh, that sucks.
Niko Neugebauer: It’s still running, it’s 1 million 36 seconds, should be actually around 1 million. Most of the time it’s so evil that you will have no idea. So any estimations?
Brent Ozar: Oh, because I don’t know how much memory the box has.
Niko Neugebauer: It has 8GB.
Brent Ozar: I’m going to say like 50, I’m just guessing.
Niko Neugebauer: Okay, so any bigger numbers on the guesses?
Brent Ozar: No, nobody’s guessed first. I don’t think anybody else realized what’s going on here either. Oh wow, jeez, memory really does matter, huh? Wow, and I guess it matter what other queries are running at the same time too if you can’t get enough of a memory grant.
Niko Neugebauer: I am rebuilding my column store indexes, who cares? People who care on the next day.
Brent Ozar: Wow, so then it makes select queries slower.
Niko Neugebauer: So if I ran for example the details function, which will show me the row groups, you can see that it’s [crosstalk] – on the total rows…
Brent Ozar: But on the bright side they now fit on a floppy disk, so that’s kind of cool.
Niko Neugebauer: Some of them. This first one will not, I’m sorry. If you’re not old enough, don’t care. Floppy disks, they don’t exist like [inaudible] never existed.
Brent Ozar: Wow.
Niko Neugebauer: So scary, scary, really scary stuff. Don’t try this at home. Alright, so I spent half of the presentation – okay, I’m rebuilding, I’m bringing back the good old solution, right, so let’s run it. Okay, the other thing is evil is you can see this demo, this is creating a C4 with a character 800. This one, no matter – so if I’m doing here with the regular user, not with the Niko slow, this will cut me, I’ll not run this demo because it’ll take me two minutes. It will get me to – maybe I can actually – maybe last demo test I did so you’ll see the result. It was 65,000. Okay of course, now it’s competing for the resources. So learn with Niko, don’t run … parallel with the other stuff. Bad idea, bad Niko.
Brent Ozar: There’s a question for I Jeb, he says, “What about – does it get any better in 2017 if you do things like off-road data? Are you allowed to do off road data in a column store?”
Niko Neugebauer: Yes, so we finally get the lobs on the cluster with columns store in 2017.
Brent Ozar: You don’t sound excited…
Niko Neugebauer: The limits are still applied because this is the two part of in-memory. You need to keep the whole row group in memory fitted. If you don’t have for the memory grant of the row group, it will be cut automatically. Lower the number of course, half of the memory grant until it fits or you know, until it might be by rebuild simply get you some [inaudible] and throw you out, everything is possible. Okay, so there is something wrong with my virtual machine again, I can see. That’s beautiful because it’s running too slow and still enough memory.
It’s 135 so let’s hope that it will come back because this is one of the bad things on the column store. You can – while you’re doing the rebuild, you cannot run the DMVs information, you can’t get the information on the DMVs. It will – but you can see, so this C4 column 800, it will be cut to these two ribbons, ladies and gentlemen. And I will tell you a secret, a little secret, there is nothing you can do about this at all. If you get big strings, you’re doomed, they will be cut. Bad thing, okay? The void strings, strings I warned you, alright? Okay, then – oh it’s strings time, it will get even crazier ladies and gentlemen. So I will be using the TPC-H, you’ll get the restore, once you generate it. You can use the different 1GB, 10,000GB, whatever you wish. For the restore, I create from the original line item table, line item CCI with column store index and I will do something about one of the most problematic structures there. The line item table in TPC-H has a little, little, little problem. It’s a fact table or a kind of a fact table. I’m not a very big fan of this schema – I’m using it. Originally it has a ship mode column, which is char ten – so I said let’s take the – let’s see the diff synced values within this column with our table and maybe we can put them into the dimensional separate table and by querying we’ll create a view and basically join them, and see if it might bring us something, right?
Okay, so line item CCI, test query to see how many – Niko, get the right database – we’ve got seven distinct values. If this would be millions, not a good idea to put on a separate table. In our case, seven distinct values, I’m fine. This will be a short lookup, a basic optimization, alright? So from number three, we’re going to the four, we create the line item CCI optimized table where we basically substitute L for the line item, underscore ship mode into line item ship mode new, which will be a tiny INT. I know we should be putting BIGINT, this is a – I’m a bad person, okay. Cluster with column store on it, you got to put a column store on it, then we – I’m like Brent, I love the idea of your presentation, you’ve got to put a key on it. That’s beautiful.
Brent Ozar: Right? Yes, once you start singing, that’s the end of it.
Niko Neugebauer: Bad idea.
Brent Ozar: All the single tables, all the single tables.
Niko Neugebauer: This is a professional broadcast.
Brent Ozar: I’m not so sure.
Niko Neugebauer: I would like to thank my parents for giving me such a [inaudible]. They will never see this video but at least I will try to – so our dimensional table, we’re putting the unknown members who will have the eight rows. They’re loading the data, doing the lookup, building the foreign key – please put the key on it. That’s – so many things – SQL Server is so intelligent unless you make it act stupid. And then I will run this to queries.
The first one, the three aggregates on the line item CCI and the same three aggregates on the line item CCI optimize doing a lookup on the ship mode, okay? So no execution plan showing right now because the whole idea is basically if you are putting the execution match showing SSMS, your last query will get a bonus to it, and if they are very fast, this might be like triple the time of the execution bonus. So view the execution, then view the execution plan. Do not mix it up, very important. So this is something wrong but 16 seconds – no, something is wrong because it should be actually the half of this time, the regular execution. Maybe I did something…
Brent Ozar: There was a guy who told me to check no matter what, check the compatibility mode. That’s what I would say.
Niko Neugebauer: No, that’s fine. This one is – I was doing something bad with max view P recently where I was like maybe I did something really, really bad. I’ll do – so cost threshold for parallelism is…
Brent Ozar: There you go, there you go.
Niko Neugebauer: Let’s do it the right way as much as possible, but the results you see, it’s two times faster. The lookup, the second query, eight seconds, the first one, 16, okay? So this technique is basic. In a real data warehouse you might – in existing fact table, you might not be able to redesign it, but if it’s critical, you might get some conversation started. So it will be – might be very significant. So this one will basically show you in 2016 that we get no row group elimination here on this query. So set statistics I/O here – we have different values for this L command and this would filter the row groups but things still do not support row group elimination. Please upvote for all the Connect item for allowing this, even for the default collation, we need to get this started. String are so common. If you’re running analysis services with a direct query, there is no – nothing else but strings, just separate. There are other demos, I’ll just stroll through it, so delta stores, if your data is not compressed, a string predicate pushed down will not work on the delta stores and I have a demo here comparing with a row store table – row store will function faster. So bad it will be. It’s rare to get these occasions but these are the limitations.
You can use the extended events and the demo scripts and you can find on my blog, the detailed information to see which single instruction multiple data, synced instructions are being pushed or not. The message here, if you have got a lot of dictionary entries, more exactly more than 65,000 you will not get string predicate pushdown on this column.
Brent Ozar: And it sounds like this is a big part of the column store work is figuring out why the queries aren’t going as fast as you want them to and then it takes a new set of instrumentation skills, so now I’m seeing really why you wrote those scripts that you did, to see deeper into column store indexes because you have to figure that stuff out yourself.
Niko Neugebauer: And there will be – so let me announce, there are new things coming for the library. I don’t know – this will take place, I will finish them, after [inaudible] so during the summer I’ll bring them. The whole idea is here basically, all the knowledge that I have joined over the years to give you a possibility to go through this and other not published also practices, and give you an idea why is it slow. Check this, this item is probably slowing you down. To – how to make this work, to make you concentrate on the really important stuff on redesigning this and whatever.
Brent Ozar: Very cool.
Niko Neugebauer: Okay, so we’re half the presentation, maybe 40 minutes and we’re number four out of 15. So partitioning, so I created the line item table, I created the same orders table and then I will do something really crazy. I create a partitioning function, which is called FN_dayrepartition. I will take my line item table and I will partition it daily because I’m crazy, because I can.
Brent Ozar: Well, if you have three days worth of data.
Niko Neugebauer: No, a couple of years.
Brent Ozar: That could be bad.
Niko Neugebauer: A couple of – oh, that will be bad. And I also create the monthly partitioning. So this one is monthly partitioning, here I’m creating a function which will – let me just show you, create a monthly partition between 92 and 99. Of course – so you can see 1st of January, 1st of February, so the result will be, I will have three tables. Line items_CCI non-partitioned, line item CCI part, this is – I partitioned – the daily partitioned one, daily, and then the _month, this is the monthly.
Let’s run it. Basic aggregation, nothing spectacular, I run it a couple of times because it’s always a good idea. So first one, 20 milliseconds, as expected. The second one, is 30 times slower.
Brent Ozar: Because of all the data and partitions.
Niko Neugebauer: Every single partition is a separate column store index, which you can sort and you should, on a different way, but it also contains all the dictionaries, deleted buffer, whatever, deleted bitmap, all those structures, all those internal tables, it will kill the performance. This is the same tables 30 times slower because of the partitioning. Look at the monthly, when we go from daily to monthly, look at the result. That’s 30 times less partitioning and it’s the same performance, virtually.
Brent Ozar: Is there any guidance about like how many rows you should have per partition or?
Niko Neugebauer: 1,048,576. One million rows. This is on the SQL beats just – I spent like 10 minutes, most of the answering the questions, most of them were how big are the partitions. I’m like, one full row group. It’s not – you know, maybe you’ll be fine with 500,000 rows if you get like comparable performance but make sure – so you can simply go into Cstore_getrowgroups, I can’t run it on this partition table because it will take half an hour or whatever, but on the other, it will just tell me – I need the details, so I have – I can calculate, 12.6 million rows, I have 13 row groups. Fabulous, all fine, and so on the library you can go and say like show me one partition, show me an aggregate, whatever. Read the docs. Drop Brent an email, he can answer it.
Brent Ozar: No, no, no, no, no.
Niko Neugebauer: Hopefully. And Brent, do not forward it to me please.
Brent Ozar: Bad behavior [crosstalk].
Niko Neugebauer: Alright sir, you asked for it. Alright, so another test query, I took a TPC- H query, which is more like what you will see in real life. This end exist and non-exist, not the best practices but running the non-partitioned with the daily partitioned table, compare them and what we will get is the performance will be very different. It’s not 20 times, but still, 1.5, 4.7 seconds. Just because of the partitioning. Just don’t over partition. Keep your row groups tight, as close as possible to one million rows and please, please send messages to Microsoft, tell them we want to have opportunities to set them two million rows, maybe even like in analysis services up to 16 million rows, might be a good idea. Scary, because we will need more memory, right? Otherwise, we’ll get the memory pressure because the row group needs to feed into the memory.
But you know, this parameter is missing like for a good number of years, since the first release to be precise. Using compression delay, it just – compression delay is basically a parameter, 2016 again, which is telling you that once the data is landing in the column store index, and you can set it on all three column stores in 2016 and 2017 on cluster column store so data housing, on operational analytics or HTAP disk based non-clustered, updatable, or in memory column store.
It’s basically a parameter in minutes. You can specify, so once data landing is landing into the delta store and delta store has reached the maximum size with no other circumstances, it will stay like this after its closed before being compressed into the compressed row group. So it’s good because if you have a hot data, you say like, I want one day after the data has landed to be still in the data store. I don’t want – because it might be updated and the compressed data, which has been updated, it’s stored in a deleted bitmap so logical fragmentation – 2014, the only solution is to rebuild the whole index or using partitioning, just use partition.
It’s nothing – you want these parameters to set sometimes, but if you’re a consultant and you’re landing and you’re just like I don’t know why it’s compressing, you know, you’re using my library or check the DMVs, you see ah-ha, it’s not compressed because somebody said it’s like 24 hours delay. Copy paste from somebody else’s script, good idea. Not always, alright?
So about identity, because we’re running like not very slow, I’ll stop at the second to take a breath, to show you once again the data monster. Any questions, Brent?
Brent Ozar: There is, Brad says, “Is it a bad thing if I create multiple partitions for the future that are just empty or should I just only create them on demand as like I actually need to create those extra partitions?”
Niko Neugebauer: The answer is you need to have at least one empty partition. So I’m – hopefully, next week, maybe in two weeks, I’ll put a very big blog post I’m writing for a lot of time on the partitioning for the column store. There are a lot of small and not so small quirks on them, huge topic, hugely important, keep at least one empty partition always. So keep two, because sometimes you just – enter, well not what I wanted. More questions?
Brent Ozar: No, that’s it. Steph Lock is like, the 100 plus articles that you’ve written are giving her a tough start, she has so much stuff that she has to go read, but yes, keep going.
Niko Neugebauer: Book is coming soon.
Brent Ozar: Are you actually going to do a book?
Niko Neugebauer: Yes.
Brent Ozar: Oh good, cool. Very cool.
Niko Neugebauer: Wish I could announce on this show, I wanted to announce earlier, too many stuff. Let’s get back to the real stuff.
Alright, so let’s do the identity test. I’m creating here a staging table where I’m holding just two million rows. I’m creating a cluster column store index on it, okay? On this script. Then I will create the fact online sales identity table and I will load the data into it and you will see that – because I already have this table, if I truncate it – no I don’t, of course, because I reset this part of the demo, but I’m easily recreating this table in the script number two. Yes, I am that chaotic and I’m jumping around the different scripts. Just watch it time over time to find the right sequence. So sorry, no, not really.
So if I insert the actual execution plan, because I didn’t create the stage, trying to speed up stuff, not always gets you the stuff you want to do, so I’m creating this staging table here, this two million rows. I could have created it actually while I was talking. I decided to do this in a bad way, as always. So if I will include the actual execution plan once this one is ready, should be like a couple of seconds, just two million rows, not hundreds, 20 seconds, we’re done. If I run this query – because I selected – Brent is not here to help me to tell me that I’m running on the wrong database.
So if I run this query, you will see – even if I tried with TABLOCK in 2016, it will not have the parallel insertion part. It’s because identity is preventing it, and all these demos that you will find in the scripts, they will show you that you can try it with a sequence, you can do a lot of tricks but you will not be able to get the improvements and you can see, even the reading of the data from this stage is single threaded. There is no parallelism sign, and they most important, here what I’m trying to achieve is the parallel insertion because this would speed up – trying with TABLOCK would not help me in this case.
So the solution is I have to say thank you to Arvind from the SQL cat team who tweeted me this solution, I was struggling to find to use this – I believe originally, if I’m mistaken I’m sorry, but I believe Adam Machanic was the one who introduced this solution, this running the row number over order by select NULL and if I will run this one, I will get the best speed for insertion. So it took me 20 – not 20, whatever – the most important part is the parallelism and you should be able to see it.
I’m writing, of course, identity insert, and I hope the demo gods will be fair to me and they’re not unfortunately. But you can see at least that I’m running the window aggregate in the batch execution mode and I know why it’s happening, that it’s something else for the other demo and I need to reset it. It’s the course threshold for parallelism. In order for this demo to be fast, I need to – because the overall execution plan – no, it’s 250, there is something else going on with my computer. I’m sorry for that, but you can run this on your computer, it will – it should work correctly and on my blog post you can read the details about this one. So I didn’t use this trick, row number select over NULL.
Another one, don’t use merge. Just don’t, ever, ever, ever, ever. You can use the merge like this – besides all the problems and I will direct it to Aaron Bertrand article. He blogged very good things, some problems you can get into while using merge statement and nobody cares or knows how to write correct merge on the top of their head. Merge is basically delete and insert because the update is delete and insert for the column store indexes, and so running the alternative of delete using – with EXIST clause, to make sure that you’re deleting something that makes sense and then inserting the data – I will show you the difference, so this is the identity but this is the merge.
Huge difference on the merge and delete and insert, and the more data you’re loading – if you’re loading 500 million rows, this will go absolutely nuts. Just, column store indexes, merge, avoid it, because merge will not run in the batch execution mode while your insertion can. You can get your insertion running in the parallel – in the batch execution mode, which will bring you a huge advantage. So for parallel data loading, and I see we are approaching, this is a very important item for the column store indexes, especially you will need to use a TABLOCK. If you’re loading the data in the data housing, not in OLTP scenario, use a TABLOCK kit. This will basically split the workload between the number of course.
So I’m loading, for example, two million rows and I have four quarters on my VM, it will create four threads, each one with 500,000 rows and it will not load them into the delta store in a slow way. It will load the data into the compressed row group in parallel. So TABLOCK hint is just hugely influential. It’s 2016, even if you’re working with a row store, always use a TABLOCK for data warehousing or BI. On the OLTP, once again, very bad idea to load the data into your huge table which is running live with a TABLOCK hint, okay? I hope you agree, Brent. If you’re not, well, then you don’t.
Brent Ozar: I walk back and saw TABLOCK and I went wow.
Niko Neugebauer: It just runs faster. Alright, so on the integration services, where we’re loading the data, important things, important settings – so in 2016 you’ve got one good setting, but you got to have enough memory because I’ve seen it crash in integration servers, besides giving errors, which is a basic stop. Oh come on, my virtual machine doesn’t like me… Testing connectivity.
Yes, it’s testing. Okay, so once it will be back I will be going to show it, but I will show you the content which we will get. We will get SSIS, virtual studio 2015 project and you will get four files, which you – for integration services, projects which you can basically run and test and see the results. I didn’t set it up because I’m trying to speed it up, let me just solve this part. That’s a havoc, Niko. Professional havoc – I need to do like a professional havoc or a…
Brent Ozar: The chaos monkey.
Niko Neugebauer: The chaos monkey, I need this toy, chaos monkey. So let’s put it on the right because I, of course, I put it on 2014, I have three instances here on this virtual machine. Come on, I’ve seen a lot of years ago, 32GB virtual machine with five instances, each one having six GB of RAM and for a long time it would be running slow randomly and everybody was wondering and I’m like I think we’ve got a winner, like let’s lower the amount of memory and you know what, in big companies, they have servers like let’s not, let’s solve the problem. I think they are still running that virtual machine. One of these systems is slow, why is it today?
Brent Ozar: Yes, one of them was – I had a group of people come up to me and go, “Hey you know what, were you serious about in the presentation you just gave, that you can have multiple databases on an instance?” And I went, “What?” and they said, “Yes, you can have more than one database on an instance?” and I said, “Yes, why wouldn’t you have able to do that?” and they’re like, “Oh, all of our database servers – we have one database per instance because we didn’t know you could have more than one.” I just saved you – I just made your whole life much better, and they were like, “This is actually amazing.”
Niko Neugebauer: Alright, so you need to save the buffer size to 2GB and set the default buffer row size to the maximum size of the row group, which is 1,048,576 and set it to 100MB, the maximum size of the buffer – of the data for buffer. I’m sorry but it just…
Brent Ozar: So this is so that SSIS puts in the right amount of data for exactly one-row group and calls it a day?
Niko Neugebauer: The worst thing that happens is by default you’re limited to 10MB.
Brent Ozar: Oh wow, so those row groups had 10MB in size basically? Oh, okay, because, yeah, it’s small, that makes sense.
Niko Neugebauer: It just – you need to set a couple of the parameters, take a look at the blog post I have on this integration services. Just these little things. So casting types, I love this one. Let’s run this same query. This one receipt date is on the TPC-H – I’m not doing the same mistake hopefully. It’s – hopefully, I said it not for real – so what is it name, receipt date – receipt date is a date.
I’m using a – on this item string, it’s not a date because if I will show you this set up table, receipt date is a char ten. I’m not using a date, okay? So if I’m using the default casting or if I’m casting on a different side, so the difference – did I drop it? Okay, let’s run it. How many – I’ll load all the data, beautiful. Okay, it will take some time, but – because – I know I edited the wrong demo while doing the test. It’s the same demo I run on the beats and I edited the beats with the changes. I will never learn. At least I will correct this one here.
Brent Ozar: Presenter’s life.
Niko Neugebauer: Yes, so it’s running and hopefully my – come on, be a nice person at least once. Okay, 100%. Typical situation, column store – it’s a hungry beast, it will eat of what you throw in it. If you want to run something parallel, create the workload group, very important. Really, like saves lives, as I have found out later. It’s running.
Brent Ozar: Oh, Brad says he’s got your alignment script and he’s got a partition on an integer, and he sees that it’s not 100% aligned, what does not aligned mean?
Niko Neugebauer: It means – alignment is you select one column and you will – sorry the bells are ringing, I got to go – so alignment means that for the integer data type and for the most data types, column store indexes for each segment within the row group, it stores the minimum and the maximum value.
So when the query is entering and started processing, it gets this many information and it says your predicate is saying I need values over 30 but this row groups contain 10 to 20. And my scripts does something that column store indexes still can’t deliver, is sorting the data. Even in 2016 there is an error message saying it’s still not working. I don’t know what it means, the error message, but maybe we’ll find out soon, hopefully.
So I run this thing, these queries, 14.3 seconds, the other 2.5. Wait, why? I just did casting. If I take a look at the execution plans, this one is for all of you data housing people who are running like reports, whatever, or big processing, this one – and I’ll do something bad I shouldn’t be doing and don’t ask me what I’m doing. I hope it’s not recorded, okay?
Brent Ozar: I’m sure we’ll edit that part out in post.
Niko Neugebauer: Okay, so I’ll accidentally save this script.
Brent Ozar: And these things just happen.
Niko Neugebauer: So you see, I’m doing compute-scalar, so the trace rec 8666, it – besides a lot of other things, don’t ever run it in production, really, not kidding. It shows you the statistics, all really cool stuff but it shows me this one was missing in the original execution.
Now it shows me, I’m doing casting here on the first one for the column store index and I’m doing it with the batch execution mode, right? This one is blazingly fast, the casting of the data type and I’m doing here also blazingly fast but I’m doing this on the dimension and…
Brent Ozar: Way shorter, way smaller.
Niko Neugebauer: Very smaller, but the result is the first one, 3.5, 2.3. That is a winner. So if you are casting – if you are doing it on another same data type, I know it never happens to you, it happen to me, only to me all the time in my project, you let SQL Server decide, that’s cool, but just, you know, understand. Maybe casting will just bring you to a different level of the performance. Another one is the NULL expressions. It just – I hope – I’m pushing it I know, so I will run this query, I don’t know if I have this table created but I’m just – wait, I’m [inaudible] for real. Bad rows, bad behaving… So I’m doing here a predicate lookup on ship mode and the string and I’m running ISNULL function. Do you see a predicate pushdown? No. Do you see a very bad filters saying just 9%? Yes, okay, if I will run on my row store table, the same one, this will take some time, but it’s just about the concept. It’s – who wins where is another thing. Do you see the filter?
Brent Ozar: No. So it did the pushdown?
Niko Neugebauer: It did the pushdown. Okay, how can you solve this? So this one, this type, well, I see a lot of these types of predicates, not now, exactly – so this one is the same table rewriting with comparison and ISNULL.And ladies and gentlemen, parallel running query, no filter at all. It’s been pushed down the storage engine, beautiful.
Brent Ozar: How did you rewrite it? What did you change about it? You only changed it to the OR? Oh, you changed it to the OR?
Niko Neugebauer: Yes, I changed it to the OR. It’s one of the limitations currently. I believe there is no changes in this sense in 2017, the new release. But there are some cool changes, little changes like incredible stuff. I – you know, column store indexes – let’s say something important, they have a lot of bugs in this 2016 release. You know why? Because they did an incredible job. I would say the best – for me personally, the best ever job. Going from one hub architecture to three fully completely implemented architectures and I’m just bowing down before all the teams that Georgey, Vasily, all the developers, Devin – everybody did incredible job and 2017, there are some little queries and I’ll show you one today. One which is 2016, is running slow and 2017 it’s running fast and just query optimizer. It just – the thing they just tweaked it and it really runs faster. Incredible stuff.
Brent Ozar: Vladi asks, “Does COALESCE have the same problem that is NULL has?”
Niko Neugebauer: Of course. It’s a NULL expression, right? It’s not about IS NULL – this is why even the demo is called NULL expressions, okay? So query construction, there are a lot of things I wanted to show you. Don’t use cross apply if you can. Sometimes you can rewrite cross apply and you’ll find on the web – from much smarter people, the examples some of the cross apply you can rewrite – stop taking drugs, sir – maybe not, maybe not. Please for the best of you and all of us, do it.
Brent Ozar: I’m even worse without them.
Niko Neugebauer: Just you know, Stack, Stack Overflow. So this one is running with a hash match but there will be enough cross applies, in most cases, they will throw you in a loop, and in a loop does not work and it’s not scalable really, so it probably will never work in the batch execution mode so your queries will be slow. Bad things, bad people. Okay, cross apply joins and – query construction. So check out these two queries, I’m doing the aggregation, I’m doing multiplication outside of the sum and on the first I’m doing multiplication on the constant within the sum. Let’s run it a couple of times. Let’s take a look at the result. The first one, 600 milliseconds, this is 60 million rows, this is not a bad table, weird. What about the first one – do you want to see the difference? So it’s four times, it’s just how I wrote the query right?
Brent Ozar: Wow.
Niko Neugebauer: It’s – conceptually, it’s four times the difference, right? And if it works, I spent – I burnt just a fourth of the CPU time as well. So let’s take a look at the existing execution plan. Do you see the difference? Brent Ozar: Yes, the scalers moved. They do the scalers…
Niko Neugebauer: Kendra even posted about this. When you see actual number of rows, it’s zero, it’s because it’s a full aggregation predicate being pushed down the storage engine. The blazingly, amazingly fast feature. It – for the first query, when we put the constant into the sum, the SQL Server just, you know, optimizer does miracles but it doesn’t understand this one and if we go into the properties, we’ll basically see – of course it does, it’s the best what it can. But it still – it runs in the batch execution mode but it’s not getting this thing, and it does computer scaler, and it does run slow and it does all the troubles and here is the thing, although I never blogged about this, I need – I have like 25 or 26 blog posts work in progress. This one, the first one on 2017, it runs like the second one, for real.
Brent Ozar: Nice, good, wow. So they just probably rewrite the order of the math.
Niko Neugebauer: They did a better row rewriting, and I’m like, guys, I’m just like, so a lot of the things, smaller things, unless you find them, you’re just like I don’t know what they do, maybe they sleep all the time. I hope they get some sleep. So Georgey Georgeevsky, Vasily, Conner of course, who’s leading them, just like…
Brent Ozar: Pretty impressive, because otherwise if you have to tune between these two execution plans, it’s really hard to understand exactly what’s going on here.
Niko Neugebauer: [crosstalk] will help you, for this one. He will say like you’re not getting – even though you have the aggregation, you do not have – get the aggregation pushdown, it’s easy to recognize this better. So one of the things is like, basically, 130 compatibility level, do you push something into the aggregation, which you shouldn’t?
Basic stuff, really, because you see, the results are the same and it does miracles.
I need to finish because I need to pick up my kids, I’m sorry. So the last one I wanted to show you is estimated versus actual, so I created two test tables. There is some – one column which might has a little difference and I’ll explain to you why, but I’ll run the query. So sample A, sample B on the Contoso retail data warehouse database, you see – of course, I need to run this and I need to load just two million rows into them in – Brent, every time I do this I need to say Niko, select the correct database. I couldn’t – I could have included the use statement but I’m like, Niko you’re so smart. This will never happen to me, happens all the time, and now I’m on video.
Brent Ozar: Well it happens to me usually when I’m in rooms full of people and they’re like you know, you’re in the wrong database, I’m like oh yes, I’ve just created a bunch of tables in master.
Niko Neugebauer: I don’t want to see my masters here, I don’t want to see it, I don’t care, I don’t.
Brent Ozar: It’s horrible.
Niko Neugebauer: Okay, so let’s run this – estimated execution plan for these two queries, same, look the same, right? I run – I focus on one aspect just the I/O – if I see lob logical reads 11, lob logical reads 11. Tables are the same, right? Let’s run – I’m just focusing on statistics I/O okay? So let’s take a look, 365, oh well, the second one has 367. Yes, close enough. Let’s do another thing. Let’s give it to the users. Eight milliseconds, 148 milliseconds.
Brent Ozar: What was the difference between those two? What are the two tables’ differences? Oh A and B, oh there was – I thought it was the same one at first. Okay.
Niko Neugebauer: No, it’s just like there is something with this column man – wait, let’s see the expected execution plan estimated, check it out. There is nothing you can find there, there is nothing in this estimated thing. Let’s run the actual. If you really like it, you put an aggregate predicate down on it.
Brent Ozar: That’s exactly what Beyonce said.
Niko Neugebauer: Exactly, like respect booyakasha, in the words of Ali G. So the thing is, I’ll show you the definition of the tables and then I’m gone, really gone. So the first table has a total cost numeric 18, 3. The second one has a total cost numeric 19.3.
Brent Ozar: Wow.
Niko Neugebauer: One, it doesn’t really – it is. One, 18 occupies eight bytes and aggregate eight predicate push down works on eight bytes as a max.
Brent Ozar: Wow.
Niko Neugebauer: Last week I was at a client, I was tuning them and they had everything set to 36 and I’m like, guys, you need to lower your expectation. So eight bytes is the limit for the aggregate predicate pushdown and 18,3 versus 19,3 gives you the result. Aggregate pushdown working beautifully, zero rows here forwarding just the result will be coming out of the hash match and here, this one is getting all two million. Imagine 20 billion rows, how beautiful would it be.
Brent Ozar: All from one data type. Wow.
Niko Neugebauer: Back to the roots, like if you are looking for the performance, and I’m not saying like you want to go faster and you don’t care about the pieces, no, pieces is the first but analyze, run – you should have – buy some software that does for you if you don’t have your own script or find some good scripts that do, analyze your data types and see you know, what you would have defined and what you’re really storing. Start the conversation with the business, maybe there is something to discuss. On this note, I – really, I shouldn’t be doing this but I want to show you to say thank you very much to all of you and to show of course the other one, thank you, great idea, no bullshit, really thank you for organizing this conference, I’m a fan here.
Brent Ozar: Oh good, thank you.
Niko Neugebauer: I’m trying to participate, I love that you’re interacting with the people, which is really like – this is what a [crosstalk] want…
Brent Ozar: They really want just the verbal questions and answers, they absolutely love that, so yes, everybody loves it. Thank you for volunteering your time, sir.
Latest posts by Niko Neugebauer (see all)
- Columnstore Clustered vs Nonclustered, when to choose what - June 11, 2018
- Columnstore Indexes: What are we waiting for? - May 10, 2018
- Worst Practices & Less Known Limitations for Columnstore Indexes - April 21, 2017