SQL Server 2016 Features for Performance Tuning Lovers

Audio podcast:

Enjoy the Podcast?

Don’t miss an episode, subscribe via iTunes, Stitcher or RSS
Leave us a review in iTunes

Target Audience:

Anyone with experience of SQL Server performance tuning


It’s a work day, and you’re in front of your monitor battling a vicious query. You’ve already tuned a query or two in your life and have some tricks up your sleeve, but nothing seems to get this query down to a reasonable time and allow you to drink that cup of coffee you want so bad.

SQL Server 2016 adds a few impressive new features that might help you with that.

During the session, we will look at the different performance tuning aspects the new features enhance: Execution plan history and enforcement using the Query Store, live query execution visualization, DMV enhancements, new query hints, and more.

Join this session to see how SQL Server 2016 makes performance tuning easier and more efficient.

Why I Want to Present This Session:

I’ve been working with SQL Server for many years, and performance tuning is my bread and butter. The SQL Server 2016 new features will enhance my arsenal, and I believe anyone doing performance tuning work should know them.

Additional Resources:

Session Transcript:

Brent Ozar: Good morning everyone, or good afternoon, depending on where you live. Next up, or the first session this morning of GroupBy, we have Matan Yungman talking about performance tuning features for haters – just kidding, for lovers. Matan, take it away.

Matan Yungman: Thank you Brent and thanks again very much for having me and thanks everyone who voted to see my ugly face at this time of the day. So we’ll talk about a few features of SQL Server 2016 for performance tuning.

All execution plan lovers now, there were a few sessions already talking about performance enhancements in SQL Server and I will refer you to those, but what I try to do is make it as much practical as possible. So this session will be very demo-heavy. So we will see this slide and this about me slide and then just jumping to the demos.

So if you heard about myself, my name is Matan Yungman, I live in Israel, I’m a data platform MVP. I talk a little bit – on Twitter about SQL Server and rant about my SQL. I have to work a little bit with my SQL lately and I don’t really like it, so you can see my rants over there. I work for Madeira Data Solutions, we’re a consultant and training company dealing with everything data and SQL Server development, performance tuning, VIs, and so on and so on. And I also co-host SQL Server radio with a podcast for SQL Server professionals alongside Guy Glanister, who had a great session last Friday. We release two shows a month and if you haven’t checked it out, maybe you like it. It’s a nice way to learn while you’re doing other things like jogging or commuting to work or washing the dishes and so on and so on. We also had two shows with Brent, which were very good, I believe, so if you haven’t checked that, check it out.

Without further ado, let’s jump to the first demo. So, what we’re doing over here is we have a relatively simple query where we use row number in order to get all of the rows, which have a rank of one according to some partition back-down. It doesn’t matter too much what the query does. And what we’ll do over here we’ll start running it, and we’ll make sure the enable live query statistics. What live query statistics means is once I start running the query is that, I see the rows starting to flow between the operators in the execution plan and what I can see here is that for each. And every operator that started to get rows, I see how much time the operator is working and the amount of percent that this operator already handled compared to the estimated number of rows SQL Server thinks it should handle in this plan. And this is something I’ve blogged about in the past, I also wrote a very ugly utility a few years ago, maybe we’ll talk about it a little later.

This is something that can help us mitigate execution plan problems, which were quite hard to mitigate a few years ago. So this query is about to finish, but we’ll stop and execute it again because I want to show you something in this plan. So another nice thing about live query statistics, or LQS is that it also teaches you a few things about execution plans. If you see here, those two operators the tables can under parallelism started to work and started generating rows to the sort operator, and the sort operator, if you can see, doesn’t send the rows to the operators, that are after it in the execution plan. This is because the sort operator is something that is called a blocking operator or an iterator. And this means that it has to get all of the rows from the previous operators and only after it gets all of the rows, it sends the rows onward in the execution plan.

So if we – so now the query finished again, but as we can see, the sort blocks the rows and after – actually it’s not after the sort is finished, but only after the sort gets all of the rows, it started generating chance of sorted rows onward in the execution plan. So as we understand, we have some problem in the sort, and now let’s think how can we solve that.

So the query ran for 52 seconds and what we can do is this pattern of row number over partition by transaction date and order by quantity kind of hints that if we create an index on the transaction date and the quantity, maybe it will help us mitigate the sort operator. So if we create an index on the transaction and quantity – it will take about 30 seconds, and meanwhile we can talk about the fact that we can also see live query statistics using Activity Monitor.

So if we go to – over to object explore and right click on our server and go to Activity Monitor, here we can see active expensive queries and when our query will run we will see our query over here. Now we can see that we have a create index and we can see show live execution plan. There isn’t too much to see in the case of creating an index, but when we will have an actual plan that is running, we will see the rows flowing between the operators. Now, we have two types of profiling for the live query statistics. One is called lightwait profiling and one is called full profiling, or something like that.

The full profiling means that we will get the full profiling only for queries which we enabled include actual execution plan or include live query statistics or we gather the actual execution plan using profiler or extended events. Now, this type of profiling generates a very big overhead on SQL Server and that’s why in SQL 2016 service pack one, the product team developed something that is called lightwait profiling, and you can enable that by enabling trace flag 7412. It’s the global level of the server. So if you go here to the SQL Server service, you go to start up parameters and you can see minus T7412. You have to restart at the SQL Server service and starting from this point, you will have lightwait profiling on your server and you’ll be able to see in Activity Monitor, the live execution plan, live query statistics.

Just to show you, we will run this query again and we will not include live query statistics and we will not include actual execution plan and we will go to Activity Monitor and we will close that and open that again, and we will be able to see live execution plan. So we can see we don’t have the sort at all. Excuse me, Brent?

Brent Ozar: No, I didn’t know – can you turn it on, on the like, per session level outside or you just have to turn it on across the entire server?

Matan Yungman: So you have to do it globally. I guess you can do it that also by using DBCC trace on but then when you have a restart it won’t work so you have to do it at the global level. What the product team says that it’s an overhead of only about 2%, so for most systems – most systems can have that I guess [crosstalk]. So this is very, very nice, and a very nice tool for the DBA, and we will talk about it in a bit. Now, I like it a lot but I still think there are a few things to improve. I will talk about it in a few minutes. So not during the party, it’s a great tool.

So as we can see, we went down from about 52 seconds to 12 seconds using the index. This is new to SQL Server 2016 obviously, but there is another thing that we can do that is in new SQL Server 2016. So let’s drop this index and talk about something that is called the Itzik Ben-Gan trick.

It’s related – Itzik has many, many tricks, this is called the column store index Itzik Ben-Gan trick and for that, I will spend two minutes talking about column store indexes. This is about me – so the column store index, if you talk about the traditional approach, the row store approach. Up until SQL Server 2012 where column store was introduced, we didn’t know we had the row store approach because this was the only approach. But starting 2012, we have the row store approach. And what row store means is that traditionally, SQL Server stores rows in pages obviously, each page is 8K and generally speaking, there are nuances, but generally speaking, each page holds all of the columns of a row.

So let’s say I want to get only two columns out of a table that has 30 columns, I still have to read all of those other columns and this is, in many cases, doesn’t perform very well, so as we said data store will rise. And starting SQL Server 2012, and in 2014, which was much better, and 2016, which is also much better. We have column store, and column store means that we separate our columns and store each one of the columns separately and data is stored in chunks called segments. And generally speaking, each segment holds one million values over a certain column. And because – and there are all kinds of optimizations that allow SQL Server to compress the data and because of that, much more data fits in memory, even if we have to read data from disk, it’s much more compressed so we have to read much less data and so on and so on.

This is in column store session, I refer you to Niko Neugebauer’s blogs and presentations and I also have a presentation about column store. But what interests us for the purpose of this session is batch mode processing. This is one of the major improvements in query processing, in the query processing mode. What this means is that instead of – in SQL Server, even if we work – let’s say it’s not good to work by row so let’s say we execute the query and it has to go over one million rows and it returns one thousand rows. Even though we didn’t work row by row, under the hood, SQL Server did work row by row because each one of the operators in the execution plan works on the row by row basis. SQL Server – each operator has a method in C++ I believe, called get next row and it works that way. It works iteratively and on each and every row, one at a time.

In batch mode processing, what this means is where we take chunks, we take groups of rows or one thousand rows at a time, treat them as a vector of rows and manipulate them all at once together. So this is much better performance and this is what we want to see over here. We want to see estimated execution mode of batch and actual execution mode of batch.

So those were a few minutes about column store, but how does this relate to our query? Our query sits on top of row store, it doesn’t sit on top of column store. So this is the Itzik Ben-Gan trick, and what this trick means is that in SQL Server 2016, we can create a filtered non-clustered column store index. It’s a little complicated, we have a filtered index that is a non-clustered column store index, up until SQL Server 2014, we could create a non-clustered column store index, but it would make the table read only.

But starting SQL Server 2016, we can create I believe, one non-clustered column store index on the table and it won’t make the table read only. So what Itzik says, I will create a non-clustered column store index, but I will make it empty. So we have a WHERE; transaction ID equals one and transaction ID equals minus one. And obviously, none of the rows, until we have quantum computing, none of the rows can be equal to one and minus one at the same time. So we’ll create this non-clustered column store filtered index and execute the query again.

Let’s just go over the run times, first execution 52 seconds, after creating the index for the row number 12 seconds, and let’s see now, this trick, let’s see how much time it runs – and the demo isn’t working. Great. So we were 20 seconds, that’s okay.

Brent Ozar: Niko says, check your compatibility level.

Matan Yungman: Yes, so I actually made sure we were on 130 but let’s try it again. Let’s look at the estimate execution plan and see if we’re getting what we’re supposed to get. So yes, first of all, we have the window aggregate, aggregate mode, the window aggregate operator, which is exactly for those scenarios and for column store indexes. So this is a new operator on SQL Server 2016, which is optimized for those situations and here in the sort, we do have estimated execution mode of batch.

So basically, we’re okay and this is only because of the demo god, so let’s run it again. But generally speaking, even though we were at 20 seconds, it was still much better than the 52 seconds we had at the start and we didn’t have to create an index, because this is a big table, we don’t have to create the index on the table. We’ll be at around 20 seconds again, so not as good as traditional indexing but much better than the original state. So this is something very interesting; now, this is a hack because I would expect – if this is something that is coded, I would expect Microsoft to give us a hint of use column store or stuff like that because this is going like this, but this is nice. Be very careful because it’s not documented or supported. Use it in production only if you really know what you’re doing and only if you really want to do it or really need to do it. But it’s nice to know that this such thing exists. Any questions up until now, Brent?

Brent Ozar: Nope, you’re good. A few questions – well, one question in – hold on a second, let me go back and look at it – is 7412 a documented trace flag?

Matan Yungman: Yes, Microsoft did a very good job of documenting the trace flag. For many years, trace – even the supported trace flags were not documented but now they are. I don’t have the exact thing but you can search for it and they have a very nice page that is documented with all the documented trace flags. By the way, this was the Itzik Ben-Gan trick. Niko Neugebauer also has a trick and I believe Niko is saying left join into an empty table with the column store index and this will generate the same thing, so this is another trick that you can do. Yes, maybe Niko can say a few words about that afterwards.

So let’s continue, and let’s have a look at query store, which is maybe the most bombastic thing in this area of performance tuning Microsoft released. So we will go over here to object explore and go to query store and we have a few reports over here. Then we will go to the top resource consuming queries and what we can see here is out top consuming queries from the last hour. I can go here to configure and I can change that, we can see that we have the last hour and I can say, last hour, last day, and so on and so on. As we can see, we have the query that we just executed, this is the top consuming query obviously, and we can see all over the execution plans that this query generated.

So if we go over here, we can see that this is the one with the row mode, and if you go over here, this is the one that we had after we created the index and if you go over here, we should have the one that we had with the batch mode, and this is a very nice way to see our executing queries and compare execution plans and start to see which plan feeds the most for our queries.

Let’s say I see that I have a sort in query but the execution plan that SQL Server is currently generating isn’t good enough, so I can go to this query and select the plan that I want and choose for example, force plan. In this case, I force the bad plan but if for some reason that’s what I want to do, so I can do that and if I do that, I go back to the query window and execute the query and even though we have the Itzik Ben-Gan trick and the column store trick, we won’t use that and we will have a table scan and the bad sort as we can see over here. We have the estimated execution plan of row instead of batch execution mode and I will go – just go back to query store and un-force the plan. Now, query store is much more than that, again, I refer you to Enrico’s session about that, which goes much deeper into the query store, but again, this is a very nice way to look at our plans, and if I will do not the last hour but let’s say the last year, I will see much more plans over here. And I believe Enrico also has a project of showing those reports in Power BI, which can be much nicer than this UI that we have in Management Studio. And I think we’re done with this demo and we’ll go to the next one.

So we’ll open demo number two, and I want to show another thing with live query statistics and we have another query and we’ll drop the clean buffers and what we have here, we declare two variables and execute the query on those variables and we’ll make sure to include live query statistics. And what we can see over here, if we will go over here to the right side, we can see that over here in the index seek, we’re on way more than 100%, and this is because when using local variables, SQL Server doesn’t know the values of the variables and generates some kind of default plan that is generated out of the statistic that we have on the table. But because the execution plan isn’t optimized for those specific values that are inside the variables, we can get to all kinds of those situations. As we can see, we’re at 3000% out of the number of row that we were supposed to handle in this operator. So this is another example and we see that we’re finished and just as an example, we will execute the plan with – the query with option recompile, which will say to SQL Server, read the values that are inside the variables and generate the right plan. And when we do that, we can see that the plan is slightly different. SQL Server chose to first seek the big transaction history table then sort the data and only then perform the clustered index seek.

So again, live query statistics is not only for performance tuning, it also shows us and teaches us how SQL Server executes queries and how the flow of data in the execution plan works. So again, live query statistics, a very nice tool.

So how can live query statistics be better? So again, I think it’s a great tool, and great job by the product team. What I would want to see in live query statistics is, first of all today, I have to hover around and I have to search for the problem. In this case, in those plans, it was relatively easy to find the problem because I search for the operators that started work and the rows that started to flow and the percent that went up and up and up. But in more complicated queries, it will be a little complicated to do that and what I would want from the tool is for the tool to show me to highlight in red or in yellow where is the problem, to make our lives a little easier. So hopefully the product team will hear that and make this adjustment.

And another thing that I would want to see, let’s say that I’m not near my computer and we have a production problem during the weekend and after the weekend I come back and my boss tells me on Saturday at two o clock there was a problem. I can’t use the tool because the query isn’t running anymore, so what I would want to see is the ability to replay the execution plan and we have something like that. If someone started to play with Azure Data Lake, in Azure Data Lake, it’s a little hard to see, it was hard to find the source of that, but if we go here, over at the bottom, we see Joe play it again; we can play the execution plan after the query is already finished and if you go search videos on Azure Data Lake, you will see the operators of this execution plan will start to film. So I think this will be a great investment for us if you will have this tool, the ability to see live query statistics when they’re not live anymore. So again, an awesome tool but I hope the product team will make it slightly better and make our lives much easier as performance tuners.

Okay, so we’re done with this demo and let’s continue to execution plan comparison. Another cool tool developed in SQL Server 2016 and this is actually the same query we had in the previous demo but we will execute that. And actually, I already executed that. I will execute the good version of this query with constants, without variables. And after I execute that – excuse me, I didn’t include actual execution plan – and after the query finishes, I can go to the execution plan and hit compare show plan.

When I hit that, I can select the already saved execution plan, which I want to compare to. So it will compare to the query plan with the local variable and when we do that in this case, SQL Server does show me where the differences are, so as we can see, this is the old query, this is the new query. In the old query, we had a clustered index seek and then as it loops and in the index seek in the bottom, in the new query, we have an index seek of the big transaction history table, we have the sort, and only then we have the nested loop into the clustered index seek and we also have the window over here, so..the GoToWebinar interferes – let’s write it again. Compare show plan, and over here, the side, we can also see it on the plan and at the operator level, we see the differences. But even if we look only at the plan, SQL Server highlights the changes and we can see how SQL Server chose to execute the query in each case. And we can also do that using query store and when we do that, we understand what’s the difference in the query plans and then we can force the query plan that is a bad one. So again, another nice tool for execution plan lovers. Any questions Brent before we move on to the next demo?

Brent Ozar: Let’s see. Several people say they wanted to see for live query statistics, they wanted to see performance software, like say SQL sentry, Spotlight, Idera SQL DM to capture these replays to overtime so that they could jump back in time and see what queries were slow at a given time.

Matan Yungman: Yes, so I guess that’s a problem because it is an overhead. You have to keep more data over time and the question is whether you keep that. And if you do that in the Cloud it’s easier, but it’s harder to do that on an on-prem box but I do hope we will have this option because I think it’s a very strong tool to have.

Brent Ozar: Harry says, or Doug says, are there any additional set up steps he needs to do in order to enable that plan comparison you just showed? Or how does he get that?

Matan Yungman: So what you have to do is you have to save file – let’s do that together. So execution plan compare, we will execute the bad query – actual execution plan, I believe you can do that with the estimated as well, but let’s do that with the actual execution plan. And after it finishes, we will save that as a file, as a SQL plan file. We will do save execution plan and we’ll override the existing file and after that, I can take a certain execution plan and compare to that file that we’ve just saved. So this is basically what you can do, you can’t – at least for now, you can’t compare two plans that sit, let’s say in query store or in cache. You have to save one file as a file – one plan as a file but I hope it will be better over time. But not too bad, we just have to keep that and then compare.

Brent Ozar: And also this is Management Studio, you’re showing 2016 I believe?

Matan Yungman: 2016, yes.

Brent Ozar: Then there was one other one, Teddy asks, “Is the job playback on by default in Azure Data Lake or is that something you have to turn on?”

Matan Yungman: I don’t know enough about Azure Data Lake in order to answer this question.

Brent Ozar: And Teddy was probably even hoping it was Azure SQL DB, but no, this is Azure Data Lake.

Matan Yungman: Yes. Okay, so let’s move on and talk about session wait stats, so let’s close this one. And let’s talk about those. So if some of the audience were Pedro’s session, great session, by the way covering basically what we talked about in this session and more and Pedro talked about the fact that we have much more information about – we have wait stats information inside the execution plan. So if you go to an execution plan, you will see right at the bottom, you will see the top five wait stats, I believe, in the plan.

So this is very, very good in many cases. I will show a case where this is – we need another approach for seeing that. So we will make sure we will use a database called data loading and we will create a table called load me fast, and we will insert 100,000 rows into the table. So if I include the actual execution plan and go over here, I will get 100,000 instances of this execution plan. So if I go here to the properties, I can see good amount of information about the execution plan but I will get 100,000 of those and it basically kills my Management Studio.

So I need to have another approach for that, so what I will do once my Management Studio revives, is I will use a DMV called sys.dmexecsessionwaitstats, which allows us to see wait stats information per session. So let’s wait a few more seconds, maybe we’ll kill this tab and open it again. Sorry about that. Yes, I want to cancel this query. Sorry about that, so let’s do that again, but now without the actual execution plan. We will use the database and create a table and we’ll insert 100,000 rows into the table in a loop, and when I go to sys.dmexecsessionwaitstats, we can see that our problem is the right look wait type, which basically means we wait for writing to the transaction. We’re writing to go to disk and write to the transaction log and this is a relatively simple case but obviously in production you will have very long stored procedures, we could use loops and we will do queries and so on and so on and so on, and you will need this information in order to see what’s the most problematic part in terms of wait stats in your query.

So when we have that – now, this DMV resets each time the session disconnects and it can be a little misleading, so you need to be careful with that. But if I execute this query while my query or while my process is running, we will sit information, maybe let’s say execute it again and see the numbers go up, so we were at 100,000 and we’re going up and up and up. So obviously this is updated while the session is running.

So this is nice and now that I know the most problematic part of my plan is the right look wait type, then I can go here and solve the problem and obviously, this isn’t a very good way for insert values into a table using a loop, so we will either use a cross join or another way to do that is to wrap the insert – wrap the loop or the cursor, which is essentially the same in this case, wrap that in a transaction. So when we do that, we have a begin run in the beginning and a commit at the end. What we basically say to SQL Server is we were handling the transaction, don’t open any implicit transaction for each and every one of those 100,000 rows like you do in this case. Just write that to the buffer and when I decide I will commit the transaction and then you will write the rows to disk. Basically, when – each time we don’t use a transaction, when we insert the row into the table, SQL Server in order to have the ability to rollback, opens an implicit transaction. So what this means is that it has to write the data to the transaction log and the transaction log obviously sits on disk and the disk is very slow and that’s why our inserts are slow. It’s not only because we work row by row, it’s because we have to go to disk for each and every row.

So when we use this method, and again, use it with care, it doesn’t fit for each and every situation, but when we use this method, it works much faster and instead of running 20 seconds, it runs just one second. So this is one approach I use with one of our clients. The client had a stock trading platform and it had – the client had a stored procedure which ran many times a second and performed a very complex logic inside the query store. So I use this method because we’re architecting the cursor, we’ve taken a month and another month to fix the bug that we would add when converting the cursor to a set-based solution.

So we just use that, we use transactions and we were able to cancel transactions and we reduced the amount of I/O we generated significantly and things went much faster. So again, sys.dmexecsessionwaitstats gives you another view, and again, I think the tools that the product team added in SQL Server 2016 are very good because they enrich our arsenal as performance tuners. We had very good tools previously, but now our arsenal is much better and this is another tool that we can use.

Okay, let’s move on to the next demo, and the next demo is about actual number of rows read. So we’ll open that, and we’ll use, it is called “test” and what we have here is an I/O from another client – we have a table called stock buys and this query goes over all of the stocks that were bought in 2010, but they want to focus on the time of day of two o clock. I don’t know why it’s two o clock, something happens and this is the only hour that interests them. So we execute the query and include the actual execution plan and it took three seconds and when we go here, we can see – if we hover over here, so we have a clustered index seek, generally sounds good because we like seeks, we don’t really like scans as the general rule of thumb.

Of course, things are in many cases much more complicated. But when we go over here and we go to – we have an estimated number of rows of almost 300,000, actual number of rows is actually smaller, but when we go to number of rows read, we have something new in execution plan called number of rows read, and we can see that we actually read almost three and a half million rows. So even though the number of rows that came out of the operator was relatively small, in order to decide which rows will go out of the operator, we had to read a very big amount of rows. So this is something that we didn’t have and now we have that, and this is because if we go over here, we have a seek predicate, which is the by date and time, and over here we have – so at the bottom we have the seek predicate and we have over here the predicate, the time of two o clock.

SQL Server didn’t know how many rows it will have to read in order to filter out the other hours that aren’t two o clock, and the actual number of rows read can help us understand how much data we had to read. Now, the way to solve it in this case, I won’t get to that too much, is we have the options but one of the options is to actually use a date table. A date table is a sister of a numbers table and this basically – only means that we insert all of the dates and all of the hours, let’s say from 2000 to 2050, which we will all be retired at this year so we don’t care. And what we do is we actually filter on the date table and then only join back to the original table. And when we do that – excuse me, I didn’t show, we read 3200 pages out of the table using statistics I/O, and when we execute using the date table, we can see that the execution time is slightly better but the amount of I/O that we read is much less, divided by ten. So this is very nice, again, this is a specific optimization but the purpose of this demo was to show the actual number of rows read.

Now, another thing, if I’m already here that I can show you, is if you go to the execution plan and we can go to specific operator and we can see, if we go over here to the properties, we can see more information that we had previously. For example, we have actual time statistics, we can see the actual elapsed time in milliseconds, actual elapsed CPU time in milliseconds and we should also have – we have actual I/O statistics and we can see the same amount of pages that we see in statistics I/O over here. We can see over here in the execution plan actual logical reads. So we have much more information in the execution plan and we can mix those tools, in this case, I actually like to see the information over here like that, but I also have another tool that allows me to see how much I/O and time each one of those operators took.

So if I have a more complex plan, I can see for each one of the operators, how much time each one of the operators took and understand which one of the operators was the most problematic one and … to mitigate the problem.

Okay, let’s move on and see what we can cover in the next five or ten minutes. So we actually covered demo number six, and let’s talk about a new dynamic measurement function that shows us DBBC input buffer. So, everyone who knows or doesn’t know, DBCC input buffer shows us the last command – let’s say I open a window and I do select stuff from a big product. And we will search a number, we’ll search number 54 and I do DBCC input buffer 54.

The DBCC input buffer shows me the last command that was sent to SQL Server by this session. We have other options for retrieving which query is currently running, but let’s say I execute a stored procedure and a certain statement inside the stored procedure is currently running. So the other ways which we will see in a second, will show me which statement inside the stored procedure runs but it won’t show me what command was sent to SQL Server, and in many cases, I want to see this command in order to see for example, which parameters were passed and so on.

So for example, we will create a procedure, called what a great proc, and it will execute the query we’ve been knowing from the start of the session, and we will execute the stored procedure, we will just check with session 51.

Okay, so we will execute the stored procedure and go over here and we will do row session ID equals 51. So what you can see here is the query I use many times when I do performance tuning, what this query does is it’s going to sys.dm_exec_request, which basically says which queries are currently running. I cross apply to sys.dm_exec_SQL_text, and if I want I can also get the query plan using sysdm_exec_query_plan or sys.dm.exec_text_query_plan, which allows us to see the execution plan on a statement by statement basis and not for the whole batch. So when I do that and it will execute a stored procedure again because it already finished, it executed again, and when we do that, using sys.dm_exec_input_buffer, which is a new function in SQL Server 2016, we can see the input buffer, this is what we executed and the running statement and the running batch. So in our case, the input buffer was what a great proc, with a parameter of two, the current batch that is running is the stored procedure itself and the running statement inside the stored procedure is this select.

So this option allows us in one to see all of those options in a single query. Previously, we had to execute the query out the input buffer and then concatenate DBCC input buffer in a weird way and it sometimes works and sometimes it didn’t work and in this case it’s very, very nice because we can get all of the information together.

So again, another way and another tool for performance tuners to perform their job better and go drink some coffee faster because they mitigated the problem faster.

Let’s see if we have one last demo. So the last demo of today is of sys.dm_exec_query_stats and sys.dm_exec_query_stats stores information about the queries that run in the system, query in stored procedure and so on and installs cumulative information. For example, and again, we use this sys.dm_exec_query_stats and sys.dm_exec_SQL_text now to get the text and Sys.dm_exec_text_query plan in order to get the execution plan. And when we do that, we see many, many queries. Obviously, all of the queries that we saw at the session and we can, for example, look at the – hit the query plan and we can for example, do order by total elapsed time divided by cost execution count – and I then divide by one million in order to get the time in seconds obviously. It doesn’t matter to do it in the order by I will do it over here in the select. And when we do that, we get the most expensive query in terms of a single execution.

So this is something that I do, for example, a lot, but now we have more tools in this DMV. If we go all the way to the right, you can see that we have many new columns and they starts in total DOP, total degree of parallelism and we have last degree of parallelism, mean the real parallelism and so on and so on. We have information about memory grants that work on granted to the query, min query grant, max query grant and so on and so on.

And now we can use those new columns in order to have another view of our plan cache. Basically, Sys.dm_exec_query_stats show information from the SQL Server plan cache. So for example, I can see queries that total idea grant in kilobyte was greater than the total granted kilobyte and this means that the query didn’t get enough memory grant for it to execute properly. So for example, this query, should it got more memory grant would have performed better.

I can do another thing for example, I can do order by total grant in kilobyte descending, which basically gives us the most expensive queries in terms of memory grant, which can imply that this query isn’t written well or grabs a lot of information and thus has to get a very big amount of memory grants. Or I can do for example, I can order by the last grant in kilobyte, multiply it by the last DOP, which gives us a plan that is very, very, very expensive – a query that is very, very expensive. And then I can use the old tools of execution plan or for example use a new hint that is called min grant percent or max grant percent in order not to give or to give the query – the memory grants that it needs to get in order to execute it properly.

And again, very good tools and new tools that allow us to be better performance tuners. And there was an interesting scenario, the visual studio online team had a crisis, they had a very big crisis and that was because of the new code in IT estimator. So for anyone who doesn’t know, when you upgrade to SQL Server 2014 and you upgrade the compatibility level, SQL Server starts to work with a new cardinality estimator, which is part of the SQL Server optimizer and the cardinality estimator is the one that is in charge of saying how many rows will go out of each one of the operators, and the new cardinality estimator in some cases calculate that differently. So in many cases it does that better, in some cases it does that worse and the visual studio online team – and basically whatever is in Microsoft land in high scale, they have many customers and so on. So they headed down time because they upgraded to new code to SQL Server 2014 or 2016, but they had problems with the new cardinality estimator. So they used the max grant percent hint in order not to allow each one of those queries to get a very big amount of memory grant and thus blocking the server. Actually the right way to do that was to use the old cardinality estimator for this specific query but maybe Brent will say what he thinks about that in a minute because I see you think otherwise.

But in any case, they use the max grant percent hint, which solved their problem. So you can use the same as well, and I think we’re done. So I will show you this beautiful slide again, check – feel free to contact me and check out SQL Server radio and let’s see if we have any questions.

Brent Ozar: Perfect, so everybody go ahead and get your questions in for Matan. I want to thank you for presenting today, it was a great run down of list – I really believe SQL Server 2016 is an amazing release for performance tuners. And one of those where when you’re talking to clients about they’re having performance problems on older versions of SQL Servers, well how do you explain to them if they should move up to 2016 to make your job easier as a performance tuner?

Matan Yungman: Well, mainly I need to explain what will benefit them because they generally speaking care less about me but if I’m a DBA, I can say that I have much more tooling and my arsenal would be much bigger and thus your organization will have less crises and so on and so on and so on, because at the end of your day, it’s an expensive product so we have to have very good reasons why to upgrade and this is one of them because if your organization will have less downtime, so it’s a motivation for your manager to upgrade.

Brent Ozar: Joe says, “Aren’t licensing costs higher in 2016?”

Matan Yungman: So they are higher – from what I know, I’m not the licensing expert and licensing is very, very complicated. You can do a career out of that, but in SQL Server 2012, there was a big change in licensing, they went from licensing per socket to licensing call, which in many cases made the product much more expensive. I don’t believe there is a change between 2012 to 2014 or 2016, but the big change is that in SQL Server service pack one, you have many, many great features that were only in enterprise edition, so if we look at it this way, licensing costs actually went down. So you can run column store on standard edition, partitioning, compression and so on, very, very good list of features. By the way, something that I forgot to say during the session, lightweight profiling is supported – the one that you are able to connect to any one inquiry is only supporting starting SQL Server 2016 SP1, so if you’re running 2016 or upgrading to 2016, very, very common to jump to service pack one from the start and not move throughout the end.

Brent Ozar: I have an interesting question from J. D. Walker, he says so he noticed your shirt in the background and he says where is the shirt from and – so I said it’s a SQL Saturday shirt from SQL Saturday Israel, he says what is SQL Saturday? So explain SQL Saturday in a nutshell.

Matan Yungman: Okay, so SQL Saturday, it’s something that is supported and started by PASS I believe, or it was started – so it belongs to PASS, it’s a community session, speakers come for free to present and almost every week there is some sort of SQL Saturday all over the world, so you have many, many in the US, in Europe, in South America and we had one in Israel as well. You’re not mistaken, it wasn’t Saturday, so it’s just a brand name, but not every SQL Saturday happens Saturday and most of them happen on Saturday but in Israel we have the Shabbat and many attendees couldn’t attend if it would happen on Saturday, so we had it on Thursday. So – but it’s very easy – relatively easy to find a SQL Saturday that happens near to you, just like a user group, so check that. It’s a very, very nice initiative by Pas, totally by the community and lots of fun.

Brent Ozar: Yes, you can volunteer for them, they’re all over the world, you can go hit SQLSaturday.com and either sign up for one that’s coming soon near you or sign up to present. Astaway says, “They say SQL 2016 standard SP1 has many great features, but is it chained?”Astaway, I’m not sure what you mean by chained, I’ve never heard that before. Chained down in the data centre, you may try a different word or what you’re thinking of there. Another thing I’ve heard from people say before is I don’t want to go to a version of SQL Server until service pack one hits, so now not only is it stable, you but get more features too. I’ve also heard people say is it really ready to go yet? Well remember, there’s the new SQL Server 2017 that’s already been named, so it’s coming soon.

Matan Yungman: Yes, and we would do this session let’s say a few months from now, maybe we’ll talk about adaptive query plan and all kinds of new features in 2017, they started to move much faster than they used to.

Brent Ozar: Astaway says for chained, he said, “In a limitation of memory and CPU.” Yes, 2016 standard is still limited to 24 cores or 128GB of RAM for the buffer pool. It can use more memory for things like execution plan caching too.

Matan Yungman: Yes, but in addition, remember you have an additional 32GB for column store and additional 32GB for in-memory OLTP, so that’s a nice hack to bypass that limit.

Brent Ozar: Michael Wigg says in Slack, “2016 was probably the most stable RTM version we’ve ever seen.” RTM meaning released to manufacturing when it first came out before any cumulative updates and service packs. I’d agree with that, I would also say with a caveat that the cumulative updates and the service packs probably been the most but…

Matan Yungman: I was just about to say that I don’t wait for a service pack but I do wait for a cumulative update one or two, because in many cases, what I say to people is that Microsoft released a RTM, the Stack Overflow guys go and employ that on the first day, get all the cool bugs, Microsoft fixes that in C1 and then we can go and deploy to production, but I agree, I understand you saying something a little different.

Brent Ozar: You know what’s funny is that Stack Overflow, they were running the community – well, one of the preview versions of 2016 before it went live. They couldn’t go to RTM due to a bug, they had to wait a while before they could go out to RTM. Funny how that stuff works. Alright, well thanks a lot…

Matan Yungman: [crosstalk] super early adaptors.

Brent Ozar: Yes, insane. I was even – I’m like are you going to do SQL Server 2017 on Linux, are you going to put that into production and they’re like, well they thought a lot about it but doesn’t sound like they’re going to do it yet. Alright, well thanks everyone for hanging out with us this morning. That’s Matan’s session, give everyone – give Matan a big round of virtual applause. You can say thanks to him on Twitter as well, he’s Matan Yungman on Twitter, his Twitter handle’s there on the screen.

The following two tabs change content below.

Matan Yungman

Latest posts by Matan Yungman (see all)

, ,
Previous Post
How to evaluate if SQL 2016 In-Memory OLTP is right for your workload
Next Post
SAN Primer for the DBA

8 Comments. Leave new

Dimitri Grinkevich
December 22, 2016 6:36 am

Matan is much more better than Angeom ;-)))

Hugo Kornelis
January 3, 2017 12:03 pm

I absolutely love the title and wish I had thought of it myself. I absolutely consider myself a performance tuning lover!

Based on the entire submission, I do find myself wondering what the level of the presentation is. The title suggest you target people with a huge amount of tuning experience, and hence a truye deep-dive session. The second sentence of the abstract suggests a target audience with some tuning experience, which suggests a more entry-level session. And the list of subjects you want to cover suggests that this will be a pure entry-level session that gives a quick overview of all new features in SQL2016: Query Store, Live Execution Plans, DMV extensions, and new hints – four subjects, and at least two of them could fill a full session by themselves.

Can you amend the abstract and/or title to help me assess the actual level of the presentation?

    Matan Yungman
    January 3, 2017 5:08 pm

    Thanks for the great feedback Hugo. It gave me a few things to think about.
    I will revise the abstract and submit it for the next edition of the conference.


Michellea David
January 12, 2017 4:04 pm

I love the title. I am interested in this class. I would love to know why on SQL Server 2016 when a database is set to compatibility level for 2014 a query will run slower, then when the compatibility level is set at 2016. I know it has to do with the execution plan, but what changed from 2014 to 2016?


I consider myself to be good at performance tuning and spend a lot of my time doing it, but I know there’s much more for me to learn even in the 2012 I use all day. I haven’t worked with 2016 yet, and am interested in what new goodies await me. I don’t want just a list of what’s new, I want to see how to use the new features effectively. So I hope there will be good demos.


This seems similar to “Gems to help you troubleshoot query performance”. If this is different, the abstract should make that more obvious.


Lot of good stuff in there; I’ll go back and listen again. Where can I get a copy of those demo scripts?


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.