Enjoy the Podcast?
This session is aimed at anyone who writes T-SQL and wants to improve their skills.
Whether you are a developer, DBA, or anything in between, chances are you are not always following best practices when you write T-SQL. Unfortunately, many so-called “bad habits” aren’t always obvious, but can lead to poor performance, maintainability issues, and compatibility problems.
In this session, you will learn about several bad habits, how they develop, and how you can avoid them. While we will briefly discuss advice you’ve probably heard before, like avoid SELECT * and don’t use NOLOCK, you will also learn some subtleties in SQL Server that might surprise you, how some shorthand can bite you in the long run, and a very easy way to improve cursor performance.
By changing your techniques and ditching some of these bad habits for best practices, you will take new techniques back to your environment that will lead to more efficient code, a more productive workflow, or both.
Why I Want to Present This Session:
I have delivered this session several times, to audiences large and small, and even I learn something new just about every time I present. These are mostly things I’ve learned the hard way, and as a result, the session has evolved over time. Deep down, though, I like it for three reasons:
- I always surprise people with something they didn’t realize could lead to poor performance or unexpected results.
No matter how often I give this session, I still see people out on forums and Twitter encountering these same issues over and over again.
I am a huge fan of the trickle-down effect – encouraging people to follow best practices, not just for their own code, but also for anyone who might learn from them. This includes fellow presenters, bloggers, and documentation writers.
Brent Ozar: In this first session at GroupBy today, Aaron Bertrand is going to be explaining all of the bad habits that you have writing T-SQL. I just laugh at the example up on the screen because I’m like I’ve totally written one big long monster line stored procedure that has been kind of heinous.
Aaron Bertrand: Usually it’s just one line. Usually, it’s one long line, but I wanted to fit it so you could see it on the slide.
So who’s this Aaron Bertrand guy? I’m a product evangelist at SentryOne. We make some pretty cool software I think. You might have heard of Plan Explorer, but we also have some performance monitoring software for just about every aspect of Microsoft’s data platform. We’re in SQL DW, we’re in Azure SQL database, on premise SQL Server obviously. So, check us out.
I’ve been an MVP for a long time. I started with SQL Server 6.5. I’m a moderator on a couple of sites. If you have database related questions, I think dba.stackexchange.com is one of the best places to go for help. If you have execution plan related problems and you like people like Paul White and Rob Farley to help you answer your questions, answers.sqlperformance.com is the way to go. If you use Plan Explorer, you can actually upload your troublesome plans right from within the tool. That’s pretty handy. I’m an editor on a couple of blogs, sqlperformance.com, and then we have a team blog. I also write for mssqltips.com.
Brent Ozar: And he was the author of the year this year at mssqltips as well.
Aaron Bertrand: I was. I was author of the year this year. I lost out to Koen last year, so congrats to Koen for last year. This year, I was actually the top pick, so that was pretty humbling. That was pretty cool to win that award.
I did something with this deck that I don’t know why, in 10 years I’ve never thought of doing this before. I put my contact info at the bottom of every slide, because I imagined there are going to be people that are tuning in that get through half of this presentation and think this guy’s full of crap or whatever. Then, I might say something that actually interests you, or that you want to respond to, or something like that. So, I think it’s good to have contact info up on every slide. So, if you see a slide and suddenly you think, “Oh, maybe this guy does know what he’s talking about,” you can contact me directly through either of those methods. Usually, it’s just on the beginning and then you would have to go and watch the presentation or download the deck and go to the first slide. So, I thought it’d be neat to put that at the bottom of every slide.
Brent Ozar: That’s genius, I love it.
Aaron Bertrand: A few disclaimers before I get started. One is, I do come across in this session as demeaning and judgmental, please don’t take offense. I’ve learned all of these things the hard way. These are things that I found out why they’re bad habits and why you should do things a different way. Some of them are exceptions. You have cases where this thing works 99% of the time, but if you’re in this 1%, it’s going to cause a problem. And if there’s another way that you can do something that covers 100%, why use the one that only works 99% of the time? So, those are the kinds of things that I try to focus on.
Most of the slides have links in the notes, and if you download the deck, it’s already up on the GroupBy site. If you go to my session page, the deck is already there, the samples are already there, and they have all kinds of details. If you don’t believe me, there’s proof, the background to why things are a problem or why things have changed, and all of the demos obviously.
Then the most important thing for me, I want to make sure that everyone takes away something. When I do this in person, I always ask at the end of the session, “Did anybody not learn anything?” You can’t really trust the answers, but I do appreciate when people will say, “Well I knew everything in this deck.” My goal when I build this presentation is that I know that every slide is not going to be relevant for everyone. So, Brent, you’re going to watch this deck and you know 99% of the things that are in here already. You’re not going to really learn that much. The average person is probably not going to learn something from every slide, but I do want to make sure that I hit something for everyone.
Brent Ozar: If you’ve seen one line of code in sp_Blitz, you would know better than to say that.
Aaron Bertrand: That’s true. That’s true. I have seen that, yeah. All right. So, anyway, that’s the goal of my session. It’s not to fill your brain with an hour of knowledge, it’s to get you that one nugget that might help the problematic stored procedure you might have, or set one of your behaviors to a better way to do things.
The agenda. How I used to break this deck up was really, I just had a buffet of slides. I had 15 slides in here and if I was doing a 60 minute talk, I would hide three of the slides and if I was doing a 90 minute talk, I would make those appear again. They weren’t really in any logical order. It was just a bunch of slides. I would step through a slide and I would do a demo, then the next slide, I’d do a demo. I thought this time that I would break it up into sections, so that there’d be less transition to and from demo, and less crap on slides and more demo. So, I have a section and then I do a bunch of demos related to that section. The sections are: coding conventions, date responsibly, get predictable behavior, query efficiently, and stay employed. The last one is just something that if we have a couple of seconds at the end, I’ll talk about it. It’s kind of a funny little thing.
Coding conventions, I just want to talk about a bunch of things that people do that they could improve readability for other people. A lot of things that when you’re writing SQL code, you think, “Oh, well, this is just for me and it’s fine,” but if you put anything into a codebase or into an application or a production app, someone else is going to inherit that code, because you’re not going to own it forever. Or, someone else is going to learn from it, especially if you put your code into blogposts or answers on Stack Exchange and that kind of thing. Coding conventions are actually pretty important to instill best practices in other people, not necessarily yourself. You might know better, and maybe you’re lazy or you do things a certain way because that’s just what you’re used to, but think about the person who’s consuming that code, and what they’re going to get from that.
Dating responsibly, that’s just a bunch of things where people tend to handle date/time operations or data types incorrectly. They make assumptions based on a closed environment, so US English, there are certain things that a lot of us, I think, take for granted in how things work.
Get predictable behavior. I’m going to show a couple of things where you don’t quite get the things that you would expect.
Then query efficiently, I think that’s kind of straightforward. I’m going to talk about a few things that you can do to make your queries more efficient or a few things that you should avoid that you might think make your queries efficient, but don’t really, nolock comes to mind.
So, coding conventions. Coding conventions are preferences. This is a very subjective thing. Nobody can tell you what is the right coding convention, what is the wrong coding convention. Like I was saying before, you should focus on making your code readable and self-documenting. I should be able to look at a piece of code and just because you’ve made it cryptic by hiding something in a bit column or something like that, there should be enough in the code that I can understand what it’s doing. What standard you choose, not important, I don’t care if you like upper case in your keywords or lower case in your keywords, none of these things are correct or incorrect. I’m not wrong and we can both be right about the way that we code, even if we code differently—but not tabs versus spaces. There’s no argument about that. We all know that spaces is correct and tabs is wrong.
Brent Ozar: Thank you. Thank you.
Aaron Bertrand: The important thing is, pick a convention that makes sense and be consistent and make sure that your entire team is on board. There’s nothing worse than having to format code just to modify it. You have a co-worker that writes something a different way and you can’t—like some things that I read, I have to reformat before I can actually work on it because I can’t read it the way that it’s written.
A few of the specific things, begin and end. I see a lot of people write code where they say, “If some condition,” and then they just have a statement. After the “if” they just say, “Go ahead and do that.” This becomes a problem when someone else comes along and modifies the code and says, “Well, I also want it to do this other thing.” So they just add that other thing in the indented portion of the code and they don’t understand why it happens every time. So, begin and end, it’s really nice to have those.
Carriage returns and indents, that’s just the code that you saw on the beginning slide, that’s a mess. People post questions on Stack Overflow all the time where it’s literally one line of code and you have to scroll five screens over to see all the code. Nobody can read that. Nobody wants to read that. And everything justified to the left is no good either. Nice to have some indents, and things like that.
Table and column aliases, these are really handy to make your code short. I see code sometimes where someone is doing a join between tables in two different databases and every column reference has database.owner.tablename.column and order by database.table.column. It gets really messy. So, aliases help clean that up. But don’t do things like A, B, C or T1, T2, T3. Trying to reverse engineer this code, I’ve seen, I had an example a few weeks ago where I was trying to answer a question and they had customers as A join orders as B join addresses as C. So you had customers was A and addresses was C and then you’re messing around with all the joins and the aliases later and the A and C makes no sense. Makes no sense.
Name your stored procedure parameters. A lot of lazy coders out there that say, exec sp_who2 active and don’t specify what the parameter name is. For a systems stored procedure, that’s not really a big deal because they’re not going to go and change that on you tomorrow. But for user stored procedures, absolutely should be very careful about naming parameters. Because anytime, someone can come in anytime and change the order of the parameters and now your code breaks. It might be troublesome to troubleshoot that.
Key strokes. So, this is a fun one. I find a lot of people try to improve their productivity by typing fewer characters. One of the examples that I’m going to show is month. Someone will say dateadd M date one, date two, instead of typing up month, which is fine for month, because nobody would ever confuse that for minutes, right? But some people might. So, that’s just a case of self-documenting. But people do all kinds of things all over the place to save a couple of key strokes.
It drives me nuts because if you save four keys strokes today you’re not going on vacation early. You’re not going to make the train versus not making the train if you type four extra key strokes. People put a high priority on saving key strokes and it’s just not—I don’t know, it’s like scraping one bug off your bumper to get better gas mileage. It’s not going to help.
I have some demos about some of these things. One of them is properly defining variable length data types. If you have a varchar—what do you say? You say “var-chair”? How do you say it, Brent?
Brent Ozar: I say I’m a caring guy, so I say “var-care”.
Aaron Bertrand: A caring guy? Okay, “var-care.” So, everybody pronounces this differently and it doesn’t really matter. I’ve always said “var-car.” Anyway, I see people say convert to varchar all the time, and they don’t specify how many characters the varchar should be. That leads to some unpredictable behavior.
Lower casing data types. This is a new one for me. Forever, I would upper case all of my T-SQL keywords and all of my data types. Just always, just upper case, that was just my habit. Made it very easy to differentiate those from objects and other things that are proper cased or lower case, especially if I was reviewing code. Because when I first started, I was always reviewing code in plain text editors. We didn’t have all this nice, fancy color. I didn’t always write in Query Analyzer, and so, I would write a Notepad++ or things like that and it was all black on white or white on black. So, it was nice to have that differentiation.
I learned recently though that when you have, and this is again, this luxury we have with US English and having this closed environment, where we only have certain collation and case sensitivity, that kind of thing. There are certain data types that if you specify, if you type them in upper case, they’ll actually fail. So geography fails if you try to invoke a method on geography and you say upper case and you have a case-sensitive collation. So, what I learned is that you’re supposed to, technically, match what’s in sys.types. Because in certain cases, if you have a user defined type, you need to match what’s in sys.types. Because, if you deploy your code—and it might work fine on your machine, and that this the whole “works fine on my machine” syndrome.
But you deploy to a case-sensitive collation, maybe you don’t even know that the destination is case sensitive, and all of a sudden, all of your code is broken because your case doesn’t match for your data types. So, I’ve recently got into the habit of always lower casing all of my data types. I see my old code that I posted, even a year ago, and I’m like, “Those data types are upper case,” and I feel compelled to go and fix those.
Match exact case for entities. IntelliSense really helps out with this a lot but you need to make sure that your case matches, for the same reason. Your code could end up on a case-sensitive collation and then SQL Server’s not going to be able to find dbo.employees if you put a lower case e. Then the metadata it stores an upper case E. It will actually expect those to be different objects.
For T-SQL keywords, like I said before, if you like upper case keywords or lower case keywords, it doesn’t matter. Just, again, pick something, and be consistent.
Semi-colons. So, Adam, we had a discussion this week with Adam, and Adam wants some confirmation that he should be using semi-colons and I absolutely think he should. I think everybody should be using semi-colons to terminate all of their statements, and I’ll show some examples of why.
The schema prefix. One of the coding conventions that I see a lot of people do is they’ll leave out the schema prefix because right now everything’s in DBO or they want to use the behavior where everything looks like your default schema. So, we’ll take a look at some examples of why you don’t want to do that.
All right. So properly defined variable length data types. I have some code here where I’m declaring a variable as a varchar and I’m setting it equal to my first name. I’m going to select that and I’m also going to convert in-line to a nvarchar, the string, “There once was a man from Nantucket.” Brent, what do you think will show up here?
Brent Ozar: Oh, I remember, there is a gotcha, and I can’t remember—I think Aaron works, but “There once was a man from Nantucket” gets clipped and I have no idea where.
Aaron Bertrand: Okay.
Brent Ozar: I’m probably wrong on both but… dammit!
Aaron Bertrand: Selecting the variable ends up with an A. So that truncated at one character. Doing a convert truncated in the middle of Nantucket.
Brent Ozar: Golly.
Aaron Bertrand: So, there are cases where a varchar without a length is one character and there are cases where varchar without a length is 30 characters. It will silently truncate, and that, if you have a parameter to a stored procedure that takes that in, the data is gone. It doesn’t get logged. There’s no error. There’s no warning. It just gets truncated.
Brent Ozar: And it’s like six hours of debugging right there as to why your code doesn’t work.
Aaron Bertrand: Right. So, let’s say you have the table, call it dbo.ruh_roh, that has two varchar max columns. And now I have a stored procedure, and it’s not just about specifying a length or not, it’s also about specifying the length that’s not long enough. So, I have a stored procedure here that takes two parameters, parameter one, and parameter two. One’s a varchar, and one’s a varchar 10. All they do is just insert those values directly into the table. Then I execute that stored procedure with two parameters. One is some important data and the other parameter is even more important data. Now, if we take a look at what actually got stored in the table, we don’t have all of our important data.
Some important data became an S and even more important data became even more. So, I call that stored procedure, and because maybe my unit tests just checked for success, I don’t think there’s a problem here. So I deploy the stored procedure to my production environment, people start using the new stored procedure, and we’re losing data all over the place, and we have no idea, no idea that we’ve lost all that data. Because it’s varchar max. Why would anything be truncated?
Brent Ozar: Wow.
Aaron Bertrand: Because the parameter truncates it and it’s silent.
Brent Ozar: Nicely done.
Aaron Bertrand: Another thing about declaring the right size for your varchar. If you have a column, I see what a lot of people do is, they’ll create a table, and they’ll put a varchar column in there. They’ll say, “Well, I don’t really know how big the biggest string will be, so let’s just do max, or let’s just do 4,000 or 8,000.” They’ll create that table and they’ll put that information in there. Even if no value was ever more than 10 characters, at least they’re covered, just in case. But it actually does cause problems.
So, we have three tables here, and they have columns A, B, C and D. They’re all nvarchar 32, one is all nvarchar 4000, and one is all nvarchar max. So, we have those three tables. I’m just going to insert a whole bunch of data in here and the data is all going to be very short. So, it all fits into a nvarchar 32 no problem. But we have these tables that were designed with larger sizes.
Brent Ozar: So, for those of you who are watching, there’s three different insert statements there but the reason the first one’s different is he’s dumping those all into T1 first, then he’s just taking T1 and populating T2 and T3.
Aaron Bertrand: Right. I’m saving keystrokes. Which made this demo come together much quicker.
Brent Ozar: He’s going on vacation.
Aaron Bertrand: I got this done so much faster. So, now I’m going to run a few queries, against these tables. I’m just going to run them each once. It’s just select distinct A, B, C, D. I’m doing this on purpose so that there is grouping and sorting. So, we’re going to need to use memory in order to satisfy these queries.
Brent Ozar: Seems like the second one’s taking a little bit longer.
Aaron Bertrand: Yeah. The ones with the wider columns do take a little bit longer, we’ll see why in a second. So, it took 30 seconds. Let’s see where all the time was spent. You would think, “Oh well, they probably each took 10 seconds,” right? They’re returning a bunch of data to a grid. But if we take a look at sys.dm.exec.query stats, we see what happened. So, even though we selected the same data we ran the exact same queries because of the data types, SQL Server made assumptions about the size of the data that would be returned and therefore the amount of memory that would be required. It actually generated different plans because of that.
So, when SQL Server looks at a column and you’ve decided, “Oh well, we’ll just make this nvarchar 4000 so we’re covered just in case.” SQL Server actually believes that the average value will contain 2000 characters. So, when you have varchar 4000 and it’s vastly oversized and all of the values are 10 characters you’re actually—the memory that SQL Server will grant to this query is 2000 bytes per row, just for that column, instead of the 10 bytes that it really needed. So, you can see how the granted KB goes way up over time and how that actually affects the elapsed time.
Brent Ozar: I’m going to guess that that monitor behind you is so big because when you display varchar max data you actually need to show all of it.
Aaron Bertrand: That’s true. It wraps around the entire room. There’s a monitor here and a monitor here. And I spin in circles.[Laughter]
Aaron Bertrand: Alright, so what else do we have? Oh, so lower case data types. So, create database floobmort, and I’m going to use collation of Latin general N2. Now this is the case I was talking about where you have something like geography and you think, “Oh well, it doesn’t matter. Case for data types doesn’t matter. I can do this and I can get an answer and it works no problem. Here’s my spatial results.” No big deal. Well, it doesn’t really show anything, it’s just a stupid line. Now if I try the same thing in that case-sensitive database with upper case geography, it tells me that’s not even a system defined type.
Brent Ozar: That’s amazing. That’s insane.
Aaron Bertrand: So, you have to be very careful about that and I’m telling you you’re going to have people that deploy SQL Server in all kinds of weird ways, with different collations and different case sensitivity, and you don’t know where you going to end up. So it’s much safer to just be safe and match whatever is in sys.types. This isn’t as important for varchar. I couldn’t find a case where varchar would break if you typed in an upper case. But just as a consistency thing, just match what’s in sys.types. I just think it’s safer, that might break in the future, I don’t know. But I would be very careful about that, lower case your types.
All right, so matching exact case, so let’s have a table called sales orders as an int column. I’m going to free the procedure cache and I do this in lot of my demos just so you don’t think I’m pulling the wool over your eyes or benefiting from something I did before. I’m just going to run queries against this table. I’m going to run three different queries and I’m going to run one twice just so it’s clear which one is which in the results. So, select top one sales order id from dbo.salesorders. Then the second version of the query has all of the entities in lower case and the third version of the query has all of the keywords in lower case.
This is going to demonstrate two things. One is to use consistent code for your keywords. So, if you like upper case, always use upper case. Don’t do this, “Well, most of the time I use upper case, but in this one situation I’m going to use lower case.” Then the other is obviously the entities themselves. So, let’s run these queries. I have nothing in the cache, so no tricks here, no magic, anything like that.
Then if we go look at what has been cached in the plan cache, I’m just taking a look at sys.dm.exec_cached_plans and then cross applying to sys.dm.exec_sql_text, so I can find my queries where lower of the text is like “sales orders.” If we look at those, I actually stored three different copies of that identical query in the cache just because of differences in case. So, the query text is hashed to binary. So when you have an upper case s and a lower case s, that ends up as a different value. So, SQL Server has to store a separate copy of that plan. So, I ran that same query four times and I actually had to store three different plans. I tripled my plan cache usage for that one query because one of my developers used lower case and one of my developers used upper case.
Let’s take a look at something else.
Brent Ozar: Albanian.
Aaron Bertrand: Yeah, well, it was first in the list, it’s alphabetical.[Laughter]
I didn’t pick Albania because I’ve got anything against Albanians. So, I’ve created database called Boom. Same thing, I create a table called sales orders. And maybe I’ve deployed this code using SSDT or a DACPAC or whatever they call those things now.[Laughter]
So, selecting sales order id from sales orders works. Selecting sales order id from dbo.salesorders in all lower case doesn’t even work, it can’t find the object. So, that’s no good. But it will let me create a stored procedure with the wrong case. So, if my initial testing is just, well, the procedure compiled, so we must be good, right? That’s not necessarily going to be the case because deferred name resolution allows you to specify objects that don’t exist yet and SQL Server will assume that you didn’t type this wrong, that there’s an object you’re going to create later called dbo.salesorders. I think we need a feature in SQL Server that allows you to turn off deferred name resolution.
Brent Ozar: Especially stuff with if branches. Because if I screw up one if branch, and usually everybody hits the other one, you know my stuff compiles, it runs fine. But when they hit that one odd if branch, the thing explodes.
Aaron Bertrand: Right. Right. So, semi-colons, Adam, if you’re listening. So, with every new version more and more syntax requires semi-colons. I believe in 2008 they deprecated transact SQL statements without semi-colons. This is something that might not happen in our lifetime, but in case it does, I really think that everybody should be using semi-colons at the end of their statements.
I have a few examples here of where it can happen, or where it can bite you, and how it might be difficult to troubleshoot. I have a stored procedure here called problem and it takes i as an integer. What we’re doing is we’re going to begin try, we’re going to start a transaction. We’re going to select one over i, you can probably see where this is going. Then we’re going to commit the transaction. We’re going to go into end try.
When we go into our catch handler, we’re going to have roll-back transaction and then throw. We’re going to throw the exception, which is a much more powerful method now than raiserror because throw allows you to raise the initial exception instead of a custom, a user defined one. And then end catch, end, and go. And so, we’re going to call the stored procedure with the parameter of i is equal to one, and it works no problem. Oh, and I already created the procedure. Actually, I had create or alter here, but then when I uploaded the samples I was like, “Well, I can’t put create or alter because not everybody who runs the samples is going to have a version of SQL Server that supports that.”
So, I have the stored procedure already created and I’m going to execute it with a parameter of one, and it works no problem. I got select one over one, which equals one. Then now I’m going to try it with exec dbo.problem @i is equal to zero. You would expect a divide by zero error, right? Because we’re trying to divide one by the parameter and if it fails, obviously we’re going to go into our catch handler. We’re going to roll back and we’re going to throw. So that’s how it’s going to work, right?
Brent Ozar: Yeah. It should, yeah.
Aaron Bertrand: Yeah, not quite. So, what happens is, “Cannot roll back THROW. No transaction or savepoint of that name was found.”
Brent Ozar: What?
Aaron Bertrand: Yeah. So, a “Transaction count after EXECUTE indicates a mismatching number…” So, now imagine you’re the developer trying to troubleshoot this. What do you do?
Brent Ozar: What?
Aaron Bertrand: The problem is because roll back transaction did not have a semi-colon.
Brent Ozar: That it thinks it’s a transaction with throw?
Aaron Bertrand: What it thought this was is roll back transaction throw.
Brent Ozar: What an idiot. Wow.
Aaron Bertrand: So, that is all caused by not having semi-colon after roll back transaction. If I change this to alter and I just add that one semi-colon here—never type in a demo by the way, this is bad. So, I’m going to change my stored procedure, and I’m going to do this again, exec zero, now I get to divide by zero encountered.
Brent Ozar: Nice. Wow.
Aaron Bertrand: So, all I have to do is add that one semi-colon and problem goes away.
Brent Ozar: That’s horrible.
Aaron Bertrand: There’s some other new syntax, so, merge. We’re going to talk in a few minutes about merge and why you shouldn’t be using merge, I think anyway. But if, let’s say, we have a table called X and of course it already exists because I ran through my demos this morning, and now I’m going to run a merge statement. This should work no problem, right? Merge X as X using values as Y on X.I is equal to Y.I when matched then update set I is equal to Y.I, when not matched then insert. So, it’s just an upsert using merge. Except a merge statement must be terminated by semi-colon.
Brent Ozar: Come on.
Aaron Bertrand: So, there are cases where you have to put a semi-colon there.
Brent Ozar: That’s kind of cheesy because if you’re good enough to memorize all the parts of the merge statement, and you forget the semi-colon, you deserve like a special prize, because the merge statement is the hard part.
Aaron Bertrand: Right. There should be a consolation, “Oh, you did it, first try without a syntax error? You don’t have to put a semi-colon. You’re good.”
Brent Ozar: We’ll give you bonus points.
Aaron Bertrand: The excuse that I always hear, the only justification I ever hear for not using semi-colons is that, “Well, I don’t have to.” I haven’t found a good reason to not use semi-colons. It’s just the “I don’t have to” excuse and I don’t think that’s good enough because what I feel you’re doing, you’re just justifying more technical debt. So, I’ve a blog post out on SQL Blog where I’ve had a bunch of people comment back and saying, “Well, Microsoft will never enforce this. Blah, blah, blah.” I want to bring up some political stuff right now, that people promised we wouldn’t do, or promised and did. I’m not going to get into that, but what I say to them is, “What you’re doing basically, is if you’re saying, “Well, I’m going to adamantly not put semi-colons in my code,” you are making your codebase. You’re just adding more to your codebase that you’ll have to fix later, if they do ever enforce this.
Brent Ozar: They’ve made worse decisions before. They’ve made stranger things happen before.
Aaron Bertrand: Right, right. Notification services. That just happened. This has been announced for nine years now, they’re going to make semi-colons mandatory. Nine years. Just because they haven’t enforced it yet doesn’t mean they won’t. So, when I got really adamant about always having semi-colons in my code was I used to post answers to Stack Overflow that use CTEs. CTEs came out in SQL Server 2005 and they were the first real syntax problem where semi-colons were relevant.
So I would write an answer like this, and obviously this wouldn’t be an answer on Stack Overflow because it’s too simple, but just imagine that this was a much bigger piece of code. I would say, “Here’s a CTE that does this with XI, select one. Select I from X.” I would paste that into an answer and I would post it on Stack Overflow. An hour later, I would get a nasty-gram from this person because what they did, is they would take this code, just like this, and they would have a stored procedure that they were running, and they would paste it right here, like this.
They would say, “Well, I took your code and I put it in my procedure.” So, let’s create the stored procedure. And it would break. “Incorrect syntax near the keyword with,” because they weren’t using semi-colons. So their code had this declare at I int, that wasn’t terminated. And now I’m the bad guy because I broke their stored procedure basically.
Brent Ozar: Except if this really happened on Stack Exchange, it would also have things like, “My client asked me to” and “I got fired because… ”
Aaron Bertrand: So, I just find that it’s very rare that you can have too many semi-colons. So, you can put as many semi-colons into this thing as you want. This is valid SQL, all these semi-colons in here, you can do that. It’s stupid and I would never do that, but I can.
It’s usually more common that a missing one causes problems than anything else. So, Adam was very adamant that I needed to convince him to use semi-colons, but it doesn’t look like I need to convince him at all. This is a snippet from sp_whoisactive and it’s actually pretty consistent in there. He uses semi-colons even on things where I don’t. Like begin and end. So, he’s already doing this. So, the reason I brought it up to Adam specifically is because in his session last week, and I downloaded the demos to make sure I wasn’t hallucinating, not a single semi-colon in any of the demos.
Aaron Bertrand: So, I just wanted to kind of call him out about that.
Brent Ozar: Oh, jeez.
Aaron Bertrand: So, it’s do as I say, not as I do. Because his actual production code does have semi-colons.
Brent Ozar: Did he show you that too, or no?
Aaron Bertrand: I’m sorry?
Brent Ozar: Did he show you his production code?
Aaron Bertrand: No, he did not. But I’m just basing it on sp_whoisactive. All right, so last demo in this section, we’re going to create two schemas called Brent and Aaron. I’m going to create two users called Brent and Aaron, and they have default schemas that match their names. I’m going to create a simple table in the dbo schema. And I’m going to grant select to both of these users.
So, just to prove I’m not messing with anything, I’ve got another [mumbles freeproccache] blah, freeproccache. I’m going to execute these two statements as Brent and as Aaron.
They’re both doing the exact same thing, select ID from simple table, and we’re going to see what happens. So, there’s nothing in the table, but I did cache some plans. So, I’m going to check the plan cache again and see from cache plans and sys.dm_exec_sql_text, get me all the queries that involve simple table, and text is not like sys. So I don’t want to pull back my queries against the system objects. And we see the same query stored in cache twice and using up twice as much plan cache memory as I need to for this query. They’re both accessing the same table, and they’re both returning the same data. So, why should I have two different plans for this that are identical? The only difference is that they have different default schemas.
So, if we go back, and I go back into this code and just do what I should have done in the first place, which is type dbo dot at the beginning of these things. I’m going to free the proc cache again, and I’m going to run these batches again with that one change. All I did was add the schema prefix like I should have always done. I get the same queries back. Now if we check the plan cache again, now we have one copy of that plan. I’m only taking 32K of plan cache memory, and it was used twice. So, I get plan cache reuse because I specified the schema prefix correctly. There’s an example down here that shows why that is and you can dig a little bit deeper into the exec plan attributes.
Brent Ozar: Oh my goodness, I didn’t have the questions panel open and you guys have poured in a ton of questions. We will go through and ask the questions at the end.
Aaron Bertrand: Yeah, that’s probably good. All right, so act two is about dating responsibly. One of the things I see people do, and I mentioned it already, is date/time shorthand. So, they do things like date at d instead of date at day, because saving two characters really changed their day. Or get date minus one, you ever do get date minus one to find right now yesterday? You’ve done that, right? Yeah, so we’ll talk about that.
Regional formats, have to be very careful about regional formats. People love to pass in date literals as M/D/Y, or D/M/Y, and I’ll show why that’s a problem.
Between, a lot of people like using between for date range queries and it’s very problematic because it’s very difficult to determine the end of a period. What is the end of February? Really. Can you tell me?
Brent Ozar: I’m so bad with milliseconds. February 28th and some change.
Aaron Bertrand: Right, if it’s not a leap year.
Brent Ozar: Oh, yes. Yes.
Aaron Bertrand: Yeah so, calculating the end of a period has a bunch of little complications.
Format, so people like format because they can now have output in SQL Server that looks just like the output in C# with the same incoming format, right? But there’s some significant overhead with CLR.
Then non-sargable expressions against columns. People do things like year of order date is equal to 2017, and month of order date is equal to one to find all of the orders in January. And that’s not necessarily good either, especially if you have an index on that column.
Then predicates using dateadd/datediff, I don’t have a demo of this, but I do have a link in the slide notes. I have two links in the slide notes actually that show a couple of bugs in the optimizer that make for really bad estimates when you’re using dateadd or datediff in the where clause. All right, so let’s go to a demo for this one. So, act two. So Brent, you have a piece of paper and a Sharpie handy?
Brent Ozar: Mm-hmm.
Aaron Bertrand: I’m going to have you take a little quiz. I’m declaring a date variable as get date. So, this is today, no tricks, I didn’t change my system time or anything like that. I want you to tell me what these four answers will be. So, just write down your answers for datepart D.
Brent Ozar: Okay, so datepart D is going to be B. datepart week is… oh man, I don’t know the calendar. Week, W, shoot. I don’t have a, where’s a calendar… Let’s see here, it is the third week? So, I’m going to guess something is not on your list. I thought the second one datepart week would be—oh no, it’s week day, it’s week day, it’s week day. Oh man, Microsoft Certified Master at work. I’m going to say datepart… That’s going to be… God, I’m going to guess five, I’m going to guess that that’s B five. Month? Oh, see I don’t know if M is month or minute, and I would have to hit Books Online. I’m going to guess this minute, I’m going to say C minute, and then so the datepart of the year I’m going to say is 2017.
Aaron Bertrand: All right, so you’ve written down your answers?
Brent Ozar: Yes.
Aaron Bertrand: Let’s hold them up and compare.
Brent Ozar: All right, cool there we go. So I got 20 right! I got five and six wrong, because I wasn’t sure when the weekday starts. I don’t know if it depends on your…
Aaron Bertrand: Yeah, Friday is six.
Brent Ozar: Yeah, and the datepart I got wrong, because I wasn’t sure if it’s minutes or months, and how the hell is datepart Y 20?
Aaron Bertrand: Alright, so the reason is, so D that’s just—I did a progression of date/time, so D isn’t really a trick, right? D and day are the same. So today’s the 20th day, no problem, you got that right. W, that’s day of week, not week, a lot of people will look at that and say, “Oh it’s W, oh that’s week,” easy right?
Brent Ozar: Yeah.
Aaron Bertrand: So, day of week is six, and actually I should’ve had three and four as the week or the ISO week. Yeah, so that’s my bad. Then datepart M, so you said a very interesting thing, “When I see datepart M I would have to go look at Books Online to see what it means.” That’s not self-documenting code. If you have to go look at Books Online to figure out what you’ve written, that’s terrible, right? So, that’s actually datepart month, that’s not minute.
Now datepart Y, that’s not the year, that’s the day of the year. So, if you meant Y, if you meant year, just type year. It took you eight extra micro seconds. There’s nothing wrong with actually making this self-documenting, and I harp on people all the time about this, just spell it out, type it out. If you meant month type month, it’s not hard. It’s a few extra key strokes and it’ll prevent you from having to go look in Books Online.
Brent Ozar: There’s a saying in drivers, if you make another driver hit the brakes then you’re a bad driver. Doesn’t matter how good you think you are, if you make somebody else hit the brakes—if you make me hit Books Online when I’m reading your code, you suck as a developer.
Aaron Bertrand: Yeah, that’s true. So another one is the short hand, the plus one short hand. If I do this, if I use datetime and I say declare @dt datetime is equal to current timestamp, and I say select @dt plus one, no problem. I get tomorrow at this time. It’s 8:44 east coast time tomorrow. If suddenly the data type of the column where I’m performing this plus one or minus one operation against changes or the data type of the parameter, and that can happen outside of SQL Server, I can declare a parameter in my SQL code inside of C#. And that would be invisible to whatever code is in SQL Server. So, declare @d datetime2 is equal to current timestamp, and I try select @d plus one, I get an operand type clash.
So, the new datetime types that were introduced in SQL Server 2008, they don’t play well with others. They don’t do all of the things that the old types did, and implicit integer math is one of those. So, be careful about that. Just use explicit date at.
Regional formats. So we have something like this where these three formats, year.month.day and month/day/year and YMD space time, these all work in my local system because I’m using US English and I have an English locale, and they all produce September 8th, no problem there. However, if I do something silly, like change my language to French, no one would ever do that, right?
Brent Ozar: America.
Aaron Bertrand: I select those same dates that I thought were safe, I actually get August 9th for all three. Even this one, which surprised me the first time I saw this. So, I have Y-M-D and this is interpreted as Y-D-M in French and German, and most of the European languages, this is interpreted as Y-D-M for whatever reason, I don’t know why.
Brent Ozar: Same reason that they won’t use inches and pounds and feet.
Aaron Bertrand: Yeah, I guess so. Yeah. So, for this case, if I want this to be safe I have to use YYYYMMDD. Or, if I have time I need to use the dashes and put a T in where that space was. This tells SQL Server, I don’t give a rip what their language is, I don’t give a rip what their locale is, this is how I’m going to interpret this. So even in the French language, this is going to be interpreted as September 8th the way I intended. This is one of those cases where there is an exception, where it always works. So date, if I’m converting to date it will always work correctly, whether I put the dashes or not, but that doesn’t mean that you should do that. Use these two formats, these are the only two that are truly safe in SQL Server across the board.
Brent Ozar: And it sucks because for some reason it makes my eyes hurt.
Aaron Bertrand: It is a little harder to read, yeah. So I’m going to create a table called sales orders and it has an order date column called datetime, or that’s datetime2. I’ve inserted a bunch of data in here. So, February 1st, February 5th, the 19th, the 28th, the 28th at 1:27 in the afternoon, the 28th, just before midnight, and March 1st. We can look at the data in this table and I’ve got my seven rows here, there are seven rows in total, six of them are in February and one of them is in March.
So, the old millisecond trick, so this is what people will do to find the end of February they would say, “Well, the start is February 1st,” let’s say I’m trying to find all the data from February 1st to the end of February. The start is February 1st. The end is, well, let’s take March 1st and subtract three milliseconds, so we got 23:59:59.997.
Now when we select the data from that table between those two points in time, we only get five rows. We’re missing a row, because we had datetime2, we’re missing this row right here. Because .999 is not between midnight on the 1st and 23:59:59.997 on the 28th. So, we lost data. It might be a very small portion of data that we lost but we want our reports to be right.
It gets worse. So, let’s say the parameters for this stored procedure changed to small datetime. So, I’m passing in the same information. The start is February 1st and the end is the beginning of the next month minus three milliseconds. I’m going to do between start and end. Now, because it’s small datetime, it rounded up. I included data from March 1st, but I didn’t know that. My application, I’m the application developer and I’m passing parameters into this stored procedure and the stored procedure parameter data types happen to change. Or maybe they convert into a local variable or something like that.
Brent Ozar: Wow. What’s the resolution on small datetime? I don’t even know.
Aaron Bertrand: Small datetime is minute.
Brent Ozar: Oh wow.
Aaron Bertrand: Yep. It rounds, if you have at the end of a minute, it rounds up. So, if you try to find the end of the day, it rounds up to the next day, which is not good. Makes your data wrong. So, parameters changed to date. We have start date is equal to February 1st and the end is three milliseconds before March 1st and we run that query. Now we only have four rows. So, because we said start is between the 1st of the month and essentially the 28th of the month at midnight, we lost all of the rows from after midnight on the 28th. So we have only four rows.
Brent Ozar: Several folks have asked in the Q&A and in Slack, why minus three instead of minus one? It’s that there’s three millisecond resolution for datetime.
Aaron Bertrand: Right. Three millisecond resolution for datetime in SQL Server. Then eomonth, I see a lot of people all excited about eomonth and it’s terrible. It does the same thing that we just did here where we convert to date. So, declare at start date is equal to February 1st and you think, “Oh well, I want everything from the start of the month to the end of the month. Eomonth tells me, it’s the end of the month. Now I don’t need to calculate it. Yay! How awesome is that?” Except the end of the month is the last day of the month at midnight. Not truly the end of the month.
Brent Ozar: Oh god.
Aaron Bertrand: So, what I recommend for this, never use between for date range queries. Always use greater than or equal to the beginning of the period and less than the beginning of the next period. So, instead of doing all this fun math and everything else…
Brent Ozar: Good demo there.
Aaron Bertrand: Yeah, so stop using between, open ended range is safer and far easier to calculate than the end of a period. So, if I say, declare at start date. If I want all the data for February, I just say, where it’s greater than or equal to the 1st of February and less than the next month. So, less than date at month one start. This will always work. It doesn’t matter what my data types are. I’m always going to capture all of the rows from February and none of the rows from March.
Brent Ozar: Then in Slack, D. Briskin says, “That only works until developers pass in 12.31.9999.” Come on now!
Aaron Bertrand: All right. So the Y10k problem, got it.
Aaron Bertrand: I won’t be around when that happens, so enjoy that. Next thing I’m going to do, format. I’m going to show how inefficient format is. So, these two queries return the same output. So, here’s convert char (8) 112 returns 20161210. If I use format and use YYYYMMDD, returns the same output. These are the same queries, except one uses format and one uses convert.
So, I’m going to free the procedure cache and I’m going to run both of these queries 50 times just to see what the average run time is. So now, a lot of times what people will do is they’ll set up datetime variables at the beginning or the end and they’ll time these things, or they’ll have a table where they log stuff. This is all in query stats. So, we can go look at sys.dm_exec_query_stats and see what the run time of these things were.
So, here’s convert. Total elapsed time, 725 milliseconds. For format, 5,078 milliseconds. So, the average time, it’s almost 10X performance penalty you’re paying to have this nice pretty format instead of memorizing what 112 means.
Brent Ozar: Ouch.
Aaron Bertrand: So, format looks like it’s something that’s cool that you should be doing and it’s actually pretty expensive to do.
So last one for this section. I’m going to create a new table called objects and it’s just 50,000 objects and columns and modify date from sys.all_objects. I’m going to create a unique clustered index on ID and column ID. Then I’m going to create a non-clustered index on the modify date column. Then I’m going to run these three queries and just look at the execution plan for these.
So, I want to find all the data for January of this year. All the objects that were modified this year. And I have no idea what the answer is because I’ve made a bunch of changes in this database, but select ID from dbo.objects where year of modified date is equal to 2017 and month of modified date is equal to one.
Pretty simple, right? People write this kind of thing all the time. Or, I can do it this other way, where I strip the time from the date by making it a string. So convert, modify, convert char 10, and actually that should be convert char. So, people will say, well, I don’t need to care about stripping time or doing it any of that, I can just convert this thing to a string where it’s six characters which means…
Brent Ozar: This is the safe one.
Aaron Bertrand: Right, this is the safe one. Then the last one, well this is one that’ll show me year-month and that’s it. It doesn’t care about what the day or the time is. Or I can do it the way that I’m supposed to do it which is greater than or equal to the beginning of the month and less than the beginning of the next month.
Brent Ozar: That’s hard work.
Aaron Bertrand: Because remember, we’re using an index. So, when you have an index and you do anything against that index, against the column in that index, you effectively negate the use of the index. You can’t use the index if you apply an expression to it first.
So we’ll take a look at the execution plan for these. The two methods we had where we apply transformations to the column cost a lot in terms of estimated cost at least. The one that used my preferred method of greater than or equal two and less than, is much less, and you can see why. You don’t have to necessarily take too much stock in these numbers, but just look at the plan itself. We’ve got an index scan and an index scan versus an index seek. And a seek isn’t always the best way to get data from a table, but in this case, it probably is.
Brent Ozar: What was the yellow bang on the second one?
Aaron Bertrand: The yellow bang?
Brent Ozar: On the select…
Aaron Bertrand: Right here?
Brent Ozar: No, run that query again and there was a yellow bang on the select in the execution plan.
Aaron Bertrand: Oh, probably a conversion. So, if we convert it to char 6. It’s just a cardinality estimate warning, based on the conversion.
Brent Ozar: Love the SSMS team.
Aaron Bertrand: Okay. Let’s move on. Predictable behavior, so merge. I’m sure you’ve used merge, I’m sure you have clients that have used merge, I have changed almost every client I’ve seen using merge to not use merge. So, I have a URL here bit.ly/merge-with-caution. I don’t have a demo for this, but there are a bunch of problems with merge.
One is one that Michael Schwartz has blogged about as well, and that is that most people don’t realize that just because merge is a single statement, it’s not a single operation, it’s not an isolated operation. It actually handles those operations underneath—the update, the delete, the insert—it handles those as separate operations. So, if you don’t use hold lock or serializable transaction around that merge statement, you’re actually still doing the same thing you used to do with update and insert, except you don’t have any transactional control over that. So you can still end up with raise conditions.
I also, in this URL, I have I think 14 connect items that haven’t been resolved yet. Some of them are no big deal, some of them actually lead to the ability to bypass constraints because of the order of the operations that merge does. My suggestion if you are thinking about using merge, just use the way that you already know works. Just use update and insert.
Sp_MSforeachdb, please don’t use this stored procedure to loop through databases and do something. I wrote a replacement that’s in the slide notes and the reason is they use a cursor, and this is an undocumented and unsupported procedure, so you probably shouldn’t be using it any way. But the way that they use the cursor in this stored procedure, it means that if there’s any activity in databases, specifically that affect sys.databases, this stored procedure will actually skip that database and it’ll essentially do a repast and it won’t tell you. So, if you say, “Let’s do fragmentation, some kind of index de-fragmentation against all of our databases,” and it skips a database, it doesn’t tell you, “Oh, by the way, I didn’t do this database.” It just doesn’t do it and that can be a pretty big problem.
There are a lot of common myths about order. One is that a table or query has a default order. So if I insert a bunch of data into a table and I say select * from table, I’m always going to get data in the same order, right? No, you won’t. An outer query has to obey an inner order. So, if I have a CTE or a sub-query that says select top 10 * from wherever, order by something, and then I select from that CTE on the outer thing, I don’t have to put the order by anymore, that’s not true either.
Top 100 percent order by. I still see people do this, they put this in views and they expect that the order by will actually do something, it doesn’t do anything. More importantly than all of these, is that the optimizer will process your query in a specific order. So, people think, “Oh well, if I filter out these rows first and then I filter out these other rows, it would never do that in the opposite order.” Actually, SQL Server isn’t bound to process your statement in the order you wrote it.
I have a few examples for these. So, I’m going to create a simple table with an ID. It’s an integer and it’s a primary key and that means it’s clustered. So, the clustered index is on the integer and that’s just because SQL Server does cluster primary key by default. Then I have a first name column, that’s varchar 32, and I upper case my keyword there—bad Aaron—for my data type. I’m going to purposefully insert three rows. So, Mike, Grant, and Aaron are one, two, and three, they’re not in alphabetical order.
So I have three rows in there. Now, if I just say select ID first name from dbo.what, and I don’t have an order by clause, I’m going to get these back in the clustered index order, one, two, three, Mike, Grant, Aaron. Now, I might have written an application around this that sends this specific query to the database and says select from this table and it comes in a certain order and I have my nice little grid in my application that has one, two, three. Then someone comes along and for some other purpose, might be some completely different person, has a search stored procedure that looks up people by name.
So, they get a missing index recommendation from Management Studio or from digging into plan cache and they say, “Oh well, I’m going to create this index on first name.” So, they do that and then users start calling and complaining because this select ID first name from dbo.what now comes back in alphabetical order. The reason is that it chose the non-clustered index instead of the clustered index.
Top 100 percent doesn’t help. So we have this sub-query where we say select top 100 percent ID first name from dbo.what order by I’d. We select from that and we say, well, if we put this in a view, for example, which some people will try to do and now we don’t have to change the application code, because we’re calling this view. And no, that doesn’t help. I still get it back in alphabetical order, because that’s still chosen.
I can’t add the explicit ordering in the CTE either. Same thing happens, I get it back in alphabetical order. So, the only way that I’ve been able to do this with a CTE is if I actually have to materialize the information from the CTEs. So, if I take the actual ordering column out of the CTE and display it, then, I get the order I want. But that’s not documented, not guaranteed. That’s just observation, don’t rely on observation. That’s what happened here. You relied on observation here that it came out in a certain order, right? So, don’t believe that.
Another thing about order by, don’t using lazy shorthand, like order by one, order by two. That refers to the column ordinal in the select list, and that’s fine for ad hoc stuff, don’t put that in production code. Because what’s going to happen is you’re going to have this big, long query that has order by one at the end, and someone’s going to go throw a column at the beginning of your column list. Or change the table structure. Then it’s going to be a problem. We don’t like maintenance that requires us to go fix things later.
Evaluation order. So, I have a table called ID int. It has a maybe numeric column that’s a varchar 32. I’m going to insert three rows in there. One has a value of 900, two has a value of the name of this conference, and three has a value of 420.
Brent Ozar: Is that a weed joke?
Aaron Bertrand: No, uh, yeah, maybe.
Aaron Bertrand: If I try to do something against this maybe numeric column, I’m obviously going to have an error if I try to filter out the non-numeric rows first. So, select ID, maybe numeric times 10 from this table, where maybe numeric is greater than 500, I’m going to get an error. Conversion failed when converting the value GroupBy to data type int. That’s not because of the math. That’s because it’s trying to figure out if the string GroupBy is greater than 500.
It also fails if you filter it away first. This is the case I was talking about, where I say, “Well, I said, maybe numeric not like—not a number first.” So, this is a replacement for, isnumeric. Don’t use isnumeric. It’s terrible. Use string parsing. So, select ID maybe numeric times 10 from dbo.splunge, where maybe numeric, not like not a number, and maybe numeric is greater than 500. So, SQL Server still tries to do the greater than 500 before it tries to do any string parsing. So, not good.
This even happens if you try to stuff it into a CTE. So, I’m going to use a CTE first and I’m going to make sure that the only thing that can possibly come out of the CTE are the numeric values. We can test that. This is why I like CTEs, because you can test the CTE by itself first. Select these values from the table where the maybe numeric is not like not a string. I get these two rows, 900 and 420. That should work, right?
Brent Ozar: Mm-hmm.
Aaron Bertrand: So, with filtered as, select those two rows, select ID maybe numeric times 10 from filtered where maybe numeric is greater than 500. Nope, still fails. So, SQL Server, it isn’t forced to evaluate the CTE first. In fact, in a lot of cases, it will collapse that. Treat it like an in-line view and it will actually expand it into your query.
Try convert works if you’re on SQL Server 2012 or better. So, select ID may be numeric times 10 from dbo.splunge where try convert, maybe numeric is greater than 500. That works. I don’t know if that’s guaranteed to always perform the filter before it performs the expression. It still might not be a safe thing to do.
In older versions, you can use a case expression. So, case when maybe numeric is not like not a number, then maybe numeric end is greater than 500. What that does is for any row that doesn’t have a numeric value in that column, it will return ‘null’ inside of that case expression and compare null to greater than 500.Then it drops it out of the result. So, that works as well.
What’s in the CTE doesn’t really matter. So, even in this case, I’m doing the try convert, which should filter out those rows first. But it’s still, even though I used try convert inside the CTE, it still fails. So, it’s not going to work in all scenarios. This is why you shouldn’t use isnumeric. So, I’m taking isnumeric of a period, of 1e99, a dollar sign, 0d27 and a tab.
Brent Ozar: [Laughs] A tab.
Aaron Bertrand: Those all return isnumeric is equal to one. So, that’s why you shouldn’t use isnumeric. Terrible. Terrible.
Brent Ozar: Yeah I was going to say Andy says the same thing in Slack. He says, “scientific notation,” yeah that was what I figured too. Anything that even looks remotely like scientific notation will pass.
Aaron Bertrand: All right. I have a couple more demos for the query efficiently section. I know that people are going to use cursors. I don’t expect that everybody’s going to just stop using cursors because people like me say that they’re bad and slow, even though they are. I know people are going to keep doing that. But most people, when they’re using cursors, it’s a firehose cursor.
So, they start at the beginning, they loop through every row, exactly once and then they end, and they’re not doing where update of, they don’t need outer-scope. So, when you set up a default cursor in SQL Server, it has all this extra overhead, all this extra locking, that allows you to scroll backward and forward and skip rows, and do all this stuff. People aren’t writing cursors that do that. 99% of the cursors out there are just go through once and you’re done. So, if you have to use cursors, and I’m not judging you for using cursors, make sure you use the right options and I’ll show that.
Select *, this seems like an obvious one that I don’t have to say but I still find it in production code all the time. Don’t use select * in production. Aside from the performance, I’ll show you why, maintenance wise, you don’t want to do that.
Count of a table should not need to touch actual data. I see people all the time saying select count * from table. They lock the entire table and wait for that count to come back. I always argue that they should use the DMVs for that because the DMVs have a pretty accurate picture of how many rows are in that table at any given time. Not transactionally consistent, because it’s in memory and there are transactions and action.
They say, “Well, I need my count to be super accurate.” I say to them, “If your system is busy enough that the DMV is not accurate, here’s what you’re doing. You’re locking the entire table.” Right? Because the count is going to escalate to a table lock. You’re going to lock that entire table and you are going to scan the entire thing and count all those rows. If your system is really busy, what are all those other transactions doing while you’re scanning that entire table? They’re waiting.
So, you’ve locked the table, you’ve scanned that entire thing, you get your accurate count. Then you release the locks on your table. Then you release the locks on the table and all those transactions come in and insert and update and delete. How accurate is your count now? It’s not accurate at all. It would have been more accurate if you looked at the DMV for that point in time. Because the scan took time and a whole bunch of things happened after the scan finished.
Try not to use nolock like a blanket. I see a lot of people that have this Linux nolock complex where they actually have policies in their organizations to put nolock on every select statement. That’s bad and we know about all the little things that can happen with nolock. I’m going to show a really fun one.
Pick group by over distinct generally. That is because in some cases what’ll happen is distinct and group by will do essentially the same thing. They’ll eliminate duplicates and group by isn’t the most intuitive way to do that but because group by is used for aggregates, not for eliminating duplicates. But what distinct will do often is if you have an expression that happens and it’s eliminating the results of that expression—so, let’s say you are doing XML for XML path string concatenation. It will create all of the rows that have that string concatenation and it will do the concatenation every time for every row and then it will apply distinct. Group by will group the rows first and then only apply those calculations, once per de-duplicated row.
A few examples here. Cursor options. I have a couple of variables, sys.name, declare @i sys.name @d datetime2. Datetime2 is just to track how long it took to run this cursor. I’m going to declare C cursor for select name from sys.all_objects, cross join select top 50 sys.all_objects. This is just a nonsense query to chew up some time. Open the cursor, fetch into the cursor, while the fetch status is not equal to negative one. I’m just incrementing an empty string to a variable and I’m closing and de-allocating and measuring the amount of time. Pretty simple thing. So, we’ll see how long that takes. Brent, do you have any jokes? You must have a joke.
Brent Ozar: Golly, you know, I don’t have any database jokes. There’s always the two DBAs walk into a bar, a NoSQL bar, and they leave because they couldn’t find a table. That’s really bad. That’s all I got.
Aaron Bertrand: So, that cursor took 8.4 seconds to run. Again, all it did was just loop through the values in the row and do something silly. Now, if this cursor is like most cursors, it can be a local fast forward cursor which means it’s not visible globally. It means you can’t scroll and it means you can’t use where update of. We run this exact same code and I’ll drop the clean buffers again just to make sure I’m not fooling you by benefiting from having sys.all_objects data in memory. We run that code again and all I changed was local fast forward on my cursor. It finished in 1.5 seconds. You wouldn’t even have had time to tell your joke.
Brent Ozar: Dang it. I just looked up a joke.
Aaron Bertrand: So, I quadrupled the efficiency of this query just by adding those things to the cursor.
Schema stability. When we have select *—we have a simple table with two integer columns. We insert two values, one and two. Now we have a view that says select * from dbo.X. Now I can reference the view and get all the rows from that table. Someone comes along and says “Well you know what, that column B, that really should have been called C.” This doesn’t happen often, but it can happen. Then they say, “What B really should have been is a datetime column. Then we’re also going to add this other column D, which is new ID.
I get a warning when I change that column. The person who wrote the view didn’t get that warning, right? It doesn’t send out an email to those people. If I select from the table, I’ve got my four columns now: A, C, B and D. B is a datetime you’ll notice and C is an integer that has what used to be in B. But if I select from the view, I get a very different thing. These are supposed to reference the same objects. Select * from the table and select * from the view and the output looks very different. Even though the view should really just be a pointer to the table. But what it ends up being is a pointer to very specific metadata from when the view was created, what the table looked like at that time.
That’s not good. Now your application breaks because its expecting B to be an integer but it’s actually a datetime. You can fix that with sp_refreshview or sp_refreshsqlmodule. Now when I select from the view it’s actually correct. Remus, who works at Microsoft, he showed me this little trick a long time ago and I use it in this demo all the time. If you want people to not use select * against your table, you can add a computed column called “don’t use select * as one over zero.” It allows you that to the table but if I try to select star from dbo.X, I get a divide by zero error encountered. It doesn’t tell me why but if I look at the results I get this column called “Don’t use select *,” which is a pretty neat trick. Don’t do that in production. That’s fun to do to your developers.
Brent Ozar: For the record, Remus is a developer.
Aaron Bertrand: Yes true. So, select count * from table name. This is obviously going to scan the entire table even if you use nolock. It’s far more efficient to just look at sys.partitions. Or sys.dmdb_partition_stats. Those will have a count that’s going to be pretty accurate especially if all you’re doing is ball parking. So, nolock. Let’s clean up the cache and the buffers. We all know with nolock that you can read a row that never existed. You can read the same row twice. You can miss a row entirely and you can get this ‘could not scan due to data movement error.’ Paul White showed me this example once, and I forget where he posted this. We have a table that has an int column and a varchar max column. We’re going to insert one row and it’s just 16,100 Xs, that’s the value that’s in the table.
Brent Ozar: It’s pretty dirty.
Aaron Bertrand: Now, in a different session, I’m going to run this code. This actually requires me to take this and open a new query window. So, I’m just declaring the value that I read. So this is just a variable that’s going to hold an assignment from the table. Then I have two values, one is 16,100 Xs and one is 16,100 Ys. I’m just going to run in a loop and I’m going to alternate between Xs and Ys—I’m just going to read whether the table has all Xs or all Ys in that value. If we go back here, and in this session what I’m going to do is something very similar. I have the same declare values, so all Xs, 16,100 Xs and all Ys, 16,100 Ys. I’m just going to run in a loop and I’m going to update the table back and forth.
I’m going to switch it from all Ys to all Xs, to all Ys, to all Xs, and just alternate back and forth. We’ll see what happens. So, I’m going to do this and my disks are going to spin up and you probably won’t hear anything because it’s quiet. We go back to this new query window I opened. Now I’m going to run this and what this does is if I ever find a value in that column that I’m alternating in this other thing—because I’m reusing read uncommitted, I want to see if I ever see a value that doesn’t quite match what I expect to be in there, and shouldn’t be possible. If an update is an atomic operation, I should never see a mix of values.
Brent Ozar: Some Xs and some Ys, right.
Aaron Bertrand: Yeah. So, what this does is it breaks as soon as it finds a value that has some Xs and some Ys, that isn’t all Xs and all Ys.
Brent Ozar: That’s genius.
Aaron Bertrand: It happens right away. So, if we scroll over at some point—yeah, so you see the Ys. What we did is we actually—because we’re using nolock, we actually read a single value that never existed. Because what happened was SQL Server was updating the value to be 16,100 Xs and it finished one page and we read that, and then we read the next page before it got to it, before the update proceeded to the next page. Or we read the second page first. So, it’s hard to tell what actually happened behind the scenes and you could if we really dug in, but that’s kind of out of scope for this. Anyway, it’s kind of impressive that you can read a value that is essentially data corruption. That’s not the data that was ever in the table.
Brent Ozar: Nice.
Aaron Bertrand: If you have an email, if you’re reading and sending out emails to customers and you’re using nolock and you have a lob for the message body, you can end up with some pretty interesting things. You could pull a message for your Consumer Report subscriber and mix it with your Penthouse subscriber, and that would be some interesting…
Brent Ozar: That’s where your Xs and Ys are coming from, I see.
Aaron Bertrand: All right, distinct versus group by. I’m going to pull these up in Plan Explorer. Basically what these are, are just two queries that do the same thing. They do XML path concatenation, they use the same methods for everything, except one uses distinct and one uses group. This is the problem with having both task bars on the bottom. So, we go in here and we have these two queries in here and we take a look at the results, and “This demo failed because…” Yeah, so this is doing the same amount of reads now.
In some cases when you run this, I think it might be because I’ve done this before and it cached a plan, but in some cases I’ve done this and it actually ended up doing a worktable. You’ll see worktables pop in and out of certain queries, depending on how many times you’ve run them. The first couple times you might see a worktable and the other times you won’t. Where the extra reads, it did 10 times more reads for the distinct, and what it is, is from the worktable that it used to generate the X amount concatenation over and over and over.
Brent Ozar: I was going to guess a database with a different CE was what I was going to guess, but…
Aaron Bertrand: That’s possible. I’m not sure. But anyway, a lot of times you’ll see these two operations are the same. But you will see cases where they’re different, where distinct will actually be more expensive.
Then I have one more thing about querying efficiently. That is the “Kitchen Sink” procedure. I’m not going to demo this because I know I’m over on time, so I don’t want to go through the demo. But basically you have this massive search form where people can enter 50 different criteria in any combination of criteria. What people do is they try to optimize—they try to create a stored procedure that takes all these optional parameters and they have where column is equal to—where parameter is null or column is equal to parameter.
They have 50 of these, and they expect SQL Server to build some magical plan that can cater to all combinations of parameters, and that’s a terrible idea. SQL Server is never going to be able to build a single plan that will satisfy all of those different criteria. Adding option recompile doesn’t help because it still builds the same plan shape even if you have different parameters that are passed. So, my solution to that is to use Dynamic SQL. When you have a certain set of parameters that are passed into the stored procedure, you get a specific plan built for those specific parameters. Not even the parameter values, but just those parameters.
So, the first time you call it—let’s say someone says, “Well, I want to search on account opened is between…” I wouldn’t use between, I would use greater than or equal to June 1st and less than June 15, let’s say. So SQL Server builds a plan that says, “Okay, I’m going to use the non-clustered index on the datetime column to find all the rows that match the query. Then I’ll go get the other columns.” So, it builds a plan based on that index seek.
The next time someone runs the stored procedure like, “Oh I have a plan for this stored procedure already,” but they passed in last name. Now you’re trying to use an index seek on a datetime column and finding all the rows where the customer’s last name is equal to Fu. That doesn’t work so well, right? So, you’re reusing that same plan over and over again even though the parameters have changed. My solution is to use Dynamic SQL, where you build a statement based on the actual parameters that were passed. You can do this, and I have a blogpost and a video about it that are linked in the notes. It’s pretty simple to do it’s just, it’s Dynamic SQL so it makes the…
Brent Ozar: Builds a perfect query plan for whatever combo of parameters you pass in.
Aaron Bertrand: It builds a perfect plan for exactly what combo of parameters you have, right. If you use optimize for ad-hoc workloads, you don’t have to worry about the odd guy that specified all the parameters once taking up valuable space in the plan cache because optimize for ad-hoc workloads will allow you to only cache a full plan for a query that’s been called more than once. So, you’ll have all these separate varieties of execution plans that are tuned specifically for the parameters that were passed.
Now to deal with the values, you can add option recompile to the Dynamic SQL statement that you create in each case, so that you aren’t bitten by parameter sniffing. So, you’ve solved two problems by using Dynamic SQL for that. I’m going to skip the demo for that.
The last thing I have is stay employed. And I always get a couple of laughs out of this. So there’s no undo or control Z in SQL Server. So, people will go in and they’ll delete from some table and they’ll leave out the where clause or they’ll forget the where clause and they deleted six billion rows. The next thing they do is, they go refresh their resume.
So, Mladen’s SSMS Tools Pack allows you to modify the new query template. So, every time you click ‘new query’ you can have it pop up this, which is begin transaction. Then the cursor goes, the cursor will be placed right in between the begin. Then you have commit and roll-back commented out. So, what you can do is if you pop up a new query window and you write something really quickly and hit execute and you made a mistake, now I can highlight roll-back, and roll it back, and possibly not get fired. I might lock the table enough where I’ve disrupted the application and the users that I still—maybe I only get a dock in pay or something instead of actually getting fired.
That will at least allow you to avoid those problems where you have to go to backup to fix the problem. There is an issue though, you can cause a different problem and that is you can say begin transaction, delete 10 rows from some table and it works, and you’re happy and its offline. You go to lunch or you go on vacation, you forgot to highlight ‘commit’. So you can have this dangling orphan transaction that never gets committed and that can cause a resume changing event as well.
Brent Ozar: Agreed.
Aaron Bertrand: So, that is all I have. Like I said, I have the demos and the slide deck up on GroupBy.org. You can download them right from my session. And I saw Adam said that those queries are not really logically equivalent, “Distinct is on both columns whereas your group by only on one.” The same thing happens if you say group by id column, comma, and then the output of the concatenation.
Brent Ozar: Before we wrap up here, I want to thank Aaron for spending his time this morning working with you guys and teaching you about bad habits in T-SQL. Normally I would ask for a round of applause, but of course, virtual applause is better here. So, feel free to go get Aaron’s stuff. It’s available now at groupby.org/go/session9. You can leave comments over there to ask him questions as well. Thanks, Aaron, for teaching everybody this morning. Appreciate it.
Aaron Bertrand: Thank you