Enjoy the Podcast?
This session is for users who are new to writing T-SQL.
Are you new to SQL Server and not sure where to begin with T-SQL? Does it feel like a foreign language? The objective of this session is to go over simple T-SQL Statements and to show you how to build on them. We will focus on simple select statements and translating English into T-SQL. Once you have the Rosetta stone, you will find that T-SQL becomes like second nature. We will also focus on adding a simple where clause, order by and join. We will also create a simple backup script and learn about the built-in intellisense in SQL Server Management Studio.
Why I Want to Present This Session:
I have a lot of people ask me where to start with T-SQL. Because SQL Server Management Studio allows you to do most things through the UI, writing T-SQL is becoming a lost art. Developing a basic understanding can greatly improve how your daily processes run.
Brent Ozar: So in this session of GroupBy, Andrea Allred, who, for those of you who are just listening to the audio, you don’t know that she’s wearing a princess costume. That is your loss; you should go see the recording on YouTube or watch it live. Andrea Allred is going to be talking through T-SQL for beginners, a new hope. So take it away, Andrea.
Andrea Allred: Thank you, I’m super excited to be here today. The reason I am doing this class is because I want people who are brand new to SQL to understand that it’s super easy. It works just like English, once you figure out the Rosetta Stone, and how to translate it back and forth. So, just so that we make sure we have all the right people here, this is for people who are new to T-SQL. We’re going to be covering just a few basic things, like SELECT, FROM, WHERE, GroupBy, HAVING, ORDER BY and how to do a basic backup; how to actually script it out and do simple backups.
So it should be super fun and, in my mind, super easy. So this is me…
Pinal Dave: And I like it when you say GroupBy, that’s fantastic. So let’s contuinue…
Andrea Allred: Yeah, GroupBy’s the best, right?
Pinal Dave: Exactly.
Andrea Allred: This is me, I’m going to let you read my slide on your own, but I blog at RoyalSQL.com, that’s the reason for the dress. I’m also on Twitter at RoyalSQL and feel free to reach out to me, I love to talk to people. If you see me at an event, I love to take pictures, so please, please, please come take pictures with me, it’s super fun. I’m married; a lot of people don’t realize that I’m married. I have an awesome husband who dresses up as a prince, so I feel really lucky about that. So he indulges a lot of the princessness, so…
Alright, so let’s get started. The first thing we want to talk about is, what is T-SQL? SQL stands for structured query language; also when you’re re referring to T-SQL, it’s transact-structured query language. That is the Microsoft standard, so it’s Microsoft specific language. Pl-SQL is Oracle’s. So just a quick introduction on that…
So I want to start at the very, very, very beginning. T-SQL is a lot like a spoken language; so once you flip that switch in your brain, it will be super simple. When I was first starting to learn T-SQL, I started to dream in it, and that’s how I know that I’m learning a language is when I start to dream in it. And I went through and I cleaned my entire house using T-SQL. So I was like updating all the cans in my cupboard to make sure they were at the right date, I was going through and truncating my trash. I was selecting all from the clutter and I was moving it to different areas; the whole thing was in T-SQL. And I woke up the next morning completely exhausted and I still had a messy house, so I was kind of disappointed, but I knew that was how I was learning SQL.
So the basic commands that we’re going to be dealing with today are SELECT, FROM, WHERE, GroupBy, HAVING and ORDER BY. S SELECT is what I want. FROM is where I want to get it from. WHERE is only if it matches what I want. GroupBy is what I want it to summarize for me. HAVING is when it’s used after a GroupBy to do more filtering. And then ORDER BY is to sort it the way I want it.
So how is it like English? The easiest way for me to start this is to give you a sentence. “I buy milk from the store if it is chocolate.” So my SELECT statement, my SELECT would be “I buy milk”. My FROM would be “from the store”. My WHERE is “if it is chocolate”. So my SELECT statement would look like what I have on the screen here. SELECT milk FROM store WHERE flavor equals Chocolate. It’s super simple.
So let’s get into some more details on our SELECT statement. With SELECT, you want to use it wisely, especially if you’re playing with SELECT star. I also wrote a post on this on RoyalSQL.com, so feel free to go check that out. SELECT star is a quick way to see all of your fields, but you want to be super careful with it. I like to do a SELECT TOP five when I use star, just so I’m not pulling back too much data and I’m not filling up my buffer cache or kicking things out. When writing code for reports, I prefer – when I’m writing code or when I’m writing stuff for reports, I like to list out my fields specifically. This way, if something new is added to the table, I’m not accidentally getting a field that I’m not expecting. So that also helps me to know what I’m going to be getting back every single time. And then the final thing with SELECT is you want to know what you’re selecting from, so you know how many rows to expect, so you’re not getting too much data back.
Okay, with FROM, you’re going to be picking from a specific table. You will also want to include your schema; dbo is the default schema. So you can see dbo.Users is the table that we’re going to be using in our examples. There’s other schemas, there’s like the person schema, the context schema. Your company may have a specific schema that they’re using, so you want to always specify the schema when you’re using your table. Also, when you have a schema and a table, you can have – like I have a DBA schema, I can have a table that is like user in the DBA schema, and I can also have a user table in the dbo schema. That’s why it’s super important to always include your schema in you SELECT statements or in your FROM clauses.
You can connect to multiple tables using JOINS, and we’re going to cover that just a little bit later. And then this is the basic SQL statement that I’m going to be starting with, this SELECT TOP five from Stack Overflow.dbo.Users. Stack Overflow is my database, so I’m fully qualifying here, I’m specifying my database, my schema, and my users.
So let’s go ahead and take a look at how this looks in SQL Server. I’ve run this first query – this is the SELECT TOP five, you can see I’m using that TOP five so that I can get the data back quickly. But if you look at this table, I have a whole lot of information here that I don’t really need. So I’m going to go ahead and filter this down a little bit with my SELECT clause, and I’m only going to get a few columns back.
So you can see, I changed it to TOP ten because I am bringing back more columns. So the columns I am listing out here now are display name, location, reputation, and up-votes. So I’m SELECTing with my specific columns FROM, here’s my database name, my schema and my table. Now, if I didn’t want to specify my database name, I just need to be sure to SELECT my database name from here and it will automatically go to the Stack Overflow database and I can remove my database name. Do we have any questions so far?
Pinal Dave: I think everybody’s saying that they really like it. How you are presenting this session may be a benchmark for a lot of people to learn about T-SQL, the way you explain in the simplest possible words. I love the English part really, really very well.
Andrea Allred: Oh good, okay, then I’m going to go ahead and keep going. So we’re going to flip back over to our presentation here and we’re going to add in the WHERE clause. So, WHERE is a way that we can go through and filter out that data that we don’t need. It’s going to help us to improve performance by not bringing back too much data. The WHERE clause that I’m going to be adding is going to be WHERE display name equals Anon. It’s similar to the name Anonymous, but I found that there are a lot of people that like to use the name Anon. so I’m going to be using that so that you can see how it filters out the data. Also, there’s going to be – this is the way that you can make it just a little bit more useful. So with a WHERE clause, what you’re going to be looking for is your column, your operator and what it is being compared against. So with this, I’m going to use display name, that’s my column. My operator, in this case, is going to be an equals, because we’re starting with something really simple, and then what I’m, comparing against is that Anon.
So let’s go ahead and flip over and take a quick look at how that’s going to work. See, this is my WHERE clause, and you can see I have the exact same SELECT statement that we were using right before. And now you can see that I’m only bringing back people that have the display name Anon. there are about 900 rows that come back with this if I don’t have the TOP on there; so that’s why I’m leaving the top on there for now.
But I want to show you more operators because you’re not always going to be looking for something that is equal to. So yes, we have the basic equals operator, so we have WHERE display name equals Anon, or we can have the does not equal. There are two operators that can be used for does not equal, it’s the greater than/less than brackets. That’s my preferred way to do it. There’s also the exclamation point equals. Both of those will work in T-SQL. So the WHERE clauses here would look like WHERE display name does not equal Anon, and also WHERE display name does not equal Anon.
A few other basic operators that can be used is the greater than, it means that this column has more than what’s in our – what we’re comparing against. So WHERE up-votes is greater than a specific number. You can also use dates for the greater than and less than. Along with greater than, we have greater than and equal to. The difference between greater than and greater than and equal to is that you will get that specific number back and anything above it. If you don’t include the equal to, it will SKIP that number and just go above it. It’s the same thing with the less than and the less than and equal to. If you include the equal to, you will get that number in your results. If you don’t include the equal to, it will just be everything below that number.
And finally, LIKE. LIKE and NOT LIKE are fun… Oh wait, we’re going to go back before we do this, because I want to show you just our basic operators before we get into this, because I like LIKE, like a lot. So here you can see, we’re using the does not equal to, and you’ll see that there is nothing with Anon returned. We will get Anonymous user, because we’re not using – it has to exactly match what is in between these quotes here. And I’m also going to show you that I can use the greater than/less than operators, the does not equal, and that works exactly the same.
Pinal Dave: I have a quick question, [inaudible] any reason for that?
Andrea Allred: It’s the way that I learned, and so that’s why I prefer it that way. It’s also the standard here where I work, and so that’s why I stick to that one. A lot of people – I don’t know if the does not equal was an older way of doing it or was from a different language or something like that, but when I use the greater than and less than together, it just means does not equal better for me than the bang equals. Do you have a preference on the two?
Pinal Dave: No, used to write this way only, but as soon as I start blogging, initial time of the blogging, this one was getting converted to some ampersand number, so I went with an exclamation mark and equal to. But I personally love this one because of what you just said, it’s greater than or less than like [inaudible] so it’s really not equal to. So this particular sign really conveys the message, but other one, you have to learn it. So I really love your explanation.
Andrea Allred: I feel more comfortable with it. Visually I can look at it and go, okay that does not equal, but the exclamation point being the same thing, I had to learn that before it made sense to my brain.
Pinal Dave: Absolutely.
Andrea Allred: Okay, so here is where we are doing the greater than, and I am including a specific number so that you can see that that specific number is not showing up in our up-votes list here. That’s because I didn’t include the equals to, it’s just greater than. But now if I throw on the equals, you will see, with the same query now, we get our number right there. So that’s why it’s super important to understand the difference, because you may be missing results if you’re not using the equals to.
Now I’m going to flip over to the less than and show you the same type of thing. So our up-votes are less than 45. Oh also, have you guys ever figured out how to tell the difference between greater than and less than? This was something I learned in elementary school. It looks like an alligator and the alligator’s super hungry, and so the alligator’s going to eat whatever is bigger. So with greater than, he’s going this direction and with less than he’s going that direction. I don’t know if that made any sense, but it always made me laugh.
Okay, so if I run this one, which I think I actually – yeah I have already run it, you will see that the 45 is not listed here at all. But if I add my equals to again, my 45 will show up, yay. Super great way to keep track and make sure. So now we’re going to move onto the super cool operators, which I really like. I like having the additional options; the problem with super cool operators is they do have more of a performance impact, so you’ll want to keep that in mind whenever you’re using them.
LIKE is fascinating to me because it’s everything that is similar to these first four letters, and then whatever could possibly come after the percent sign. Now you can put this percent sign anywhere you want within this word. You can do it at the beginning, and I’m going to show you how this works. So if we put it at the beginning of Anon, it will be everything that will end with that Anon. if we put that around the Anon, it will get everything that has things – that have the word Anon in the middle. So it’s really exciting. NOT LIKE works just the opposite of LIKE. It’s going to bring back everything that doesn’t look like what you have in between those brackets – not brackets, I can’t think of the word… Quotes, between your single quotes.
Alright, also there’s IN. IN is where you’re going to create a list, so as you can see here, I have Word Display Name IN, and I’ve created a specific list. So my results are going to look like anything that is included in this list. With NOT IN, it’s just the opposite. It’s going to give me display name where it doesn’t look like anything in this list.
And finally, BETWEEN. BETWEEN is awesome. I use it mostly for dates, but it also works with numbers. So with BETWEEN, you’re going to use the words BETWEEN, a number or a date, and, the word and, and then another number or date. So your results are going to come back with anything that falls in between this range. And when you do NOT BETWEEN, it’s everything that is not between that range.
So now let me show you how it works. So with LIKE, this is exactly what I was talking about with the demo, we’re going to have that percent sign at the end, and I’m going to go ahead and execute it. You can see that all of our display names are similar or start with Anon. We can get the exact word, or we can get a whole bunch of different names that look very similar to that that start with those first four letters. I’m going to take out the A and I’m going to throw a percent sign at the beginning. So now it’s going to bring back anything that has Non in it, so we will still get like Anonymous user, but we’re also going to get things like Brannon and Josh Gagnon. And if I have it only looking for things that end in that, we will see that our Anon does show up again, but again, we also get names that just end in that.
There can’t be anything after that Non in order for it to show up, so it won’t show up like Anonymous user because that would be in the middle of the word. Let’s do NOT LIKE. So you can see the exact same things. So now you can see that we get everything – our results coming back don’t end in Non at all. If we move this percent sign around, we will see that our results don’t include anything that has Non in the middle, and if we remove it off the front, we will see that we no longer get that Anon user.
So now let’s play with IN a little bit. Here I have my list, and you can make this list as big as you want really, but I don’t recommend it. If you’re going to use a big list, you will want to join to a table, so I will show you how to do – that will actually be in a different presentation, how to join to a list, but you can take my word for it right now. This is just a simple way that you can create a fast list. So with – you can see that in my results, I’m only getting back things that are listed in my list in that IN clause. So I am getting Anon, Anonymous and Anonymoose. I think that one’s really funny, that one was one of my favorite ones. Now, if I add the NOT IN to my WHERE clause, you’ll see that I don’t get anything like Anon, Anonymous or Anonymoose. I do still get Anonymous user though, that’s because that was not specified in my list.
Now let’s play with BETWEEN. Again, you can see that I have my WHERE clause, here is my column name, my operator is the BETWEEN, here is the start of my range, my AND is in the middle and here’s the end of my range. So we’re looking for anything that’s between 5000 and 5045. Something to remember with BETWEEN, it does – it works very much like the greater than and less than. It does not have the equal to so if you have like 5045 for the top of your range, it will not show that number, and also with the 5000 at the bottom of my range, it will not show that number. It’s just things that are between those two numbers. So if you want those numbers included, you need to adjust your range accordingly.
So you can see in our results that we got everything that’s greater than 5000 but less than 5045. Now, if I use NOT BETWEEN, it’s going to be everything outside of that range. So you’ll get numbers that are lower and that are higher than 5000 and 5045. Do we have any questions so far on the operators? I’m hoping now, I’m going to keep going.
Drew Furgiuele: Yes, it doesn’t look like you have any questions so far but I’ll keep an eye on it.
Andrea Allred: Okay, great. So let’s add in GroupBy because it’s awesome. GroupBy gathers similar data together, it’s used to create aggregates, and it’s awesome for totals, averages, counts, min, max. The GroupBy columns, you’re going to be grouping by the columns you don’t want to sum or average or count. You can have multiple columns in your GroupBy, and I’m giving an example right here of what a GroupBy will look like, so we’re going to be grouping by display name and location. So here you can see we have our GroupBy by display name and location. Those are the only two normal columns that we have up here, and you can see that I’m going to be counting multiple columns here so that you can see the difference between a sum and a count.
Count is actually a function that is in SQL Server, and I’m going to show you a couple other functions that work similar to it, that also work in SQL Server. So our results are going to be showing us all of the Anons that I have listed, and their location, but it’s actually grouping them also. So you can see that these two columns here, the up-votes and the ID are now being counted. So we can see that we have 356 users that are called Anon that have not specified a location.
Now, maybe you’re wondering why we have India listed twice here. This is because this specific database is case sensitive, and so it is looking at the capitals and the lower case, and because those are spelled out differently – not spelled differently, but they have a different upper case and lower case, it is separating them out. So you can see that we have 356 users, so if I change the count of up-votes to a sum, you’re going to see how many votes are actually there. So for the up-votes, we have 809 now for Anon, but we’re looking at our top row here, and again, we’re still grouping by display name and location.
Let’s say I just wanted to GroupBy the display name. I can remove location, but I also have to remove it from my query at the top too, otherwise we will get an error. And now I’m only getting one row back, it removed the location from the GroupBy and you can see that our sum and our count did increase because it summed everybody together. I’m going to go ahead and add location back in, and I’m going to change this sum to an average. The only row that this average really changed for was the first one, and the reason being is that there are 356 users in there, and some of those users have zero votes, some of them have a lot of votes, and we’re going to show that in just a second, but you’ll notice that down here, they didn’t change at all with their average, because there’s only one user there and there is nothing for it to be averaged by, this number, the nine is being divided by a one. Whereas the count – the sum that was up here before the 809, is being divided by 356, and that’s how it’s getting the average. I love that SQL does all this cool stuff without me having to think about it.
So now we’re going to look at the min, and we’re going to see what the lowest amount of up-votes is there, and see, there are Anon users that have zero up-votes, that’s why the average is so different. Now we’re going to get the max. I want to see what the ma number of up-votes is for that Anon user. And here we can see that one of them has 148, that’s why it’s keeping the average higher, is because some of them have something like 148 and some of them have zero up-votes.
Now we’re going to move on to HAVING. I love HAVING. I learned HAVING right after I learned GroupBy, and it helped me solve a massive problem that I was trying to figure out. The nice thing about HAVING is that it works with GroupBy, but it allows you to filter using the things that you have just grouped by. Like a WHERE clause can’t filter based on a count or a sum, but HAVING can, so it gives you that opportunity to filter more. You can have multiple clauses in your – or multiple columns in your HAVING clause, and I’m giving a quick example of HAVING right here on the slide. So HAVING count ID greater than 50. So we’re using operators that we’re all familiar with, we’re using a function which is the count that we’re familiar with, and we’re using our ID column that I’m going to show you.
So here is what my HAVING clause looks like, we’re going to be summing our UP-VOTES, and we’re looking for things that are greater than five, or up-votes that are greater than five. So you can see we only get back three columns. We removed everybody that had zero up-votes and now we’re only seeing these three that had more than five. If I wanted to change this number, say to 11, it would remove those other two rows.
Let’s say that I wanted to add additional filters into my HAVING clause. I could absolutely do that. I would add an AND, and I want you to notice that right here, Intellisense is trying to help me, so you can see that it’s saying, oh hey, are you trying to use the ID column, well, here it is, and I can hit Tab, and it will automatically prefill in that column for me. I know it was just two letters, but with larger columns it becomes super helpful. So now you can see that it only brings back one row, and if I remove that you so can see the difference, you’ll see it bring back three. Super fun, super easy. That’s what I love about T-SQL.
Pinal Dave: This reminds me one thing like couple of time before when I was doing the HAVING, one person just told me that HAVING – you don’t ever have to use WHERE clause because you can just use HAVING, and I had a lot of difficulty explaining that there are two entirely different thing. HAVING can do your stuff once in a while, but you are using something else for something else, but the way you explain, I think it totally makes sense.
There is one more question also on the chat going on, so I just want to say, how the BETWEEN works, so I would answer on your behalf, that BETWEEN is inclusive and it definitely includes the both the values which are there and it also supports integer as well as state and some of the other values. So now let’s continue, I just thought I would just give a small interruption and I just clear out all the questions which we have. So fantastic, let’s continue princess.
Andrea Allred: That’s awesome, thank you. Yes, absolutely. With HAVING, you don’t want it to be your only filtering out clause, and I’m going to show you that with the order of operations, because the way that the order of operations works, HAVING comes a lot later than a WHERE clause, so you’re going to be bringing back more data to process when you’re filtering with a HAVING clause.
Alright, so let’s move to ORDER BY. This is how we can sort out data, and a lot of people like data to be sorted so it makes sense to them so that they can see it. When you’re sorting your data with an ORDER BY, it automatically assumes that you want to sort the column ascending, which means like the lowest number to the highest number or A to Z, or the oldest date to the newest date. You can change that by specifying DESC or descending at the end of that specific column that you’re sorting on. You can have multiple columns in your ORDER BY, and the ascending and the descending can be done by column. It’s not just for the whole thing.
Let’s play with ORDER BY. Because ORDER BY comes at the very end of order of operations, I can actually ORDER BY account, which is right here. Something that I created, it’s a summed column, it is something that wouldn’t be able to be filtered out in my WHERE clause. I’m going to show you that first that I can do that, and so you can see that it’s going lowest to highest. Now if I add the descending, it’s going to go highest to lowest. But say I want to sort on different columns, I don’t want to just sort on the count. I would rather just sort – I would rather use the display name. You can see I’m using my Intellisense here, and let’s also do my location. Now again, it’s assuming that I want it ascending for both of those, it will do display name first and then location second because that’s the order that I have them listed in.
So here I’m getting – and you can’t see it because it’s all Anon, so we’re going to change this to a LIKE. So now you can see that I have Anon, and then I have others, but they’re all in alphabetical order. Now, why did this one pop up down here? Well, it’s because it has a period next to it, and that comes later than a space, so that’s what’s moving it down. We’re going to change this to top ten so that I can speed it up a little bit, and now let’s say I wanted to flip my display name and my location, and I want to sort by location first and then display name. NULL comes first alphabetically, let’s add on to my WHERE clause where location is NOT NULL. So now you can see why did that come up with blank? Well, the reason that came up with blank is that this is actually a space, not a NULL. They are different in SQL, so you want to remember that space and NULL are very different in SQL. It’s going to put the space first, and then I’m going to be getting it alphabetical by location, and then it will go back and sort the display names second.
What if I flipped these and I added like descending to my location but not my display name? Well, it’s going to flip my location, but my display name will still be going ascending, and if I add descending on to my display name, you’re going to see those flip as well.
Drew Furgiuele: Choo choo, here comes the SQL train.
Andrea Allred: Yay.
Pinal Dave: [inaudible] for staying near the railway station, but I think your queries are running faster than the train, which is passing by my home. I tried to click on mute a couple of times, I miserably failed and I think that’s why GroupBy, it shows the real side of us.
Andrea Allred: I think it’s fantastic and I really, really liked it, so no worries there. Okay, does anyone have any questions up to this point?
Pinal Dave: All questions are answered, but one – you have been presenting in such a way that questions are being answered as soon as they are being asked, so that’s good. But one comment somebody made in between about does not equal to, they said the one which we discussed about less than and greater than is common across all SQL language, whereas does not equal to is only supported in portion of the SQL Doug Stevens said, so I think that’s the reason – I think since he also prefers the same one which we like, so that’s awesome. I didn’t know this trivia and it’s very fantastic. That’s good.
Andrea Allred: Yes, this is awesome. I always love this because I learned from these and it’s like yes, it’s fantastic. So I have better answers for next time.
Drew Furgiuele: There was one other question that I’m going to wait until the end to bring up to everyone, because I think it’s probably open for discussion, but keep going, it’s a really good question, and keep your questions coming in by the way.
Andrea Allred: Okay, awesome. Alright, so we’re going to talk about the order of operations. This is why it’s so important to have your FROM and your WHERE clause set up properly, because those two are evaluated first. That means that when the SQL engine goes through, it’s going to look at your FROM clause and your WHERE clause so anything that you can limit down between those two is going to mean that less data is being read and less data is being brought back. Then it’s going to evaluate your GroupBy, then it’s going to evaluate your HAVING, and that’s where it’s going to filter out a little bit more, then it will do your SELECT. So this is why you want to keep your SELECT short, with just the specific columns that you need, because it’s happening so much further down that you want to make sure that it’s short and then finally it will do your ORDER BY. That is why I can put count and things that I have aliased in my SELECT statement down in my ORDER BY.
Now we’re going to do joins. I like joins. They are the magic of T-SQL, honestly. So there’s four kinds of joins that we’re going to talk about today, we’re going to talk about an INNER JOIN, and that’s only if it matches in both tables, that’s what it’s going to be returning. A LEFT JOIN is going to bring back everything from the left table and only what matches in the right. With the RIGHT JOIN, it’s going to be just the opposite, everything from the right table and only what matches in the left, and a FULL JOIN is everything from both tables.
Now, when you do a LEFT JOIN and it’s bringing back everything from the left table and only what matches in the right, that means you’re going to be getting some NULLs on the right side. With the RIGHT JOIN it’s just the opposite, you’re going to be getting some NULLs on the left side, and with the FULL JOIN you’re going to be getting NULLs on both sides. I’m going to show you all this.
So this is what an INNER JOIN looks like if you have a specific visual for it. So remember, it’s only bringing back results that match in both tables, and you’re not going to see what’s eliminated. Those NULLs won’t show up and you’re not going to know what rows you’re not getting back, so that’s why usually when I’m playing with a SELECT statement, I’ll play with an INNER JOIN and I’ll play with a LEFT JOIN, so that I can see the things that are not coming back and that I’m missing. This is an example of how we would write an INNER JOIN, so we have our FROM StackOverflow.DBO.users, and I’m putting the letter U at the end of this. This means that I’m aliasing this table, and I’m going to be showing you why that’s important in just a second. Then I have my INNER JOIN StackOverflow.DBO.badges table, and again, I’m putting a B at the end of this. This is my alias, so that when I go to do my ON clause, which I’m doing next, this is how I tell it what two columns are the same so it can connect the two. U stands for my users table, and B stands for my badges table. So I can say U.ID equals B.UserID instead of having to do the whole DBO.users, or DBO.badges. It saves me a lot of time and a lot of typing, and it also makes my code look cleaner.
So let’s look at what a LEFT JOIN would look like. It’s going to be bringing all the rows in from the tables on the left, and only what matches on the right. Left OUTER JOIN and LEFT JOIN are the same thing. So if you’re lazy like me, I usually just type LEFT JOIN, I used to type left OUTER JOIN when I was first starting, but it will assume the same thing. The same thing happens with the INNER JOIN, and I SKIPped that but it was on the slide.
With the left OUTER JOIN, you are going to see NULL, so let’s hop and take a quick look at what our LEFT JOIN would look like. So we have our FROMStackOverflow.DBO.users, and it looks exactly the same as our INNER JOIN, except instead of the word inner, we have LEFT JOIN. Any questions so far? I’m going to keep going to the RIGHT JOIN. So with the RIGHT JOIN it’s the same as the LEFT JOIN, and usually the only time that I use the RIGHT JOIN is if I don’t want to rechange the order of the tables that I have been typing them in and I suddenly realize those should have been reversed. I should have had this table above this table, and so RIGHT JOIN, the only time I use it is when I’m being lazy. So if you see a RIGHT JOIN in my code that’s why. Usually I will use LEFT JOINs, but that’s just a personal standard. Do you guys have an opinion on that with the right and left?
Pinal Dave: Actually I use a LEFT JOIN only because I think that’s how I just read it. I just read left to right when I’m writing SQL or any language or any words, so I use LEFT JOIN all the time, but I think I prefer with myself because I see the execution plan, I think from right to left, so I think I’m just saying two different things right now, but about join I use LEFT JOIN. Let’s see what Brent and Drew says.
Brent Ozar: I was just going to say it’s like explosive over in Slack. People are like I always type it that way, I never type it that way.
Drew Furgiuele: Yes, battle lines have been drawn.
Brent Ozar: Drew’s on unfortunately the wrong side of the battle, but we will continue.
Drew Furgiuele: I feel like wrong is not the right word.
Brent Ozar: Losing. Losing is the better word.
Andrea Allred: For me, a RIGHT JOIN is when I’m being lazy honestly. I usually will use a LEFT JOIN, but I think it comes down to like your personal preference. T-SQL is very much an art, it is not a science, and so it comes down to your personal preference. Everybody draws a certain way, everybody writes T-SQL a certain way, and as long as you know T-SQL and you understand it, you can read anybody’s code and I think that’s what’s most important. Having clean code that works properly is the most important part.
This is what our RIGHT JOIN is going to be looking like, and again, it looks exactly the same as the LEFT JOIN, I’ve just changed the word from left to right, but it’s going to be bringing back vastly different results because it’s bringing back everything from the right table and only what matches on the left. Finally, FULL JOINs. Everything, everything. It brings back everything in both tables, you’re going to have lots of NULLs everywhere, full out join and FULL JOIN are exactly the same thing. You’re going to see NULLs on both sides like I said, and I’m going to show you what the FULL JOIN looks like in T-SQL right here.
So we have again, it looks exactly like everything we’ve been typing, but instead of left or right or inner, we’ve typed the word full. So it is important to pay attention to every single word in your T-SQL query because one change of one little word can bring back a whole bunch of different results. So let’s play with joins. So right now we’re going to play with the INNER JOIN first, so you’ll see I have inner right here, and I’m only going to bring back the top five right now, and I’m listing out the specific columns that I want. So if we look at this, it’s kind of hard because you don’t see any difference. Well, that’s because there is no NULLs coming back, so our line right here is between the up-votes and the name, so these first five columns are what’s on the left side, and this last column is what’s on the right. I’m going to show you what it looks like if I bring back all of those columns. So I’m going to do a select top five star with my INNER JOIN, and so you can see this massive amount of data, and you’ll see that all of it is there.
Right here is our divider line between the two tables and all of our data is there. Let’s go over to the LEFT JOIN and take a look. Now, I’ve picked out a specific user so that you can see how the LEFT JOIN works. This is our divider line again right here, right after – between count ID and ID. Everything is filled up, there’s a few NULLs in there but there is rows being returned for that data, but then everything on the right side is NULL. So I’m going to go to a specific group of columns just so you can see those specific five, so that makes it just a little cleaner. This is our dividing line, everything in the left table and then the right side has NULLs.
With the RIGHT JOIN, I’m bringing back a user that specifically doesn’t have anything in the users table and only has information in the badges. This actually would be dirty data, but you know, glad it was there, because in my opinion, if they’re in the users table, they should also be in the badges table. But it does show the point really well, and now let’s do the FULL JOINs so that you can see that everything comes back. I’m using both of those user IDs that I was using before so that you can see what happens with both sides. This one does take a little bit longer because I’m pulling back specific rows.
Pinal Dave: This teaches us that we should not use this so frequently unless and until we need it.
Andrea Allred: Right, absolutely. Because it brings back so much, and you can see here that you’ve got the left table that has the NULLs here where it just has the right information, and then you can also see the left table that’s populated where the right is NULL. So you can see that it’s bringing back everything, and yes, it’s a really difficult join. I want to show you one other thing, and we’re going to have to talk to fill the time because if I do a SELECT star from this table – now, remember it took 14 seconds to bring back those specific rows. Doing a SELECT star makes it take a lot longer. So do we have questions while we’re waiting for this to come back?
Pinal Dave: That is not a question, but this FULL JOIN reminded me of a very first lesson which my very first boss taught me. One time there was really, really need for me to use this FULL JOIN, and I used the FULL JOIN okay, and I went to him and he said FULL JOIN bad, and before he finished the statement I said, you know what, that’s the business need, that’s the logic which you gave me. He said hold on, listen, instead of using FULL JOIN, just use LEFT JOIN and RIGHT JOIN and union them, and I had a moment of like wow, I never thought about it, and that was much more faster at that time when I practiced it and it really made sense. Now, after looking at your FULL JOIN, maybe I will try the same thing one more time and see as homework if that is still true with all the change in the logic in SQL. This is like homework for me.
Andrea Allred: That’s fantastic, yes, because it begins to make you wonder like, is this a really good way to do it, and you do want to find the best way that works for your system, so playing with it is the best way to figure out what is going to be best for your performance.
Pinal Dave: Exactly, and somebody – I just got school, somebody wrote to me in an email saying well, that may produce the same execution plan, and now I ask him back saying which version of SQL Server, and he’s quiet. All this happen in last 15 seconds.
Andrea Allred: You can see we’re at like a minute forty, and it is still – there we go. A minute and forty one. So the difference between doing the SELECT star and listing out those six columns is huge, so this is why you want to be super careful with SELECT star first of all, and then also with FULL JOINs, so that’s kind of a warning when you’re playing with it, but remember to play with it, test it out, that’s the best way to learn your system and to learn what’s going to be best for you and your system.
Alright, let’s get back to this. Now we’re going to script out a basic backup. I think everyone needs to know how to do a basic backup, whether you’re a developer, or a DBA, it is super important to know how to do a backup. You always want to be able to roll back and fix anything that you may have broken. So I’m going to bring back my object explorer, I’m going to hover over the Stack Overflow database, I’m going to right click, and there is this awesome thing called script – nope, not script database as, I could do that but I’m not going to. We’re going to go to tasks and go to backup, and this is how you could do a basic backup. You could just run this and it would work for you, but instead, we’re going to script it to a new query window.
SQL Server has a lot of this built in so that you can go ahead and just script anything anytime that you need it. So now that I’ve got this query here, what does this all mean? It’s kind of like a lot of jumbled stuff. Well, let’s talk through it so you understand exactly what you’re doing here and why. Okay, so we’ve got our backup database and that’s our database name, so we’re backing up the Stack Overflow database. TO DISK is the location where we’re going to be putting it, so that makes complete sense. With NOFORMAT, what is that? That’s something that is from old, old, back when T-SQL was brand new and you would go and reformat drives whenever you were doing backups or reformat the space when you were doing a backup. So the default now is NOFORMAT.
This NO INT – what’s that? So NO INT means that you’re going to be appending to the data set. I – to the backup set. I usually use just INT because I want to overwrite it if it’s there, but that’s my own personal preference, so I would just remove that NO and I would just do INT like that.
You’ve got the name, this is a name you can put in, and I have had people ask the questions, what does this N mean, that just means that text is about to follow. It doesn’t always have to be there, but that’s what it means. Now, the SKIP – is it going to SKIP checking whether or not that database has expired before I overwrite it? You can set an expiration date on backups, and if you say no SKIP, it will check that expiration date, make sure that has happened before it overwrites that backup. The NOREWIND is also from way back in the day when you were backing up to tape. It used to be – the default was rewind and it would get your – as soon as it finished the backup, it would start rewinding the tape, so NOREWIND means it won’t rewind the tape, and then the NOUNLOAD or unload means that after it finishes rewinding the tape it would unload it so that it frees up that slot for another tape to go in.
The one that was most interesting to me was the stats ten. I’d never paid attention to it before, and it turns out that this is how you can tell the progress of your backup. Your backup, it’ll say like 10% complete, 20% complete, this stats ten means that it’s going to be going by ten. You could do five, you could do just about any number that you want. So if you did stats equals five, it would be 5%, 10%, 15%, 20%, and it just helps you to know how far along your backup is before you complete it.
Pinal Dave: That’s good. Actually I – that’s a very, very interesting part. I didn’t know this for at least – last year somebody asked me and then I learned about stats, and you are very, very right. There are things which we see for the tape, nowadays we don’t use it so kind of if you’re not using tape, I believe we can escape them right, we can overwrite it, and just like N is stand for Unicode in a name, which we can also escape it if we don’t want to – if you’re not using Unicode name, so you’re right. In one way we can skip few things here and there if we are not using it, but if you have Unicode names, you will use N and format and skipping things definitely makes sense. Thanks, this is good, this is really good.
Andrea Allred: Now, if you wanted to script it out the specific way that you want this backup to happen, you could go ahead and script it out differently. Go through, set your options in the interface. Like this is the append versus the overwrite. I’d want it to overwrite every time; different things like that. You can get this exactly the way you want it, you know, checking your expiration date, different things like that, and then script it out and it will give you with those specific settings that you set here. So that’s kind of a way to cheat a little bit when you’re first learning T-SQL and when you’re first learning how to do this.
You can do the same thing when you’re selecting from a table. You can do SELECT TOP 1000 rows or something like that, and it will return the query along with your result set; so you can see what you’re going to be getting back and what the query looks like, and then you can adjust your query from there. So that’s also a great way, for when you’re learning, to set up a SQL query and top see how SQL would write it so that you can write it the same way…
Pinal Dave: And I think the backup here, I’m glad you’re teaching about the backup part because I think a lot of developers are just ignoring the backup. They just do things because they’re worried, they don’t know how to take backup, and they fire one update without WHERE clause and I think their world resets. So I think it’s really cool we are talking about it.
Andrea Allred: To me, backups are the – like the first thing that you should learn as a DBA is how to do a backup, and then, for me, security is the second, because you want to keep your database healthy and be able to restore it, and you want to keep the bad people out. And so those are the two things that I always do very first when I want people to learn SQL or when I want someone to learn about being a DBA. But T-SQL is an important part of that because if you don’t know how to navigate through your database and if you don’t know how to set up these specific scripts, you kind of get lost. And this is kind of like a little secret weapon you can keep in your back pocket and be like, oh I know how to do that, I know how to find that information. Do we have any other questions? No? Awesome.
And this is just another – I am going to be putting my slides up on SlideShare, so that you can go back and re-review these slides, if you would like. And it does have that information in the notes section of the slide of what everything means.
So finally, if you need more resources, the place where I learned to write T-SQL was SQLcourse.com. Now, you’ll want to be a little careful with that because the joins get a little different than the way that T-SQL uses their joins; so be careful with that piece, but other than that, I think it’s a great resource.