Developers, DBAs, and anyone else responsible for writing or maintaining dynamic SQL queries.
SQL injection is one of the most common ways that hackers gain access to your SQL server. Do you know how to harden your queries and protect your data from malicious users?
This session will provide an overview of how SQL injection works and how to write injection-proof queries through a series of T-SQL demos. We’ll also take a look at why some commonly used techniques aren’t as secure as many people think.
If you ever write or maintain dynamic SQL queries, or work with developers who do, then this session is for you.
Enjoy the Podcast?
Why I Want to Present This Session:
The first time I saw a SQL injection attempt made against my server was around 2001. Injection vulnerabilities are fascinating because they’ve been around for so long yet there isn’t really a one-size-fits-all solution to prevent them (besides not running dynamically generated code).
With what seems like a major data breach occurring each week (many due to SQL injection!), reviewing secure query writing best practices is more relevant than ever.
Full Session Transcript:
Brent Ozar: So in the session at GroupBy, Bert Wagner is going to be talking about SQL injection. So take it away, Bert.
Bert Wagner: Alright, thanks. Yes, so SQL injection, hot topic because of all the security problems that have kind of been making the news recently. And if – you know, my goal for today – we’re going to talk about injection, we’re going to talk about how to – what it is and how to protect yourself from it. But the key thing that I want to talk about is if you look at all these company logos here, these companies are all related by the fact that they have either put out software that has SQL injection vulnerabilities in it, or that they have had SQL injection attacks on their databases, which have caused, you know, customer data to be released to the world. And if there’s one thing I want you to walk away with today, it’s just that we’re going to see a lot of solutions and there’s no one perfect solution for how to protect yourself from SQL injection. There’s – it’s going to be a multilayered approach, don’t think you can use one of the examples from one of the demos and then you’re going to be totally safe. It’s really a problem and the reason it’s such a big problem is that it’s not necessarily easy to solve.
So with that said, couple CYA disclaimers here, you’re going to see a lot of scripts that are going to be really tempting to run. I want you to try this at home. It’s going to be really tempting to maybe try running these scripts on a production server or on some public-facing website, and that would not be a good idea. Definitely, don’t run it at work unless you’re a security analyst or a DBA who has permission to be doing that. Last thing you want to have happen is come up in some security audit. That’s not the way you want to find out if your company is using any kind of SQL injection monitoring software because you’d be surprised by that. And like I said, don’t try this on other people’s systems. You saw the company logos on the previous slide, these are customer-facing websites that have these kinds of problems and it’s very likely that there are many out there that are still problematic. I updated that slide most recently this week from SQL injection attack that occurred to the India’s database of identification for all of their citizens. I mean, it happened a few weeks ago, so this stuff is still happening all the time.
Quick background. I’m a business intelligence developer but I’m a tech security enthusiast. When I was running one of my first websites back in the early 2000s, I remember seeing some SQL injection attempts in my SQL logs. I was wondering, “Okay, what’s this funny looking – where are these funny looking character strings here with equal signs and quotes and stuff?” And that’s kind of how I found out about it. I don’t do security professionally, although I think all developers and DBAs, that’s kind of one of the many hats they wear. I’m just an enthusiast, I like it a lot, and I know the bottom of this slide says the demo code’s already available, it is not. I didn’t get it to, I’ll put it up right after this presentation, but it’ll be up on my website, it’ll be up on GroupBy.org.
And so basically today what we’re going to do is we’re going to run through, you know, what SQL injection is in case – if you’re unfamiliar with it, why is it important that we’re looking at it, what problems can come from it, and then just kind of common misconceptions and how to prevent it from happening. Or at least try as hard as you can to prevent it from happening on your server.
Alright, so the importance of SQL injection protection. So the first thing that people think of is, you know, leaky data, that someone’s going to – some hacker’s going to get into your database and they’re going to steal your data and then they’re going to go sell it or they’re going to release it or blackmail you with it, whatever. And that’s obviously a really big problem, it’s our duty to keep that data safe, and SQL injection can definitely result in whole databases of customer information being compromised. But that’s not the only reason you want to kind of protect yourself from SQL injection attacks. You also have to be concerned with data validity, right? So I always think about like, imagine you’re an e-commerce store and someone gets into your database and is able to update all the prices. And instead of having your thousand-dollar TVs, they’re now only a buck and people are buying them. That’d be the best Black Friday sale ever, until you got caught, I guess. But I mean obviously, on a more serious note, there are places with maybe data is very critical and important to safety, so if you can’t trust the data that’s in your database because it may be compromised by a SQL injection attack, that’s a big problem.
And finally, just for your server availability, your apps availability. As you’ll see in a moment, SQL injection is basically where someone is able to execute any code they want on a SQL Server that was unintended, and that doesn’t just mean stealing data or changing data. It could also mean making your server not responsive. It’s pretty easy to write some queries that’ll lock up a server or even issue a shutdown command or something like that. So, suddenly your customers aren’t able to get to your website or your application or anything like that, that becomes a big problem as well.
So SQL injection can only happen if you’re using dynamic SQL. And so a lot of people might stop right there and say, “If SQL injection’s bad, why even bother using dynamic SQL?” It makes me think of back when I was visiting my cousins in Munich, and they gave me this. And you might think it’s some spaghetti with sauce and cheese, but if you look really closely, if you zoom in, you’ll see that this is actually ice cream. And when I saw that, I thought, you know, just because you can make spaghetti ice cream doesn’t mean you should. And I think that’s one of the concepts, one of the ideas that a lot of people have with dynamic SQL. It’s like, “Yeah, you can do it, but there’s other ways of writing queries. Why even bother with dynamic SQL?” And so just like I’m sure there’s a good reason for people liking spaghetti ice cream, I’m not sure what that reason is. If anyone knows, please let me know.
But in terms of dynamic SQL, there are some very good reasons of why you may want to use it. For starters, you can’t parameterize everything like in – I know from our previous presentation with Rich, a lot of those maintenance scripts and things, they may parameterize things like tables. And that’s, you know, it’s okay if you’re using that that way, but you can’t parameterize everything in a regular SQL query. And so sometimes you need to use dynamic SQL in order to parameterize things like table names or order by statements. Things like that.
Additionally, if you want some flexibility in how your queries and your reports or your application code runs, you might just have to write dynamic SQL, or you need some dynamic code to handle different scenarios based on inputs and that might be the only way to do it. And then obviously one of the most important reasons to use dynamic SQL would be for performance, right? Sometimes there’s just no other way around it. If you need maximum performance from some queries, the only way you’re going to get it is potentially by writing some dynamic SQL code. Maybe not the cleanest or nicest code, but in certain situations that’s what you need to do.
And so let’s talk about what SQL injection is. This is kind of the typical example that you would see where you have a SQL string we’re putting into this full query. It’s varchar variable here, and you know, selecting from users where username equals, and then we are passing in a parameter. That parameter presumably has an input from a user, they typed in their log in name or something like that, and then we’re using an execute SQL statement to run that query. And what happens is in order for SQL injection to work, we need to have this kind of dynamic string execution scenario happening, where we’re concatenating in some kind of user input parameter. That user input parameter also needs to be unsanitized, or at least you know, it hasn’t been checked over to see if it has some value that is injectable, and we’ll take a look at some examples of that in a minute.
But finally, SQL injection essentially, it’s going to perform something that we didn’t originally intend for this query to do. it’s likely that if we inject some malicious query code into that parameter, we’re not just going to return the full name from our users table. We’re going to maybe dump the whole users table or go grab passwords or whatever other personal data. And one quick thing I want to point out is a lot of times SQL injection people think, “Oh, this only happens if you’re using an input parameter like this, concatenating it.” And that’s not necessarily true. It can also happen when you don’t have a direct input parameter. So think about an application where you’re executing code based on some value that’s coming from a column inside SQL Server already. So this code we’re looking here, imagine we have a website where our user can define their preferences for the way column data should be ordered, in some report or something like that. We’re storing those order preferences in JSON value that gets stored in the database. That whole preferences front end may not have an injection vulnerability itself because it’s not doing anything, it’s not executing any dynamic SQL. But if we’re using a query later on, like what you see on screen here at the very bottom line, if we’re pulling that user input value from a database column and then executing it, that kind of works the same way. And it’s just one of those things, I bring it up because once you know it, it’s common sense. Like, oh yeah, there’s user input in a column in a table, of course that leaves you vulnerable because you can’t trust your users. But a lot of times it’s overlooked because in this type of scenario we – if this is a stored procedure, our sort column variable isn’t coming from – directly from our user input. It’s coming from a different place in our database.
And so with that said, let’s switch over and just kind of look at the demos of how this basic SQL injection works. So like I said, these will all be included online but let me create this database. It already exists, that’s fine. Let’s just use it, and I’ve created this table here with some users. Let’s take a look. It’s your typical users table. We’ve got some – the full names of users, their usernames, their unhashed passwords, their join dates. Imagine this is some, you know, website where you have registered users. Clearly, I was working on this demo over breakfast one day because we’ve only got cereal characters in the database. But this is kind of the table we’re going to be working with here. And here’s this one other JSON preferences table that I talked about. So let’s see SQL injection in action.
So if we use – we create this stored procedure here, this is basically from that first demo slide where we’re concatenating in a username into our query here and then executing it. So we’re calling that GetFullName. And so, our basic SQL injection attack works like this. So if we call our stored procedure with an expected value like TFly37, that’s the username of our Toucan Sam character. Query works as expected, we get his full name back from the database. Now, the problem is a malicious user might input something like this. We’ll take a look at what happens, but if we execute that as the username, we get the entire table of full names of users, so more than what we originally intended to with that stored procedure. And the reason that happens is because if you pass in this value into that parameter that’s been concatenated in our where clause, this is essentially the query that’s running here. And you’ll see the logic that ends up happening is you’re getting something that says username equals and then a valid username, but then our malicious user has tacked on this condition of or one equals one. And essentially what that’s doing with your or here, if you say or one equals one, that’s just going to return everything in all the rows in your table. And so this is an example of where a user was able to kind of form their own fragment of a query and put it into the procedure that you created in order to get results that you originally weren’t intending for the user to get. That’s kind of the classic SQL injection example.
And the same thing happens if we talk about that example where we have user preferences. So imagine we have these two user preferences, the correct value is this value with key of one, where we’re storing some JSON in a database, developers like me love JSON and we’re storing lots of different user preferences for background colors and stuff. We’re also storing the default sort order of a column in our report. So our expected input is something like username descending because that’s how we’re going to build our query. And then user number two here manages to inject a piece of code. So this means that the application that’s saving the user preferences has some problems in it. Maybe the developer should have been looking at the input values and seeing is there something not kosher in there, but this second value that’s going to be here isn’t just username descending, we’re actually going to add a – tack on a semicolon here and then run this second SQL statement. That’s going to insert basically all the data from our table, as well as our passwords into our users table, and you’ll see what that accomplishes here in a second.
So user one is two, and user two is bad. So if we run this query with user one, not a huge deal, we’re going to grab that value, we’re going to parse it out using the SQL Server JSON parsing functions, and we’re going to execute our dynamic SQL statement where we are passing in that column value. So this is coming straight from the database. And things work okay. So we’re ordering by descending order, things work great.
The problem is when we do something like use user number two. If user number two is logged in, remember, so they’re injecting this insert into users’ code here, so if they run this query the first time, it’s going to return exactly what we would expect. The same output, it looks like nothing’s wrong. But just remember, behind the scenes, an insert statement just ran as well. So if that user, our malicious user runs that query again, suddenly, they’re going to get back what was previously inserted with that insert statement. So now they inserted all of the usernames and passwords into the username field. Obviously, this is not a very discreet way of doing it, but for purposes of an example, it’s pretty clear our malicious user was able to basically create his own data result set here with usernames and passwords, which he can now download or do whatever he wants to with it. And so that’s the point there. It’s not always just a user input value concatenated directly into a query. It could be something where a user saved a value into the database and then your dynamic SQL query is going and grabbing that value from the database and executing it as well.
So back to the presentation here. Quick history, so this is not a new problem. It’s been around forever. I know 1998 isn’t forever ago, but that was as far back as I could find something on the internet that was easily accessible, that documented what SQL injection is. This is from some hacker zine, searching things from the 90s is really tough on Google. So it’s been around for a long time, and it is still prominent today. So in 2004, in the heyday of web 2.0, everyone’s doing their e-commerce websites, they’re selling, you know, whatever they’re selling in their regular stores online. They’re doing it very poorly, very insecurely.
So this group called OWASP came together, some volunteers, and they decide, “Let’s create a ranking of the most high-risk vulnerabilities on the web.” And they tried to do this using data, tried to do this very scientifically, there was no data out there for them to use. So they just kind of came up with their own list and here’s the whole list. Hopefully your developers are familiar with it, but number six of 10 on this list is injection clause, which includes SQL injection attacks. And so remember, this is kind of without any data, this is just kind of where they thought SQL injection attacks fell in the scheme of things.
Fast forward to the most recent list, which was released this past fall, and now they are collecting survey data every year of verified attacks on databases and applications, and it turns out that SQL injection attacks are still on the list. Injection attacks in general, which include the SQL injection attacks are number one. So all the way back from 1998 to present day, this has been a problem and it still will continue to be a problem forever as long as you can execute dynamic strings in SQL Server.
And so there’s a lot of common misconceptions surrounding SQL injection, like why you maybe wouldn’t have to deal with it. And one of the biggest ones I hear is that the structure of my database isn’t public. And so whenever I hear that from someone, I kind of think back to – so when I was growing up as a teenager, I was a magician, I worked birthday parties and I tell people that, “Hey, you know, I used to be a magician I still practice magic a little bit. Let me read your mind here. So you’re saying the structure of your database isn’t public. Well, I bet you have tables that are pretty easy to guess, like users, or products, or inventory, or sales, invoices.” It’s not that hard to guess names of tables necessarily. And so it’s not a big deal then – basically this is an example of security through obscurity, or where you think my database info isn’t public; well, it doesn’t need to be. It’s pretty easy to guess.
So taking that one step further, maybe you have met someone who does this. They obfuscate their table names and their column names. They offer some reason of job security, “I’m never going to get fired because no one else will be able to maintain my database,” which if that’s you, if you’re creating these kinds of tables called AZ2538, please stop, it’s really bad. It doesn’t actually help you with security either because like we looked at if there’s a SQL injection vulnerability in your website, there’s plenty of ways to get column, date, and table name information out of a database. So obviously there’s the system tables like sys objects and sys columns. We’ll run a query real quick to look at that, but also if your develops aren’t hiding their exception logs from the internet, which believe me, I’ve been on many publicly facing websites that will just have their whole stack dump there and you can get column names and table names from those error logs. So it’s a SQL Server example because it’s kind of just cool to see. If we jump here, if we are a malicious user, we can query something like sys.objects to get the table names, procedure names in our database. So if we scroll down here we should see somewhere – here’s our users table, our preferences table, our stored procedure that we created before. If we throw on our good buddy for XML path, what that’ll do is we’re going to get that entire sys.objects table in a single string, single row, single column. And so what that means is if we try to run an inject attack, unioning the result of that sys.object for XML path query into our injectable stored procedure, well now, suddenly instead of you know, the username displaying in the corner of a website or something like that for a logged in user, the malicious user will get to see your entire sys.objects table. So security through obscurity doesn’t work, even if you name your tables crazy things. A hacker who is dedicated will be able to figure out what they are anyway.
Alright, another common misconception is that I’m not important enough to get hacked. So maybe I run some boutique garden gnome online shop that gets like, 40 customers a year. I mean, these are pretty expensive garden gnomes so it’s worth it for me to keep up my business, but who out there is going to go and find my tiny little website that gets a few thousand hits. They’re not going to come and get my data. And the key here is that it doesn’t really matter because there’s plenty of tools that you can get, and this is one that I linked to, and I’ll include it in my demos there. But open source tools, SQLmap runs in Python. It’s a tool that’ll automatically test for SQL injection vulnerabilities against your website and it’ll test the whole variety of different types of SQL injection vulnerabilities. So basically, you know, you can go watch some teen on YouTube explain how to use the SQLmap tool and put it in a four loop, looping over all of the IPv4 IP address block, and suddenly – yeah, no one’s directly manually going to your garden gnome website to target your e-commerce store or whatever, but this tool will automatically do that for a malicious user on its own. So they’ll kick it off in the morning and by the next day or whatever they’ll see your site in a list of vulnerable sites. So don’t think just because you’re unknown that you won’t get hacked.
Drew Furguiele: Hey Bert, I’m sorry to interrupt you. One person had a question on the web conference back on your demo scripts. Steve asked, “What’s the significance of the double hyphen at the end of the objection scripts?”
Bert Wagner: Sure. That’s a great question. So in this last example here, you have to think about – let’s actually scroll up, this will be better. So imagine our first example we looked at where we’re passing in TFly37 or one equals one. And so what we want to do is if you look back at our original procedure, the parameter name is getting concatenated into our where clause, and we are adding single quotes, escaped single quotes to both sides of our parameter name right here because we’re expecting username to be some kind of varchar value. So in order for the injectable input parameter to work, we need to comment out this last single quote here, or else the query will fail. So if we don’t include that for example, if we don’t add that comment, we’re going to get some kind of error saying unclosed quotation mark after the string because essentially what would be happening is we would be saying where username equals TFly37, or one equals one, and then a random quote. So we need to comment that out. So you could do something like dash dash to get that to work. So that’s the reason there. Great question. Any others, Drew? I think that was Drew.
Drew Furguiele: Yeah, no, I’m pretty sure that was the only one.
Bert Wagner: Alright, cool. Moving on then. Lastly, last common misconception I hear here is that I use an ORM to code my SQL queries so I don’t need to worry about it. And I know there’s a big debate on ORMs and whether they should be used in SQL Server or not. I do like them in certain situations when you use them correctly, and correctly is the key word there. I don’t want to start a big debate about it. But point is, it is not foolproof. Most ORMs are going to offer us some kind of method that just allows you to execute a query string that you type in manually in your application. And so I mean, that’s not the only way that an ORM is injectable, but it is one way. So just to blindly say I don’t need to think about it because I’m using an ORM, that does not fly.
So with that said, let’s get to – we’re going to do demos for the rest of the session here of how to actually protect yourself against SQL injection. And so we’ll look at a bunch of these demos and we’ll take a look. So the very first way to prevent yourself from SQL injection, and this is – what I’m going to say is the only one 100% foolproof way of doing it is to not use dynamic SQL. You might be saying, “Bert, that’s dumb. Why waste our time with this?” And the reason I’m saying this, you might be taking over a database, or you might be working with some new developers who didn’t realize that you can parameterize queries, right? So based on someone’s background, maybe they’ve never encountered this before, or maybe they worked in a really small shop where it wasn’t an issue, they might not know that you can – that you don’t need to concatenate a parameter into your where clause, that you could just actually pass in a parameter value and it’ll work. And so by far the best thing you can do is if you’re finding queries that could just be rewritten using parameters, well, get rid of the dynamic aspect to them and just rewrite them that way. You’ll be 100% safe. And so we can prove that here but you know, so we pass in a regular value, we’re good, we get one result. If we pass in our injection syntax here, it’s not going to work because SQL Server is looking for a value in the database called TFly37 or one equals one. Can’t find that record, so we’re totally safe. That would be a very boring presentation if I just said get rid of your dynamic SQL queries, so let’s see what else we could do.
So one of the things that I recommend you do for all your queries, where you’re dealing with important data is limit the user accounts that have access to that data. And limit to what those users can do. So here I’m going to create a user, limited user. I’m going to grant it only select access to our users table. Now, doing this is not going to prevent SQL injection. Let me be clear. It is not a SQL injection solution at all. However, if we do get a query, we do – let’s build this stored procedure here that’s executing as our limited user, and we will include our SQL injectable query. What that’s going to do for us is that while, you know, regular queries are still going to work, certain injectable queries like where we try to append the entire output of our sys objects table, they’re also going to work, but you might see this string – I didn’t show you the full string before but before we had hundred of tables, and here this string is pretty short because it’s only showing us our users table. Because our user that’s executing the stored procedure only has access to our users table, a malicious user is only going to get – only going to be able to see what’s in the user’s table. So they’ll still be able to run a query like this that’s injectable, and it’ll still work. So it’s not preventing your SQL injection at all, but let’s say you’re storing credit care information in another table in your database, they’re not going to be able to access that data through this type of stored procedure that’s executing under a limited account. So just as a best practice, you want to limit access as much as possible on all your queries really where there’s important information that you don’t want to get compromised. But that won’t prevent injection.
One other thing that will prevent injection is to use sp execute SQL. So up to this point, I’ve just been using execute to execute my dynamic query strings. What you can use instead of sp execute SQL. So if you do need to build a query dynamically, so for performance, maybe there’s a bunch of conditional logic there and you’re kind of piecing together a query based on some run time parameters, if you piece together your query and don’t concatenate any parameter values, but if you actually just include a little @ sign parameter inside your query string, sp execute SQL will be able to safely execute that. So the way sp execute SQL works is you have to define what parameters are in your query and so I created an ParmDefinition variable here, and then you’re passing in your query string here, full query, you’re passing in the definition of what parameters are in your query, and then you’re passing in the actual values for those parameters. So ParmUsername is an input to our stored procedure here. So if you use sp execute SQL, everything works great because SQL Server is parameterizing your query and values that should work and values that shouldn’t work don’t work, and you are not vulnerable to that injection attack.
The problem though is if you have a query like we were talking about, like with maintenance scripts maybe, where you want to parameterize something like a table name or an order by clause. That won’t necessarily work with sp execute SQL, and I’ll just run through this to prove it to you. So now we’re passing in our table name of users, it fails, incorrect syntax in your table name. That’s because sp execute SQL’s only going to parameterize queries the way that SQL Server, in general, allows parameters and queries. So not in places like the table name, for example. So it’ll work on your where clauses, it will not work on your from statements. But that’s a good thing to know because if you are only parameterizing your where clauses, just use sp execute SQL. It’s a no-brainer, you don’t have to think about it. Parameterize all your inputs, you’ll be good to go.
Alright, but sometimes we need to parameterize something like a table name, and that’s where quote name comes in, and you’ve probably seen lots of code with quote name online as a way to protect against SQL injection. And it works. It works pretty well most of the time. We’ll take a look at maybe where it might not a little later, but let’s say – basically, what quote name does is we’re using this as a way – you know, if you’re using quote name for SQL injection prevention, you’re not using quote name for the original intended purpose in Microsoft SQL. Quote name is meant to basically limit things like table names. So if you have a space in value and you want to have brackets added to it, that’s what quote name’s going to do. So here, if we have some input value that has a single quote, an escaped single quote in it, quote name is just going to go and add brackets around it by default. You can also overload quote name with another – with a second parameter input here to use something like single quotes instead of brackets. But the idea here is that if you wrap your input variable with brackets, you’re not going to be able to execute any dangerous code. And that’s definitely the case. So here we’re going to take our same stored procedure and we’re going to pass in our table name and we’re going to wrap it around with the quote name function, so it’s going to add brackets at run time and let’s see if it’s injectable. So if we pass in our regular query, everything works like you want it to. If we pass in our query that’s supposed to run some injection code, it doesn’t work because we’ve added brackets. SQL is considering this user select from dbo users statement as a table name, so it can’t find the table called dbo user select from dbo users. So in that case, quote name is working exactly like you want it to.
Now, the downside of quote name is since it’s just built for SQL Server object names, it won’t work on anything longer than 128 characters. So if you need to sanitize your input or by escaping it with the brackets on either end, if your input you know is always going to be less than 128 characters, use quote name. It’ll work great. If you have something longer than that, that’s where you start finding solutions online that we’re using things like replace. And this is where things start to get really scary. This is what a lot of people don’t consider because you’ll go on Stack Overflow and you’ll see some solution saying, okay, just strip out all of the single quotes in your input parameter with a replace function and replace it with two single quotes in a row. And I realize this syntax here is a little confusing because we’re escaping single quotes inside single quote input values. But that’s what this is doing. We’re taking all of our single quotes and we’re replacing them with a set of two single quotes. And the idea there is that in all of our SQL injection attempts up, up here we’re always adding things like single quotes in our query string. That’s like a very common way that you’re going to have to force a SQL injection attack. So the idea behind using replace is that if you can just get rid of any single quotes or escape them, then your query will be injection proof.
Well, let’s see if that actually works. Hint: it does not. So we have our custom replace function that’s supposed to keep our queries safe here. If we pass in values that are valid, our query works, as we would expect. If we pass in our SQL injection attempt here of TFly37 or one equals one, which has our single quote in it, the replace function actually works exactly like you would expect it to. It’s going to escape our single quote and so the injection attack is foiled. Now, that works great except in instances where a SQL injection piece of code doesn’t use single quotes. So where the case where you’re concatenating a second statement altogether, it’s not going to catch that kind of SQL injection attempt. And so here we’re returning two result sets. So yeah, replace is good, but at the end of the day, it’s as clever as you can get with writing your own replace functions because maybe you’re going to now say, let’s replace selects or froms. There’s probably someone really bored, some malicious user, some hacker that’s really bored or if he’s really motivated to get into your data, he will find a way to get around your custom written replace function, and we’ll see an example of that here in a second.
The last thing I want to talk about first is limiting inputs, checking inputs. So obviously, work with your developers. If you’re a DBA or SQL developer and you’re working with app developers, have them check user input. That’s very important. It’s everyone’s role to practice good security. But I have two procedures here that I’m going to create now. One of them has primary username, which has an input of varchar 100, so length is 100. We’ll run that one and then we’ll run this second query here, which is called full name short. So the above was full name long, this one’s called full name short because our username input is limited to only six characters. And that’s assuming we’re on some system where we know all our usernames are never longer than six characters in length. And so what that’s going to give us here is if we pass in our injectable code into our long procedure, which is varchar 100, for the user input, the injection attack’s going to work because we can type in this nice long string here, I know how many characters it is. 20 or something, right? So because we’re not tightly defining our input parameter lengths, that happens. But if we do tightly define the lengths, so we’re limiting it to only six, if we pass in our injection attempt here, it’s only going to return the first six characters. The rest of that’s getting truncated and our injection attack is foiled.
So once again, this isn’t foolproof in all instances. This only works when you’re working with data where you know you can do that. And I scrolled past this second procedure here, our full name short input, but I want to go back to it because there’s a couple other features that you can implement. So not just the limiting the length of your user input, but you can do things like write basic custom validation functions.
Now, generally, this still scares me. I think this only really works in scenarios where you are very comfortable with the data coming in. So if you know you’re only supposed to get certain characters, like a username is only letters and numbers, maybe you can check it with this kind of pattern index search here. Maybe you can create a table that has all valid table names, right? Let’s say you’re doing a parameter on a table name and you know that your list of valid table names is, you know, 20 different table names that that query would ever run on. Maybe you can check your input parameter, check to see if it’s in that list of – that white list of valid table names. Those are the kind of checks I’m talking about, once that you can be confident of, that they’re going to work all the time.
So in this case where we’re checking for username values where we know we’re totally limited only to letters and numbers, you know, if we do an injection attack then where if we’re passing a username that’s only letters and numbers, the query runs without an error. There’s no user T37, but we don’t get any kind of error message, SQL just didn’t find any results. If we now try to attempt an injection attack where we’re maybe using something like a semicolon or a dash, that stored procedure is not going to give us any output at all because we hit our return statement because we had characters that weren’t just letters or numbers. So the thing you got to be careful of though when you start implementing things like this, like using a white list of valid table names or using a white list of allowable characters is really great, but you have to be careful that you don’t miss anything. And that’s where things get scary.
So this last example I want to show you is why I definitely recommend doing things like limiting user access, using sp execute SQL, using the quote name function, and you know, definitely do those. If you want to then add additional validation after that, great, but don’t rely solely on that. And this is I think a perfect example of why. So we have this query we’re writing here, USP get profile. It is accepting a username as our input, we’re going to do nvarchar100, that’ll be important as you’ll see in a second. And we’re going to write our customer place function because we’re very smart and we’re going to steer away any injection attempts by replacing single quotes with two sets of single quotes, and then here’s our query string that we’re concatenating our username into. Thing to know here is that our query variable is of type varchar max, so this attack that we’re going to look at now is called a homoglyph attack, and it happens when you have an implicit conversion. So we’re going to get an implicit conversion because we’re passing in an nvarchar value, but our query is executing as a varchar, we’re concatenating to a varchar. So SQL Server will automatically convert our nvarchar to our varchar. That’s how the implicit conversion in SQL Server works. You might be saying, “Bert, why would you mismatch your data types like this?” And I’m sure I’m not the only one who has seen implicit conversions happen on their database, where users don’t always strictly follow their data types. So as crazy as a set up this may seem, I’ve definitely seen this before. It’s not that uncommon.
But let’s create this stored procedure. There we go. So remember where we’re correctly using our replace function here to pass out single quotes, so let’s see if we can inject it. So if we pass in a regular username, valid value, works great. If we have an injection attack that is using a single quote, an escaped single quote, our SQL injection attack is foiled, our hacker can’t get our data because our customer place function here works great. No problems. Now, this statement here look similar to the one above it, but if you look really closely, at this first character, instead of our single quote which we have up here, this is actually an apostrophe character. And so that is technically a different character. That’s a Unicode value. That’s like a little curly apostrophe, it is different from a single quote. So what happens though when SQL needs to convert from a Unicode data type to a non-Unicode data type, it tries to look up what are called homoglyphs, which are lookalike characters. And so this is where this injection attack fails because if we execute this, suddenly our entire table of data is returned and our clever replace function that’s replacing single quotes doesn’t work. And so…
Daniel Hutmacher: Excuse me. I’ll have to read about after two billion lines of code.
Bert Wagner: Yeah, so this is a problem, right? So what’s happening is if we have these casts here, so if we cast our apostrophe as a Unicode value and if we cast it as a non-Unicode value, you’ll see – this is going to be tough to see, but you can see our Unicode character is this slightly slanted apostrophe character. Our non-Unicode has now converted to a straight single up and down quote. So basically, this apostrophe is getting converted to a regular up and down quote.
However, that conversion is only happening when we concatenate it to our varchar variable. So our replace function runs before that, so our curved apostrophe character gets through, SQL Server actually converts that, then apostrophe character to just the single quote, and we now have an injectable query. That’s what’s known as a homoglyph attack. And so now you might be saying, “Okay, well I’m just going to add this apostrophe character to my replace function. I’ll just replace it.” Let me just run this code, I’ll explain what it does here in a second. And while yes, that is one thing you can definitely do, and then you know, it’ll work and you’ll be safe, the problem is is that’s not the only character. So this script that I kicked off because it runs for a few seconds here is looping through the entire Unicode character’s address space to find lookalike characters for – in this example, I only am looking for single quotes and I’m looking for semicolons because those are pretty typical in your SQL abjection queries or attempts. And so you’ll see, these first two rows are our actual single quote and semicolon characters that we’re typically used to using, but all these other characters here are characters that will convert to a single quote or a semicolon when there’s an implicit conversion that happens from a Unicode data type to a non-Unicode data type. And so you know, what that means is your replace function then would have to cover all these instances. And by the way, this list of Unicode characters is still growing. That full address space isn’t used up, there’s new characters added all the time, so this is now a maintenance problem of you having to constantly maintain and check for new characters. And this is only the character table list for semicolons and single quotes. It’s not looking at all other characters that might be injectable.
So for example, let’s say you’re going to do something like I’m going to filter out the word select from any user input and just replace it with a blank. And that’ll fail in this kind of homoglyph injection attack because your malicious user might be able to replace their query with all Unicode characters, none of them are the typical 255 ASCII characters that you’re used to. And they’ll be able to just get around your clever function by just passing in Unicode values. So that’s why getting into the business of writing these custom security functions is just a really bad idea because it’s very hard to do and it’s very likely that you will forget some edge case that a dedicated hacker wants to get into your data will probably think of. They’ll probably spend more time thinking about this than you will.
So stick to those solutions that work. I’ve outlined – definitely if you download the demos, I’ve included kind of – if you saw them while I was presenting, there’s notes about the pros and cons and the caveats in the notes so just refer to those to know which ones work well, and not – the last thing I want to show you before we wrap up, give you the opportunity for any questions is I’ve included these two scripts here. We’re not going to go in detail of what they do. But let’s say you’re a DBA, you’re taking over a new server, or maybe you’re a consultant going in and you’re unfamiliar with the database or a server, and you want to see are there SQL injection vulnerabilities. So these two scripts I wrote to kind of do that.
This first script is looking at our execution – sorry, it’s not looking at our execution plan in cache. It’s looking at all our system objects, so things like stored procedures, functions, and it’s checking for keywords like execute and execute and sp execute SQL. I don’t have any data on this machine, it’s not going to return anything for us, but basically, this is just searching for any queries that look like they’re dynamically executing code. So if you’re inheriting a new server and you don’t want to be responsible for a security breach, you want to know are there injection vulnerabilities somewhere, this is maybe a good place to start. It’s not going to catch 100% necessarily of your queries, but you know, whittling down your list from 100 to 10 is a good place to start.
And this second query here does a very similar kind of lookup, looking for injectable query parameters, but it’s checking your execution plan cache. So if your developers are writing ad hoc queries, and they run them frequently enough and they hit your plan cache, this will search the plan cache for those types of injectable values. So just a good way to kind of protect yourself if you’re inheriting something so you’re not leaving these – you’re not responsible for these injection vulnerabilities.
So with that said, just a quick recap. There’s no easy single approach solution, I said that in my very first slide. The take away here is that you need to kind of layer, you need to do security in layers. You need to do your validation, you need to limit your access, you need to do – use functions correctly and don’t write your own custom security functions if you can help it because it’s very hard to make them foolproof. You know, everyone’s responsible and that’s basically it. One thing I didn’t talk about is using other software. I did talk about SQLmap. You use SQLmap, you can use that so it’s not just for hackers to hack into other people’s systems, you can incorporate that into your own system to test for vulnerabilities. I know some people include it in like a built process to test, to see if they’ve created any injection vulnerabilities with new code that they’re introducing. So that’s pretty cool there. And obviously, if you’re running things on Azure SQL, there’s a SQL injection option there to prevent SQL injection attacks from happening or to log them. So if you’re using something like that – I’ve only played around with that feature, I haven’t used it in a production setting so I don’t want to recommend it, but from what I’ve read, from what I’ve kind of tested out, it seems like it works really well. So if you have other tools available, definitely use them. So that’s it, thank you very much.
Brent Ozar: Very cool, thanks Bert. Lots of dialogue happening over in Slack. Lots of people are going, “Oh my god, I had no idea it worked that way. oh my god, I’m a bit scared now.” Everyone is like, “I vow to your greatness,” J.D. says. All kinds of stuff going on in there.
Bert Wagner: Cool, thanks, everyone.
Brent Ozar: Adam asked, “Are there any tools that you would recommend to also scan for SQL vulnerabilities, or do you just usually use the script you presented there?”
Bert Wagner: Yeah, so I mean, it’s hard for me to say. I typically use the scripts, but I am not a DBA and I am not a security analyst. This is not my day-to-day job to check for these things. I’m approaching it from a standpoint of I want to be a better developer, I want to try to wipe my hands clean, that’s right, Drew. You know, so yeah, there’s tools out there, there’s monitoring tools that I know like, banks use, and things like that where they’ll look at your data hitting your server and if there’s some anomalous pattern or anomalous input value coming, they’ll just kill the query. So there’s lots of tools available. I can’t recommend any of them because I don’t use them.
Daniel Hutmacher: I store a Pluralsight course by security MVP, Troy Hunt, and he uses a freely downloadable tool, which just automates the whole process. It’s basically a script tool. It’ll map out what tables you have in your database, the column name, the values, and the columns. It will do everything for you. it’ll even figure out what kind of database you’re running. So yeah.
Bert Wagner: Yeah, Troy Hunt’s definitely a great guy, another great resource to go look into this. And he gets into the real detail of injection vulnerabilities. He’s got a great website out that you can kind of test your knowledge to see if you can hack into it, which is really fun, to really see if you know, you understand how injection vulnerabilities work and stuff.
Daniel Hutmacher: The homoglyph thing was a complete eye opener to me. I went into this session thinking, “Yeah, I know something about SQL injection, I’m covered, that’s fine.” And now I think I’m going to have to revisit some of my code. So thanks a lot, it was really, really valuable.
Bert Wagner: Yeah, no problem.
Brent Ozar: This is also one of those things where people go, “Security’s everybody’s job,” and then you look at what real security people do and you’re like, “Oh my god, I don’t know even remotely. The more I read I’m just terrified. There’s so much work that’s involved in this.”
Bert Wagner: Yeah, it’s definitely a tough – you know, and this is just we’re looking at SQL injection, which is just one tiny little piece of the whole security puzzle. So yeah, it’s hard.
Brent Ozar: If you’re watching this too on YouTube or a podcast, or even live, I guess, if you liked this session, Andre Melancia had another session at GroupBy. Andre’s out in Portugal and he talked about hacking SQL Server by doing man in the middle attacks, intercepting network communications between things like D apps and SQL Server itself. Changing the data that users see, it was pretty crazy. Well, thank you Bert. Lots of comments. Let’s see, Adam says, “So should I check out Pluralsight?” Yeah, absolutely. Troy’s course is on Pluralsight, and he says where will this video be posted? At GroupBy.org. Also, on our YouTube channel. So thanks for hanging out with us today, Bert. Nice presentation, nicely done.
Latest posts by Bert Wagner (see all)
- SQL Injection Attacks: Is Your Data Secure? - November 27, 2017
- DBAs vs Developers: JSON in SQL Server 2016 - June 9, 2017
- High-Performance SQL Server Reporting Services: Lessons Learned - January 13, 2017