Enjoy the Podcast?
For developers and DBAs that need to deploy database changes to multiple environments reliably.
DevOps is a hot topic in today’s software development world, however most of the knowledge and experience with DevOps is based around application software and ignores the database. We will examine how the concepts and principles of DevOps can be applied to database development by explaining the core concepts of managing scripts, automated building, testing, and deployment of database changes. Once you understand the principles of a DevOps database development process, you can generate scripts used for direct deployment on your instances or sent to customers for them to execute.
In this session we will cover tracking the changes made to your database and capturing them in version control. We will show automated building and testing of your code on a separate environment and the generation of an upgrade script that a DBA and developer can review and deploy to QA, test, staging, production, or any other environment.
This session will use Redgate Software tools to accomplish specific actions, with explanations on how the same actions can be alternatively built by anyone.
Why I Want to Present This Session:
I’d like to help more developers and DBAs understand the process of streamlining their deployment process through a DevOps mentality and automated tools. By explaining the general process and concepts, and showing specific implementations, DBAs and developers can gain confidence in experimenting with a proof of concept in their environment that can help them build a repeatable and reliable database software deployment pipeline.
Brent Ozar: So in the next session at GroupBy, Steve Jones will be talking about bringing DevOps to the database, so take it away Steve.
Steve Jones: Thanks, Brent, so welcome everybody, hopefully you’re having a good Friday here and you’re interested in DevOps to database and I’m here to talk about some of that. So I’ve definitely given this talk a few times, different places and it’s been fairly popular and I’ve always learned things and I always encourage people to ask questions because it’s always interesting to see how people view the world of development and what works well for them.
So the agenda really today is to kind of talk a little bit about kind of DevOps and software development and specifically database development, just to kind of level set us all in the same spot and then spend most of the time kind of in a demo and try some things and show you some things.
One thing I want to emphasize really is that what I’m showing you is just a particular implementation of how you can easily move database development and have it go faster, smoother, easier or anything you want. But I’m not trying to say you should do it this way or I’m not saying you should use the tools, the platform, the technique, anything that I’m doing. The platform doesn’t matter, whether it’s SQL Server, Oracle or Neo4j or Mongo. It doesn’t really matter and source control doesn’t really matter and all that stuff, but they’re principles and ideas that we kind of want to follow.
So who am I? I’m Steve Jones and I run SQL Server Central. I founded that site about 15, 16 years ago and it’s been my full time job for about a decade really, and then I worked for Redgate Software as an evangelist, but I’ve been working with SQL Server for almost 26 years now. I started back in 1991 on version 4.2 and actually ran SQL Server on OS2 so I’m now on my third platform running SQL Server. I’ve run it on OS2, on Windows and now on Linux. So I’ve had a lot of experience in that time with just all the different versions upgrading you know, typically you know, every few years and some companies, some clients, in something I’m doing.
I’ve worked in big and small companies, I’ve worked in large 20,000 person companies, Fortune 100 companies and then I’ve worked in small startups that were 2, 3, 4 people and a variety of companies between. I’ve had experience in different industries as well, I’ve worked in finance and education and utilities and import/export, different industries across my career. So all my contact information is up here, please feel free to get a hold of me and I’ll have this at the end as well, so if you’ve got questions get a hold of me and you know, ask and I’ll try to respond as best I can.
So let’s talk briefly about what DevOps is, and really it’s this kind of Morphis thing that – it somewhat defies definition and it’s kind of like saying Cloud, right? What does that mean and it’s different things to different people. On some sense, parts of DevOps aren’t different than things people have been doing for years, matter of fact, I would say a lot of the software development I’ve done throughout my career, you know, 15, 20 years ago, was what we would call DevOps today. But really DevOps tries to encompass and just give us a way to talk about some of these good practices, this way to smooth things out. Now, the … gravity Agile movement, really, and in 2008, John Alsplugh wanted to do a talk where he had been looking at all this agile stuff and he wanted to do this agile operations talk because he was an operations guy and he went to a conference and they accepted his talk and he went there, walked into the room and one guy showed up to watch him. So in some sense it was a little bit of a failure but in some sense it was also interesting because he realized he wasn’t the only person that wanted to know. So in 2009, Flickr gave this seminal talk, John Alspaugh was part of this and they had spent the last couple of years trying to become better at how they build and deploy code. And they gave this talk and said they deployed ten times a day or more and they do it by having Dev and Ops work together. Now, I’ve linked the talk in the deck, you can go and watch it because it’s interesting.
But you know, in 2009, ten times a day was stunning, phenomenal. Nobody could really believe that people deployed that much code every day and you know, since then, we’ve seen a change, right? In 2010, Etsy actually did talks where they deployed 50 times a day and various other companies have come out over the years, usually these oppressive high tech software or servers or all internet companies that deploy many times, and just a few months ago, Amazon actually released statistics for 2016 and they deploy actually every 12 seconds on average. So throughout the year, if you look at all the deploys it did, it’s 12 seconds.
Now, doesn’t mean every developer deploys every day because they’ve got thousands of developers at Amazon, doesn’t mean that these are all you know, big changes or small changes, it’s just a variety of stuff that happens. It could be somebody alters a CSS style slightly, somebody changes an image on a page, somebody makes a database change or somebody updates a messaging queue, updates the contract that’s going to flow through the messaging queue, any number of things. The whole point is that they want to be lean and agile and flexible to be able to deploy when they’re ready to deploy. So if all of the people – so a bunch of us were just on talking, if Erik decides he wanted to deploy today because he’s going to update the main page on WordPress he can certainly do that, might be something he’s worked on for a day. Brent might have a new LAN to function that fixed a bug he saw this morning and he deploys it 20 minutes after he creates it. Could be that Tara has a database change that she had been working on for you know, three weeks and she finally deploys it today and dark launches it. The idea here is that we want to be able to move at the pace and at the schedule that works for us, whatever that is. Is it once a year, is it once a day, whatever.
The other thing to know about DevOps is that it’s not a specific thing; it’s not prescriptive. So DevOps doesn’t say everybody stand up in a meeting everyday that we only have teams of ten people, doesn’t say any of that stuff. DevOps means that we try to follow these principles and ideas that help us build better software and the way that we that, the actual implementations could vary from company to company, from project to project, to person to person. But it’s similar – not similar to but kind of encompasses agile and strong and lean and kanban and waterfall and everything else we’ve talked about throughout our entire careers, but doing it in a way that helps us get better.
Now, it’s kind of been best expressed by Gene Kim in these three principles, and Gene Kim is a researcher, he was a CTO for Tripwire, he’s been an entrepreneur and tech guy for lots of years, but he’s been researching DevOps and software development for the last 6, 7 years pretty heavily. And he wrote The Phoenix Project, which is kind of a DevOps story disguised as a novel and certainly I would recommend it if you want to get an idea of some things that you could maybe see going right or wrong inside of different companies or different organizations. It’s kind of an easy, fun read. He’s also written The DevOps Handbook and a few other things as well. He postulates that – we’ve got these three principles that should guide us when we build a software pipeline, which goes everywhere from I say, development on the left, I may have QA, I certainly could have architects and project managers and all these people on this side and I have operations, I could have security people, I have my clients obviously.
That whole group of people where we want to think about them holistically, so we want to think about the entire system, we don’t want to optimize development, we’re not trying to just get development stuff done and thrown over the wall, we’re not trying to optimize operations so that security is perfect. We’re trying to optimize the slow of software so that everything moves smoothly to our clients at the speed that makes sense for our organization.
The other thing I want to do is amplify feedback loops. One of the problems I’ve seen throughout my career in development typically is that feedback is difficult. Clients get upset, they complain, doesn’t always get back to developers or doesn’t get back to developers in time if it’s not smoothly, operations people have issues and all you do is complain when developers release code or when they have to deploy it, developers complain and architects don’t listen or you know, all these issues and what we’re trying to do is we want to make sure that we get feedback loops that make sense.
So when QA sees issues with software, they can easily let development know and help them get better and development accepts that, and operations can let people know how they need software built so that they can deploy it and so that it makes sense and you know, clients obviously have their own things. And over time, what we want to do is get better and better, so it’s not just these big feedback loops, but as we become better, we want in-flight smaller feedback loops so is this feature actually being used? Does this query perform well? We want those smaller and smaller feedback loops to get back to different groups as we improve our process.
And the last one is we want a culture of experimentation learning. I’m sure if we were all in a room, and I wish we were, we could talk, we have all had issues with software, right? Building it, deploying it, something else. We have problems, we make mistakes. That happens right now. We don’t necessarily want to prevent mistakes or get so uptight about them but we want to try things, see how they work, do minimal experiments, fail fast and then fix things, we want to learn, we want to get better. One of the things I think is sad is that so often developers, operations, everybody, we realize things don’t work well, we fix them but we don’t adjust our habits.
We don’t get better and better at what we’re trying to do, so we make the same mistakes over and over again. See it constantly in security, we see it constantly in code, and server configurations where we’re not improving a lot, and I think people here in GroupBy and people that go to SQL Saturdays in the past summit, you are way ahead of everybody else because you are starting to think about things and you do make sense where you try to get better over time but so many people in our industry. So often, we might try things, we might break and we get worried, we get upset, we don’t learn, we don’t get better and better and better over time, and so what we really want to do is make lots of small experiments. We want these minimal viable products not as an entire product but you know, maybe that minimal viable thing is just a section of our database. We want to change our indexing, we want to add a few feels for auditing, we want to add some more instrumentation and extended events or some other way to better understand what we’re doing and we want to make small experiments, see how they work then start to maybe improve them, change them, grow them to the rest of our application, the rest of our system.
Donovan Brown works at Microsoft, he’s the DevOps PM, which I’m not even sure what that means really, but I like Donovan, he’s a great guy, I had the opportunity to present with him last year and in all the work that he’s done with DevOps and trying to change Microsoft’s world, he came out with this blog post and he put this definition down, which I think really is one of the best ones I’ve ever seen. And ultimately, the goal here in this quote is that we have to deliver value to our end users; that’s what we’re trying to do with software. We want software to work for the people that are doing it. We don’t build software for the sake of software, we don’t build software so we have perfect elegant code and minimum number of lines or the fewest reads in a query. We have to deliver value to our end users and that’s what’s important.
Now, we can do it with people, process, products, the proportions of it could vary, the choices we make, which people, how we change our process, which platforms, products, tools, et cetera we use, that’s completely up to us. But we have to put those together to make sure we’re delivering value to our end users, and ultimately that’s the goal and that’s what I want to see is that we focus on delivering something to our users that works well.
One of the things I like with this GroupBy, Brent’s trying to change the conference world because we want it to be better. We want to deliver something to people, it doesn’t really matter who’s speaking, how, how are we making money, it’s about let’s talk to people and that was our goal with SQL Saturday, that was our goal with SQL Server Central, that we wanted to deliver something that people could use and in DevOps, what we’re trying to do is do that in the software world. So if that means bigger servers, we scale our, we change platforms, we buy some tools, whatever it is, we have to deliver value. Now, Microsoft has come up with also some practices here, these are kind of ways that you might look at DevOps in practice. We do CICD, we certainly instrument telemetry and stuff. We manage configurations so that we’re sure what servers look like, all this kind of stuff and testing. I always hesitate to ask people how they test their code or do they test their code because most of the time I’m sorely disappointed in how people test their code and it shows, right?
We all see it in the end product that gets to end users that’s delivered, we certainly see the times from Microsoft and other vendors when they deliver to us that we’re like, god, please test your code more. It’s important.
There’s also habits. There’s habits of how we as individuals, we culturally have to work together, we have to be sure that things start to align so that we’re all trying to deliver that value to customers, and these are kind of examples of things that you might do and certainly it’s not just developers, not just operations or that group, it has to go to everybody, right? Managers and project managers have to be trying to deliver value, not just get something done or not just deploy something. We have to kind of make sure that we are actually delivering value throughout the organization, and certainly this requires buy in from management.
I know DevOps is a hot topic now and lots of managers want DevOps and they have no idea what they’re trying to do, but ultimately as we try to move to a DevOps culture, we have to change the way we work, the way we manage people, the way we put thing together and I’ll say as I talk with clients and work with people at Redgate, the hardest thing for most people is the culture. The technology stuff is easy, it really is easy; the cultural changes, building trust and acceptance and sharing information and communicating and taking ego out of it, all those things are incredibly difficult.
Today, I really want to talk in practices – the first three things are the technical side. We want to do automation, I want to show you some automation, certainly testing is a part of this as well. On instrument things and there’s all sorts of ways to instrument, I don’t really want to get too detailed on that because that’s a whole other talk and that just goes you know, sideways. But those last three items are the important things. We have to communicate transparency and we have to share information and we can’t try to make my job better than your job or more important. We have to ensure that everything works well.
So let me talk briefly about software development here. Some of you are application developers, those of you who are raise your hands, Brent can count them, I’m not going to. But you know, as we’re developers on the left side of this, developers typically work in version control. We’re in Visual Studio or Eclipse or you know, IntelliJ, whatever it is and we write code, we store it, commit it to version control, we pull it back, we share code and then the more advanced developers, and I wish this was most of the world but I don’t think it is, should be doing continuous integration. They all actually commit their code and have a system automatically build and test it.
Now, continuous integration grew out of this idea that a long time ago, back in the 60s when IBM was building system 360 and Vax was building their systems, we realized that if we took really more than like three people and we put them in a room and you know, had them write code for a month or two months or three months and then we tried to put it together, it never works. It just never works.
Some of you have probably read The Mythical Man-Month and if you haven’t you should, because that really highlights that we can’t build software better when we’re working separately and not coordinated. So some companies in the 80s and 90s started to get a daily build where they built their software every day because there’s only so much damage each of you as a developer can do in a day. And so at the end of the day, we compile our code, if it doesn’t work we fix it and, you know, nobody goes home until it’s fixed.
Well, as hardware has become more powerful way of virtualization and containers and all these changes, people came up with the idea of why don’t we build on every commit and I think ThoughtWorks and Martin Fowler were some of the first to do this, where they – it said you know, every time we commit to version control, let’s pull all the source code out. Let’s compile it, let’s run automated tests against it because we can’t trust developers to run tests and then let’s let the developers know if there’s a problem and if not, let’s package that as a potential release candidate. Let’s publish out an artifact that is an XC or DLL or Jar file or whatever that we can potentially send out to production. So the ideas, we’ve introduced some automation and some consistency in our software process. And then what we want to do is we want to use that artifact and we want to put it into a release management system so that we deploy it in the same way every time, that if QA pulls a build for a software. Whether that’s web software or backend software or client software, they get it the same way that we’ll deploy it to beta test or the same way we’ll deploy it to production. We have everything working consistently, and that’s ultimately proven to be one of these ways that companies can move faster and product better software, and Microsoft does this. Regardless of how you feel about SQL Server, it’s stunning, they’re going to release a new version in about a year, June 1 was SQL Server 2016, so we’ve got a few weeks, they could actually hit a year. But the reason that they can do that is that they are consistently building and testing hundreds of thousands of tests on that platform, and deploying this out in a smooth fashion so that they can actually deliver software faster.
And they started this back in 2000 where they realized that problems with patching and they started to do regular patching and certainly Windows does this, lots of companies are starting to do this but it’s having a smooth pipeline means that I can build and deliver software better and faster. So what about database people? Now, typically, we have databases we work with, maybe we each have our individual one, maybe we share database, hopefully we communicate with other developers. I think the worst thing I can see is database developers kind of working independently and not talk to other people about the changes that they’re making because ultimately the database is kind of that central point where we all meet. Often applications or systems meet, so we have to communicate. But we typically haven’t had version control, I mean, I’ve worked with Oracle developers and DB2 developers and MySQL developers and SQL Server developers and so few of them have version control, and they certainly don’t have any continuous integration, they don’t publish any packages out. But what they do is at some point, everything is in some shared database, which could be development or could be QA or something else, and then the DBA is responsible for figuring out how to deploy that to test systems, and then they have to refigure out how to deploy it to user acceptance or beta systems and maybe they refigure and they deploy it to staging before they deploy it, and maybe they had to fix that one little thing you know, and make this work a little better. And then they can do scripts or they alter their scripts and they deploy it to production and it’s inconsistent, and many of you are very good at this. I don’t mean to disparage you and say that this doesn’t work because most of my career I’ve done things like this and I’ve had a lot of successful deployments.
I get stuff done, we deploy software, it’s just not the greatest thing where we make mistakes you realize I forgot this script, we forgot an object, we didn’t do this thing, we didn’t set permissions, any number of things happen when we’re dealing with databases and those problems lead us to be nervous about deployments. Most companies I’ve work in, most database developers I’ve worked with, they try to limit how often they deploy to the database because it’s stressful. If it goes down, it causes problems, everybody is upset, typically we’re bundling in more objects, more changes because we’re waited a while and then that makes it more complex and more risky to deploy that. It doesn’t work so then the next time we try to be more careful, we try to put more process and paperwork on it and we end up delaying our deployments more and creating more and more risk all the time. We don’t want to do that.
So what we really want to do is we want to evolve database deployment to be more like application developer because the application developers have gotten better and better and better at building software over the years. Still make mistakes, but I’m stunned that all the apps that I have on this thing here, they update regularly, not all of them but some of them, across all these apps, some of them will update and they get pushed out they relatively rarely have bugs. I remember deploying client-server apps and treat their apps and they constantly had bugs. So we’ve gotten better and we want to do that in the database as well. We want to get better.
Now, I’m going to start with version control, so I really start from the left and move over, so I’m going to go from developers to test operations, pushing the code that way. But ultimately we build this process the other way. We’ve got to understand what do I need to actually deploy software? How do I get these changes into my database? So I’ve got to think about that process and then that drives what I need to output from my CI system or from an intermediate system.
I’ve got to know how do I put that together so that I can deploy it and then of course, knowing how I’m going to put that together will drive how I store things in version control. Now, most of us who try this will start with version control, we’ll realize it doesn’t quite work, we’ll change something, we’ll get that working, we go to deploy it, realize it doesn’t work, we have to then go back and forth and that’s totally fine. The idea is we just want to keep getting better and better every time. So we want to use version control and I’m going to start there, and ultimately that means I’ve got to get my code into version control and that’s challenging because unlike an application, right, if I have a c# or a Java class, I’ve got the class, I’ve got the properties, I’ve got the various methods in there and they’re all in one file. And I have multiple files here that I store, but in a database, everything is separate and it’s on the platform. If I’m in Oracle or in SQL Server or MySQL, the table code is kind of one way and it’s really metadata stored somewhere and my views and stored procedures are code somewhere else so I’ve got to manage this together.
I’ll talk SQL Server because that’s where we are but the ideas are simple in other platforms – sorry, similar in other platforms as well. I need some way to script and gather any DDL that I have out there and potentially DML as well because part of our deployments, part of our software can involve DML, so we’ll talk about that. On SQL Server, I can use DSMO interfaces and do scripting, right click, generate scripts, connect with PowerShell or something else and generate scripts. Certainly that works, there’s a few third party tools out there, Redgate makes one of them, I’m going to show that today but ultimately all of these tools or the scripting really boils down to it’s doing a file save and a file open, so I want to file save all of my object code into a file that I can put into version control because version control deals with files, not with database platforms. And the same thing, I could certainly pull code from a version control system, do a file open and edit it rather than right clicking something in Management Studio and saying modify.
I could certainly just file save and file open all my code and I could teach any of you to do that in about three months, and I’ve done it before. I can take developers, database developers, app developers, whoever, operations people. And it takes about three months to build a habit that I always file save and file open my code. But that’s ultimately what we’re trying to do, and however you want to make that happen is up to you. Like I said, I’m going to show one way but you do whatever you want.
The only thing you have to think about is how do I organize this in my version control system. Now, we have some clients and customers at Redgate that use separate repost for their application or database and some use the same rebuild and it doesn’t matter ultimately, okay? It doesn’t matter what you want to choose to do. Some people want application changes and database changes to commit together so that they know this is a deployment unit that I’m going to send out, some people want separate – I prefer separate repost because I want the database to be a first class citizen. Because if I’ve got a web app connected to my database, it’s not going to be long before I have reports or have Excel or have some ETL extract, something that’s going to be there as well, which means I’m going to have two applications on one database and rather than just get confused. I like to have a separate rebuild, but you can make your own choice, whatever works for you. The other thing is don’t put all your files in one folder, I know if you scripted a SQL Server, they’ll all be in one folder. The database of any size gets incredibly confusing and cumbersome, so I just separate them out. Whether you use SSDT in Visual Studio, you use the Redgate system, something else, usually we separate things out to tables, we use stored procedures, whatever, and when I’ve worked manually and built my own methods of scripting things out and building packages and deploying them, I did that as well. I have table code, I have view code, I have stored procedure code, all that stuff.
The last thing is, use pull request. Database changes can be disruptive, they’re inherently more risky than application changes, so please don’t develop in your main branch, and I’ll show you that today. Have a branch where you do some development and you test things and you try them and when you think things are okay, use a pull request to merge things back together or the equivalent process in your system. I’ll use Git today but you know, even if you’re in SourceSafe, you can essentially create your own pull request and have people move code to a different branch and you know, manage it that way. Alright, so let’s look at some demo stuff. We’re doing alright Brent, no questions?
Brent Ozar: Yes, totally doing good.
Steve Jones: Alright, so let me orient you a little bit here on what I want to try to do so we have some idea. I’ve got this database here that I’m going to do development in and apparently I can’t click my control key, but this is essentially my development database so this could be on my laptop, my work station, could be on a shared server, whatever it is. But this is where I’m doing development, and my recommendation is typically every developer has their own spot they can do this so that they don’t mess with anybody else. It’s driven me crazy in the past when somebody changed the table and my queries break and it’s disruptive and annoying.
So I’m going to make changes here and then what I want to do is I’ve got a – I’ve got this integration database and what I want to have happen is once we pass all of our automation testing, I want to push those changes into this integration database, so typically if Grant and myself and Arnie and other people break it, we’re all developing in our own databases. At some point we’ll want to put all that code together so that we can see it with the application and make sure that we haven’t fundamentally broken anything in development, and so typically we have some sort of integration database that has all of our code together. So this is the second place we’ll put stuff and then I’ve got a QA database. Boy, this Vaio keyboard does not – I think I broke one of the control keys.
Brent Ozar: It’s the poor musician that blames his instrument.
Steve Jones: Exactly. So the third place is we want to send stuff to QA, right, where humans can look at it because there are certain things that you can’t write tests for easily and it’s good for humans to do it, so we want a different place that you know, we can just deploy database code to and somebody can then look at the application or whatever they want to do.
So those are the three things, and the other thing I want to point out is this is on 2014 that I’m developing, but I’m using generic codes so it doesn’t matter.
Once I’ve deployed it there, I’ve actually got down here, a 2016 instance. This is my staging place and this is where I want to perform one final deployment test before I go to production, so I’m going to take this code that was developed in one platform and we’re going to shoot it over to this other platform automatically. I got no sleeves on today so I’m not going to change that code, it’s going to go there automatically. And then if the gods are with me, we’re actually going to deploy that code out to Azure, to database.windows.net, which roughly 2016 and a half right now I think. 2017.8 whatever, however you want to view the world. But we want to send this stuff out there and actually – I went too soon, this database is where we’ll shoot this stuff, this demo database. So that’s where we are, so let’s start here. So I’ve got this development database, I’m on my local machine here, let me create some table here, that’s good and let me do some stuff – actually, let’s change this stuff here.
So I’ve got this table of country codes, it’s got all the country codes, typical kind of lookup reference data stuff, let’s add a value here and let’s call this – since I have Brent hosting, POR, that’s probably not a country right?
Brent Ozar: 911.
Steve Jones: 911’s a good code. That’s a good one. So live and in color we’re doing this and then a lot of times when I’m testing things, I actually have a client there who wants to – they went to weekly deployments to the database, so they went from literally like every two to three months to weekly deployments and they don’t always deploy stuff, they don’t always have database changes but the DBA really wanted to make sure that they could just deploy something because it builds confidence and people can see things are changing and that way. They at least know that the process still works, so one of the things we talked about is just deploying – make some complex code but something that’s not too crazy that’s not going to affect the application. So a lot of times what I do is I make these really interesting stored procedures that let me just test if stuff is working. Is my deployment working? So if I run this, this should work, and let’s change this to say country code is equal to 911. So that exists there, so that kind of stuff exists there and let’s just change this to – sorry, I know I should type in demos but I want to.
Brent Ozar: It works.
Steve Jones: It works. So I’ve got this code here and if I jump over here, here’s my integration database and let’s execute some stuff here. I don’t have that row and obviously that stored procedure doesn’t exist and in my QA system that doesn’t exist either, got all kinds of issues in production, production’s always red right, that doesn’t exist there.
So I haven’t deployed any of this stuff out there, no magic here, I’m not going to do anything, everything is going to happen in front of your eyes. So let’s commit our code to version control, so again, this is a Redgate tool that just plugs in to Visual Studio, but all it’s doing is it’s grabbing my code, so in my country codes, I’ve added this new row, we have Porchland here, my incredibly impressive stored procedures here, my ratings tables here. All we’re doing is just taking this code that you see on the right hand side, I don’t know, whichever way you’re looking at it, on the one side, and we’re going to put it in version control. We’re just going to put in into a file. So the comments that you write in version control are there because next week when I look at this, I’ll have some idea of what I did. I have these changes that I’ve committed in there and all I’m doing is sticking them into a file and they actually exist – let’s come here and do something … So I’ve got this stuff here and one of the things I do is I created this ratings table, it’s right there. So we just did that, that’s now – it’s the 28th of April, wherever we are, I don’t know, it’s 10:38am in the mountain time, that’s just the file we just created.
That’s all we did, and we did it in our auditing branch – let’s come over here, auditing branch is there, clicking on the wrong thing – and I’m going to push this stuff out. So I’m going to push this out to a remote repository, version control system, I’m using Git and this stuff actually is going to exist out here in my Git system, and I’m running on Visual Studio team services at visualstudio.com, here is my URL, which none of you will get access to but that’s alright.
So I’ve just committed a minute ago, I’ve got my stored procedure, this amazing change that I made right here, my Git 7, my table, my data changes, all this stuff is there. You’ll notice that I just realized I committed a branch change, do I want to create a pull request, and I actually do. Now, I should test this, I should do some stuff but you know – it’s live, we’re trying to get through this in an hour so we do have clients that do this exact process, some of them use GitHub, some of them use Bitbucket, some of them have internal Git, I don’t know if I have anybody personally using Visual Studio team services but Visual Studio team services is fantastic.
Once they get to TFS2015 and better, update two and better, this is the first time I’ve seen a Microsoft development tool I thought was excellent. Most of the time I think they’re – this is fantastic. So like I said, we have clients who do this, they create pull request, and what they typically do is then they have – a bunch of them have two people review the code, so they would come in here and they would look at this code and they would say, alright, I can approve it. Since I’m in charge here and I’m just going to go and approve it and I’m completing it, you’ll notice this is related to a work item I’ve got here and stuff, I could delete to branch, I could do all the crazy stuff that I want to do, but I’m not going any of that stuff. So I’ve created this and I’ve actually merged this into master, so if I were to switch to master we would actually see that I’ve dropped this code there and it’s available for everybody to see now.
Those of you who are application developers might be familiar with this, you know, create a branch, do some stuff, test it, get somebody to review it, then merge it back somewhere else. Typical flow in the development world. Alright, so we’ve gotten our code there, any questions so far?
Brent Ozar: Yes, it’s stuff that’s not necessarily related. There’s – for example, you can only do so much damage in a day sounds like a challenge, but Scott says, “Does Redgate recommend putting your agent jobs instance configurations data like when you make sp_configure changes as part of a deployment? Would you also include that as part of your source control?”
Steve Jones: Yes, so from the Redgate perspective, we don’t really work outside the database. Everything we do is self-contained in the database and I wish we did more. The answer for me as a software developer is yes, put all that stuff in there. So as an operations guy, when I used to manage in a finance area, it was really important that stuff worked. You know, if we changed – so back in – we went from 65 to 7 in 2000 so in the 65 world, if we grew a database, it was ultra critical that we knew what the additional file and size that we added matter because I had to recreate that on restores, so we put that stuff in version control. It’s like dude, that database, I have to know every single change that ever happens, and if we do something like we’re going to change parallelism or we’re going to change query time outs, nobody ever gets to change that in the
GUI. As a matter of fact, if I were to come over here and say on my server, that I want to do something crazy – well, let’s do something easy then I won’t break hopefully stuff – like I wanted to come over here and I wanted to go I’m going to change this to 11, which is – if you guys want to do this and try it in your production servers, BrentOzar.com is the place you’re going to call for help. All performance goes down, they’ll fix that for you, but this button right here, again, it takes about three months. I taught all my DBAs, if you press this button, you’re fucking fired, you’re gone. You get to press this button right here.
Brent Ozar: It teaches you so much.
Steve Jones: Yes, you press this button and then you go okay, what exactly am I doing? And then the first thing you do is you come up here and you press this button right here, because when you screw things up and stuff stops working, I want to know what you did. Believe me, it takes a few months of me screaming at people for them to understand that I’m serious about this, that you log everything that you do in the world and part of DevOps is instrumentation and telemetry to understand exactly what happened at any point in time. Not so that we blame people right, if you make this change, we can talk about why it’s a good or bad idea. But if you make this change without telling anybody, without logging it, without anybody notifying it, that is the problem, and that’s what I try to do.
So anything that’s a change, if I do agent jobs, if I do like standard event sessions, if I change queue addresses, you know, if I’m doing containment, if I’m – parallelism, adding file groups, all that stuff gets logged in version control. Always, always, always.
Now, think about how you’re going to deploy that and I might recommend that the Ops people have a separate pipeline for non-database changes they want to deploy. Mostly so we’re just aware that A, they at least went to one other environment so we can test that they work, whether or not they have a performance impact, just the fact that running the code works, and then we have a log that says we deployed this production on this day and this is what ran. So that’s my answer there.
Brent Ozar: Perfect. There’s a few others but they’re not quite related so I’m going to hold on to them until the end.
Steve Jones: Alright, so let’s keep going. So let’s talk continuous integration now. Now, the idea here is that we want to compile our software, we want to test it, we want to publish out some package and we want to let developers know if something’s broken. Now, there’s many ways to do this, but we have things called CI servers or build servers, and these are server processes that are designed to build software. That’s really it, they’re like fancy SQL agents that – they just execute steps in order and they build software. So to understand how to run Ant or Nant or Maven or Make or anything else to build software, so you can use whatever you want. Bamboo, TeamCity, CruiseControl, Hudson, Jenkins, TFS, whatever you want.
I don’t recommend one over the other, I’ll say that I’ve used TeamCity quite a bit, I’ve used Jenkins and I’ve used TFS and I think Jenkins is just – it’s painful. It works perfectly, tons of people use it, it’s free, it’s just painful for me to use. TeamCity I really like, and I like Visual Studio team services or TFS. Once you get to really 2017, fantastic. So pick one of those. And the process of compiling in code right for databases is we have to get all the code from our version control system and then execute on the database and execute it in order. So I can’t build indexes before I build tables. I can’t typically alter tables after I change a view to add to column or have something in stored procedure, so there’s an order to how I execute database code.
For a particular application, for example, if you have one database and one application, this is trivial code to write. I mean, honestly, you could just say run table one, run foreign key two, run stored procedure seven, that’s trivial code. To do it for multiple applications and kind of generically, it’s more complex but there’s tons of tools out there that will help you do this stuff. So you know, that’s what you have to do; however you want to do it.
I’ll show you, Redgate has a tool that does this for me, but I’m not advocating that. I’d love you to buy Redgate products but there’s plenty of ways you can do this. To test your software, we’ll talk about that in just a second but you got to test your software. Please, please, please test your software, and then you have to publish out some package that’s potentially an upgrade script.
Now, in the database world, we don’t get an XE or a DLL in assembly or Jar file or module or anything else, we don’t get that thing because when we compile code. It exists on the server platform and we can’t extract that binary code out. We just can’t do that, so the way we publish the script in the database world typically is I want to grab the state of all my code in version control because that’s what my database has to look like. And then I have to figure out how I execute that in the future, and there’s a couple ways to do it but we’ll talk about that when we get to release. So that’s what we want to do here.
Now, for testing, if you’ve ever had somebody send you a bug report and say that something doesn’t work, you should be testing your code. Even if – I’m not advocating you sit here and start a project write test for all your code, please don’t do that, that’s silly, but any time anybody tells you something doesn’t work, you’ve got a foreign key, you don’t have the default, you don’t have anything else, write a test. It’s that simple, just start writing tests for the things that you don’t do well and your users and your QA, they will tell you what you don’t do well. So write those tests. Again, there’s plenty of frameworks to do that, some of these frameworks are built for databases and so they make things easier to do in a database world.
I’m actually using T-SQLT here, this is an open source – I think it’s open source, it’s certainly an open free project, but you can use DB unit, T-SQL unit or a PL unit or you know, there’s any number of ways. You can write your own test, the only thing is they have to be automated. I have to be able to run one command or push a button and it runs all my tests. That’s the only thing, because I know a lot of you want to go I write a query, I insert data in table and I write another query and I’ve tested my code, and that’s not a good test because it’s not repeatable, it’s not reliable and you’ll forget. Use a framework of some sort, X unit, DB unit, Microsoft, MS test, I don’t care, pick something and use it.
Now, we have plenty of customers that use curated data sets and they have fantastic results. I’ve got – we’ve got a customer that deploys code to multiple countries and before they started to do a process like this and before they started to have curated data, they constantly had issues. Stuff would work in English and it wouldn’t work in Italian. And what they went through is they created a curated data set in all their localized languages and for them it was really go to production and take ten rows from every table. I mean, list the countries, take them all, list of postal codes. But for transactional data, take ten rows that represent your application. Everybody probably knows – I’ve worked with accountants and they can go, there’s these seven clients that they have all the weird accounting stuff we have to go in the GL, so if your software handles these guys, great.
And then what you do is you realize, well, there’s really an eighth customer, we should add those rows of data, and it’s a project. I mean, it’s something you do as part of development, it’s maintenance work but it really does help. We have customers that write between four and ten times the amount of test code as they do database code and that includes maintenance, but they almost never have deployment issues and they almost never have bugs reported by clients because those tests define the requirements the clients ask for.
Alright enough, we could talk testing all day but let’s do it. So let’s talk about continuous integration. So one of the things that happen you know, I deployed that code and somewhere I have an integration thing here, we ran this a little while ago, right Brent? We ran it well and this code wasn’t there but now it is because I have continuous integration running in the background, now it’s not in my QA area, wherever QA is, because that’s still broken, but it certainly got deployed to my integration environment. So how did that happen? Well, what happened is in the background, is I had continuous integration – those are big words for me, I had continuous integration process running. And so I’m actually going to kick that off again so we can see it run and again, we’re going to use – if you don’t understand this code, message me later on Twitter and we’ll explain it to you. But we’ll make another change, I did that there, so let’s commit this change, did I change databases, I did [mumbles] – so I’ve done the stored procedure, I just added it in there, I can alter stuff, do whatever if you want to see that but we’ll do that. We’ll push this code out and again, we’ll pop back over here and we’ll do pull request because one just created. Nice thing about TFS is it kind of thinks ahead for me, some ways tells me the bad thing about Visual Studio team services is that they change stuff. So like this stuff on here moves every three or four months and so if you’re not in here regularly, it gets confusing. So I just completed this pull request, I just merged this into production in the master and actually what I have is I have a build actually running here, so we’ll see something happen in a split second here. If not, I’ll come over here and look.
So I’ve got these builds set up and you can see in fact, one is in progress here but I’ve had numerous others ones that have been running in my system, and if I were to look at this, I could actually see this running. So I have an agent process running that I could have on any server in my data center lab or whatever, and the agent would – it’s downloading the code from Visual Studio team services and it’s performing the build and the tests and it’s going to then – it sends results back up here and so this stuff that’s happening here is exactly what I’m seeing happening on the web.
So I could have this happening on one of my other servers here and running, but in this case, I’ve just got it running all on this one machine because it’s just slightly simpler for me to demo it and show it all happening. So this is actually running, and if I look back at my builds, you’ll see I’ve had some good success lately, but these are just steps in assembling software. So I go get my code from repository, I’m getting it from the master branch. I could get it from release branch, Dev branch, whatever. I build it, and this is that simple thing that I said. I execute my dot SQL code in order. My tables before my stored procedures, maybe some functions have to come first, then views later, other stored procedures later, indexes, I just do it in order.
In this case, I’m building on my local system here, I’m going to run tests here. My testing step is actually fairly simple, so here’s my original database, let’s just create a new window. This is actually what my testing step does, I have a lot of configurations I do this and I’m running tests and in this case I actually have a test failure. Hopefully you all know that tables have primary keys, they should all have primary key, always, always, always, always, always, always, unless you can prove to me they shouldn’t, but in general always. Then I’m going to sync my integration database, that’s right here. So what I’m doing is I’m just going to upgrade it with all the code, if I’ve passed test, then I’m going to publish some artifacts out. And so when I look at my builds, I can see what’s going on, for 446 just succeeded, and when I look here I get all the telemetry, right, I see all the steps ran correctly, I can see what’s going on. In this case – maybe I didn’t have a test here that’s failing on my Dev database – but I’ve got all my tests here, and I can see what’s going on, so I can follow through as I look through all my builds and see what’s been happening. Did something break?
And typically if something breaks, this is where I want to message my developers or message one developer and say, hey, something’s broken, go fix it, because part of the idea in continuous integration is I want my main branch of development always releasable because if we find a bug, got to be able to fix it today. So I can’t have like 12 things that are half done from other developers in there. This is why I work in branches, but you know, I have the ability to – if I find something broken, I either got to fix it immediately, and one philosophy is everybody goes and helps fix that bug right away, or I go back and I revert that commit and I get the old code in there so things are working and then I test it further. So in this case, I’ve got this running, I’ve got my build running, it succeeded. So I’ve actually upgraded that database and so if I were to come to my integration database, and let’s copy this code because that’s always better, I should have another stored procedure in there. That’s automatically deployed there.
Alright, so I’ve got networking, I didn’t really implement it, it was already done but essentially I just went through here and I said give me steps, so if I wanted to add something else, you’ll notice here on the right I’ve got all kinds of tasks, so I could add a .NET core task if I was trying to build this with an application.
I could do just my own batch scripts, some kind of command thing, if wanted to use Make I could be sure they’ll be using Make or Maven I could use Docker, all kinds of things out here are available, and I just add these one at a time. For example, if I had a Shell script, I could just add this over here and it appears here, then I just have to configure it to do whatever I need it to do. I’m going to discard this change because I’m not that brave to do this in a demo, but I certainly could have a Visual Studio build or something else if I wanted to. Alright, any questions?
Brent Ozar: Yes, there’s one – inside the Redgate tools, how does it differentiate the difference between lookup data that you’re adding for like your own personal test versus things that needs to get pushed up into source control? Is there a way to say these changes I don’t want to have as part of my pull request or what goes into source control?
Steve Jones: Okay, so let’s look at that two ways real quick. So let’s talk about generically, so let’s take Redgate out of it because really it’s a generic process. How would you differentiate what stuff needs to be tracked and what doesn’t? So if I look at this list of tables, and I look through here, I have to have some knowledge of my application to say well, articles and blogs, these are not look up data. This is transactional data that will be generated in production by the clients and the workload and all that stuff. Auditing data will certainly be stuff generated in production, I want to track that separately. But then I can go you know, these country codes, my postal codes, my list of currencies, this is look up data, so how would I track this stuff and get this from development to production? Well, there’s plenty of ways to do that, I mean certainly I could say well, I know that we did something like this, this is the wrong table, right? This is exactly what you do, you sit there and go well that was wrong. You do this and I’d say this is Porchland and this is 911 and then I’d save this and I’d say update it country codes, 2017. And then what I do is I’d have to then have a script in my process somewhere that says grab this and execute in next environment, and if I change the name then I’d certainly have to go well, really, what I meant was update…
Brent Ozar: If exist.
Steve Jones: Right, and as Brent said, I don’t want – if I was going to do this, or if I had the insert, if I was back here, I don’t want to run this multiple times so you know, I’d have to make this item potent so if not exist select – then do this, so that’s exactly how you have to do this stuff.
Now, in the Redgate tool specifically, what I’ve done is I’ve just linked things as static data, so we have the ability to say if I have a primary key, I’ve got two tables linked as static data and all the tool does is go look at version control, look at the table, decide what data is different and generate the script I need to do that.
Brent Ozar: Perfect.
Steve Jones: Right, that’s what it’s doing but it’s – all the Redgate tools or really any tool you buy or doing this, they’re doing this stuff you would do manually, so think about how you would do it manually and decide, is it something I can write, and do I want to be in the software development world of checking data or do I just buy a tool that does it for me, but if you buy a tool, you better understand how it works because it’s kind of important. Every tool is going to have holes and it will break, you want to know how it works, but that’s how you do in data. Where are we with time? Let me keep going because I’ll run out of time if we do more questions on this stuff.
Brent Ozar: Yes, go for it.
Steve Jones: So let me keep going, so the next part of this is the release side of things. I’ve got code in version control, I’ve tested it, built it, how do I get it to other environments? And really at this point in the database world, we have to generate an upgrade script and this is complex because the SQL language is horribly broken. My alter table code looks fundamentally different from create table code, completely opposite from Java or c#. My class code and my methods and stuff, that always looks the same right, there is no alter class, alter method stuff right, always the same. In SQL, we’re horribly broken, now, we have create alter in 2016 and later, which helps, but it doesn’t help tables and there’s still weirdness and my table isn’t a thing because the index is kind of separate but kind of not and my foreign keys require links to two things and it’s difficult, so you have to think about this.
Now for code, we typically get the latest version from version control somehow, make sure the create alter is there, write an item potent script, even if it exists, whatever I want to do, and execute that and in this case, it’s easier because I can kind of go forward and back if I have to. For tables, I have to – I really need some way to transition the table because I have to maintain state. Philosophically, there are two ways of doing it. I can use comparison technology, which is like SQL compare from Redgate or what SSDT does with Dacpacs and I look at the before and after I figure out what the differences are and I generate a script. Or I can use what’s called a migration base philosophy, which Liquibase, FlywayDB, ReadyRoll from Redgate, rails migrations, any framework migrations. All these things do is they tend to track the scripts as we make them, so if I do an alter able, we track that as script 47 and then if I do a create stored procedure we track that as script 48 and then we execute those in order and we track them and all that stuff,. So those are just – they’re two philosophical ways of doing things. Neither one is better than the other one, they both have good points and bad points, the advantages and disadvantages. I prefer migration frameworks but comparison stuff works great, use whatever you want.
For deploying the code, what we want is just like CI, I want a consistent process and so I want to automate as much as possible because you as humans are not repeatable and reliable, can’t trust you at all. You may do something perfectly 100 times and 101 you’ll screw it up, and that’s the day we have the Black Friday sale and everybody’s upset. So I want a process, I want an automated process, I can have manual steps if I have to but I try to avoid them. And then I constantly improve my process, because I’ll find I may deploy the same way all these times and all of a sudden I need to do something different or I need to change security. I got a client who just went from SQL to Windows security and they had to change all kinds of things in their process and figure those out so they had to improve their process to handle that change.
I say use a release management server because this is a server process that just understands how to deploy software. It understands copies and deploys and log ins and all that kind of nonsense. Again, there’s tons of them out there, use what you want, but we’re just trying to do that. People always ask about rollback, I don’t want to get into it too much but in general, for code you just get the previous version of your code and you can deploy it. Logically it still has to work right, but you can just kind of move code. For tables, you’re going to have to have custom code because some things, like if I’ve done a merge, I might not be able to split data. If I’ve added a column and any amount of time has passed, especially in an online world, I may not be able to just drop that column, I may have to do something different.
In general, my recommendation, I try to dark deploy stuff as much as possible, so I want to deploy it without the application using it or needing it or worrying about it and then turn that on. Even if it’s five minutes later, I want to be able to deploy a database separately, which means architecturally, I just have to think about software a little bit, but you can do it. The other thing is somebody asked a question about what about other data that’s outside the database or stuff that might vary by environment. You can use kind of pre-post scripts, deploy one’s scripts, ultimately there’s no great way to do this stuff. There’s variables in release systems, there’s tokens I can use but really at this point, I’m depending on somebody to type something correctly into a place, a variable or something and then I’m hoping they haven’t typo-ed it.
Now, I can test the process that says in QA, here’s the email server name and did it replace the token and deploy that correctly, and in production I assume that process will be the same, but I’m not actually testing the data piece of that what’s going on, so that’s tricky. So let’s do a release real quick while we’ve got time and hopefully this works and let’s go to releases.
Brent Ozar: You know you’ve got up to 90 minutes if you want it, right?
Steve Jones: Okay, good, so then I don’t have to worry. Alright, so here’s all my releases, so I’ve actually got two pipelines here and you’ll see I’ve got different numbers and environments here and sometimes it works, sometimes it doesn’t, sometimes I skip environments, sometimes I deploy all the way through, it’s up to me what I’m doing. Let me start a deployment here while I’ve got a second. So here I’m picking up build, which this tracks back to these builds – let me see what our latest build was so we can see – so 446 was the build here that we just did, so I’m picking 446, you’ll notice my QA department has an automatic deployment, my other ones don’t, those are manual. So I’ll create this and this will start deploying. As a matter of fact, when I look at this, I should see blue here somewhere in a second and I do, and in fact, it’s going to start deploying here. It’s downloading, once again, it’s going up to Visual Studio team services pulling down code, it’s looking at that package that came out of CI and it’s starting to unpack it and look at it and figure out all the stuff I have to deploy it.
That’s what’s happening here, and the way this works is very similar to a release – sorry, a build process. I’ve got environments so I’ve got multiple environments here, in this case I’ve got two steps in here and I configure them but these are just steps that I could add. So if I wanted to add some steps in here like if I was going to deploy this to an Azure app service, I could do a deploy there, I could do a RM deploy if I wanted to, a resource group deploy, I could do whatever I want really. I could upload to a mobile center, it’s crazy what you can do, and the nice thing about Visual Studio team services and really all the platforms is they’ll deploy to Linux and OSX and do iOS and Android deploys while they’re doing your web deploy to AWS or Azure or whatever, to your own thing on site. I’m deploying these database on site here, that’s what’s actually running back up. And each environment can separate, right, like this one has a restore from production here, I’m just doing a SQL command here, doing a restore, nothing magic here. My Azure one, know how to log in to Azure and do that and I’ve got old steps and new steps here that I’ve – these are disabled and these are enabled and I just configure it, right?
I’ve got variables here so in this case, I think I’ve got variables here so I have a build password and I have a deploy password for two different systems that I’m going to try to deploy to. I could have variables here if I wanted to, so I could override a variable and say here is my deploy password and give it some value, I could make this secret so that it can’t ever be retrieved, so other admins could come in here and use this but they would never know that the value is. It’s amazing what I can do here.
So if we come back to our release, that actually worked, let’s check it before we go anywhere else. So QA, so we did this – right, so this failed last time, we got an error and there was no results here, but if I execute it now, it actually works. Again, you didn’t see me touch this particular database, everything just deployed automatically. And if I were to go look at this, I would see this stuff out there that – well, I don’t want to look there but if I look here, I’ll see this stored procedure.
There’s Git 8 and Git 7, but those don’t exist in my staging database, these don’t exist here and if I went to my staging database still, there it is, so we did – actually just go paste that code, it’s probably easier, so if we execute this right, there’s no results and I get two errors here because that doesn’t exist, because I haven’t deployed it. Let’s deploy it, so here I’ll just say deploy, and the same process that I use essentially is happening here. Now, slightly different steps because I want to do that restore, but that’s going to happen. And as soon as I do that, what I actually get here is I’ve programmed this for an approval.
I’ve got a pre-approval here, which I like to have in staging because I want the DBA or operation person to be aware we’re potentially going to deploy this stuff and they have to go look at this and they get the chance to approve this. I could defer this, I could do whatever I want, but this allows me – I could go look at what this is related to on work items, I could go see the build that this is related to, to see what’s going on here, I could do some performance and due diligence here to understand what’s going on, or I could just let it run, and it’s going to run now. It’s going to run in the background.
I could certainly do things if I wanted to, I could have post-deployment scripts, so in this case, if I look at this, I have an approver here and I’m the only approver but I could add you know, like Stephanie is in the Redgate, I could add Stephanie and approve this, I could set more options so that either of us could approve it, I could have post-deployment approvers.
That’s a great one to say hey, did somebody actually acknowledge they went to look at this deployment and made sure it worked by having a post-deployment approval, and I could have a condition that says you can’t deploy to production until somebody’s actually approved the post-deployment approval. I could send email notifications; I can do all the crazy stuff I want to do. Anything I want to do can happen, and meanwhile, this is all happening in the background. And if we come over here, I probably have code deployed at this staging. Again, I didn’t do anything, restored from a previous production backup and deployed without me having to manually go you know, create procedure and insert into country and all the stuff that I’ve done as a production DBA, because I realized the developer forgot to tell me something, or their package is wrong or the QA guy fixed something in QA because he realized it was broken and it was just quicker to fix it then redeploy again. But now I could just redeploy again, right, so I could just say – let me go back to releases. I could come back here and I always have the option to redeploy stuff, so I could say, hey rebuild a package or just redeploy it because the QA guy screwed everything up, they could just go redeploy it off the code on top of it again, reset my environment and let’s try it again. I have the option to do that.
And of course I don’t have an approval here because it just worked in staging, I’m just going to shoot it out to Azure and cross my fingers, and hopefully that’s running in the background. It is, and I feel it here you know, again, I could be looking at this as it’s running and see what’s going on, but the idea is that I have an understanding of what’s going on, and I have some consistency and repeatability in my process so that I have the ability to understand what we’ve actually deployed and when and how and when something breaks, we can go figure out. It’s easier to troubleshoot, it’s easier to look back and decide what to do, and I have complete control.
We have a customer using this stuff on Redgate that deploys once a year because they do HR software, so they deploy once a year when all the regulations change. Now, they have bugs, they may have to deploy three or four times in a week, because they realized that they didn’t do something correctly, but they use this process you know, all throughout October and November and December to deploy the test systems over and over and over again and people look at it and try to decide it’s working but clients never test completely so there’s always like one thing that’s broken.
So when they deploy on January 1st or 2nd or 3rd or whatever day it is, and they realize there’s a bug, they’ve got that whole process, so they can go to development real quick, let’s implement the fix, let’s deploy it, back out to QA, have somebody look at it, did we fix the thing that’s not in production and then push it to production right away. Right that day, next day, whatever works for you. Let’s see, we’re finishing our release, so let’s go – where’s my red production screen? Somewhere…
Brent Ozar: Lots of people saying in Slack that they love the Octopus, they’ve been using Octopus for deployment and their huge fans.
Steve Jones: Yes, Octopus is a fantastic – I mean, I really liked Octopus, the thing with Octopus I just have to run a server and Visual Studio team services I don’t. I mean, the nice thing is here, like because I travel and I do things remotely, like I just connect and I just say run agent and it just connects in to Visual Studio and I’m running stuff.
I don’t have a server process here, I’ve got this really tiny agent process that’s running here. But Octopus is fantastic and as a matter of fact, if I wanted to, in the market place. I can never spell Octopus, but I can add this to my account, I can say I’m going to install the Octopus to deploy integration and this essentially allows me to configure a tentacle on this agent that will connect to the Octopus server inside my local system and run the package there and shoot it out there and tell Octopus to deploy to do something there. When I look at the market place, there is freaking everything in the world here. I can send to AWS if I want, I can send messages to Slack, we have people do this like when deployments happen to QA, it goes to the QA Slack channel so that everybody knows QA was updated and when it goes to production we have an update here. There’s HockeyApp if you’re mobile guy or gal, there’s everything in the world on here, it’s absolutely fantastic and you know, Octopus is great too. Octopus has millions of plug ins and it works fantastic. I’m not saying use this, use something though. Use Octopus, use this, just use something that allows you to automate what’s going on and tell what’s going on.
So I’ve gotten this code deployed out here, and the only place I ever changed code was on this development system here, on 2014. We deployed it at 2016 into Azure without having to run code manually anywhere. So we’ll wrap up and I’ll take questions, happy to take questions, but really the idea is these principles in DevOps of holistically thinking about everything, right, don’t think about development, don’t think about ops, think about the whole thing from architects to clients and optimize the process, the flow.
Some things may be harder, some easier but optimize the flow, use feedback loops so you understand where things are getting better and worse, and then try things. Try small things, get better, write better code, do better testing, build a better process over time. The technical side of this in most places in a week we could set it up but it would be really hard for your culture to understand. You could have to start to trust and start to figure out what all these inherent steps that you just do that you don’t think about documenting or you don’t realize how many places somebody just fixes code for you. You learn and get better, and ultimately, I want to treat the database as much like an application as I can.
In the SQL world, our language is broken, it’s hard, it’s goofy right, it’s just there’s difficulties here but as much as possible I want to treat it there and I want to work around those issues like alter tables and splits and mergers and renames and all that kind of stuff. And then I’ve worked in a number of environments where we were socks bound or iso bound or something else. Ultimately, everything that you do that boils down to any process, Six Sigma, whatever it is, it comes down to listen, do more of what works, do less of what doesn’t. And understand and document what you’re doing right, understand what works, understand what doesn’t and why and just get better at that, and that’s really the philosophy here. And so you know, if you’re a SQL Server person or that you want to write your own code, you want to use Redgate, you want to use Microsoft, you want to do whatever, build a process that just gets better over time and once you get this working and you can just deploy stuff, it’s fantastic.
I did this stuff manually with Visual SourceSafe and SQL 2000 and you know, we did that and when we started working, we were deploying every Wednesday. We deployed code every Wednesday for 18 months because we were a startup. We had to go and we started, me and a lead developer and a couple of other developers would show up in the office Wednesday night at eight o clock and we would put stuff together and I would run database code and they would run application code and we would test it and we spend two or three hours every week doing this and after about six months we got better, we learned to automate it. All our own automation, but eventually it was – you know, I called the lead developer up on Wednesday night, sitting at home and I go and I run one command it deploys all my database code. He runs a command, deploys all his app code, 30 minutes later we’re like, you good, you good, we’re done, we walk away, go back to our lives. You can do this stuff, I’m not saying you have to use any particular tool or method but just do something. So there’s my contact information, there’s some Redgate information but there’s certainly plenty of other ways to do this stuff and then I’ll take questions.
Brent Ozar: Wonderful, so let’s see here, Teddy asks, “We maintain like Dev test stage production, is there a canononical way or canonical way to structure our databases like when we think DevOps, what are the normal stages that people go through?”
Steve Jones: So there’s no normal thing because I’ve seen – well, I’ve seen people do it in as few as two environments right, share Dev production, of course I see people in production too, that’s one way I would say don’t do it. I like to say at least three because however I’m going to do database development, I want to deploy it at least once and test at least once separately from production just to iron out any issues. Ideally what I like to have is five, so I want to have developers with their individuals, so this is one per developer, an integrated place where development can look at all the code and we can run our basic performance tests or we can just see everything together with the app people. Then I want a QA place for sure, that somebody outside of development, whether that’s clients or dedicated QA or whatever it is can just see stuff separate from where developers might be mucking around, because at each stage, I’m moving a little bit slower and so that gives me just a tiny bit of stability.
I want a staging environment. Now, I may have customers that can look at this or something else, but ultimately I want to be able to restore from production, which means this may have to be a secure environment that very few people can see, but I just want to be able to restore from production and deploy it to make sure it works. I don’t have to restore like restore database, I could just restore schema but I just have to make sure my code works because we always have drift in production. We have hot fixes, we have things that have to be fixed today and I can’t run it through the pipeline for whatever reason. Now if somebody screams at me to change the spelling for Porchland, I got a change that right now. I can’t tell them it’s going to take me three minutes to run it through a pipeline, it’s unacceptable. So ultimately I want to be able to restore so that if there’s any weird things that are happening in production, I catch it right there and then in production, so I like to have five environments. I know some people go larger because they may have dedicated perf environment where they literally do restore production. Sometimes they increase the size of production like double it or something and then they just hammer that with the workload and they want a dedicated environment. Some people do a training environment, so they deploy to a training environment, some people deploy to a DR at the same time, I would say at least three, I prefer five and then you can go crazy. Whatever your budget is, go crazy.
Brent Ozar: Which is one of the things I love about the Cloud is that you can stand those things up, just to go and get as many as you want for short periods of time.
Steve Jones: Exactly, now in the Cloud, the one thing to be careful database DevOps stuff is that if you’ve ever tried to like create a database in Azure, it could go like – probably like that, not super fast but seconds, it could take minutes or tens of minutes, so you may need some time outs for things. You have to be careful.
Brent Ozar: C2 says, “How about changing column order? Is it easy enough to – using stuff like the Redgate tools or anybody’s tools, can I change column order or should I just always add new columns to the end of a table?”
Steve Jones: Okay, so that gets into a database design philosophy and you know, I’d be curious like maybe like Karen Lopez might say but I think if you’re worried about column order you’re a moron. I mean, I just – sorry. It’s – you’re getting caught up in the detail that’s irrelevant to the world. You shouldn’t care about the physical order of columns. If you ever do anything like I’m addressing columns in a certain order or something, you’re fundamentally misusing the database there. Now, if you care about columns for report writers or something, make a view. Not terribly difficult right, but for crud applications for stuff, you should never worry about column order. If you ever address columns as bracket one, bracket two, bracket three, you are digging a hole that you will never get out of.
Brent Ozar: Joe has a point that I’ll give him in Slack and I hadn’t thought of this, which is replication, so it can matter deterministically if you’re creating objects on either side in replication, but I just don’t…
Steve Jones: That’s disorganization right, so you’ve got to have control over your application environment in some sense. Now, replication has all kinds of issues, it’s – to me, that’s like the huge failing in the SQL Server platform because it’s so brittle. Phenomenal technology, like I love replication but it’s just too brittle in SQL Server. And so I’m an add guy in general, like we add things and then we clean up later, so there are changes that take place across time. There’s a domain of problems like a column spilt or column merge or table split, table merge, horizontal partition. All these problems that really should take place across time because you don’t want to split a column one into two, right like address into street name and street number and delete the address because I guarantee you there’s eight places in your code you forgot you talked to that column.
So what you do is you add – I have a trigger that syncs things right, like one or two triggers and then across days, weeks. I figure out where things are broken, eventually I turn that off and I delete it. And the same thing with replication. Like replication is you’ve got to have slow change and you’ve got to be careful, but this is a great way for application because what do people do when they deploy replication a lot of times right, pause things. People just go you know what, I’m scrapping an application and we’re going to script it and set it up when we’re done because it’s easier and so the way to do that is to build that into your process so that it always works the way that works in your environment.
Brent Ozar: So the great question in Slack, the one other place where it matters is if I have developers who do insert table and they don’t name the fields and then select…
Steve Jones: This is how you fix that right…
No, there’s another – I like Jorge Segarra, SQLChicken had a great way of fixing that, he said, “Add a computed field with one divided by zero or something like that in it so then they can’t do a select start, they can’t do inserts or…
Steve Jones: Yes, [crosstalk] I mean to me, that’s a technical debt issue that you don’t want to have because that prevents you from making changes around, so if you write insert table select, then I don’t deploy your code. Then we can go and have the argument in front of somebody else about why this is just fundamentally bad and if you can’t figure out how to use SQL prompts or to drag the columns from the left side of Management Studio into the column window and store that code, then maybe you’re not cut out to be a database developer. Sometimes I think you got to just – you got to draw lines because you’re just making bad decisions.
Brent Ozar: SQL Gene asks, he says – we discussed this in Slack but I think it should be on the recording – “How do you write good unit tests for SQL?” and he had never seen T-SQLT – he said, “Good unit testing should avoid side effects and state mutation, how do I unit test things like stored procedures?”
Steve Jones: Yes, so it’s hard I mean testing it is hard and the thing you’ll do is you’ll write tests that are good and you’re write tests that are bad and one of the things I see with developers, all developers and Richie is probably is a good one to ask is that, it’s so hard to get used to throwing away tests because you realize I wrote a bad test. But once you’ve written it and you’ve started to use it, you’re like I don’t know, so here’s what test I have. In this test is I’m testing that I’m calling a function and I’m updating a table in this test and so one of the things I do is I reset, I create an expected result table right, so this is kind of like what I expect to get in my assertion.
Here I’m going to add some data into a table so one of the reasons what T-SQL2 allows me to do is copy a table and empty it and fake it so that I can muck with it however I want so in this case, I add data to it, I add data to my expected result because what I want to do is eventually compare two tables. Here, this is a function that is in my stored procedure but I don’t want to call a stored procedure that calls a function in test deck because I want isolation.
I want a stub in mock and isolate things down and so one of the things I do here is I – this allows me to replace this function call with my own function call, so in essence, this function is just a return to scalar-value in this case, but I’m taking all of the calculation stuff out of the function here and then I call my stored procedure, right? And if I were to look at this code, here what I’m doing is I get a variable, I get a parameter, I’m testing some error handling, I call this function, get a value back and I do an update. So that’s my function – sorry, my stored procedure, and here what I’m doing is just testing it and I’m just creating an assertion and once you learn about testing, we do a lot of assertions right, we do assertions all sort of places in computer science.
The OS does assertions all the time because it goes you know, if chuck some value read doesn’t equal what I think it is, I get a blue screen of death. Blue screens of death are assertion failures in the operating system where it goes I don’t know what’s going on, I’m not doing any more work and process failures that we have sometimes when processes just explode, it’s because they hit an assertion point. Now, if they’re bad developers, it’s because something just exploded in the memory but good developers have assertion points all over, and that’s what we’re doing with testing here.
I have other tests like this test is I’m looking for my table is my metadata so here what I’m doing is – sorry, my table is my API. Here I’m saying this is what the table has to look like and if the fundamental table changes right, here I’m doing a select star because I’m just comparing metadata, if that changes I fail this test.
Not because we can’t change the table, because if somebody changes this table, this has a billion rows in it, I want every single person to know that this table changed and it’s potentially going to cause us issues and so we have a conversation and there’s ways to write tests, like you know, one of the things I have like we have SQL cop tests, like I have tables that are primary key.
I look to see if there’s a primary key on this table right, we look at sys.objects, sys.schemas and if there’s no primary key, we fail a test because you always, always, always, always, always, always, always, always have primary key but there are very rare cases you don’t have a primary key. If you can prove to me that’s the case, I’ll add an extended property with an exception to my test. So I have some ways of writing tests that allow me to do things.
In the database world, we often want to test zero rows, one row or multiple rows because those are common issues we may have with queries or inserts or things. We may want to write tests that look at you know, I have to do something outside like I call a CLR procedure, I send an email, whatever, I want to mock those out and there are ways to mock those out. We may want to do boundary conditions; so in e-commerce, we have discounts at 5% and 10%, so we have to decide is it 5% or 5.001% and we want to test those boundaries and make sure that we’re implementing code according to what is expected. So when we get to the production and we get to the test and the client says no it’s 5.01% not 5%, we can go look at the test and say well you said 5% so that’s the test. That’s the documentation of the spec, let’s – which one do we change right? Do we change the test, change the code?
We figure that stuff out, so those are kind of the ways you do testing but testing is an art and you got to practice writing tests so you got to – like I said, somebody tells you you did something wrong, write a test.
Brent Ozar: Especially now you see how – the level of complexity involved with it with SQL Server code because like you said, it’s all state mutation, which is…
Steve Jones: The other thing is you like to work on limited rows of data right so almost all tests could probably be done with less than 30 rows of data, and I love that because the last thing I want to do is I want to test an island function with windowing aggregates across a million rows of data. I know you love doing that as developers, but trying to figure out if I’ve actually calculated stuff correctly and then waiting for that to run when I’ve written a cross join and made a mistake, god, that’s maddening.
So I write tests, like I write tests all the time. Somebody says how do I find a gap in this set of identities? Well write a test that creates a test, run the test and it’s going to fail and then you run a query that goes alright, let’s calculate that, it should match your expected results. That’s what it is, that’s what I’m doing here is I’m saying here’s what I expect. I expect one row of data from the result of this query. I could expect five rows, 12 rows, whatever, but I just put those down, all I do is formalize it into a test so that I’m not trying to scroll through Management Studio and go did I get the count balance correct.
Brent Ozar: Yes, or you can look at some of the new cutting-edge development type stuff that’s doing continuous testing in the background as you’re typing, you know, I look at some of the stuff Richie – yes, I’m like auto testing while you work, I’m like dang, that’s just another whole level.
Steve Jones: Yes it is, I mean Visual Studio 2017 does that and to me it’s slightly annoying because a lot of times I’m mucking with code like I’m trying a method and I’m going no, what do I think I want to do here, so when it’s constantly going you failed, you failed, it’s slightly maddening.
Brent Ozar: [crosstalk]
Steve Jones: Yes, but I mean I just do development right and then I just say here, let’s just run these periodically. It helps if I spell things correctly, that usually makes things work better but [inaudible] and then I can just see – Redgate has a plug in to do this right that all it does is it runs that command that you know, I just come in here and say well, I just want to run this one test and it’ll run it and tell me if I failed and I think the primary key on will fail.
But I can do all this stuff on a command line, I can do it any number of ways, I just want a way to repeat it over and over again, so as I’m testing my query and trying to figure it out, I just want it to run. I’m sure you do this with sp_Blitz, right, like do we actually get the DMV that gets the degree of parallelism or whatever right.
Brent Ozar: You know it’s really funny, I’m a huge believer in unit testing and T-SQL testing now but all the sp_Blitz, sp_Blitzcache, none of those have any testing at all, and I feel the pain every time we go do a new version. I know there’s checks in there that don’t work. I’m absolutely positive.
Steve Jones: And the thing is like I wouldn’t go write all the tests right, like I’ll go right we’re going to add a function for x, so let me just – I’ll just create a new test you know, and I’ll say test x, and then I’ll just write the code right. I’ll just go and say alright you know, select star from sys.dm_ blah blah, and then you know I’m just doing an insert. I’ll insert this equals some value. You know and then I never have to go check that particular thing again because it’s codified, this is what I’m trying to assume works.
Brent Ozar: And it’s the same with any new stuff that we develop and Richie I’m sure is watching this, he was talking in Slack, like I’m a huge believer of start with the tests and he is a visionary on that.
Steve Jones: He can do the sp_configure, I’m going to change it, I can look for negative tests, I can look for exceptions, I mean I don’t love testing in development because to me a lot of times I’m not quite sure exactly how the spec should work and sometimes I have to futz with it. But I often write some code, you know like if I’m writing this code – let me find that function real quick. You know, what I may do is come in here and go this is how I think this should work, which is actually horrible code but it’s there for a reason and then I may write the test that says this works and then let’s do something like let’s change this to you know 20.
And I’ll recompile this and run the test and the test should fail and I’ll come back here and I’ll go alright, do that, and then eventually I’ll go well really this is dumb and what I really want to do is something like – we’ll take the LAN of this and you know, this is more efficient code right, but the test should still pass. If I refactor this code the test has to still pass. That’s what it is, testing is often about avoiding regressions, but breaking stuff that worked last time.
Brent Ozar: JD says, “Without validated requirements, what are we testing? This is why I’m skeptical about test-driven development.”
Steve Jones: So JD, you always have requirements right, whether they’re imaginary in your own head or somebody wrote them down and gave it to me, you’ve got requirements. I mean, the classic one, this is the thing I see all the time where somebody will go I need to insert into country codes. And I think this is valid right, like I think this always works, and so what I’ll do is I’ll do select star from country codes and then I’ll do this and I’ll do select star from and I’ll go this is the test, because in my head this is perfectly valid.
The problem is whatever I put in here, if I take that example we had and I run this test in a work rate and I can say my code works and then tomorrow somebody goes, well are you sure this works? If I run this test again, it doesn’t work because now I’ve duplicated things, I don’t know I’m not testing. How hard is it for me to just go you know, really let’s get rid of this, let me spend eight minutes writing the test that tells me in my head I expect an insert and then go you know, dot run. And I’m going to say this will be you know, country test dot test insert into country codes or however you spell that, and that’s my test now and instead of having this stuff, all I’ve done is I’ve codified that according to what I believe.
So it doesn’t matter if somebody else gives me requirements, I’ve still got to make decisions as a developer how I think it works, and for me to write tests to go this is how I think it works, then later on when somebody yells at me I can go well, this is what I thought you said and I’ve got some documentation, rather than some random set of test stuff that I ran last Thursday and I think I remember what I did. That’s all I’m doing is codifying what you believe.
Brent Ozar: Well perfect, great session sir, lots of interactivity in Slack, you should go through and read the messages, lots of people were talking back and forth about things like Octopus. Thank you for speaking today.
Steve Jones: Pleasure, thank you.