Enjoy the Podcast?
You’re responsible for the care and feeding of production SQL Servers, including index maintenance.
DBAs, developers, and database architects considering using In-Memory OLTP, but not sure if it might benefit their workloads.
You’ve heard that In-Memory OLTP is fast, but how do you know if your workload can take advantage of this awesome feature?
We will explore In-Memory OLTP workload evaluation, use cases, architecture, capacity planning, data migration, hot/cold data, potential impact on disk-based workloads, and more.
By the end of the session, you will have an understanding of how In-Memory OLTP works, and whether your workload is likely to see an increase in performance.
Why I Want to Present This Session:
I’m an In-Memory OLTP evangelist, striving to bring clarity to the misconceptions that I see in forums, blog posts, and in the field in general.
In my quest for expertise, I have read every page of the Microsoft documentation for In-Memory OLTP, verified differences in the behavior between SQL 2014 and SQL 2016, and blogged about many aspects of In-Memory OLTP.
Ned Otter: Today we will discuss, amongst other things, how to evaluate if SQL 2016 in-memory OLTP is right for your workload. I will be your SQL strategist for today. A little bit about me: I’ve been a DBA since dinosaurs roamed the earth and I am obsessed with in-memory and just was recently made the co-leader of the In-memory Virtual chapter for PASS; where I hope everybody who’s watching this presentation will sign up for future presentations.
The key takeaways for today are workload evaluation, of course, use cases and architecture. And I realize that everybody who’s watching this session is likely on different levels, so I hopefully will have something for everyone. Yes, we’re going to answer this question, how to evaluate, but another question we’re going to answer is what is the impact of deploying in-memory in your database ecosystem because it changes things that you might not expect to change. And especially from the DBA perspective, there are things to be aware of that are different.
First, I want to mention that if your application depends on a number of features that are not yet supported for in-memory, you might be able to find a workaround, but it might be a showstopper for you, so we’ll discuss a few of them. Some of the items on this list will be supported in 2017; I have a slide on that in a moment. But, for instance, if you need partitioning, in-memory doesn’t support it; so, you’d have to live without it or not be able to migrate to in-memory. Or event notifications, luckily it’s not critical for your application but maybe for monitoring. But on this slide, you’ll see towards the bottom, you have cross-database queries and transactions are not supported and that would definitely be a showstopper. So if your app depends on that, then you can’t use in-memory, at least at this time. I don’t know if there are plans to address that. And at the bottom of the slide, I have FileStream; there’s some confusion about FileStream, so I want to drill into that a bit.
You do not have to enable the FileStream feature in order to use in-memory. Some people are, when I look at the forms, are confused about that. 2014 used FileStream to do file management for the files that persist durable memory-optimized data. That was changed in 2016 because it was a performance bottleneck. So now the in-memory engine actually manages the files; and it does use a component of FileStream in the background, but again, FileStream does not have to be enabled to use in-memory. Also, as we said, FileStream as a feature is not supported for in-memory tables but it is supported for on-disk tables. Which means you can have a memory-optimized database with memory-optimized tables and on-disk tables, and the on-disk tables can use FileStream. So this stuff can coexist and there’s a relation to in-memory just in terms of what the engine uses in the background, but you don’t have to use FileStream to use in-memory. I just wanted to be clear about that.
There are some data types that are not yet supported. SQL_variant – I believe Microsoft has a blog post about a workaround and I’m sure that clever developers could find a workaround for the lack of datetimeoffset support. The others are just plain not supported, so that could be an issue for migration. But there are tons and tons of improvements relative to SQL 2014, now in 2016 and further enhancement in 2017. Some of them are listed on this slide, there’s just too many to list. It is a tsunami of changes for in-memory surface area and integration. With new features in 2016 like temporal integration with availability groups, et cetera. So there’s a lot to be excited about. And one of the things I see out in the field is that people still have a bias against in-memory because it was more limited in 2014. So if that was the case, if you had a negative experience in 2014, you should revisit in 2016 and definitely 2017 because things have improved a lot.
2017 gives us greater than eight indexes per table, which is a great new enhancement. And also, because now you can have so many more indexes, there’s a rebuild speed improvement, which is, I’m sure, necessary. Computed columns come in 2017, plus indexes on computed columns. You can finally rename tables and native modules. Until 2017 you could not do that. sp_spaceused now works and you get parallel redo for AGs; so, these are all welcome improvements in 2017, meaning that Microsoft is continuing to invest in this feature. Native compilation finally gets a few things it was missing, especially case statements; so developers should be happy to see that.
Before we get into the nitty-gritty, I want to go over what happened in SQL 2016 Service Pack 1, because it’s a paradigm shift, at least historically, with Microsoft, where many formally Enterprise only features are now available in non-Enterprise editions. But there are some specifics to in-memory that I want to go over. The way it’s listed is that there’s a max in-memory limit and column store limit per edition. So we’re going to drill down to the Standard Edition as one example. Some of the imitations are per database and some are per instance. You can have a total of 32GB of memory optimized data, and that is per database in Standard, not per instance. But column store, it depends on what type of table the column store is applied to; that determines the limit. So for in-memory tables, the limit counts towards the per database cap, not the per-instance cap. But column store for on-disk tables counts towards the per-instance cap. So another way to read this slide would be that you cannot have 32GB each, of memory optimized data and column store index for memory-optimized table. You can only have a total of 32GB – I just wanted to be clear about that.
For those who don’t know it, we’ll just briefly cover the basic architecture in terms of file layout for memory-optimized databases. So we start with, of course, the database. And you have to add to that database a special type of file group, which is a memory-optimized file group. That file group is logically associated with one or more containers. Now a container is just a fancy name for a folder. It’s literally a folder. And within the folder reside the files that persist your durable memory-optimized data. These are known as data and delta files or checkpoint file pairs; it’s the same thing. This data is streaming, it is not pages on extents like an on-disk table. It is a different type of data, like FileStream, it is a streaming type of data. And the key thing to remember is: it is not possible to limit the size of these files on disk or per container or via the file group. The volume that your containers reside upon must have enough free space for these data and delta files to grow. You have to have that; otherwise you will essentially reach an out-of-memory condition. You won’t be able to checkpoint files, et cetera. So you’ll have to make sure you have enough monitoring in place to make sure you have free space for this stuff to grow.
The way it works is that everything is first written to the transaction log, just like for on-disk tables. And for durable memory-optimized tables, there’s an offline checkpoint throughout the thread process that reads the transaction log and harvests events from that that are pertaining to data and delta files, and it writes those events to the data and delta files in the relevant container. This is the general way that memory optimized data is persisted to disk.
Data file in blue here, delta file in red. A data file has a distinct range of transactions that it contains, and it’s much more efficient to logically delete an entry in that file rather than physically deleting an entry, and that’s what the delta file, the red file, is for. It contains key values of data entries that are logically deleted. And over time, there can be some inefficiencies if you have many logically deleted rows per data table; and we’ll talk about how that is handled.
In fact, it looks more like this. You have many pre-created and ongoing pre-created data and delta file sets within a given container. You have no control over these files; you should not ever touch them. You probably shouldn’t even go into the folder – only bad things can happen. So these are multiple sets of adjacent data and delta files, and this is what persists your durable memory-optimized data. A data file can be shared by more than one memory-optimized table, there’s not assignment of a table to a file. Entry one can be for table A in the data file, entry two can be for table B in the data file. There’s no order to it. The engine handles all of that, you don’t have to worry about it. But you do have to worry about instant file initialization being turned off, because if it’s turned off, then as these data and delta files are created, they have to be zeroed out; and you don’t want that performance penalty. So IFI is still the best practice, even with in-memory.
Recovery model – now some of what we know an d love from the on-disk universe is different in the in-memory universe with regard to recover model. Recover model is actually irrelevant for in-memory OLTP. No matter what you do, no matter what it is set to, DML for fully durable memory-optimized tables is fully logged always. And so whether it’s simple, bulk logged or full, if you’re making changes to your durable memory-optimized tables with inserts, updates or deletes, then all of that activity is fully logged; no way to change that. the only thing you might do is decide that you did not need durability and create a schema only table.
I just wanted to compare, quickly, the different table facets for on-disk and in-memory. From a durability perspective, for on-disk tables, durability is mandatory. But for memory-optimized tables, you have an option to have it not be durable, so it’s optional. Storage is on pages and extents for disk tables, but is streaming, as we said, for memory-optimized tables. tSQL on-disk tables is always interpreted, but can be either interpreted or compiled for memory-optimized tables. Isolation is enforced through locking for on-disk tables but is enforced through validation for memory-optimized tables. Deadlocks are possible because they’re a byproduct of locking for on-disk tables, but they are not possible for memory-optimized tables because there’s no locking; which is one of the reasons you’re interested in in-memory – there’s no locking.
These are the types of objects that can be memory-optimized: tables, table variables, functions, stored procedures and triggers. I wanted to be clear that it is possible to combine referencing on-disk and in-memory tables within the same select statement; assuming that isolation level requirements are met, this is a perfectly valid statement. So I don’t want people to think that if you have memory-optimized tables in a database you can only select from them that reference disk-based tables. You can mix – with interpreted SQL, you can mix. This feature is obviously optimized for a very highly concurrent and write intensive environment, which would mean OLTP, data loading, data ingestion and transient data. It’s not going to make your reads any faster. Your analytical queries probably will not see much benefit, even though you can create a column store index on memory-optimized tables. This feature is optimized for massive amounts of concurrency and writing.
I’ve seen, on the various forums that I frequent, that often, when people do a proof of concept for in-memory, they don’t quite have it right and they’re disappointed with the results. So I wanted to briefly review things to do or things not to do. You need a reasonable number of cores to verify if in-memory might benefit. You need a lot of concurrency. You can’t just test with a single threaded process and say, hey this is no faster or may even be slower than my disk-based table. You need 100, 500, multiple hundreds of concurrent threads doing things, because that’s the problem for the traditional engine, that it won’t scale; but an in-memory engine will scale with high levels of concurrency.
And if you’re testing native compilation, you can’t just have a single select statement that selects one row from one table, because it’s not complex enough to benefit from being natively compiled. There has to be enough logic in the stored procedure to justify migrating to native compiled. And if you use something like O Stress, which is very popular to test, because it can fire off hundreds of concurrent connections, you should not use – I should say, you should use it with a –q parameter, which is a lower case Q, so that’s quiet mode; so it does not write to output files on disk, because that will slow your benchmark. And I had to correct one gentleman who was having tremendous amounts of problems getting in memory to be faster because he was slowed down by writing to the output files. So you have to be smart about how you do your proof of concept to give in-memory a chance to shine, I should say.
On to use cases. There’s way more use cases in 2016 than we had in 2014. If your application needs high throughput and low latency, in-memory could really help you out in this way. If you want to do caching or track session state, it’s an excellent use case. If you have devices that send tons of data to SQL Server, then in-memory can help because it can handle a concurrency. Let’s say cars or other devices are sending telemetry data to a database, SQL Server might fall over with an on-disk table, but it should handle it with in-memory. There are types of tempdb contention that cannot be resolved by changing the file layout on disk. So tempdb contention can be because of GAM, SGAM and PFS pages, which are allocation pages, and that would require you to spread things out to multiple tempdb files. But the actual system tables in tempdb can be a source of contention and in-memory can help with that because you can literally have memory resident only temporary objects. ETL data loading is another use case. And I’m sure there are others, but these are the ones that I’ve come up with and they may be enough to justify migrating or considering migrating some of your tables to in-memory.
On to workloads. Now, these are the things, the characteristics of workloads that would be interesting to someone who’s considering migrating some tables, some data, to in-memory. If you have high levels of concurrency, intensive write levels, latency is localized – an example of localized latency would be the traditional best practice of clustering on an identity column means that all inserts go to the last page of the clustered index; as Brent said in his prior presentation. So that becomes a hotspot and it prevents scaling your write workload. So in-memory can deal with this problem in a fantastic way because there’s no locking. There’s nothing that’s physically stored – there’s no order to the data. So there’s no last page concept and there’s no actual clustered index other than column store; but even that works differently.
So if your latency is localized, like for the traditional data access pattern for on-disk tables, it could benefit from going to in-memory. If your hot data is doubted, meaning not all over your database, then it’s a better characteristic to have. And if you want to use native compilation, then your business logic has to be in the database, it can’t be in a calc engine in c# that’s external to the database. You can still realize a lot of performance benefit without native compilation, but if you really want all that you could get, then you would like to try and natively compile some of your procedures. And if your temporary object use is high – so if your workload has all of these characteristics, you have a really good chance to benefit from migrating some of your data to in-memory. But if you don’t have all of these characteristics, it doesn’t mean that you won’t benefit. It just means that you might not benefit as much and you’d have to try and figure out if the tradeoffs, because there are always tradeoffs, are worth migrating.
If the latency that’s slowing you down – if your bottleneck is the actual engine itself, then in memory helps because there are no locks, no latches and no spinlocks. They do not exist for memory-optimized tables and indexes. If you have high CPU because the tSQL used to send to the engine is constantly being reinterpreted and it’s very long and complicated, then in-memory can help because native compilation will reduce the number of cycles that are required to do the same amount of work.
If your bottleneck is in some part, or tremendously, due to logging, ten in-memory helps because there is no write-ahead logging as there is with disk-based workloads. You don’t write anything ahead to the transaction log for memory-optimized tables, you only write a commit time. What is logged is logged more efficiently for memory-optimized tables and index changes are not logged. I have an asterisk next to that because that’s 99% true; but for today’s presentation, we’re going to assume that there are no changes to indexes that are logged.
The basic durability workflow works something like this. You would update a row and commit, and at that time, write to the transaction log, the offline checkpoint thread grabs that event and writes it to the data and delta files. This is the, kind of, 10,000-foot view of how it works. As we said before, if your latency is due to tempdb, then in-memory helps because these temporary objects can be thoroughly memory resident.
So just to sum up, if you need better performance, then in-memory can help because of these things. If you need to scale your workload or are having scalability issues then in-memory helps because of these things. So this is – basically, you see it’s all about writing and not so much about reading. We said that tables and indexes are lock-free, and they are, but that does not mean that the in-memory engine can’t wait on things. What could it wait on? It could be waiting on writing to the transaction log. Now, imagine today your workload is running just fine with on-disk, but you can’t scale. And now you migrate data to in-memory and you can handle much more throughput, so that’s going to put more pressure on the transaction log. So write log can become a new, or increased, bottleneck for your in-memory workload for durable memory-optimized data.
It’s possible, in the in-memory universe, to have transaction A depend on transaction B. and until transaction B commits, transaction A cannot commit. So there can be some waiting because of that, but it’s usually very fast. I’ve not seen this be a problem in my travels, but it is something that can happen.
So which tables should you consider migrating? This is one of the main questions to address. First of all, when you check the amount of latches that occur for that table, are there a lot of them? Because if there’s not a lot of latching, then you won’t be able to benefit as much from in-memory, because one of the best bottlenecks to get rid of with in-memory is the latching. The top waits, are they related to latching? So if you test inserting or updating a table in a vacuum, let’s say with no other activity, what are the top waits that occur when you check? If your wait stats look like this, where write log is at that top – this is before you do anything, you’re just checking what your current wait stats are, what are your current bottlenecks – and this is what you see, then in-memory is probably not going to help you or won’t help you nearly as much as it could because you don’t have any page latching at the top of this stack; or even in this stack.
So write log I/O completion, async network I/O, SOS scheduler yield, which is CPU pressure, you’d have to address all of these types of waits before you considered migrating to in-memory if you want to realize the full benefit of in-memory.
And you would ultimately like to see wait stats that look more like this; where you have page latching at the top and write log either not in there at all or way down relative to page latching. This was the basic concept, that you check latches and you check waits and you need latch type waits and latching to get a good chance of benefiting from in-memory.
Now I want to talk about this TPA overview; transaction performance analysis overview. So I’m going to go to my SSMS – I hope you folks can see this. So, if you right-click a database and go to reports, there’s this transaction performance analysis overview; and we’re going to go to that. and there, you have the ability to check tables and stored procedures. So let’s go to tables, and you will see that, in fact, I have only one table that this report considers a contender. So we have this table called Lots of Defaults, which is a table that I artificially created a lot of latch contention for; we’ll see that in a moment. And this magic quadrant attempts to describe to you the amount of gain you would receive from migrating this table to in-memory versus the amount of migration work required.
So, on the vertical axis, I agree. I should say, I have traced the query that is issued in the background and analyzed it and I agree with what it does to determine the vertical axis; meaning how much it might benefit from migration. What I don’t agree with is the query that is executed in the background to determine the amount of work, and we’ll see why in just a moment. The query – this is just my version of the query that is executed – so first we’re going to check latches. We’re going to go against index operational stats and order by page latch wait count, descending; that’s the first query that’s run. And we will see that, in my case, we have hundreds of thousands of latch waits for this Lots of Defaults table. We also have locks, and the percentage of the wits for the entire database is 100%, so this table is looking like a great candidate to migrate to in-memory, just based on percentage of waits for the whole database and the latch wait count themselves.
But this is just the first part. So you’re checking latching, and we have latching, and that’s good. But the next question you have to answer is – I’m going to make this a little bigger so everybody can see it – is how difficult will the migration be? And what this query does in the background is it counts the number of these issues. Some of them are completely blocking; you cannot migrate to in memory if you have, for instance, FileStream columns or if your identity is not 1, 1; things like that. but other things that are counted, which I assume are used for a waiting factor for how easy or difficult it might be, are fully supported in SQL 2016. So it doesn’t make sense to me that they’re tallied at all; they’re complete non-entities for migration. So what I’ve done is kind of broken these out. So nullable, LOB and UNIQUE are fully supported without any issues, and then other items are supported with varying degrees of migration difficulty. And this is one of the issues where this migration portion, the difficult portion, is that there’s no difference, and maybe it’s not possible to see a difference between how difficult it is to migrate a default constraint, which is really easy, versus a foreign key, which might be very difficult because of a long chain of tables that are connected. And then we have the unsupported blocking.
So I’ve taken this query and, kind of, bucketize it to count the number of issues. So Lots of Defaults has 73 issues, it has 49 default constraints and 24 LOB columns. Now, I want you to remember that number 24, because we’re going to see that in a moment. So this allows me to make an intelligent decision about how difficult or easy it will be to migrate, because I don’t trust that part of the transaction performance analysis report. If I saw 50 foreign keys for this table, I’d know that’s going to be difficult; really difficult, unless I’m also moving those tables to in-memory.
So let’s go back to the transaction performance analysis report and click and get more detail on Lots of Defaults. I hope you folks can see this – I’ll try and zoom in. so, Lots of Defaults, apparently, has 24 migration blocking issues. Now, that is, coincidentally, the exact number of LOB columns in my table, which are fully supported in SQL 2016. So that’s why I don’t trust this portion of the report; I think you should either use my query or your own version of the query that you might trace when the TPA overview comes up and make your own decision; but don’t entirely trust this report for telling you how difficult it will be. That’s my statement about the TPA overview.
Let me go back to the slideshow…
Pinal Dave: Quick question would be, where did you get that report by default? If you could just walk one more time for people, because a lot of people are in and out, so it would be very helpful.
Ned Otter: So you right-click a database and you go to reports, and you can either choose, from in there, transaction performance analysis overview or I believe it’s also on standard reports, where you can find it there. So that’s how you get to the TPA overview.
Pinal Dave: Fantastic, thank you.
Ned Otter: Sure. One other thing to consider when migrating to in-memory is, if there’s any form of compression, data or index compression that is used for your on-disk tables, because if so you will not have an exact sense of how much resources you need to migrate. Because, let’s say, you have a clustered column store index on a disk table, it’s 10% to 20% of the original row store size, now you have to figure how much memory will I need to store that table? And there’s only one way to find out. You have to migrate it and see; but it could be a lot more memory resource than is represented by the on-disk storage portion that you would see because of compression. There’s no way to compress anything in memory, so that’s a consideration.
I wanted to briefly describe the differences and concurrency between on-disk and in-memory; and it has ramifications for other things, for resources. The concurrency model for disk-based tables is pessimistic and is done through locking. You have an optional database setting to implement either recommitted snapshot isolation or snapshot isolation, which creates row versions, which are stored in tempdb, and this solves read write blocking but does not solve write write blocking, which is one of the reasons that in-memory was invented, because for in-memory, you have an optimistic concurrency model, which is using mandatory row versions, which are stored in-memory so you need additional memory for those row versions. It solves read write blocking and write write blocking. But there is a – I wouldn’t say a gotcha – but there’s something to be aware of, which is that if multiple processes running concurrently attempt to update the same row, only one of them can win. The others will fail and need to retry, and that means that if your data access pattern is like this, there will be excessive conflicts and you won’t be able to get the throughput that in-memory is capable of.
In-memory is great at scaling writes to different rows; updates to different rows. Not the same row. And if you’re using stored procedures that are really short, you probably won’t see any benefit from native compilation. As they say, if your application is chatty, constantly calling many short procs, you probably won’t be able to get benefit from that.
For conflicts, you would need to – just like 1205 which is deadlock for on-disk tables – you would need the suggestions to retry when you get a 1205 error. You will need to retry based on these errors, which you’d have to be tracking in the application. So, someone committed before you or there are isolation validation failures or transaction dependency failure, so these are new things with the in-memory world that you’d have to be aware of and track and respond to.
You wouldn’t necessarily think that – yes?
Pinal Dave: Sorry, one more question. One very interesting question is Jay Walker is asking that if he’s understanding the situation correctly, then if the database in this question is only being read from and not written to, that being in-memory isn’t going to help. Is it true?
Ned Otter: I would say it’s pretty much true. There may be some benefits – in other words, is it faster to read pages from the buffer pool, which is for an on-disk workload, because the data that’s in memory for the in-memory feature does not reside in the buffer pool. It’s a separate portion of memory. Is it any faster to read from this part of memory than this part of memory? Well, there will be no shared locks trying to read this in-memory data that’s in-memory. There will be shared locks trying to read data, so there may be a slight advantage from just reading data. The only exception to that might be if you’re doing an analytical workload on memory optimized data with a clustered column store index in-memory. That might be faster because you have the compressed data and row group elimination. So, I can’t say uniformly that it will always – that there’s no performance benefit for read only workload. You’d have to test, but this feature is designed to scale writes, not designed to scale reads. But hey, you may have to do analytical types of work on your memory-optimized data so they’ve given us some ways to do that. Hope that answers…
Pinal Dave: That’s fantastic. So yes, so this is a very popular question. A lot of people just think because of the earlier when in-memory was introduced, they were just thinking, “Oh yes, we just … this thought that it’s synonyms of like think table or something like that. Oh, we’ll just put our database in the memory and then let people access from a memory so it gives up faster. Like one of the questions we get is, “Is it like a RAM drive or is it you can just get it faster?” And that is very interesting when you see this thing loudly saying, not only for read but basically it is for write-optimized. I think that tells the power of in-memory OLTP. That’s a one-line message. I think we all can walk away from what we just learned right now. But please continue. Fantastic, thank you.
Sunil Agarwal: And this is Sunil, let me add one more point. The querying on in-memory table is not in parallel. So it is single threaded. So, if you’re running a query on in-memory table, it does not take … to be okay; except when you’re querying the column store on memory-optimized table. So, going back to the point that you’re making, it is optimized for transactional workload, not for query workload, in a [crosstalk] sense.
Ned Otter: Alright now, that’s from the people who created in-memory. That’s Sunil talking, so I would trust what he says. Okay, moving on. It might not be intuitive to think that in-memory optimized workload has storage and IOPS; potential resource issues with storage and IOPS, but if your data is durable, which means you’re writing to disk as a safety for your memory-optimized data, then you need enough resources to do that fast enough to keep up with your memory-optimized workload. So memory, storage and IOPS are key factors in deploying the in-memory feature, and as I said before, and I’m going to keep hammering it home. Right lock can be a bottleneck. We will discuss a potential solution for that, which is based on hardware in a moment.
Now, let’s discuss a little bit about memory itself. So I’ve seen some confusion on the forums. People think that the memory that the in-memory feature works with is somehow external to SQL Server. It is not. You need memory for your operating system of course, and then we’ll have the SQL Server max memory setting. And from max memory, you have all kinds of buckets in there for different things. One is the buffer pool for the disk space workload, other caches, et cetera, and a new thing that’s added to SQL Server max memory when the in-memory feature was added is a place for memory-optimized data. So everything is carved from the max memory setting. This has some implications depending on which edition you’re on, but I just want to be clear that in-memory data lives underneath the max memory umbrella.
If you’re not running enterprise, so you therefore could not use a resource group and not create a resource pool to segregate the amount of in-memory data that is – that memory is allocated for. If you’re not running enterprise, you’re running standard or any other version below enterprise, then everything comes out of the default pool. You’re not able to create a separate pool.
If you add rows to in-memory tables, then that footprint will want to grow from in-memory and it will put pressure on the buffer pool and the buffer pool will respond by shrinking. And if you then deleted row in-memory and the buffer pool wanted to grow, it will dance back over. So this is what I call the memory dance.
Now, this works great until the in-memory footprint grows and will not shrink. Let’s say you’re not deleting any rows and the buffer pool wants to move back to where it was and it can’t, and this can affect your disk space workloads. Some people think that you can use buffer pool extension to solve this problem and theoretically you could, but I think that if I’m not mistaken, buffer pool extension is a single threaded process that reads one 8K page at a time. It is only available in standard, and so it might raise the ceiling, but it might not perform nearly as well. You’d have to test, but it’s something that you can do with standard edition.
And in case you’re wondering, as we said before, we talked about the fact that compression is not available for any of the memory that is occupied by the in-memory feature, so also you cannot page out that memory under any circumstance. You need enough memory for this stuff to exist, otherwise your workload – you can self-query your data and memory, but you can’t make certain kinds of changes to it, you’re essentially out of memory even though you have memory for other parts of the workload like on disk. So you need to have enough resources.
Speaking of resources, modern versions of Windows servers support gigantic amounts of memory. So 24TB in Windows Server 2016, that’s a lot of memory, which might cause people to think, “Why don’t I just place my old database in-memory?” And I’ve had some discussions with Microsoft about this behind the scenes. It’s supported apparently, but I think no one in their right mind would do it at this time. It’s not been tested – my understanding is that Microsoft has tested workloads with all tables being in-memory. Not a migration of cold data from on-disk also being in-memory.
From a DBA perspective, because my background is as a DBA, there are things that happen in the restore process that are different once you include durable memory-optimized data. So, I’ll briefly talk – this is a 10,000-foot view about the restore phases.
When you restore database, the first thing it does is create the files on-disk, and this will be in this case, your data and delta files, and all of the files for your on-disk workload. And then it copies data from the backup files to the files that were just created on-disk, and if you restore a database with no recovery, you have completed these two steps. But for memory-optimized databases with durable data, durable memory-optimized data, there is one additional step, which is that you have to stream that data being into memory. If you restore database with recovery, then you have completed all three of these steps.
Your disk-based tables are not available until all of your memory-optimized data has been streamed. So this has implications for recovery time objective, for RTO. When you restore a database, we said it creates the files on-disk and it makes sure before it creates those files that you have enough free space to do so. So it short-circuits the restore if you don’t have enough storage to create the files. But there is no such short-circuiting to determine if you have enough memory. So you can spend time restoring a database, creating the files, let’s say it takes an hour to restore all the files and copy the data to those files and now you begin the re-stream and you’re 1MB short in terms of how much available memory you have. You will have spent all that time restoring the database and streaming the data only to fail at the end with not enough memory.
So I have opened a connect dot with Microsoft about doing this kind of check. I’m not sure if it’s being addressed, but DBAs would like to see something like this. A short-circuit if there are not enough resources, which is what you do for on-disk tables, whereas I should say for storage on disk.
Sunil Agarwal: One quick comment. This is a DBA problem that he just pointed out. So one of the things we thought about when you backup an in-memory database, a weekend write down in the backup header how much memory is needed to restore it, but that has not been done but there are a couple of options on the table. Thank you.
Ned Otter: Great, I’m glad to hear that. See, I didn’t even know that my connect dotting was being discussed, or maybe it wasn’t related to my connect dot but thanks Sunil for the update.
So without what Sunil just mentioned being part of the database backup at the moment, a clever DBA might think they can restore file list only and have a look at what’s in the backup file and determine how much resources they need to restore this dataset. And that’s true. One potential gotcha is if you’re not good at identifying how the components of your database are named, you just use let’s say, FG1, FG2, things like that, which are not descriptive, they don’t tell what it’s really related to, in this case this is from a database backup that I created on my local machine. It’s not possible to tell on this particular slide if you need 30GB of memory or 60GB of memory. Both of these types are type S. They are both streaming; one is file stream and one is in-memory. So, you as a DBA, or as a developer need to create file names – you need to name things so that you know, so that DBAs know how much resources they need to restore. Because if this was named properly, let’s say that file group two was really for a container that represented in-memory, I would know I need 60GB. And if I don’t have 60GB I know I can’t restore the database.
So with proper naming and interrogating the backup itself, a DBA can determine how much resources they need only if things are named in a way that describes what they are.
Sunil Agarwal: Can I make a quick comment on that one? This is a very clever technique. The thing is that if the file group for memory-optimized table is the 60GB but if you have a lot of deleted rows, you will know about that. So this is like an upper bound of memory that you would need that will guarantee the restore. Thank you.
Ned Otter: Right, so you might not need as much memory because you might have many logically deleted records, yes?
Sunil Agarwal: And then it does not account for the indexes that you will create and…
Ned Otter: Correct. That’s a very good point, which you don’t really address in this – so this is basically since changes to indexes are not logged, they are not part of the backup, and they have to be recreated on the fly. So you could need now in SQL Server 2017, you can have hundreds on indexes on one table. Nobody in their right mind would do that, but it’s possible.
So if you just want to see the ballpark of some memory that you might need to restore database, then you’d have to take this and hope that you don’t have gigantic indexes. So really, we need a Microsoft solution to this problem. And Sunil, I’m not sure what you folks have discussed includes a calculation for how much indexes would occupy, but that’s an excellent point.
Sunil Agarwal: That’s right, and I think this is nowhere close to being baked, but this has been one of the issues many customers have talked about. And so yes…
Ned Otter: Okay, we will discuss further I hope.
Sunil Agarwal: Sure.
Ned Otter: So when you restore a database that has durable memory-optimized data and you have to re-stream that data, if you have a single container and you have a lot of durable memory-optimized data, this streaming will take probably way longer than you would like. So it is smart to spread the streaming out across multiple containers, which should be on multiple volumes, different volumes, because one volume is capable only of some upper limit of IOPS. You need to spread the IOPS around so that recovery – sorry, streaming occurs in parallel.
No one can tell you how many containers and volumes you need, you have to test with your own database. And of course, over time, just like any dataset, as it grows, you may be beyond your RTO window. So this is something that you would have to verify every so often to make sure that you’re still within RTO.
There is no performance benefit for your workload from breaking out to multiple containers and volumes. It’s strictly for recovery time, reducing recovery time I should say. So the bottom line with this is that deploying in-memory changes the way you would architect your database because now you need to figure out how many or test and figure out how many containers and volumes you need, which also affects your storage design and number of containers.
You would test restore with your whole dataset, and while you – like you would know of course which RTO is supposed to be and while you don’t make your RTO, you would add more containers and volumes. Again, test with the restore with the full dataset and you repeat until you can make your RTO. And again, that’s your RTO for today with your current set of data.
If you’re deleting data, let’s say archiving or literally deleting it, then maybe your set of memory that’s required – the footprint of memory required, maybe it remains constant. But if it grows, because everybody saves data, they don’t get rid of it, then your RTO needs to be verified going forward.
We discussed that…
Pinal Dave: Just to break a moment, I’ll ask you three questions while I’ve been collecting together, which I see Slack, on Twitter as a list of questions, so then I think we can continue the next part. So one of the questions was that does the choice of the disk matter when we are going with in-memory tables? So what it means is choice of the disk means like should be putting this one in a different physical drive or should we put this on a SAN, or is it like a – is it okay to have a SSD or does it even matter? That’s like the question. So that’s question one.
Ned Otter: The first thing I would respond to that with is that this is for durable memory-optimized data. This data, in my opinion, should be completely separate from the volume that stores your on-disk files. MDFs, NDFs, LDFs. Otherwise, you’re competing for IOPS, yes? So that’s the first thing I would do is segregate onto completely new sets of volumes, the memory optimized footprint for your containers. So I think these days if you’re not running on an SSD, you’re getting ten times slower performance from the transaction log, so I am assuming that need speed who are considering migrating to in-memory because of speed issues or scalability issues, that they’re running on SSD. That would be my guess. I don’t know if I answered the other part of the question.
Pinal Dave: No, I think that just says a lot. So the second question is, which I keep on getting, “Hey, what kind” – which I see also in the questions, so I’m just going to read as verbatim, “Is there any workaround to work with in-memory OLTP table which involved across database queries?” So I assume cross database join or in-set update, something. Any workaround for that?
Ned Otter: To my knowledge no, but maybe Sunil can answer that. But to my knowledge, there is no workaround.
Sunil Agarwal: Yes, Ned you’re right. I think Pinal, the basic problem is that in-memory table needs a snapshot to provide a consistent view, and that is not available across database.
Ned Otter: Now, just to clarify, let’s say you have a database that can change all disk tables and you want to migrate some of those tables to a separate database. It is possible if you jump through an extra hoop or two to get data out of database A and into database B. And even you can copy things to a global temp table in tempdb, so it is possible to, for migration purposes, to reference things across database. But you need to create a memory-optimized table variable that could be seen by multiple databases. That’s not the same as querying and transactional, but from a data migration perspective, it is possible to go cross-database using a memory-optimized table variable or a global temp table in tempdb. I have a blog post about that if people want to research that. But for your operational workload, it’s not possible.
Pinal Dave: And, third and final question which I just received on Twitter when I Tweeted that we are all together – the question is that if I try to restore our database where in-memory is enabled and another database, even though it’s SQL Server 2016, but does not have, somehow, in-memory enabled or something, will it fail? What will happen or – that question is on Twitter, so it’s a little short. So what do you – just interpret from this and what do you think? What is your reaction when you hear this?
Ned Otter: So I’m not sure I’ll address this exact question, but I want to say that when you restore a database that has durable memory-optimized data, and because indexes don’t exist in the backup, they have to be recreated – recovery can become CPU bound. Also, if you have a lot of LOB columns, recovery can be CPU bound; and this might affect your disk-based workload in another database because the server now is experiencing this CPU pressure. So that’s one thing to be aware of. That’s not an exact answer to that question, but if I understood the question correctly, a database with durable memory-optimized data will either succeed or fail based upon the resources available to SQL Server, not anything to do with another database.
Pinal Dave: Alright, fantastic. I think this is a good stop for a moment. We’ve answered all the questions which we have received so far, but amazing learning. And thank you, Sunil, for helping us out with all this knowledge. I mean, this session is going to be awesome. So continue Ned, thank you Sunil.
Ned Otter: Thank you very much. I depend on Microsoft for a lot and I’m glad Sunil is on the call, as they say. Okay, so talking about streaming data, re-streaming based on recovery might give you a false sense that data is only streamed or re-streamed when you restore a database/recover a database; and that’s not true. A database, if you offline it and online it, the data in memory must be removed from memory and must be placed back in memory; just from simply offlining and onlining. That makes sense because it has to put the data back in memory once it’s offline.
Also, if you set read committed snapshot isolation on or off, your data is re-streamed. So your database is online and you’re not restoring it, but re-streaming occurs for this also. I’m testing this in 2016. If you set a database from read-only to read write, or read-write to read-only, you will re-stream your data. So you need to be aware of that when you do things, because if you have a lot of data and you haven’t spread out your containers properly, then you could be waiting a long time to get access to all of your data, not just memory optimized data. Of course, restore will re-stream SQL restart, service restart that is, and SQL Server boot.
So the last two, SQL service restart and SQL Server boot are obvious, and restore as well; but the first three on the left, maybe not so obvious to people. I have a blog post on this as well. So DBAs need to be aware of what will happen when they change some database options to your streaming data.
We said a little while ago that there is a potential solution for write log waits that you might experience once you migrate to in-memory. And what we’re going to say here is not specific to in-memory, but can greatly help once you migrate and see that it is an issue. The solution is called persistent memory; otherwise known as server class memory. And it is implemented with a special type of memory chip that is known as NVDIMM-N in this case. I know HP has a solution for this, for their 380, 580 – the current generation, generation nine and generation ten servers.
The way it works is that you have a chip that has one side DRAM and the other side is Flash and there’s a power source that connects them, either a super cap or a battery; there’s something that can give power in the event of an outage. What is residing on this NVDIMM are log blocks, which take up about 20MB per database. And NVDIMM, I think the smallest size is 8GB, so you can put log blocks for dozens or hundreds of databases on this NVDIMM, if you wanted to. When there’s a power failure, at the datacenter level or the power supply blows up in the server, then the log blocks are copied to the flash side of the NVDIMM. And, of course, when power’s restored, the log blocks are copied back to DRAM. My understanding from having researched this is that the log blocks can reside on the flash side for years; it’s very durable. Of course, you don’t expect to wait that long.
So, using NVDIMMs, using server class memory can affect the speed and capacity of the server that you place it in because it doesn’t play well with some other kinds of memory that you may be using, like LRDIMMs. If you have a motherboard and you put – let’s say a HP server, you put all LRDIMMs in, which if you do that then you can still maintain the top memory speed, the maximum memory speed, with this special type of LRDIMM memory, but NVDIMMs don’t work with LRDIMMs. So now you have to use other kinds of memory, which can lower the speed and or the capacity. So it’s – this kind of stuff is really in its infancy, but I’m sure it will get better over time, will play better with other kinds of memory perhaps, but it’s an option for write log waits, for addressing write log waits in general. But there is one gotcha, which is that if you do use persistent memory, you cannot use TDE and if you use TDE, you cannot use persistent memory. They are mutually exclusive. So just because of the way TDE works, it’s I/O based encryption, things work in a different way when you write log blocks to that NVDIMM; so just something to be aware of if you’re thinking about this kind of solution.
Capacity planning. So we talk about capacity planning; capacity planning might not be quite what you think, so we will hopefully open your eyes. You need memory for all these kinds of things. Data, indexes and the growth of data indexes, row versions, depending on your workload, table variables, you have now memory-optimized table variables, and temporal. I have temporal on here and we’re going to drill down into that so you can see what the potential issue is from a resource perspective.
The temporal feature is available for on-disk and memory optimized tables, and the basic concept is that as you change rows in the temporal memory-optimized table, the older versions are written to a history table on disk. That is a correct statement for on-disk tables, it is a correct statement for memory-optimized tables, but for memory-optimized tables, there is one additional step in between which is that the rows are first written to a hidden staging in-memory table. So the old versions go there first, and then when the staging table reaches 8% of the size of the temporal table in-memory, there’s an async flush to the history table on-disk.
Now, the 8% size can occupy a lot of memory. You can manually flush this before it reaches 8%, but if you don’t, 8% of a lot of temporal memory optimized data could be a lot of memory that you’re using that is a kind of hidden memory consumer. So it’s something for DBAs,in particular, to be aware of if you use temporal and in-memory together. And of course people are querying the history table, the engine is smart enough to know that it has to look in both places. So just want to be clear.
Capacity planning from an IOPS perspective; now let’s say your workload IOPS is – you need 500 IOPS to do your in-memory workload. Because data and delta files can become inefficient due to a significant portion of the data rows being logically deleted, there is a background process known as file merge, it’s a garbage collection process basically for data and delta files, which attempts to identify adjacent sets of data and delta files that could be merged to one set. So you need read IOPS to read these files and you need additional write IOPS to write to the new set of files. So that’s why Microsoft recommends three times your workload IOPS to be able to handle file merge.
This is assuming that you’re in a write-intensive environment. That’s why you’re using in-memory. So these distinct ranges of transaction IDs 101 to 200 and 201 to 300 will be merged to a single new set of files that comprise 101 to 300. It’s done for efficiency purposes, from a querying perspective so it doesn’t have to constantly filter through so many logically deleted rows and also to optimize the storage footprint. But I will warn you that it takes a number of transaction log backups and checkpoints to occur in order for these files to be completely removed – merged and removed. And so the on-disk footprint can be much larger than what’s in memory. So it’s something that might surprise people who are not used to – didn’t expect to see that.
You need storage for your initial durable memory-optimized footprint and the growth of it, and because of the fact that the state changes for data and delta files doesn’t occur immediately as we just discussed, your backups can be larger as well and you could need more storage for larger backups.
The guidelines from Microsoft from a memory perspective, you need 2x your in-mem size, 3x your workload IOPS and 4 times your durable in-mem size. This stuff changes over time, I think there was some differences between what they proposed for 2014 and 2016. So this is how I understand it, I believe it is correct. I’m sure Sunil will jump in if it’s not correct. But anyway, it could change in future versions. You need to keep up on what the requirements are.
Sunil Agarwal: This is correct. What we have, the 4x for storage is a very pessimistic estimate, but you’ve got to make sure even though…
Ned Otter: Okay, thanks, Sunil. I wanted to walk through a sample migration. Now, this migration is assuming that you have tables on-disk, and you won’t want to migrate some, but not all of your tables to in-memory, which I think is a typical scenario. It may also be typical to have all tables in-memory from perspective of a new application. But if you have a bunch of cold data in your database and you want to take only the recent data and migrate it to in-memory, that could be a good use case, but there are things to be aware of.
So we will assume in this very simple sample migration that we have just these two tables, everybody is smeared with customers and orders. We want hot orders in-memory and cold orders on-disk, just like any OLTP system we don’t want any orphan child records if we delete, or attempt to delete a parent record. And if we are inserting or updating records to the orders table, whether it’s hot or cold, we must validate the parent ID, and this is traditionally done since the dawn of time with relational database systems.
The main question that we will try to address is where should the customers table reside, and again, this is an overly simplified sample, but it can illustrate some points. So above the horizontal line – black horizontal line – is memory, and below the horizontal line is disk. We have at the moment customers in-memory, of course hot orders in-memory and cold orders on-disk. And the three things we need to validate are in those boxes towards the upper right-hand corner. Can we validate the parent ID from the hot and cold tables and will we leave any orphans if we do bad things to a customer?
So the first thing we will do is create a foreign key between these two memory-optimized tables and that works. That’s a new thing in 2016, we can do that. So we validated one of the conditions. We will try to do that from the cold order side, and that will fail because foreign keys can only be – I should say, if any part of the foreign parent key relationship is memory optimized then both sides need to be memory-optimized. Foreign keys cannot span between memory and disk, so this particular approach will not work.
So we will try a different approach, which would be, okay, we’re going to start again with customers in-memory. We know we can create this foreign key, that’s supported, so we have one checkbox. We will try trigger this time, and the trigger from this table can reference in-memory optimized table, so we’re good. We got two out of three checkmarks. Now we try trigger the other way, now we can’t do a foreign key the other way. And the trigger also is not supported from – between memory and disk space tables. So this approach also does not work for us.
So the third iteration would be to move the customer table to disk, or maybe we left it there originally. You create a foreign key as is done by DBAs since the beginning of time, so we have one checkmark. Then you can create a trigger from the disk space table to hot orders in-memory and that works, so we know we have solved the orphans problem completely. But we can’t use over here the hot orders table, we cannot use a foreign key and we cannot use a trigger. We proved that in the prior scenarios. But we can use a stored procedure; one that is not natively compiled and interrupt stored procedure, and we can do our DML through that procedure. Let’s say one procedure for update, one procedure for insert, and as long as you stick to doing it – the DML changes only through these stored procs, then you can achieve all three checkmarks.
I would ultimately like to see an engine enforced way of doing this because this could make a lot of work for DBAs and developers, but if you have to do it, this is one way to do it. There may be other ways, but this is just one way that I came up with that is possible. Maybe is Sunil wants to jump in with some suggestions, please feel free. But this is my understanding about the limitations that foreign keys and triggers cannot currently span memory and disk. I don’t know if they ever will, so you need to figure out a way around it.
So this is what we covered today. Workload evaluation use cases and architecture. I have tons of blog posts on my website which I’ll be showing in a moment about in-memory. This is proof of my obsession, and that’s my contact info if anyone wants to contact me directly. I’m open to more questions Pinal if there are any. But that’s the end of my presentation.
Pinal Dave: I have one question which is very essential. So anytime the new feature which we discussed to the any of our customer, I’m sure you must be facing, so does Brent and Sunil as well. Like if I say, “Hey, this is the new feature and you should consider using it” and yes, I totally agree. People say, “Yes, is it right for me or not?” So is there some kind of consolidated checklist or something which can guide them – I’m just thinking loudly okay; this is nobody ask me question, this is my question at the end of this presentation like – I understand everything you said but like if I think about it, is it good for my database? I think I’m scared at this point of time. I just don’t want to touch in-memory. I’m being super honest here.
Ned Otter: So why are you here?
Pinal Dave: [crosstalk] I have no idea I exist in my database.
Sunil Agarwal: Except that reporting part which we have seen first.
Ned Otter: Okay, so my first question is why are you scared to try this feature? If you’ve proved through a proof of concept that it can be a great benefit for your write intensive workload, why would you be scared? People are using this in production already for years. I’m sure Sunil has many gigantic customers, Bwin would be one of them, I’m sure, that are using this in production and it works for them great. There are other issues…
Pinal Dave: Let’s not talk about the one person people who are very, very smart. Let’s talk about general people who are like me who are not sure what exist in their database too, but they wanted to go to Enterprise or they might already have Enterprise due to MSDN and would like to go for it. So they would like to go for okay, we have a lot of learn, our own knowledge is limited window, we wanted to do VOC but we do not know where to start. So the question is that is there any valid checklist which I should follow?
Ned Otter: So my response would be that you should do – I’m sure Sunil may have his own response, but you should verify what we did in this presentation, that you have latch contention and that it could be addressed by migrating it to in-memory. That should be easy to check, and that your wait stats that your top waits are page latched type waits and even if that doesn’t work for you and you don’t want to do that work, you could be getting killed in tempdb – I mean, I know a number of gigantic places that use in-memory for resolving their tempdb contention only. It has made tremendous difference in their write-intensive workloads. So you need to figure out what your bottleneck is and then determine if in-memory might help. This is what you have to do with any feature, not specific to in-memory.
Pinal Dave: Thank you.
Sunil Agarwal: So Pinal, I think the question that you asked is a very important question. How do people find if this feature is useful to me or not, so as Ned showed, in the Management Studio, there is a tool embedded that can analyze the workload and recommend certain tables that can benefit from memory-optimized tables, similar to the stored procedures, and I noted down a couple of pearls that Ned identified. I’m going to follow up on that. And the same thing we did for column store. So I, in fact, wrote a couple of queries that we have blogged, and we have embedded that knowledge or logic into a tool called Database Migration Assistant, and what it can do is it can analyze the workload and recommend features that can potentially benefit your workload. So that’s how we are doing it, and what I have seen in many cases, and even being a very important example, people try memory-optimized table when they are pushed to a corner because they’re running into a bottleneck where they cannot find a way out. And here comes memory optimizes tables that can give them a big relief. So that’s where we have seen people are much more open to trying and using the feature. If everything is running fine, I mean, why would you use…
Brent Ozar: I think the problem you run into is that everybody believes their server is running poorly. Everyone believes that it could be better and how do you know when, what features you add in to make it go better. And a problem with some of the repeatable systems, like you can run an assistant against your workload; in development then, say that the feature says, yes you should try this feature; you’ve got to be able to replay your workload somehow in another environment. For me, one of the big things about – if you’re thinking about a new feature is you’ve got to be able to rerun your load in another environment at the same kind of scale that you have in production. Man, that’s really hard for a lot of shops.
How about you guys? I mean, when you work with consultants, or work with clients, what percentage of clients do you think are able to rerun their exact same workloads over in another environment? And all says not often.
Pinal Dave: My customer just works directly on production.
Brent Ozar: Right, don’t even have development environments. Yes, how about you Ned? I mean, when you work with shops, are they able to run load tests efficiently and you know…
Ned Otter: No. You know, they – I’ve talked to people who wanted to test in-memory with a single person doing work in the QA database, and I said, no point in doing that. You need to fire up, and just like production, so yes, it’s difficult.
Erik Darling: Well, prods the best place to learn anyway. What’s a mistake if it’s not expensive, right?
Ned Otter: [crosstalk] If there’s one thing that people might take away from this session – hopefully, more than one, but there’s one; that in memory is a study unto itself, just like column store, and that it can change the database ecosystem in ways you might not expect. And you need to thoroughly test, not just your workload but what happens in the rest of your database system as a result of deploying this or any other feature. So, as you would say, Brent, it’s not a go faster button. It can be a go faster feature, but it requires a fair amount of effort to verify that that’s the case and to, let’s say, train staff on what to do. Let’s say – let’s give a for instance; I have a blog post about this. Let’s say you have five volumes and all five volumes run out of space, what do you do? What will your DBAs do? You can still read, you can still delete, but you can’t update and you can’t insert at that point. Will your DBAs know what to do? The answer is probably not; so that’s why I made a blog post about how to do that. So there are things about that…
Pinal Dave: Let me interrupt you and answer one of my questions which I asked earlier to you – so I have a very beautiful answer for you. Where do people start about figuring out if in-memory is good for you? Is there a good checklist of good consolidated information? Well, I think – I shared it on Twitter, I also shared in Slack, I think you have an amazing write up about SQL 2017 in-memory roundup. By just reading those nine points, ten points, that you’ve written, I particularly – I think I’m a little better consultant at this point of time because I knew those things, but I never knew all of them together. So Ned has a very good ten points write-up on SQL 2017 in-memory roundup, a blog post by you. I think that would be a good start for quite a many people. Like [inaudible] that they are confused about, that is [inaudible] applies with the case, cross applies or top end with ties, right? If you are using this one, should I be going with in-memory? They don’t have a – you know, only production, they can’t try this out. But now, by just reading, at least have a confidence that products support this feature – if does not work, I think they need to learn more. So I think it’s a good start. I think that’s on the screen too, so I just wanted to interrupt and bring it out that the one here. So thank you…
Erik Darling: Sunil, while you’re here, I just had a quick question. Is there ever going to be a quick way to get rid of Hekaton when you add it? Rather than dropping the – is there ever going to be, like, a remove Hekaton button? [crosstalk] try it, but a lot of folks don’t want to have to drop a database to get rid of it.
Sunil Agarwal: Erik, I think you won’t believe how hard I tried. This is 2013, the feature was being designed and I did not have a way to remove Hekaton, and I said this was unacceptable because I don’t think there’s anything in Microsoft SQL Server where you can create something and cannot remove it. But I think there’s a complexity with it – dropping a database or [inaudible] the data out is insane. I mean, nobody can do it. So what we are trying to do is find a really controlled environment, for example like a single user mode or something like that where you can delete it and get back to it. I mean, many customers are saying that is acceptable to us versus, you know, dropping the database. So we are working on it.
Erik Darling: There’s tons of stuff, you know, like what do you call it, RCSI where you have to be the only person doodling around in a database… [crosstalk]
Sunil Agarwal: Brent, just give me a second, can I request you something? This is something that we are evaluating in a very big way; if you can send me a mail why, in your experience, this would be very useful, it will help us build a stronger case. So if you don’t mind sending me a mail.
Brent Ozar: We had a production customer where they had no idea someone had turned it on. They were like, wait, what?
Sunil Agarwal: So Brent, why don’t you send me an email as well. I’m not kidding – feedback is so important.
Brent Ozar: Gary Dixon points out that, “It’s also hard to remove MDW.” Management data warehouse, that one’s tough to remove too.
Sunil Agarwal: So I wanted to make one quick comment to Ned about the RTO. I think you covered it very, very well. One of the things that I found that people don’t realize, that the transaction log in – the checkpoint that happens, right, automatic checkpoint happens for SQL Server, but the automatic checkpoint for memory-optimized tables happens after 1.5GB of log records. We have seen people who, in our [inaudible] 1 GB of log record, and checkpoint did not happen and they restart the SQL Server. It takes a long time because those logs are being applied in a sequential, until 2016. And I think – so that is extremely important to…
Ned Otter: It’s just a blog post that somebody from Microsoft created about that the other day. That before you shut down a database, to do a checkpoint. Just for anyone that might be viewing this, now or in the future, that 1.5 GB of the transaction log that must occur before a checkpoint, that’s not specific to memory-optimized activity, it’s any activity totaling 1.5 GB.
Sunil Agarwal: Absolutely right. And one more point I want to mention about RTO, if running always on, the readable secondary or the secondary replica is keeping all the data in memory. So when you do a failover, it is instantaneous.
Ned Otter: So that’s a big advantage over FCIs, because FCIs will have to restream all data. AGs, whether it’s readable or not, yeah, the data’s already there; assuming that your redo had been able to keep up, yes? So that’s a big advantage. I talk about that – I mean, this is one of my presentations and I’ve talked about recovery in that regard and the difference between HA and DR with in-memory and other features. So I do mention that.
Erik Darling: Is that only for a readable secondaries or any secondary?
Ned Otter: Any secondary.
Erik Darling: So does it work with mirroring too?
Sunil Agarwal: Mirroring is not supported with memory-optimized tables.
Ned Otter: [crosstalk] duplicated feature.
Erik Darling: What about log shipping?
Sunil Agarwal: Log shipping it does, right?
Ned Otter: Yeah.
Brent Ozar: The other think – Pinal asked an interesting question which is, where is – when you go to learn something new, where should you go and learn whether or not – what the best practices should be, whether or not it makes sense for you. I want to say one other thing too, especially while Sunil’s here. When you go to conferences, this is why you go to Microsoft attendee sessions. When there’s a brand new feature that no one else has out yet, they’re not out in the wild, nobody can talk about it; there are no community sessions on it. So it’ll be the same thing at PASS this year when, if you want to learn anything about Python, SQL Server on Linux, then you really need to be in the Microsoft employee sessions. Sunil, do you know yet if you’ve got a session at Summit this year; if you’re going to be in session?
Sunil Agarwal: So session in PASS Conference? So I have a session at PASS Conference, but the thing that is new is mostly how do you get extra miles out of the column store. Because everybody knows, when you run column store queries, it runs faster, but are you getting the best performance? So that is one of the sessions. And the second thing that I’m doing is the data loading to column store. This is actually the point Ned was talking about because when you use a new feature, there are a few gotchas that you need to be aware of. So loading data to column store is tricky, and we want to load that data fast. So that’s in a session that I have, how do you load data into column store in a faster way.
Ned Otter: So, Sunil, since we have the pleasure of your company on this call, I wanted to ask you – because Niko Neugebauer and I talk about in-memory a lot, and of course column store and memory optimized tables comes up. And we are both wondering, I should say – I’m going to put this the most diplomatic way possible, when will memory optimized column store get the same amount of love that on-disk column store gets? Or will it get it? Because there’s a lot of restrictions for memory-optimized column store.
Sunil Agarwal: That’s right. As you know, in 2017 – no log writes…
Ned Otter: I was being diplomatic.
Sunil Agarwal: I think the challenge that I’m seeing right now, I think we have not seen a significant eruption of memory-optimized tables with column store…
Ned Otter: Maybe it’s because it’s…
Sunil Agarwal: You’re right, maybe if we gave it more love, customers will love it a bit more too. So this is not something, but again, let’s do this Ned. I think this is a great way for me to learn about you. We should have a conversation, and Niko and I work very, very closely.
Ned Otter: I know.
Sunil Agarwal: Yes.
Erik Darling: One last question, what’s your favorite secret trace flag?
Sunil Agarwal: You’re asking me? My favorite trace flag is 9481, which is go back to the old cardinality estimate. And I know [Kara] will kill me when I saw that, but…
Erik Darling: Poor guys.
Sunil Agarwal: I have seen many, many cases where old cardinality estimator gives you much better query performance. But we are working very hard to fix it in that adaptive query processing that is happening in 2017 as a way to solve that problem.
Ned Otter: Yes, looking forward to that. Just migrated someone to 2016 and we had to use – persistently use the old CE. And then one query worked better with the new CE, so we had to change that. So that’s the dance you have to do, but at least there are the things you can do. You have a lot of options to mix and match, which is good. But the new CE I’m sure is great but can be painstaking to make sure everything works well with it.
Sunil Agarwal: Right, and I think Ned, the point being that adaptive query processing will bring us to the next level.
Ned Otter: Yes, so just – the only thing I wanted to say – you and I will have our own offline conversation, but for column store for memory optimized, I would imagine that one of the main reasons people don’t experiment with it is because it basically makes it so that you cannot add new indexes or change the schema of your table, so I would imagine that’s a showstopper to most clients. And it used to be the case for on-disk, but you guys changed that. So I’m just guessing that once if you change that for memory optimized, people will consider to use it more.
Sunil Agarwal: Right, and I think…
Ned Otter: Even if the rest isn’t quite up to the same level as on-disk.
Sunil Agarwal: Right. And one of the things you know, it is very confusing about memory optimized with column store, we call it a clustered column store index. It should have been a non-clustered column store index and the history behind it, I mean, I tried my best to change that to non-clustered column store index, and also to use the columns; not all the columns, right? One of the big things, but you’re right. Those kinds of limitations are unacceptable. Let’s have a conversation offline. I think I’ll include [inaudible] and Kevin Farley on this one because…
Ned Otter: We’ll get Niko in, we’ll have a party.
Sunil Agarwal: You’re coming to the PASS conference right?
Ned Otter: Yes.
Sunil Agarwal: Okay. And Niko’s coming, so we will do that.
Ned Otter: Alright. We shall collaborate.
Sunil Agarwal: We can collaborate.
Ned Otter: Okay, great.
Brent Ozar: I have to feel that it’s got to be tough being a Microsoft employee at conferences because I bet people continuously come to you and go, “Why does the product work this way? You need to change this thing.”
Ned Otter: Yes, they beat Microsoftees up I’m sure.
Sunil Agarwal: I think the point is like, you have – I mean you have x amount of resources, and the question is when you put it, and one of the things we are trying to do, this is actually pretty interesting; that you know connect items, is like a four letter word at some level, right? Because hey, I did the connect item, I mean Ned, you mentioned a few times, I mean, what happens to it? And what we’re trying to do now is working to provide a much more business impact kind of information with the connect so that you know, when the product team looks at it, it knows that this is a significant impact because otherwise, item you’re just guessing.
Ned Otter: So for instance, Brent, you and I both recently commented on a blog post written by Erin Stellato, where she was asking, why aren’t people using in-memory, yes? And I believe your comment was along the lines of, you can’t go to a client and tell them they have to do a separate process to verify their database outside of checkDB and so you know, it’s a tough sell for [crosstalk] option.
Sunil Agarwal: As a DBCC something that we’re looking at and – but you’re right. Absolutely right.
Ned Otter: So a few things like that and some of the basic DBA stuff probably will go a long way to furthering adoption, I’m guessing. I’m not a marketing guy, but the less reasons you give someone to reject a feature, the better I think.
Sunil Agarwal: You’re right, you’re right.
Brent Ozar: But – and it costs money. I mean, it costs money for the developers, costs money for the testers, the documenters and all that. I remember one of the most eye-opening experiences I had in development was went to some kind of Microsoft meeting where they’d laid out a bunch of features on tables and we were given Monopoly money and said, you know, vote with money on which features you want the most, and that really opened my eyes as to how much it costs to do development and you can’t just ask for everything. I’d like all of these things, they sound amazing.
Ned Otter: You have to prioritize. Of course, the connect items that I have created that got attention, loved right away, where if there was a stack dump as a result. Of course, that’s something that’s really bad, right? The other stuff – unless you know, the feedback that I sometimes get from Microsoft is, is it blocking anyone from deploying in-memory. Like, if that notification says if you’re monitoring your environment with event notifications and now you adopt in-memory, you can’t use it for that. So you’re going to be missing something, so it’s tough to sell, like your monitoring is not going to be as effective as it was. So some things like that, but it’s not going to block someone from deploying from an application perspective. It just you know, doesn’t have the same feng shui in some cases.
Sunil Agarwal: I think in this case it is definitely a blocker, right? Because if I can’t monitor, I’m not going to deploy.
Ned Otter: Yes, I know one guy in particular who lives and breathes event notifications, and he uses in-memory because he has to and he wants to, but there’s now a hole in his monitoring. He likes to roll his own monitoring. But even maybe a third party product might use event notifications, I’m not sure, but – and maybe there are other ways to skin that cat, but he has his home grown monitoring which now he can’t extend to the entire surface area of his environment, so for him he’s still deployed, but it’s an issue for him.
Sunil Agarwal: Absolutely.
Brent Ozar: I disagree. I got lots of customers who will deploy things, they don’t even – they don’t monitor. They don’t even know if their server’s up, that doesn’t block…
Ned Otter: But those guys are good for business for you Brent. That’s a cash cow baby.
Brent Ozar: Do it live. Alright, well thanks a lot, Ned, Sunil, and Pinal for hanging out with us.
Sunil Agarwal: Thank you so much.
Ned Otter: My pleasure.