By: Allan Hirt on December 13, 2018 in Always On, AlwaysOn, Availability Groups, Read Only Filegroups, SQL Server | 2 Comments
A question came across my inbox this week which I decided to investigate: what happens if you have read only filegroups as part of your database, and you want to use Always On Availability Groups? Let’s find out.
First, I have a database with two filegroups: one read write (PRIMARY) and one read only (ROFG).
Figure 1. Filegroups
I also have a table (ROTable) created on ROFG.
Figure 2. Table on a read only filegroup
I then created an AG named ROFGAG. I selected automatic seeding (more on that in a second) to initialize my secondary replica. According to the Wizard, all is good … or is it?
Figure 3. Successful AG creation
I checked the status of the AG and it wasn’t synchronizing. Seeing as this is not my first rodeo, I started looking and lo and behold, the database was not joined to the AG even though the Wizard said everything was OK at the end of creation.
Figure 4. DB not joined to the AG
So there appears to be a bug in the AG creation where if you have a read only filegroup and select automatic seeding, it didn’t initiate the seeding for some reason. Looking in the SQL Server log on ALEX,it gave me the telltale permissions issue.
Figure 5. Seeding permission not granted
Granted, this is SQL Server 2016 where there’s the 3 minute “issue”, but that generally worked OK through the Wizard. I manually issued the ALTER AVAILABILITY GROUP ROFGAG GRANT CREATE ANY DATABASE and lo and behold …
Figure 6. Seeding initiated
You can also manually back up the database, copy it to the other server, and restore it to the instance that is the secondary replica. Either way, once the database was on the secondary replica, with seeding , it joined to the AG just fine. If you manually restore the database with NORECOVERY, the database has to be joined to the AG.
Figure 7. Join the DB to the AG
I inserted a few rows into a table on the read write filegroup, and everything is synchronized as expected. Keep in mind that on the secondary replica you have the same files as the primary and both filegroups. So having a read only filegroup works just fine with an AG.
Figure 8. AG status
I now deviated from the plan. What if you just want the read write filegroup on the secondary replica, and not everything? Does that work? To do this, I removed ALEX as a secondary replica and then deleted ROFGTestDB from that instance. On ALEX, I then restored ROFGTestDB, but just the read write filegroup. That means I wouldn’t get ROFG or its corresponding .ndf file.
Figure 9. Restoring just the read write filegroup
Since I restored the database, in the Wizard I just selected the Join only option for initial data synchronization. SQL Server thinks all is fine … or is it?
Figure 10. Adding the replica with no read only filegroup
Unfortunately, the AG is not synchronizing even though it joined supposedly with no issues. The AlwaysOn_health extended events trace shows the join, but no other information indicating a problem. The same holds true for the SQL Server log. I was inserting rows into a table, and that DDL executed fine, so the issue is apparently not on the primary.
Figure 11. All is not well
At this point, I removed the secondary replica from the AG. That completed successfully, but after doing that, ROFGTestDB on ALEX (where the restore without the read only filegroup happened) is showing as Suspect.
Figure 12. ROFGTestDB is suspect
The only difference is that before when things worked, all of the filegroups were restored. I deleted ROFGTestDB from ALEX, and then restored it from a backup WITH NORECOVERY with both the read write and read only filegroups.
Figure 13. ROFGTestDB restored
I then joined ALEX to ROFGAG, inserted a few rows, and everything works as expected.
Figure 14. AG working
It looks like AGs work with DBs that have read only filegroups, but you have to make sure you have all files and filegroups. If you restore the database with only the read write filegroup(s), it appears the AG will not synchronize. I plan on following up with Microsoft to see if this is expected behavior, but it is definitely interesting because it reports success when it really is failing. That also means that at least the way things work today, you can’t have a smaller footprint on a secondary replica – it’s all or nothing.
Hope this helps some of you!
By: Allan Hirt on November 21, 2018 in Availability Groups, FCI, SQLHAU, Training | No Comments
Where has this year gone? It seems like it’s flown by. I’ve been on the road for the most part it seems since about mid-September with a mix of some personal, client work, and speaking engagements. However, one of the things I’ve been working on for quite some time is finally here – online training! Right now it is going to be live, instructor led but once I find some time in 2019, there will be more options. For more information on our approach to live, instructor led online training, click here.
There are seven class options/dates listed on our Events page, with a mix of US and EU/UK times. The times and all details are in each individual link on our Events page. Don’t worry – I’m eying other parts of the world for future deliveries. The seven offerings are span four different courses:
All feature our signature labs or some sort of hands on, which is something I have been doing for years when others thought labs were too much work. Yes, they are a lot of work but I know with complex topics like the availablity features, you need that hands on experience in some way, shape, or form.
To celebrate the “grand opening” of online training, we have two awesome sales you don’t want to miss: our Black Friday deal, which is 50% off everything through November 30th. If you somehow miss the Black Friday sale, all of the courses will be 30% throughout the month of December. Those who have signed up for our newsletter will get the first one in early December with a special deal, so if you haven’t signed up for some awesome content from SQLHA, what are you waiting for? Sign up today.
All sales are coded discounts which can be found in the individual links on the Event page.
If you need further justification for the classes, here’s a PDF to show why we have best-in-class offerings.
You can now stop asking me “When will you have online classes?” Go sign up for one!
If you’re in the US, have a Happy Thanksgiving if you are celebrating.
By: Allan Hirt on October 18, 2018 in Hyper-V, Linux, SQL Server 2017, SQL Server 2019, Windows Server, Windows Server 2019 | 2 Comments
If you’ve been in hibernation, today you woke up to a world where Microsoft has embraced open source and Linux. What was once unthinkable is now happening. What is going on? Why am I even talking about this?
Since the introduction of SQL Server 2017 and the support for Linux-based deployments, I’ve had a steady stream of questions from C-levels on down to DBAs asking in essence this: “Do I need to abandon SQL Server on Windows Server and learn Linux?” I would use something stronger if this was a casual conversation, but the answer is an emphatic “NO!” SQL Server still runs just fine and is supported on Windows Server (including Windows Server 2019, which is just released). Support is not ending any time soon. Linux is just another option and there may be enhancements specific to each platform because of their differences. It’s not an “either/or” thing. So breathe, OK? If you have a use case for Linux, by all means deploy SQL Server on it.
Just last month at Ignite while working the SQL Server area, I heard a lot of these same statements, but this time from a largely non-SQL Server-centric crowd. Sure, SQL Server 2019 deepens things with things like AGs on containers and the Big Data Cluster story which is right now based on Linux-, not Windows-based containers. I am hoping they work out things so Windows-based containers are supported for SQL Server in these advanced configurations, but remember that Windows and containers is much newer than Linux and containers. Let’s see how this shakes out before you start having nervous sweats. The Windows Server story for containers is much better now than it was.
This isn’t the only thing I’ve been hearing. There’s a rumbling in some corners that many feel Microsoft is abandoning Windows Server and it’s all about Azure. That’s just simply not true. Do I think Microsoft’s marketing has been amiss the past few years with regards to Windows Server which has had an unfortunate effect of de-emphasizing it? Yes. I may be a Microsoft MVP, but that’s my honest opinion. Windows Server, quite frankly, hasn’t seen a lot of love and I want them to correct that. There are some awesome features such as Storage Spaces Direct (which I’ve blogged about and has its challenges with SQL Server, but it’s still a big win for Windows Server). People just don’t konw about them.
Let’s be honest: there has been a LOT of marketing/push for Azure, and I think people are hearing Azure in a way that says to them Microsoft does not care about Windows Server. Nothing could be further from the truth. As a Cloud and Datacenter Management MVP (subcategory High Availability, i.e. Cluster), I’m a Windows Server MVP in addition to my SQL Server MVP award so I see both sides of the coin. I can tell you the Windows Server PMs give a hoot and there are a LOT of improvements in Windows Server 2019. Here’s a link to a blog post that links to many of the Ignite presentataions that talk about what’s new in Windows Server 2019.
While we’re at it, there’s also a negative impression around Hyper-V. As I am also a VMware vExpert, I will be the first to tell you that I see more vSphere than I do Hyper-V.. I am fluent in both hypervisros and have helped customers implement them. I conducted an informal poll yesterday on Hyper-V usage and by golly, there are people using it in the wild for on premises virtualization of SQL Server. That said, the predominant vendor I see is VMware. I’d be lying if I said otherwise. Hyper-V is a “just another” feature of WIndows Server, while vSphere is a product of VMware. I think it makes a fundamental difference in perception.
At the same time, I can see where these perceptions are coming from. I hope SQL Server ups its Windows Server game to not make people think support on it is going away, and Windows Server needs to be talked about in general more. I love Microsoft, but I’ll praise when necessary, and criticize when they need it. I’m doing the latter here, but it comes from a place of caring.
Do you feel SQL Server is de-emphasizing its use of Windows Server? Do you see that as a problem if it winds up being the case a few years down the road? Do you think Microsoft is abandoning Windows Server or due to the over emphasis on Azure, or has made it seem irrelevant? Do you think Hyper-V is still viable? Let me know your thoughts below.
By: Allan Hirt on September 25, 2018 in Availability Groups, AWS, Azure, FCI, GCP, IaaS, Linux, PaaS, Public Cloud, RDMA, SQL Server, Windows Server | No Comments
Hello from Microsoft Ignite on day two.
Yesterday was a big day between all of the Windows Server 2019, Azure, and SQL Server 2019 announcements. Others have covered things like new features at a broad glance (here’s the official list from Microsoft). I’ll get into some of those things over the next few weeks when I get some time to play with the bits for SQL Server 2019 and discuss things like the SQL Server Big Data Clusters. However, now that SQL Server 2019 CTP 2.0 has been officially announced, there’s something I want to address: the network transport for Always On Availability Groups (AGs) and how it must be improved.
Let’s Talk Disks
One of the keys to success for any AG configuration is the speed of the disks on all replicas – primary or secondary. If you want synchronous data movement, any secondary replicas have to be as fast or faster than the primary to keep up (network speed matters, and I’ll address that here in a minute). If you add read only workloads to a seconary replica, that increases disk usage, so again, speed matters.
We are beyond simple SSDs. The real speed is no longer there. Most people are looking at NVMe drives these days, which are faster flash-based drives than “traditional” SSDs. However there is a new (yet old) kid in town: persistent memory aka storage class memory aka PMEM. “Straight” memory is always going to be faster than going to disk due to the way systems are architected internally. Back in the day we had things like RAM SANs, but the idea of using memory for storage is coming back around again in the form of PMEM. SQL Server 2016 initially supported PMEM (NVDIMM only, and I think just NVDIMM-N) specifically for the tail of the log caching (see Bob Dorr’s blog post). Capacity for persistent memory was a bit small, so it made sense. SQL Server 2016 (and later) on Windows Server also now supports PMEM if the PMEM was formatted as block-based storage (i.e. it was configured like a normal disk to Windows Server).
There are two PMEM enhancements in SQL Server 2019:
- Support for Intel Optane
- For Linux, SQL Server data, transaction log, and In-Memory OLTP checkpoint files can now be placed on PMEM in enlightened mode
Many newer physical servers have slots for persistent memory; it’s not a passing fad.
What’s the point here, Allan? Whether done right with newer NVMe drives, PMEM, or both, you can get blazing fast IOPS for SQL Server. This is good news for busy systems that want to use AGs. However, there is a looming problem especially with these speeds: the network.
Why Is Networking Your Next Big Problem?
The stage is set: you’ve got blazing fast storage and a busy database (or databases) in an AG, but your network is as slow as two tin cans connected by a string. It won’t matter if your disks came straight from setting a a record at the Nürburgring. A slow network pipe will choke the ability to keep an AG that is synchronous in a synchronized state even with compression enabled. It’s that simple. The same could be said of a large database using seeding for the replicas.
I’ve talked about Remote Direct Memory Access (RDMA) in the past in two different blog posts (New SQL Server Benchmark – New Windows Server Feature and Windows Server 2012 R2 for the DBA: 10 Reasons It Matters), so I’m not going to rehash it in any kind of depth. TL;DR it’s really, really fast networking that at least on Windows Server, is lit up automatically when you have everything in place. However, not everything can use RDMA. Things such as SQL Server’s tabular data stream (TDS) need to be enabled for use on RDMA, just like Live Migration traffic in Hyper-V and SMB 3.0 (SMB Direct) was. SMB Direct can be used with FCIs, and has been supported for some time. It’s part of that benchmark linked.
Some good news, though:
- Windows Server and Linux both support RDMA (I’m not sure about containers, though … I’m guessing not, but I’d need to dig more)
- Both Hyper-V (Build 1709 or later) and ESXi (6.5 or later) now support RDMA inside guest VMs. The bad news: ESXi only supports it for Linux.
My Call(s) to Action
1. The Windows Server and SQL Server development teams need to work together to enable RDMA for AG traffic on Windows Server (which would most likely be Windows Server 2019 in a patch, or later; don’t hold your breath for Windows Server 2016), and SQL Server needs to get RDMA working on Linux.
2. VMware needs to support Windows Server workloads with their PVRDMA adapters. VMware really is missing an opportunity here.
3. We need RDMA for IaaS VMs in the public cloud that can be used with SQL Server. This is for two reasons: a) Storage Spaces Direct for FCIs b) AGs if RDMA traffic is enabled. For Azure, this would be enabled by the Azure compute and/or networking teams. Azure has some IaaS SKUs with RDMA networking so it’s possible, but they are for HPC and not general use such as D- and G-class VMs. There’s no RDMA that I can see in EC2 or GCP, so I think those are pipe dreams, but for those who want FCIs, it sure would be great to be able to deploy S2D right now and have it work well, and then also work for AGs down the road. Azure is our best hope here.
4. Assuming #1, Azure needs to enable RDMA so that Azure SQL Database and Azure SQL Database Managed Instance can take advantage of RDMA, and make sure it does things like work across Availability Zones in a region.
That’s it. I’m not asking for the sun, moon, and stars. Most, if not all, of this is doable. There’s already precedence for supporting RDMA for SQL Server via FCIs on Windows Server, and that also needs some cloud love if you want to use S2D up there. RDMA needs to be brought over the finish line for all of the SQL Server availablity scenarios regardless of platform. In a cloud first option, we should not be saddled by slow inter-server connectivity.
By: Allan Hirt on September 21, 2018 in Conference, Ignite | No Comments
Next week (September 24 – 28) I’ll be attending Microsoft Ignite in Orlando. I’ll be around Monday – Thursday, and will be working the Data & AI solution area in the Microsoft Showcase. Here’s when I’m scheduled to be there:
- Monday, September 24 4:45 – 7:30 PM
- Tuesday, September 25 12:00 – 2:00 PM and 5:00 – 6:00 PM
- Wednesday, September 26 9:30 AM – 2:00 PM and 5:00 to 6:00 PM
- Thurday, September 27 11:30 AM – 1:15 PM and 3:15 – 5:15 PM
Chances are I will also probably be hanging around the Data & AI or WIndows Server booths even when not scheduled, so you’ll find me. Also feel free to contact me via the SQLHA website or the Ignite app.
I look forward to seeing some of you next week.