By: Allan Hirt on April 1, 2019 in SQLHAU, Training | No Comments
Can you believe it is April already? I know today will be full of people trying to be witty and pull one over your eyes. This is not that at all. Later this month on April 22 and 23, I’ll be delivering SQLHAU’s SQL Server Availability Solutions in a Cloudy, Virtual World for the first time as an online, live class. It’s not only instuctional content but also features our famous labs.
Through Friday, April 12, there is a 20% discount available so you can grab one of the remaining seats. If you are looking to quickly get up to speed on modern SQL Server availability solutions, this is a class you can’t miss. Use the code APRILFOOLS at checkout.
By: Allan Hirt on March 25, 2019 in Availability Groups, Transact-SQL | No Comments
I teased this blog post a little while back at SQLBits, but I’m finally getting a chance to finish this blog post. No stale, scheduled content here – it’s always hot off the presses.
Anyway, at Bits I did a few sessions, one of which was on AG and FCI troubleshooting (I’m also doing one tomorrow night at Boston SQL). As I was putting the talk together, one of the things I came up with was around endpoints for AGs. How an endpoint is fully configured is not exposed in SSMS, but you can query SQL Server’s dynamic management views (DMVs) which is easy enough if you know what you are looking for.
I did some of the initial documentatio for Microsoft a few years ago for distributed availability groups. Since its publication, Microsoft has opened it up and anyone can suggest edits/comments via GitHub. So far so good – all of this plays into the bigger story.
Flash back to last summer. I saw this on Twitter from Glenn Berry which spawned into a whole thread (and to see the whole thread, click on the date)
When I wrote that documentation (and it was before 2018, if you care about such things), I used … gasp … a WHERE clause, not JOIN ON syntax. I’ve always written queries like that going back to the 90s. The gist of that thread was that some preferred the “newer” syntax like INNER JOIN, and subsequently, had the query changed to show that. You get the same results, but whatever. I didn’t take it personally and the thread was good natured.
Before you all start screaming at me they are right (no, they were not … more on that in a minute): yes, I’m old. Discrete math was a required course for me for my Computer Science degree. It wound up being fate that things like sets, joins, etc., became the foundation for my career. I’m not ignorant. These days I’m much more of an infrastructure guy, but I do write infrastructure-related queries (hence the AG stuff).
Pedro Lopes who is now on the SQL Server dev team wrote the blog post “T-SQL Misconceptions – JOIN ON vs. WHERE” which does a good job of explaining the “it depends” around the newer style of joins. Why link this article? Microsoft deprecated the OUTER JOIN operators in SQL Server 2008. Two other sources of information on this are here and here (the first is Ward Pond’s old technet blog, and sadly will probably go away soon). If you’re keeping score at home, WHERE clauses are not deprecated except if you’re using *= and =*). The changes people made were wholly unnecessary and as the author, the newer stuff is harder to decipher than what I originally did. They were putting their own biases onto things.
As an experiment, I wrote the endpoint DMV query in two variants: my preferred way (WHERE clause) and with the JOIN ON syntax. First up is the query execution plan for the WHERE clause. Click to make it bigger.
Figure 1. Query using a WHERE clause
Next up is the INNER JOIN
Figure 2. Query using a INNER JOIN clause
Spoiler alert: they have the same execution plan. There is no difference between
Figure 3. The actual WHERE clause
Figure 4. The actual INNER JOIN clause
From a results standpoint, you get the same output. In other words, zero difference all around.
Preferences are just that. Preferences. Don’t foist them on others. Deal?
By: Allan Hirt on March 12, 2019 in SQLHA, Webinar | No Comments
As we first announced in our inaugral issue of our newsletter, Mission Critical Update, we will be holding free 30 minute webinars every other month starting in April. The first one will be on Wednesday, April 24 at 11AM Eastern/8AM Pacific/3PM GMT. We will send out details on how to join as the webinar gets closer.
Here is the information about the webinar:
The Only Constant Is Change
Whether you are at the top of the corporate ladder or an administrator, everyone has to manage change. That change comes in many forms including:
- Applying patches to resolve issues and improve reliability as well as security updates to keep your systems and business data safe
- Upgrading/migrating to new major versions of Windows Server, Linux, and SQL Server
- Taking advantage of evolving hardware-based options such as hyperconverged solutions on premises and zones in the public cloud as well as new features to improve your systems and applications
- Planning and deploying an updated architecture for the solution to make everything work in harmony
Each of these areas is a moving target. Strong organizations embrace change with well thought out plans and make the most of it with the support of management. With all of that in place, the disruptions to the business are minimized. Want to be that type of organization? Sign up for this free 30 minute webinar from SQLHA® to hear mission critical experts Allan Hirt and Max Myrick talk about the effect of change and how to stay on top of it. You will also get a sample high level plan to see how all of this can be achieved.
Want to get notified of upcoming webinars? Don’t want to miss an issue of Mission Critical Update? Looking for the latest on training and need to know when a Mission Critical Moment is published? Subscribe today and choose what you would like to hear from us. Mission Critical Update #2 is going out later this week – don’t miss out!
By: Allan Hirt on March 8, 2019 in Conference, SQLbits, SQLHAU, Training | No Comments
Happy Friday. Things are finally calming down over here at SQLHA HQ. Last week I was in the UK at SQLBits 2019 where I delivered a Training Day on Thursday, February 28 and a regular session on Friday, March 1. I’ve always loved hopping across the pond to speak at Bits; it’s one of my favorite conferences. The atmosphere is always great, and the organizers do such a good job. It also doesn’t hurt that I love going to the UK. It was my first time in Manchester. Unfortunately I didn’t get much time to explore. I arrived in London on Monday night and took the train up to Manchester on Tuesday. I spent the next two days working (the inside of my hotel room was lovely), did the Training Day, my session, and headed back down to London before flying home.
The Bits venue was really lovely – it was in the old (and converted) Manchester Central train station. I’m a train guy, so it thrilled me to no end. You can see it in the picture later in this blog post. The Training Day went smoothly – I had nearly a full room. I never take it for granted that after all these years and conferences around the world, people still want to show up and hear me speak. So thank you one and all who came out to the Training Day which was not free. I know you have a lot of choice, and am honored that you selected me.
SQLBits this year made the decision to have 50 minute sessions. Anything under an hour can be … challenging, but I felt really good about my regular session “Common Troubleshooting Techniques for AGs and FCIs”. The Bits team already has the session uploaded, so feel free to view it at the link above. The audience was great, and I had a bunch of questions after. Luckily it was break time so people could ask me as I was tearing down. I also wanted to get off the stage quickly so the next presenters could get set up. I’m considerate like that!
The sign for the room
I hope to be able to go back in 2020, but if you ever get the chance, I highly recommoend attending at least one SQLBits conference in your career.
Getting home was a bit of an adventure. There was snow back home in MA, and for the first time in a long time, I did not fly direct from London. I went via JFK. My flight from JFK to Boston was cancelled, so I wound up staying the night in New York City and taking Amtrak home. It was not that big of a deal, and as someone who is on the road a lot, par for the course. These things happen. ProTip: if this happens to you, be nice to agents at the airport or on the phone. They’re having a crappy/stressful day, too, since they’re dealing with everyone else who is being affected. You get more with honey than vinegar.
The Training Day featured an all new lab that I put together, and was the first time in a few years I’ve done a Training Day with a lab. If you’re wondering why I sometimes have preconference sessions (nee Training Day in Bits-speak) without labs. I was one of the first (if not the first) to pioneer the use of labs on this scale in the SQL Server community, and it’s nice to see some others doing it now, but it’s never a slam dunk. Why?
First, before I even think of putting a lab together, I work with the conference organizers to see if the venue can support approximately up to 100 people doing labs. Since that may pepole banging on WiFi will consume a lot of bandwith, it would be a miserable experience for all involved (including yours truly) if connectivity sucked.
Second is the cost involved. There’s a cost associated with each student not only for the backend lab stuff, but also to the conference organizers for the bandwidth. If it’s not going to ultimately be too expensive, we won’t do it. Nobody is in the business of losing their shirts on these things. It’s much more economical now than the first time I did this at Bits and Summit five or six years ago.
Assuming the cost and the infrastructure is there, is it worth doing a lab? Let me say this: when others have said that labs were a lot of work and not worth it (some of whom have changed their minds since …), I did them. I’ve always believed in the power of hands on learning. I did labs for private and public classes when I had to bring (or send) an external hard drive with VMs and load them onto PCs with hours and hours of setup. That does not scale for so many reasons, not the least of which is the size of laptop needed to do my labs that way isn’t a low end spec. I think back to some of the classes I delivered in Australia where I arrived days earlier, had to load PCs up, test them, etc. Doing labs was, and still is, a big commitment.
For the past 5 or 6 (maybe 7?) years, my labs have been done through a browser. Everyone still gets their own set of virtual machines, but they no longer need to be loaded on everyone’s PC and use a hypervisor locally. Soon I’ll be offering Azure- and AWS-based labs in addition to VM-based ones for my classes and possibly precons … stay tuned!
There’s the whole instructional design part of this which is putting together the VMs (and getting them all to the right point …) as well as writing the lab manual. For one day of training, it’s hard to get a lab that works in 60 – 90 minutes knowing people are at different skill levels, but you want to do something that’s meaningful and not just “click-click-click you built something but you didn’t learn anything”. Needless to say, labs were and still are a big time investment on my end, and I feel they are worth the effort. It’s gratifying to see people loving them as they are working their way through.
Finally, with a one day class, you have to balance instructional content to put a lab in, so you need to design that part of the day even better since you’re giving up roughly 25% of your time to give that hands on experience.
I will say after all these years, it never gets old watching a ton of people all access VMs (across 100 people it’s easily 400 VMs or more) simultaneously. That’s a lot of horsepower, and also why I need to know well in advance if it’s going to happen because the folks running the backend need to ensure there’s enough horsepower reserved for the day. It’s not just “show up and run labs”.
Attendees doing labs at my SQLBits 2019 Training Day
If you want the best training and labs, sign up for one of the upcoming SQLHAU classes that are being delivered live online, or come see me in person during the Chicago dates in August. Use the code BLOG20 to get 20% off the April online class “SQL Server Availability Solutions in a Cloudy, Virtual World” which does has a lab. The discount is good through March 31. You can also subscribe to not only get our newsletter, but also get notified when we have new training or other training-related items. Sometimes we offer subscriber-only discounts 🙂
VMware vExpert 2019
I’m pleased to announce that yesterday I was re-awarded vExpert for 2019. Like people showing up for things like the SQLBits Training Day, I never assume that being renewed is automatic. Thank you, VMware!
By: Allan Hirt on February 11, 2019 in Always On, AlwaysOn, Automation, Availability Groups, Azure, FCI, SQL Server, Template, Windows Server Failover Cluster | 3 Comments
I was working on some things this past weekend for a few upcoming projects, and one of those involves Azure and automation. Anyone who knows me is that I will praise when necessary, and call out when something is not quite right. Microsoft’s Azure, Windows Server, and SQL Server teams earned my “What Were They Thinking?” badge.
What Is This About, Allan?
Microsoft has published a few solutions to automagically build AG solutions for you (there are none for FCIs right now) up in Azure using IaaS VMs. The one below has been around for some time and is easily found in the Portal.
Figure 1. Azure Template for Creating a Full AG Solution
Let me digress for a moment and say how this solution in my opinion is not quite kosher for a production deployment:
- AlwaysOn is not the feature name. Always On Availablity Groups, or just Availability Groups is the name of the feature. Always On has had a space now going on nearly five years. I may have a few blog posts about this somewhere 😉 (here is one example)
- No customer I know is going to build separate Active Directory Domain Services (AD DS) servers just for an AG; they’re going to have existing ones that they will use.
- The template only allows you to select Developer or Enterprise Editions of SQL Server, not Standard. Since this is a two-node only configuration, I’m not sure why this was not updated for SQL Server 2016 and later.
- You cannot choose what kind of load balancer thet gets created for the AG listener.
- No load balancer is created for the WSFC.
- Some regions now have Availability Zones (AZs) which is better than Availability Sets (AS). The template has not been updated to reflect that.
- Cloud witness! ’nuff said. Building a FSW here is totally valid, but this is an Azure solution. This was built pre-WIndows Server 2016 which is when cloud witness was introduced.
I ran the template and it took 1 hour, 1 minute, and 52 seconds to complete.
In theory, this particular template is an ok(ish) solution if you want to kick the tires on AGs in a non-production way and see what they are all about without purely from an AG perspective. However, this solution they put together is VERY old (I think about five years at this point) and outdated, not to mention people generally deploy AGs after they have databases. If Microsoft wants people to use this, they should update it to reflect a more modern architecture and have the ability to use things like Standard Edition, AZs, and cloud witness.
More recently, Microsoft released a few new things they blogged about the past few months: Automate Always On availability group deployments with SQL Virtual Machine resource provider from December 2018 and Simplify Always On availability group deployments on Azure VM with SQL VM CLI from February 2019.
The workflows for the last two links are a bit … odd. It’s just so much easier to create the WSFC in guest and it solves the major problem I’m about to describe below which prompted this post. If I’m already in the guest, outside of needing to do any load balancer stuff, why would I do stuff in Azure? It’s not really easier and you probably already have PowerShell, T-SQL, or other scripts to do most of this. Some of this feels like a solution looking for a problem that doesn’t really exist. Choice is good but …
Sound off in the comments if you agree or disagree. I’m curious to see what people think.
The Real Problem
I had a good look at the Desired State Configuration (DSC) module for Windows Server Failover Clusters (WSFCs) which is called xFailOverCluster. This is mostly the heart of the matter. The latest version as of this blog post is 18.104.22.168. Specifically, I was seeing what it could and could not do, and there is one major chunk missing from it: validation. The big Azure template I complain about above also does not run validation. Why is this a bad thing?
Look at Microsoft KB327518 “The Microsoft Support Policy for Clustered Configurations of SQL Server with Windows Server” . That links to KBx “2775067” The Microsoft support policy for Windows Server 2012 or Windows Server 2012 R2 failover clusters”. That KB also applies to Windows Server 2016 and 2019. Focus on this line:
“The fully configured failover cluster passes all required failover cluster validation tests. To validate a failover cluster, run the Validate a Configuration Wizard in the Failover Cluster Manager snap-in, or run the Windows PowerShell cmdlet Test-Cluster.”
What does this mean? To have a supported WSFC-based configuration (doesn’t matter what you are running on it – could be something non-SQL Server), you need to pass validation. xFailOverCluster does not allow this to be run. You can create the WSFC, you just can’t validate it. The point from a support view is that the WSFC has to be vetted before you create it. Could you run it after? Sure, but you still have no proof you had a valid configuration to start with which is what matters. This is a crucial step for all AGs and FCIs, especially since AGs do not check this whereas the installation process for FCIs does.
If you look at MSFT_xCluster, you’ll see what I am saying is true. It builds the WSFC without a whiff of Test-Cluster. To be fair, this can be done in non-Azure environments, too, but Microsoft givs you warnings not to do that for good reason. I understand why Microsoft did it this way. There is currently no tool, parser, or cmdlet to examine the output of Test-Cluster results. This goes back to why building WSFCs is *very* hard to automate.
Knowing this, I would change all of this to build the AG (or FCI) VMs with the Failover Clustering feature enabled, then validate and build the WSFC inside similar to what is in the workflow for building the AG on your own. So it’s still a mix of automation and some minor human intervention.
MSFT_xCluster also has another issue in my mind in parsing the code: it seems like it only handles Active Directory Domain Services (AD DS)-based WSFCs. If you wanted to build a Workgroup Cluster variant of a WSFC that does not require AD DS, you are out of luck. This is acknowledged in that MS blog post from February I link above, and at least they call it out. Kudos.
We only support AD domain joined Windows Failover Cluster definition. The FQDN is a must have property and all AG replicas should already be joined to the AD domain before they are added to the cluster.
All of this feels a bit like a case of fire, ready, aim, or more specifically – deploy, understand supportability, automate.
Can You Still Automate AG Deployments Using What MS Provided?
If you are looking at non-production environments such as development and QA, use anything and everything I criticize above since supportability generally is not an issue there. You’re not deploying production systems in the truest sense (i.e. end user/customer facing), but keep in mind they are production systems for your developers and testers.
If you build the base Windows Server IaaS VMs and get through validation and want to automate beyond that, you’d have a fully supported solution if building the WSFC and AG portions are fully automated.
That said, if you know what you’re doing, building all of this yourself won’t take much more time and may even take less time – especially the WSFC piece. You can automate it yourself in different ways. Building a WSFC really does just work these days when it’s done right (kudos to the Windows Server dev team and the HA PMs). Do what works for you; if what Microsoft provides works for you, go nuts. Just know there’s more than one way to approach this problem.
The Bottom Line and What Microsoft Needs to Do
Automation has come a long way but we’re not there fully there yet for clustered and supported configurations of SQL Server running on Windows Server up in Azure or any of the public clouds for that matter. Here’s what needs to happen:
- Fix things so that Test-Cluster is run and the output is checked before building the WSFC and the AG.
- Should Microsoft deem it acceptable to support these automated methods already out there for production builds, they need to say that somewhere other than a blog post officially AND update KB2775067 accordingly that the validation requirement is waived. Otherwise there will be conflicting information out there which is bad for everyone including Microsoft. Microsoft needs to stop that nonsense right in its tracks.
- Update any templates and Wizards accordingly.
When and if these things happen, by all means, automate away in Azure even for production!
Need help with your availability solutions, especially if you are looking at any of the public clouds? Contact us today and we can kickstart your projects into high gear.