Blog

Configure a WSFC in Azure with Windows Server 2019 for AGs and FCIs

By: on July 1, 2020 in Always On, AlwaysOn, Availability Groups, Failover Cluster Manager, FCI, Windows Server 2019, Windows Server Failover Cluster | No Comments

Over the past month or two, I’ve seen a lot of people run into a problem with Azure-based IaaS configurations that isn’t well documented and want to address it in this blog post. For those creating Always On Availability Groups (AGs) or Always On Failover Cluster Instances (FCIs) with a Windows Server Failover Cluster (WSFC) using Windows Server 2019 as the underlying OS, things have changed. There are other “problems”, but I’m specifically going to address one pain point: the distributed network name (DNN).

Introduced in Windows Server 2016, a DNN is another kind of name resource in a WSFC. It differs from a standard network name resource because it does not require an IP address. In Windows Server 2019, the name of the WSFC, which is also the Cluster Name Object (CNO) in Active Directory Domain Services (AD DS), can be created with a DNN. A DNN cannot be used to create the name resource for an FCI or an AG’s listener.

Figure 1 shows what a DNN looks like in Failover Cluster Manager (FCM).

Figure 1. Distributed network name in FCM

The new Windows Server 2019 DNN functionality does have a side effect that does affect Azure-based configurations. When creating a WSFC, Windows Server 2019 detects that the VM is running in Azure and will use a DNN for the WSFC name. This is the default behavior.

FCI Behavior

During Setup, you’ll get to the step cluster_failover_cluster_name_cluster_config_Cpu64 shown in Figure 3 where it’s configuring the network name resource.

Figure 3. Making progress, right?

Unfortunately, you will also get the error message in Figure 4 pop up. This error is fatal – you can’t create the FCI.

Figure 4. Not so fast …

The fix? Recreate the WSFC without a DNN. Before you do that, make sure you uninstall SQL Server cleanly otherwise that could be unpleasant.

Listener Behavior

With both SQL Server 2017 and 2019, creating a listener when the WSFC uses a DNN seems to work. I have no reason to believe it will not work with SQL Server 2016. However, there’s no official support statement from Microsoft on using the DNN-based WSFC with AGs, nor do I believe it was tested with them. Just because I have not encountered issues in my limited testing does not mean it is good or bad – I’m just relating my experience.

The Core of the Problem with Windows Server 2019

If you require a static IP address for a WSFC in Azure, the only way to create it is with PowerShell. This is the typical syntax for an AG. For an FCI, you would want storage, so leave out the -NoStorage parameter.

Figure 5 shows the output of the command

The WSFC creation process creates a WSFC with a DNN despite specifcying a name and IP address. The value for resource type is Distributed Network Name not Distributed Server Name as shown back in Figure 1. Once again, as with group/role, things are different in the UI vs. PowerShell. Get your act together, Windows!

Figure 5. It didn’t create what you thought it would

How Do You Create the WSFC without a DNN?

The answer is in this obscure blog post from Microsoft. That blog post links to a bunch of Youtube videos created by John Marlin who is on the HAS team of Windows Server. What you need is buried in Part 6. It’s a new switch/parameter in the New-Cluster PowerShell cmdlet called ManagementPointTypeNetwork that is not even documented there. I created a pull request to update the documentation yesterday.

ManagementPointTypeNetwork can have one of three values: Automatic (the default value, which detects if you are on premises or using another cloud provider, or you are doing this in Azure), Singleton (use a traditional WSFC name and IP address), and Distributed (use a distributed network name for the WSFC name).

To create a WSFC using a static IP address and a traditional WSFC name and IP address with no shared storage, here’s the syntax:

Figure 6 shows the output which is what we want to see – a traditional WSFC with a name and an IP address.

Figure 6. Much better

You also get a much more familiar display in FCM. I really with Microsoft would not use Server Name (or Distributed Server Name) under Core Cluster Resources since you don’t really access the WSFC directly. I get why it needs a name, but wouldn’t WSFC Name or Distributed WSFC Name be better here?

Figure 7. Traditional WSFC with a name and IP address

Postscript

Before anyone sends nastygrams or leaves comments, I was just showing the WSFC creation. There are more steps. You still need to add a witness resource, which up in Azure should be cloud witness. You may also need to create an ILB (or use an existing one) depending on your configuration. Both of those are outside the scope of what I’m covering in this post.

It’s my hope that the SQL Server dev team officially tests and certifies DNNs for use with AGs and FCIs. FCIs appears to require a fix to Setup. That means FCIs may not work until a CU or possibly vNext (if at all).

The TL;DR

Using a DNN right now may work for you depending on what you are deploying (AG or FCI). No matter what path you take, at least you now know how to deploy things. Happy clustering!

Configure SQL Server Backup to URL Using Azure Cloud Shell

By: on May 18, 2020 in Azure, Backups, SQL Server | No Comments

Happy Monday, everyone.

I had to do something this weekend that involved the SQL Server feature backup to URL which allows you to back up databases in a SQL Server instance directly to Azure blob storage. That PowerShell there is quite the adventure … and cumbersome. There’s a much easier way to get this done using Cloud Shell up in Azure. I used Bash, but I’m sure there’s an equivalent Azure CLI PowerShell way as well.

Backup to URL/Restore from URL works from Transact-SQL and the SQL Server PowerShell cmdlets. There is no UI option in SSMS. Figure 1 shows what happens when I tried to restore a valid backup stored in a URL in version 18.4. I hope this eventually gets fixed. For all graphics in this post, click on them to enlarge.

SSMS cannot use URL for restore

Figure 1 – Thanks for playing … don’t try again.

Storage Accounts and Access Keys

This post assumes you already have a storage account in Azure. To do nearly anything, you need one. You can create a special one just for backups if desired, but it’s not necessary. Work with your IT admins to see what the general direction is here if you already have a presence in Azure.

Whatever storage account you use, you’ll need three things: the name of the resource group that contains the storage account if using Cloud Shell, the storage account name, and the access key for the storage account. You can get the storage account name and the key from the Azure Portal. It can be found on the Access keys blade for the Storage Account as shown in Figure 2.

As an aside, the fact all of the menu things are not capitalized for the first letter completely drives me NUTS. Yes, I get that’s how things are today so get off my lawn. But if people stuck to it or used it consistently, I’d be fine – annoyed, but none worse the wear. Look at the screen shot below. Storage Explorer (which is a specific service in Azure) versus everything else (i.e. Access keys). Boy, that shift key is really, really hard for those extra words, no? Anyway …

Access Keys Example in Azure

Figure 2 – Access Keys in Azure.

If you don’t want to use bash to put the storage account key into a variable, you can just cut and paste from the Azure Portal. Stick it in a Notepad (or equivalent on your OS) document if you want that you don’t save; it’s just a file to keep open so you don’t have to keep navigating back to that blade.

To get the access key using in Bash and set it to a variable, execute the following command:

Tip: you could also put the storage account name into a variable since you’re going to use it a few times, but if you have it handy, it’s just as easy to cut and paste. Don’t overthink things.

Containers and the Shared Access Signature

The backup file you generate in SQL Server or restore from is stored in a container as a blob.

If there is already a container you will use, you can skip this step. If a container is not created already or you want to use a different one, create a container in the storage account. Below is the syntax for Bash:

Now you have to generate the shared access signature (SAS). This is how you will set up the secure access to the blob up in Azure. -o tsv is the same as –output tsv which takes care of the double quotes at the beginning and end of the output. Remember to set the value for the expiration date (–expiry) well into the future. The date is in UTC format.

A big long string of text like the one below is generated. After that happens, you use it in the Transact-SQL used to create the credential in SQL Server.

Create the SQL Server Credential

Creating the the credential in SQL Server is a simple Transact-SQL statement.

That’s it. You should be able to back up to and restore from a blob in Azure. Figure 3 shows a successful database backup of WideWorldImporters to Azure using Transact-SQL.

Backup to URL is successful

Figure 3- It really works!

The backup file WideWorldImporters.bak can be seen in Figure 4.

WideWorldImporters in Azure blob storage

Figure 4 – WideWorldImporters in Azure blob storage

The error message below appears if something is configured wrong.

Chances are the problem was between the brain and the keyboard, such as fat fingering the name of the container or storage account. Some examples include extra characters at the beginning of the SAS such as a question mark (?) or having double quotes before and after the SAS.

Why Use Cloud Shell and Bash?

In my experience, it is many less lines of code. If you do not have to create the container and want to just cut and paste the storage account name and access key, it is literally one line of code you need to write to generate the SAS required for the credential in SQL Server.

Hope this helps some of you out there.

Online Event Fatigue

By: on May 7, 2020 in Conference, PASS Summit, Pre-conferece, SQLbits | 2 Comments

Hello, everyone – it’s been awhile. Like everyone else, I’ve been affected by the current state of the world and I’m finally able to take a few moments to blog. I’ve never been someone to just put out content for the sake of it. I had ideas over the past month, but did not want to seem like I was taking advantage of the situation or do a lot .of “me too” type posts (i.e. “So you’re working from home … now what?”). My whole career has been based on helping people prepare for worst case scenarios. To me it would have felt a bit like what we call here in the US an “ambulance chaser”. I’m not going to prey on fears, but SQLHA is here for our customers new and old. If you do need help with anything, please reach out.

Everyone’s situation is different during this pandemic. Where there is a quarantine/stay-at-home/shelter-in-place or similar order, some of us are by ourselves others are at home with family. That includes juggling work, maybe home schooling, and more.

Having said that, it seems like there’s been an uptick in the number of online events. I see two trends. The first is that most major conferences until mid-2021 have been converted to online instead of in person. I did not want to use the word cancelled because it’s not true if the event is still happening. Second, it seems like everyone and their aunt, uncle, cousins, and distant relatives is now doing a ton of online events. It’s like “Everyone is home, let’s give them something to do.”

The Rise of the Online Event

I was supposed to be in the UK at the end of March/early April to speak at SQLBits in the UK. Clearly that didn’t happen. It’s been rescheduled to the fall, but given the current world situation – especially with travel – as much as I want it to happen and to be there, my gut says Bits as an in person event may not happen in 2020. While I hope I’m wrong, part of me knows I’m not. Will we ever see rooms filled like the picture below in the near future?

Room filling up for my half day session at PASS Summit a few years ago

I’m also slated to speak at PASS Summit in Houston this year as I have a pre-con. They’ve already said there will be a virtual event even if the in person one can’t happen which is forward thinking. But let’s talk reality for a moment: even if it happens as in person, how many people are going to show up? Most companies are not going to pay for people to travel anywhere right now. It may be hard and bordering on impossible to get into and out of the US from other countries. There’s the bigger question … do people really want to be in a place with lots of others where it may be hard to do social distancing? I’ve been speaking at conferences a long time and for the first time in over 20 years, this has me pondering if in person events will come back, and when/if they do, what will they look like.

There’s another aspect to all of this that is a bit of an elephant in the room: conference cost. Microsoft, Red Hat, and VMware among others turned their large, in person conferences (Ignite, Build, VMworld, Red Hat Summit) that they charge attendees good amounts of money into online free events. Within reason, big companies can absorb the financial hit of cancelling a conference that would have had thousands or tens of thousands of people and turning it into a free online one.

What does that mean for smaller conferences such as PASS Summit, SQLBits, and Live 360? Many smaller conferences have tight budgets and are largely run by volunteers who have day jobs and lives. Can they afford to do that? I’m guessing the answer is no. And what would people be willing to pay to attend an online event when so much is out there for free? Again, I have no answers, but I’m posing the question because I can’t be the only one to ponder this.  We’ll certainly find out sooner rather than later. All of the free stuff may set a dangerous precedent as if that hasn’t happened already with how people view training (i.e. paid training versus a free YouTube video).

As a speaker, I hope major events still happen but free for everything forever does not seem like a sustainable model. That doesn’t mean free is bad and I’m happy to support things like local user groups. In fact, I’ll be speaking at the Jacksonville  SQL Server User Group in a few weeks at their virtual meeting. I’m looking forward to it. Giving back is a big part of who I am.

One of the biggest aspects missing from many online events is the interaction and networking that happens. For vendors in exhibition halls, it’s how they get in front of customers. How do you replace those face-to-face experiences in a virtual way that works for everyone?

Fatigue

This brings me to a more practical and very real issue for a lot of us – attendees, speakers, and organizers alike. As I mentioned above, it seems like since the pandemic, there have been a million webinars and online conferences announced. Some are multiple days and many hours.

As someone who has done training online, I know very few people can sit for hours on end glued to the screen watching technical content. I’m going to be announcing some online classes very soon, but the model I use spreads the content out to balance knowing you need to get work done and getting things in bite sized chunks is better. It took me awhile to come up with what works.

Add to that with the pandemic, with everyone at home, this is not a normal situation. Can you even find the time with everything else on your plate to sit through all of this content – free or not? Some scheduled SQL Saturdays are transitioning to the online model, but do people really want to also spend their Saturday now in front of a screen?

Even before COVID-19, people have said to me (paraphrasing), “Well, if I’m not physically at a conference, they’re expecting me to work.” Even when I do teach – including in person classes – people have to duck out for work interruptions. Now that events are online and ubiquitous, why should a company allow you time to just sit and watch content, not answer mails or tickets, etc.? It’s a very real problem.

Truth be told, I was talking with some folks about doing an online thing that will be cool if we do it, but right now just doesn’t seem to be the time to introduce it. I’m seeing and hearing the fatigue. People are already starting to tune these online events out or have them on as background noise in the way some of us listen to music. The little separation they have between work/life balance is nearly gone. I think we’ll do it, but it has to be at the right time.

Where Do We Go From Here?

I don’t know the answer. If I did, I should buy a lottery ticket. What do you think about all of this? Will you attend an in-person event/conference again? What influences that decision for you? Do you have event/conference/webinar fatigue? What would you pay for an online event or training with great content? How long are you willing to sit and watch sessions in one sitting?

I would love to hear from you because we are all just trying to navigate a brave new world none of us have encountered.

Business Continuity – We’re Here

By: on March 13, 2020 in Business Continuity, COVID-19, Disaster Recovery | No Comments

If I think about it, my entire career has been based on business continuity (BC) – making sure that things are up and running after something happens. In our technology world, we generally break this down to two categories: high availability, which means you can survive a relatively local event, and disaster recovery, where you can survive something more catastrophic. The reasons we’ve all fretted about – server failure, flooding, tornadoes, tsunamis, ransomware, and things like them – now have a new player at the table, COVID-19. COVID-19 does not respect borders, how much is (or is not) in your bank account, etc.

Quite literally everyone on the planet has been affected by COVID-19 both personally and professionally. BC is in action, but not in a traditional way. Business is, for the most part, still happening. How it is being done is different and will be for the forseeable future.

Travel has morphed into essential only for most of us. Conferences, concerts, and sporting events are postponed or outright cancelled. Nearly every company is asking employees to work remotely/from home – even ones that have shunned it in the past. The ones that are still requiring employees to come into the office will be changing their tune soon and show how prepared – or not – they were if something happened that wasn’t COVID-19. In essence, businesses are exercising, to a degree, their BC plans. Nothing is good until you test it, right? Some are failing. A longtime friend’s company is still saying everyone needs to be in the office. Read the damn tea leaves. This isn’t rocket science or something just for tech companies. Get your head out of your proverbial rear end.

What people forget sometimes in our industry is that BC is largely about people, too. It’s not just about servers (physical or virtual) and its associated tech. This big shift in how we will all be interacting and working raises some interesting things to think about as well as questions.

  • Will broadband providers be able to provide good connectivity and throughput now that more people are home during the day? I know they see it at night, but we’re talking about 24×7 load now.
  • Will broadband providers enforce data caps or lift them? That will certainly hamper some who use data a lot for their daily needs (and I’m not talking streaming Disney +, Netflix, etc.).
  • How will this affect supply chains for everything? We’re in a global economy. What happens, for example, if your SAN has a failure and you can’t get a disk? Can someone even get to your data center to fix it even if you do have a spare?
  • As my friend Joey D’Antoni speculates, what does this mean for cloud providers like Microsoft Azure, Amazon AWS, and Google Cloud? Right now their capacity is fine, but what about six months from now? Would it be cheaper to provision a bit more now? I can’t say yes, but you do not want to run into what I saw at my local Target last night. Hoarding is not good (yes, even for toilet paper or paper towels) but being prepared is. There’s a big difference.
Empty shelves

Empty shelves of toilet paper and other essentials at Target in Watertown, MA, on March 12, 2020.

  • More importantly, after we’re on the other side of this – whatever that looks like – what will things look like both in our daily lives and our professional lives? Will it go back to the way it was or something completely different? Will in-person events such as conferences go back to the normal way of doing things? I will be at the rescheduled SQLBits this fall and hope to see you there.

Business continuity is all about planning and being prepared. SQLHA is poised to help you do that that today and tomorrow. It’s what we do. COVID-19 has not affected us working with our customers new and old. Remote working has always been in our DNA. Don’t get me wrong – I know I love going onsite; my status with American Airlines and Marriott hotels can attest to that. But that doesn’t mean I don’t equally enjoy working with our customers remotely. It’s still satisfying knowing you’ve helped someone out and they’re pleased as punch at the end result. Don’t hesitate to reach out if you are looking for some help.

Now to the human side of things …

I know the next little while will put a strain on everyone in different ways. Be safe, smart, and not a superhero. There’s a reason extreme measures like social distancing are being put in place. Be kind and understanding. Even if people are not affected from an illness standpoint directly, some businesses – especially small, local ones like restaurants, bars, and stores – will be affected. If there’s some place you like, find a way to support them, too. Have patience if you are going out locally to a store, especially to places like supermarkets and Target. The people there are stocking shelves as fast as they can. Please and thank you should be in your vocabulary. If you are still going out to eat or picking up food, leave a generous tip if you can. These workers will be hit hard if they are not already.

When not working, I play music. All of my rehearsals and the one concert I had upcoming have been postponed or cancelled (and for good reason). Same with concerts I have tickets for. Similar to my local business comment, the arts will be greatly affected during all of this. Please support your local arts community; they’ll need it. Not going to get on my high horse, but pay for a download or CD of your favorite artist. Buy or rent a movie; don’t torrent. Now more than ever the arts community will need your support.

EDIT: Also – be polite to reps on the phone if you are calling to cancel/change travel plans (hotel, car, air, whatever). The reps didn’t make the policies; they want to help you. Luckily most companies have altered policies for this extraordinary circumstance. Not only should you say please and thank you, but if you normally skip the end-of-call survey, please take it. The reps are dealing with a very difficult situation and it helps them. Understand they are facing a challenge; have empathy.

I could not have predicted a global event such as this; most are fairly regional in nature. Maybe this will wake up some companies that what I’ve been saying for so long was not hyperbole. We’ll see.

Designing and Implementing Robust Solutions

By: on February 4, 2020 in High Availability, Scalability, Testing | No Comments

No matter where you live, you have most likely heard about the Iowa Caucus fiasco here in the USA this week. This is not going to be a political post, but what happened is something I have seen too many times in all my years working both on the dev and IT sides.

There are numerous topics that could be discussed in greater detail – the failure of redundancy (the availability guy in me is screaming to talk about this), the lack of testing (my QA background is also yelling loudly), but all of this points to a larger culprit: designing the solution properly to have the performance, availability, reliability, scalability, and security required. Do not forget the solution has to be usable by end users and easily maintained not to mentioned tested. The word “robust” is a good single word to sum it all up. According to this 9to5Mac post that has a good, concise summary of what is known to date (and a link to a bigger New York Times article), most of that didn’t happen or just flat out failed. Then there is accounting for data inconsistencies, which according to the New York Times, is a coding problem in the app. ArsTechnica also did a good writeup.

The Iowa Democratic Party issued a statement. It highlights ALL of what I said above, but especially around testing. The fact the data collected was right but the reporting was incorrect is damning. As a data person that makes me angry. Did the reports actually meet requirements and did they test with real (or at least simulated real) data? So many things …

You may ascertain this is not my first rodeo. Whether you buy a packaged application, roll your own, or customize an off-the-shelf package, what worked then may not work now. There’s a difference in scale with few users or a very different workload. How you handle 100 rows is very different than millions or billions. That is not necessarily a failure of design because at the time many applications were designed with what was known. Sometimes things grow organically. Other times some aspects are overlooked with the “we’ll deal with that later” mentality. It is always better to get it right (or as close to it …) as possible the first time around. Retrofitting, or worse, replacing existing solutions is expensive in every aspect of the word and may cause outages to do so. Hire smart people (FTEs or consultants) to help you. It is not a sign of weakness to admit you may not know what you don’t know and you need assistance. What does right look like? It would include, but is not limited to:

  • Proper application design
  • Proper schema design
  • Accounting for RTOs and RPOs for all components
  • Testing using production (or close to production-like) data and masked if necessary – functional, at scale (works on my machine with no load is not testing at scale …), and full business continuity
  • Securing everything but not hampering functionality and scalability while doing so (if possible)
  • A monitoring and overall management strategy that works including backup and restore
  • Take any lessons learned and drive them into improvements

All of this is valid whether you are fully on premises (physical or virtual), in a public cloud, or using a hybrid which bridges both. Choice of technology or platform does not matter. All of the fundamental principles are the same. Crap in, crap out. The cloud will not fix your poorly designed application nor will throwing more resources often resolve all – if any – issues. At some point you need to take a holistic approach; touching one thing has tentacles elsewhere. It is rare that is not the case.

The right design is always an end-to-end approach, no matter if it is an existing implementation or something new. The application matters just as much as the backend. Over the years, we’ve helped customers achieve their performance, availability, and security goals by helping them implement solutions that work. SQLHA has worked with some of the largest systems on the planet. If you are not a large company, bank, or government, that does not mean your solution needs any less robustness or aspects of mission criticality. We do this with companies of ALL sizes.

If you want to avoid scenarios like the Iowa Caucus, contact us today. We’ve been there, done that and will roll up our sleeves to get you where you need to go.