Blog

December 29, 2022

A Lesson for IT – Don’t Be Southwest Airlines

Whether you live in the United States or not, by now you have probably heard about what is going on (or not, as the case may be) with Southwest Airlines (SWA). I was away for the holiday weekend visiting a friend and on the way back, even the employees of the other airline I was flying were talking about it and how the systems had basically melted down. As I say often, you want to be reading the news, not making the headlines and certainly not drawing the attention and ire of the US Department of Transportation (See their Twitter posts starting around December 26. Some examples: this thread, this thread, this Tweet, and this thread.).

What Happened?

From the outside looking in as someone who is a business continuity expert, this seems like it was a perfect storm of bad things converging at the same time. In the past week they have cancelled over 5,000 flights leaving passengers stranded, angry, and often in bad scenarios. Why are/were things so bad? Three quick points:

Mother Nature and the storms/weather that hit the US around the holiday and affected some of their busiest airports.
Not having a hub and spoke model for planes and people to be able to easily move the chess pieces around for things like weather events. This speaks to process and configuration as we think about it in IT.
Fragile, legacy IT systems still involved in day-to-day operations. In the case of SWA, there are systems that deals with flight and crew management. This problem is our good friend technical debt.

I feel bad for everyone involved – the customers affected, the employees who have to deal with the situation (especially the frontline ones who will feel the brunt of the customer wrath), and everyone in between.

Let me be clear: I don’t think SWA sought out to ruin people’s travel around a holiday nor was it their goal to draw the attention of the US Government. This is the reality of business and IT – things happen, often at inconvenient times. People are affected by said event. Hilarity does not ensue.

Let’s Talk Technical Debt

You’re never a hero proverbially saving $1 now when it will cost you $10 to deal with whatever that problem is later. Kicking the can down the road is a flawed, dangerous IT strategy. I’ve addressed tech debt and other related issues before (selected posts: “Technical Debt – The (Not So) Silent Crisis“, “Outages In An Increasingly Connected World“, “Security Is An Availability Problem“, and “Another Day, Another Outage“) so if you want to know the basics in more detail, read those.

SWA did upgrade some systems a few years back to give “the carrier more flexibility to improve the Customer Experience and enhance revenue performance.” Clearly the “Customer Experience” has been top notch over the past week. When you’re not flying and have to reimburse customers and figure things out, you LOSE money, not enhance revenue performance.

Availability goals should always be based in reality with real world data. How much does downtime cost the business – literally? What penalties – financial or otherwise – will be incurred? Does our solution mitigate those risks? It seems as if SWA either did not properly assess risk or worse, care. If it ain’t broke, don’t fix it, right? Wrong. According to this CNN report, SWA underinvested in its operations. Basic communication – including phone systems – were not working. Communication is crucial when the excrement hits the fan.

Andrew Watterson, SWA’s Chief Operating Officer, blamed the outdated scheduling software in a company call. The quotes from the call in the CNN article are telling.

I get that for large companies it’s hard to rip out existing systems, especially when you cannot tolerate much – if any – downtime. I spent the better part of the past 25+ years helping customers architect solutions (and will continue to do so at Pure) that perform well, are secure, and resilient/highly available. Choices have consequences and customers need to make the right ones especially when sunsetting older solutions that are very important. Always be looking forward.

How Do You Avoid Technical Debt?

I have worked with enough customers over the years to know that most people reading this blog have at least one legacy system hanging around. You know the one. It’s that system that if you look at it sideways, it acts up. That’s the one (or ones) you need a plan for sooner rather than later.

Being honest, tech debt is hard to avoid 100% of the time but you need to try. Be proactive, not reactive. Know when things like SQL Server, Windows Server, and other third party software are out of support. There are many nuances to dealing with technical debt which also includes ensuring that all staff has training and their skills are modernized. Technical debt is a people issue, too.

Know your core functionality and what you need to achieve. Getting lost in whiz-bang, fancy features and analytics does not mean a hill of beans if your company’s core goals are not met. In the case of SWA, they cannot move people from Point A to Point B. Let’s not even get into the potential hit to their reputation and bottom line that comes along with a failure of this magnitude.

Don’t become the next headline. Planning for obsolescence as soon as a system is brought online is really the exercise that needs to happen. If you do not bake obsolescence in as a feature from day one, you may be the next SWA or worse; events like this can take the business out permanently, too. Unemployment is not the goal.

What are your thoughts? Have you been in similar situations and if so, how did you get past the issue(s)?

December 20, 2022

The Only Constant Is Change

I hope everyone is having a good holiday season.

Since the beginning of my career in the 1990s, one tenet is still true: the only constant is change. Many of us joke that X number of years of experience is not the same year of experience X times. Careers are a journey, not a destination. You learn. You grow. Some things in life and work are cyclical and yes, some of the challenges we face are the same or similar. Heaven knows “everything old is new again” is true when you have been around as long as I have, but how we approach and deal with challenges is often completely different as times and technology has changed.

When I went independent in 2007 with Megahirtz and later also founded SQLHA, each was and is a unique experience. I’m proud of both companies. In 2007, I had been a consultant for more than seven years as an employee at both Microsoft and Avanade. By the end, I was basically doing my own business development at that point and decided it was “now or never”. In my mid-30s, I took a leap of faith to invest in myself. What I thought might be a few years at most turned into over 15. I exceeded my expectations and smile when I think of the success I’ve achieved.

This year, I took another leap of faith. I made the difficult decision to step back from consulting to become an employee again. It is never an easy decision to make when you have a company, customers, and a business partner. Once again, I’m betting on myself to succeed in a new chapter of my life.

What Will I Be Doing?

My role is a Technical Evangelist for FlashArray at Pure Storage. My day-to-day tasks speak to everything I bring to the table – not just my technical SQL Server expertise. Being able to use the depth and breadth of my skills was important to me wherever I landed as I’m more than a SQL Server availability guy. I’m thrilled to be joining Pure and so far, they seem happy I’m there, too.

How Did It Happen?

The process was organic. Anthony Nocentino (Blog | Twitter) and I had a conversation at an event this fall. The topic of me looking was not an agenda item; it just came up naturally. Turns out it was perfect timing for everyone involved. I owe Anthony a huge thank you. Never underestimate the power of networking at events. At one point I may blog about my job search and all its ups and downs.

What About SQLHA?

Max and I talked with customers before I published this blog post. I strive to be open and honest in my dealings with people. They know I’m stepping back from consulting and the day-to-day management of SQLHA’s business. Max continues to work with our existing customers. There’s a bit of a transition period that I cleared that with Pure.

What About My Relationship with Max?

I’ve known Max for over 20 years going back to when we were both blue badge employees at Microsoft. We are still good friends and that has not changed. There was no need to go through this process with two fingers held high in the air. I did not find the job at Pure and spring it on him. It was important Max was on board before I started to look for the right position as this impacted him, too.

What Changes?

Honestly, not much. I’ll still be that business continuity, infrastructure, virtualization, cloud, and more pedant (and smartphone curmudgeon) you all know and love – and sometimes love to hate. I’ll continue to speak at events including SQLBits in Wales in March 2023 where I have a Training Day “Architecting Scalable, Available, and Manageable SQL Server Deployments” on Wednesday, March 16. Not all of my speaking will be under the Pure banner. My book and public training are still on the radar map for 2023. SQLHA (the website and the company) will be around. SQLHA.com has been my Internet home for over 15 years and will continue to be. It will be the home to my blog and I’ll probably post the occasional video to the SQLHA YouTube channel. You can also find me on Twitter (for now, at least), LinkedIn, Mastodon (@SQLHA@techhub.social), and Counter Social (@SQLHA@counter.social).

I wish everyone the best as the curtain draws on 2022, and hope 2023 is even brighter for everyone.

September 13, 2022

Contained SQL Server Agent Jobs in SQL Server 2022

This blog post is part of T-SQL Tuesday #154.

I previously wrote about Contained AGs in SQL Server 2022 and demonstrated how to create a contained login. In this blog post, I’m going to talk about contained SQL Server Agent jobs because just like logins, they are a bit confusing from an administative standpoint in their current pre-release implementation (this blog post was written using SQL Server 2022 RC0 using SSMS 19 Preview 3).

Create a Contained SQL Server Agent Job

As of the writing of this blog, contained SQL Server Agent jobs can only be created using Transact-SQL. If this changes, I will update the blog post accordingly.

Remember per my other blog post, the key is to do this in context of the contained AG. An example is shown in Figure 1.

Figure 1. In context of the contained AG.

The job I created inserts a random character every minute into the table TestTbl.

USE ContainedRobotoAG_msdb;
GO

EXEC sp_add_job
   @job_name = N'Insert Value Into ContainedRobotoAG TestTbl'; 
GO

EXEC sp_add_jobstep
   @job_name = N'Insert Value Into ContainedRobotoAG TestTbl',
   @step_name = N'Insert Value'
   @step_id = 1,
   @subsystem = 'TSQL',
   @database_name = N'ContainedAGDB1',
   @command = N'INSERT INTO TestTbl (TestVal) VALUES (CHAR (ROUND(RAND() * 93 + 33,0)))',
   @on_success_action = 1,
   @on_fail_action = 2,
   @retry_attempts = 5,
   @retry_interval = 5;
GO

DECLARE @schedule_id int;
EXEC sp_add_schedule
   @schedule_name = N'Every Minute',
   @enabled = 1,
   @freq_type = 4,
   @freq_interval = 1,
   @freq_subday_type = 4,
   @freq_subday_interval = 1,
   @freq_relative_interval = 0,
   @freq_recurrence_factor = 1,
   @active_start_date = 20220913,
   @active_end_date = 99991231,
   @active_start_time = 0,
   @active_end_time = 235959,
   @schedule_id = @schedule_id OUTPUT
SELECT @schedule_id;
GO

EXEC sp_attach_schedule
   @job_name = N'Insert Value Into ContainedRobotoAG TestTbl',
   @schedule_name = N'Every Minute';
GO

EXEC sp_add_jobserver
   @job_name = N'Insert Value Into ContainedRobotoAG TestTbl';
GO

The Challenge of Contained SQL Server Agent Jobs

Just like with logins, the SQL Server Agent job is not visible in SSMS. Like creation right now, it is just Transact-SQL. An example is shown in Figure 2.

Figure 2. Contained SQL Server Agent jobThis has some nasty side effects.

Someone could create a duplicate job at the instance (not AG) level. I tested this – see Figure 3. It’s not only doable, but the possibilities of how that can go wrong is endless.
Figure 3. Duplicate jobs – one in the contained AG, one at the instance level
The only way right now to see if a contained Job ran successfully (one time or recurring) is via T-SQL. See Figure 4 for an example. Many admins have scripts as part of their processes (hello DevOps folks!), but I know just as many or more who rely on SSMS to see the status of a job. What if a backup job created in context of a contained AG is failing and no one catches it because it’s not visible?
Figure 4. Job status
Somewhat associated with #1, I see a potential security hole here. What if someone creates a nefarious SQL Server Agent job in a contained database which does nasty things and no one knows about it and what it is doing? The implications – especially because Agent jobs can be more than just T-SQL – are not good.

The Bottom Line

I still like contained AGs, but this post highlights the biggest blind spot and challenge with them: administration. My hope is that SSMS is updated by the time SQL Server 2022 is released to make them not only easier to create and manage, but make SQL Server more secure.

June 1, 2022

So What About That Book, Allan?

This is a post I’ve been meaning to write for some time and one of the hardest. I’m not sure where to begin, so I’ll start somewhere.

Some of you may know I’ve been working on the book Mission Critical SQL Server for a long time … way too long. Sometimes the most well intentioned plans take a left turn at Albuquerque. Let me explain.

Change, Change, Change

I wanted to write the spiritual successor to my SQL Server 2005 book – one big honkin’ volume that would be the reference for all things business continuity. In theory it was an awesome idea.

At the time I started writing there was no Linux, virtualization was still on the rise, and not many were using any cloud provider. We all know the landscape today; so much has changed. That has had its own negative as well as positive impact. There are completed parts that I’ve probably written over again or revised four or five times but have never seen the light of day because every time I thought I was there, something else needed to be added. Multiply that over multiple operating systems, SQL Server versions, and environments (physical, virtual, and cloud) … it became paralyzing.

I can’t tell you how many hours I’ve stared at a screen with Word up and nothing was coming out. Perfection is the enemy of done.

Being Honest

At some point in the process, the book became an albatross and quite frankly, I felt a lot of shame around it not being done. The more time went on, the bigger the shame. I felt like a failure. The book became a mythical, larger than life “thing” overshadowing so many other things in my life. I started to think it would never live up to expectations – real or perceived – which made me shut down. Disappointing people is a terrible feeling and on top of it all, I also gave my detractors plenty of ammo for it not being done. Any time someone would ask me about I had a sense of dread. All of this felt like a 10,000 pound weight on my shoulders.

Last year I decided to do get some help – especially with the shame. For the first time in a long time, I can say I am in a good head space with the book. Recently, I’ve been able to start writing again which has been very freeing. I’ve approached someone to help me with the book, too. One thing as I’ve gotten older is that I’ve realized sometimes you just can’t do it all. This book has been one big life lesson in how not to go about things. If I could screw it up, boy have I with this one.

I’m also learning to be kinder to myself. I have impossibly high standards that can be unrealistic from time to time. It’s ok to be human, not superhuman.

The New Timeline

I know many of you are wondering: is the book ever going to be finished? Yes.

The realistic and, most importantly, achievable goal is to have it done as soon as possible after the release of SQL Server 2022. Since there is no set release date, I can’t say when that is but realistically if it hits the shelves later this year, I’m looking at sometime early in 2023. It won’t just be about SQL Server 2022 but ensuring the latest release is covered makes a lot of sense and keeps it relevant for quite some time.

One thing that I am still debating: instead of one big honkin’ volume, release it in smaller, digestible chunks. Is this something you would want or prefer? It may also allow me to get sections out earlier. Let me know below.

That’s it. I have felt awful about not saying anything or answering questions but understand it was not because I didn’t want to; the shame prevented me from doing so. This blog is me finally putting the shame behind me to move forward and remove the weight from my shoulders.

May 26, 2022

Contained Availability Groups in SQL Server 2022

Microsoft announced the public preview of SQL Server 2022 the other day. Along with it comes SSMS 19 Preview 2 which you will need as well. There is no set release date for SQL Server 2022 but with the public preview, we are definitely closer.

One of the features in that announcement is why I am writing this blog post: contained availability groups, yet another AG variant.

For those among you who will call them CAGs, it is not the official abbreviation – it is contained AG. I get it is easier to say or type CAG. Don’t do it unless you want the Allan stink eye (I can already see the memes …).

A Bit of History

The idea of containment is not new to SQL Server. Anyone remember contained databases (or as I like to call them, partially contained databases)? These have been around since SQL Server 2012 but very few have implemented them in the wild. It was a nice concept that basically died on the vine.

Just under two years ago, I got an e-mail from Kevin Farlee (Twitter) asking if I wanted to participate in a private preview for contained AGs. I could not reply fast enough. I played with it and provided some feedback. As is the way these things go, things were silent for a long time. Fast forward to a few months ago when I learned that the contained AG feature was alive and well. It has been very hard for me to not say anything. Kudos to Kevin for pushing this forward and finally getting it into the product.

It shouldn’t have to be said, but don’t bug Kevin. I am fortunate enough to have this type of NDA relationship as well as the trust of both the SQL Server and Windows dev teams. I do what I can behind the scenes to hopefully assist Microsoft to develop better products and experiences for those of you reading this before you even see these features. Some of the features I evaulate or provide feedback on never see the light of day and I will take them to the grave.

Why Do We Need a Contained Availability Group?

All SQL Server availability features except Always On Failover Cluster Instances (FCIs) have a “problem”: when a secondary replica/warm standby/mirror (the term is different for each feature …) takes over as the new boss, some items such as SQL Server Agent jobs, instance level logins, etc., are not there. Going back to the early days of SQL Server when log shipping was not even in the product, this was always a manual process. There are multiple ways to approach this challenge and I am not going to detail them. This “problem” is a longstanding pain point with those who are responsible for managing SQL Server.

Contained AGs solve this issue by having their own master and msdb databases synchronized as part of the AG mechanism.

A Look at Contained Availability Groups

DISCLAIMER All screen shots here are using SSMS 19 Preview 2. Things may change so if what you are seeing in the future is different, that is to be expected.

Contained AGs is not an Enterprise Edition only feature; it will work in Standard Edition (nee Basic AGs) as well.

Right now you can create a contained AG with SSMS or Transact-SQL. In the Availability Group Wizard, Figure 1 shows you two new options.

Figure 1. Contained database options in SSMS

For a new AG, you would only have to select Contained. More on the second option in #4 of the gotchas section.

After the contained AG is created, you get two additional databases per AG

AGNAME_master
AGNAME_msdb

Figure 2 shows you what the contained databases look like in SSMS.

Figure 2. Contained databases

Figure 3 shows you what a contained database looks like on the AG properties dialog. Note that Contained is greyed out and not selectable. See #1 of the gotchas section for more details.

Figure 3. Contained AG properties

You can also see that the AG is contained using the DMV sys.availability_groups as shown in Figure 4 with the new column is_contained.

Figure 4. New is_contained column

Create Contained Objects

The contained system databases start out empty except for master which has the administation accounts. That means if you have a lot of logins, jobs, linked servers, etc., they do not automatically move into the contained system databases. You need to create them. How?

Connect to the contained AG as an administrator.
Switch to the contained master or msdb.
Create the object.

The example in Figure 5 creates a login in the contained master database for the MRROBOTO AG and then its corresponding user in the database ContainedAGDB1.

Figure 5. Creating a contained login and the corresponding user in the database

Connect to a Contained Availability Group

Using a contained AG is all about context. If you connect to the instance directly, you only see and can use what is in the instance’s master and msdb databases. Heed gotcha #7 in the next section. Figure 6 shows a connection to a default instance named DENNIS using sqlcmd.

Figure 6. Logins in the instance

If you connect to the contained AG as an administrator, Figure 7 connects to a listener named DOMOARIGATO and shows that in that context, you see a very different set of logins from what was shown in Figure 6. Note JChance that was created earlier is listed.

Figure 7. Contained logins

From an application perspective, you can log into the contained AG via the listener with a contained login as shown in Figure 8. JChance gets right in. The syntax used for sqlcmd is:

sqlcmd -Slistenername -Uuserincontaineddatabase -Ppassword

Figure 8. Connect to Listener with Contained User

You can also connect using a combination of a contained user, one of the databases in the contained AG, and the instance (not listener) name. An example is shown in Figure 9 using sqlcmd. The syntax is:

sqlcmd -Sinstancename -Uuserincontaineddatabase -Ppassword -ddatabaseincontainedag

Figure 9. Connecting to a contained database properly

Currently Known Contained Availability Group Gotchas

An AG can be configured as either contained or not contained (i.e. “regular”). You cannot change an AG’s configuration from not contained to contained or contained to not contained after it is created. To do so, destroy the AG and recreate it the other way. I do not see this ever being changed.
Contained AGs are not currently supported with distributed AGs. I can think of a few reasons why it would break contained AGs. That means if you rely on distributed AGs today, contained AGs may not be in your future if you want to continue that strategy. My hope is that both contained AGs and distributed AGs work together. If I had to guess, I think it would happen after SQL Server 2022 is released and possibly another major version. I plan on doing some testing to see what does/does not work and if there are workarounds.
Contained AGs also do not currently work with replication. If you need replication, you may not be able to use a contained AG.
Contained AGs assume automatic seeding for the AG’s databases including the contained system ones. If you want to create databases via manually seeding (i.e. backup/copy/restore WITH NORECOVERY), use Transact-SQL. The creation process for a contained AG with manual seeding is not documented yet.
This feature is SQL Server 2022 only. I do not see this being backported to SQL Server 2019. See #1 for implications in an upgrade.
If you delete the contained AG, it does not delete the contained system databases. If you want to create another AG with the same name, you can choose to reuse those contained system databases already created. That is the second check mark in Figure 1. Otherwise, you will need to manually delete them from all replicas.
A contained AG is not a security boundary. You can still potentially see everything else in the instance (including the real master and msdb) if you have the access.
If you do not have a baseline knowledge (not expert) to be able to crate objects not using SSMS, right now contained AGs would be a challenge for you to implement. I am hoping that changes at some point.

My Thoughts

I see contained AGs as a compelling reason to consider upgrading to SQL Server 2022 when it is released. In my opinion, the contained AG feature is the most significant improvement to AGs since the introduction of distributed AGs in SQL Server 2016.

Will contained AGs create world peace? No. Does it fix the external objects outside the database problem for features like log shipping? No; that is still a manual process and a pain point. Does this feature solve an issue that should have been solved with the introduction of AGs in SQL Server 2012 especially for scenarios like automatic failover? Absolutely. Can the feature and its tooling use some improvement? Sure.

The limitation around distributed AGs is more of an understandable disappointment than a dealbreaker for me. The value of this feature is too high not to recommend people kicking the tires on it. I also love distributed AGs so it’s a hard call.

What are your thoughts about contained AGs? Do you have questions about this new feature? Let me know below.

‹ Prev 123 4 5 Next ›Last »