Bug – Combining Failover Clustering & Log Shipping When Programs Installed On Another Drive
A customer of mine contacted me this week about a problem they were having. They have a two-node failover cluster on Windows Server 2008 R2 with an instance of SQL Server 2008 with SQL Server 2008 SP1. They installed SQL Server’s program files to another drive – not the main system drive (C). They then configured log shipping. Everything worked fine until they were testing failover. When the instance was failed over to the node which was not the first one installed, log shipping stopped working. When they failed it back, everything worked. After a WebEx session, it looks like Setup didn’t put everything in the right place on the other node. Before I rushed to any conclusions, I needed to reproduce the problem to see if it was possibly something that went wrong in their setup. Here’s what I did.
1. I created a two node W2K8 R2 cluster with a slipstreamed SQL Server 2008 SP1 instance.
Here is where I installed the progams to:
and on the next dialog
So far, so good.
2. I configured log shipping to an instance on another server while the instance was on the original (first) node it was installed on. Everything worked great.
3. I failed the instance over to the other node, and log shipping failed.
4. I failed the instance back to the original node. Lo and behold, log shipping worked again.
So what happened?
If you look at the job step for the transaction log backup job, here’s what it is calling:
“Z:Program FilesMicrosoft SQL Server100ToolsBinnsqllogship.exe” -Backup 6B81BF42-4AA8-4DE3-8349-5E54EE0C52ED -server KILROY
Here is what the programs look like on the first (original) install node for Drive Z. Note the shared tools directory with SqlLogShip.exe.
Here is what the programs look like on the second node (add node) install:
C drive showing the shared tools directory with SqlLogShip.exe.
Z drive showing no shared tools directory.
So it’s pretty clear to me that the Add Node operation is not putting the files in the same place even though the other node has the same drive structure, thus causing log shipping to stop working in a failover.
If you install things all to the original system drive (such as C), everything works fine so that is a workaround. But I know some of you like to put program files in places other than the system drive.
I have written this up over on the Connect site here, so if you want to try to get this fixed, vote for it!
The funny thing about this is that on the initial install I selected everything to go to the Z drive, but it still install some files to C. Interesting.