— by Lori Brown @SQLSupahStah
Since I manage several SQL clusters, I came up with a way to get a notification that the cluster may have failed over. This failover notification was created in self-defense because often times no one from the system administrators team will tell the DBA when work is being done or if there are issues. Many times I have logged onto a SQL cluster only to find that it failed over and have no explanation as to why. It is okay for clusters to failover, after all that is what we build them for, but it would be good to know why in case there is some problem that should be fixed. The failover notification is done by having the SQL Agent run a job that queries the current computer name and compares that value to the last known computer name. You will need an administrative database for the tables.
— Create tables
USE [ADMIN]
GO
CREATE TABLE [dbo].[CurrentNode](
[cnode] [varchar](100) NULL) ON [PRIMARY]
GO
CREATE TABLE [dbo].[PreviousNode](
[pnode] [varchar](100) NULL) ON [PRIMARY]
GO
Once the tables are created they need to be populated with the current and previous node table values. These will be the same at first so that if the CurrentNode table is updated after a failover then the comparison query in the job will know it has moved to a different node.
— Prep – Put the same value into both tables so that if the current changes on failover a notice is sent
INSERT INTO CurrentNode (cnode) VALUES (CONVERT(VARCHAR(100),SERVERPROPERTY(‘ComputerNamePhysicalNetBIOS’)))
INSERT INTO PreviousNode (pnode) VALUES (CONVERT(VARCHAR(100),SERVERPROPERTY(‘ComputerNamePhysicalNetBIOS’)))
Here is the code that is used in the job. The logic compares the values in the current node query to the pervious node table and will notify of either a failover or restart of the instance. An e-mail will be sent to notify anyone who needs it of the event.
/******************************************************/
— create job that only runs when the SQL Agent starts that uses the code below.
— replace all <<instancename>> & <<profile>> placeholders with correct values
SET NOCOUNT ON
DECLARE @curnodename VARCHAR(100)
DECLARE @prevnodename VARCHAR(100)
DECLARE @body1 NVARCHAR(MAX)
SET @curnodename = CONVERT(VARCHAR(100),SERVERPROPERTY(‘ComputerNamePhysicalNetBIOS’))
SET @prevnodename = (SELECT pnode FROM PreviousNode)
IF @curnodename <> @prevnodename — Failover
BEGIN
UPDATE CurrentNode SET cnode = @curnodename
— instructions
SET @body1=‘<p> A failover has occurred for the <<InstanceName>> cluster.</p>’
— table attributes
SET @body1=@body1+‘<table border=”2″ cellspacing=”2″ cellpadding=”2″>’
— column headers
SET @body1=@body1+ ‘<tbody align=”left” style=”font-family:Arial; font-size: 11;” <TR><TH>Current Node</TH><TH>Previous Node</TH></TR></tbody>’
— data
SET @body1=@body1 +‘<tbody align=”left” style=”font-family:Arial; font-size: 11;” <TR><TD>’+@curnodename+'</TD><TD>’+@prevnodename+'</TD></TR></tbody>’
— Send an html formatted e-mail to notify of failover
EXEC msdb.dbo.sp_send_dbmail
@profile_name = ‘<<profile>>’,
@recipients = ‘DBA@company.com’,
@subject = ‘Failover on <<instancename>> cluster’,
@body = @body1,
@body_format = ‘HTML’;
UPDATE PreviousNode SET pnode = @curnodename
END
IF @curnodename = @prevnodename — Restart
BEGIN
— Send an html formatted e-mail to notify of restart
EXEC msdb.dbo.sp_send_dbmail
@profile_name = ‘<<profile>>’,
@recipients = ‘DBA@company.com’,
@subject = ‘Restart of <<InstanceName>> SQL Services’,
@body = ‘The <<InstanceName>> cluster services may have been restarted but not failed over.’;
END
SET NOCOUNT OFF
/******************************************************/
And finally here is the actual job that I created. It will only run if the SQLAgent is started.
— Create job
USE [msdb]
GO
/****** Object: Job [Failover Notification] ******/
BEGIN TRANSACTION
DECLARE @ReturnCode INT
SELECT @ReturnCode = 0
/****** Object: JobCategory [[Uncategorized (Local)]]] ******/
IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]’ AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N’JOB’, @type=N’LOCAL’, @name=N'[Uncategorized (Local)]’
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
END
DECLARE @jobId BINARY(16)
EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N’Failover Notification’,
@enabled=1,
@notify_level_eventlog=0,
@notify_level_email=2,
@notify_level_netsend=0,
@notify_level_page=0,
@delete_level=0,
@description=N’On restart of SQL Agent this job will run and notify if the instance has failed over.’,
@category_name=N'[Uncategorized (Local)]’,
@owner_login_name=N’sa’,
@notify_email_operator_name=N’DBA’, @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
/****** Object: Step [FailoverCheck] ******/
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N’FailoverCheck’,
@step_id=1,
@cmdexec_success_code=0,
@on_success_action=1,
@on_success_step_id=0,
@on_fail_action=2,
@on_fail_step_id=0,
@retry_attempts=0,
@retry_interval=0,
@os_run_priority=0, @subsystem=N’TSQL’,
@command=N’SET NOCOUNT ON
DECLARE @curnodename VARCHAR(100)
DECLARE @prevnodename VARCHAR(100)
DECLARE @body1 NVARCHAR(MAX)
SET @curnodename = CONVERT(VARCHAR(100),SERVERPROPERTY(”ComputerNamePhysicalNetBIOS”))
SET @prevnodename = (SELECT pnode FROM PreviousNode)
IF @curnodename <> @prevnodename — Failover
BEGIN
UPDATE CurrentNode SET cnode = @curnodename
— instructions
SET @body1=”<p> A failover has occurred for the <<InstanceName>> cluster.</p>”
— table attributes
SET @body1=@body1+”<table border=”2″ cellspacing=”2″ cellpadding=”2″>”
— column headers
SET @body1=@body1+ ”<tbody align=”left” style=”font-family:Arial; font-size: 11;” <TR><TH>Current Node</TH><TH>Previous Node</TH></TR></tbody>”
— data
SET @body1=@body1 +”<tbody align=”left” style=”font-family:Arial; font-size: 11;” <TR><TD>”+@curnodename+”</TD><TD>”+@prevnodename+”</TD></TR></tbody>”
— Send an html formatted e-mail to notify of failover
EXEC msdb.dbo.sp_send_dbmail
@profile_name = ”<<profile>>”,
@recipients = ”DBA@company.com”,
@subject = ”Failover on <<InstanceName>> cluster”,
@body = @body1,
@body_format = ”HTML”;
UPDATE PreviousNode SET pnode = @curnodename
END
IF @curnodename = @prevnodename — Restart
BEGIN
— Send an html formatted e-mail to notify of restart
EXEC msdb.dbo.sp_send_dbmail
@profile_name = ”<<profile>>”,
@recipients = ”DBA@company.com”,
@subject = ”Restart of <<InstanceName>> SQL Services”,
@body = ”The <<InstanceName>> cluster services may have been restarted but not failed over”;
END
SET NOCOUNT OFF’,
@database_name=N’ADMIN’,
@flags=0
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N’failover check sched’,
@enabled=1,
@freq_type=64,
@freq_interval=0,
@freq_subday_type=0,
@freq_subday_interval=0,
@freq_relative_interval=0,
@freq_recurrence_factor=0,
@active_start_date=20150101,
@active_end_date=99991231,
@active_start_time=0,
@active_end_time=235959
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)’
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:
GO
The actual notification looks like this:
Now I know that when I receive the notification that I need to research why this happens and troubleshoot if necessary. Enjoy!
For more information about blog posts, concepts and definitions, further explanations, or questions you may have…please contact us at SQLRX@sqlrx.com. We will be happy to help! Leave a comment and feel free to track back to us. We love to talk tech with anyone in our SQL family!