{"id":908,"date":"2022-09-12T19:00:38","date_gmt":"2022-09-13T00:00:38","guid":{"rendered":"https:\/\/brentec.ca\/blog\/?p=908"},"modified":"2022-10-13T21:21:28","modified_gmt":"2022-10-14T02:21:28","slug":"script-showing-successfully-failing-sql-server-agent-jobs","status":"publish","type":"post","link":"https:\/\/brentec.ca\/blog\/2022\/09\/script-showing-successfully-failing-sql-server-agent-jobs\/","title":{"rendered":"Script showing successfully failing SQL Server agent jobs"},"content":{"rendered":"<p style=\"font-size: 13px; letter-spacing: 1px;\"><span style=\"color: #3366ff;\"><strong>Problem: <\/strong><\/span>One of your tasks as a production DBA is to ensure that all SQL agent jobs run successfully 100% of the time. Whether this is for end-user reporting or your backup and restore jobs, it is imperative that jobs succeed without any failed steps within them. When a job succeeds but has failed steps within it, that&#8217;s what I call <em>successfully failing<\/em>.<\/p>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">Now comes your daily routine. A quick check within SSMS&#8217; Job Activity Monitor shows that all jobs show a last run outcome of <em>Succeeded.<\/em> Everything is good right? However, job outcome is not as expected, or end users complain (I mean politely mention) that they are missing some specified data. Looking deeper into the jobs indicate that steps have failed but the overall outcome is a success? What is going on? This could potentially be disastrous. The kicker here is, and you have to ask yourself, do you know how long this job has been <em>successfully failing<\/em>?<\/p>\n<figure id=\"attachment_920\" aria-describedby=\"caption-attachment-920\" style=\"width: 771px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-SSMS.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-920 size-full\" src=\"https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-SSMS.png\" alt=\"\" width=\"771\" height=\"154\" srcset=\"https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-SSMS.png 771w, https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-SSMS-300x60.png 300w, https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-SSMS-768x153.png 768w\" sizes=\"(max-width: 771px) 100vw, 771px\" \/><\/a><figcaption id=\"caption-attachment-920\" class=\"wp-caption-text\"><span style=\"font-size: 12px; letter-spacing: 1px;\">An innocent SQL Server agent job, reporting that is was successful, but was it?<\/span><\/figcaption><\/figure>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">You cannot always trust the output of SSMS&#8217; Job Activity Monitor as we can see from the snippet above. Looking under the hood shows a different story.<\/p>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">Now, some will say to check the job history for the job. But let&#8217;s be realistic. You don&#8217;t have just one or two jobs to check daily (or multiple times per day), you can have hundreds, and are you going to check each individual job? No, that is a waste of your time.<\/p>\n<figure id=\"attachment_919\" aria-describedby=\"caption-attachment-919\" style=\"width: 300px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-log-file-viewer2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-919 size-medium\" src=\"https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-log-file-viewer2-300x282.png\" alt=\"\" width=\"300\" height=\"282\" srcset=\"https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-log-file-viewer2-300x282.png 300w, https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-log-file-viewer2.png 687w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><figcaption id=\"caption-attachment-919\" class=\"wp-caption-text\"><span style=\"font-size: 12px; letter-spacing: 1px;\">SQL Server&#8217;s log viewer, finding a failed job step is like looking for a needle in a haystack.<\/span><\/figcaption><\/figure>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">Others will say, review the job history, but check all jobs to show every single history event in the log file viewer. Again, you have hundreds of jobs, why would you scroll through a massive list of job histories to find a single job (or potentially not find anything) that has that yellow warning triangle icon in the left? You wouldn&#8217;t. You could filter the list, but you are going to spend a lot of time attempting to ascertain the correct filter, if you can come up with one at all (the filtering in SSMS&#8217; log viewer is not all that intuitive). Again, that is a big waste of your valuable DBA time.<\/p>\n<p style=\"font-size: 13px; letter-spacing: 1px;\"><span style=\"color: #3366ff;\"><strong>Solution: <\/strong><\/span>Below is my solution to this problem hopefully saving you some time in your daily routine.<\/p>\n<pre class=\"brush: sql; title: ; notranslate\" title=\"\">\r\n\/* \r\nAuthor: Brent Sodtke\r\nCreated: 2022-09\r\nDescription: Shows all SQL Agent jobs that are successful but contain failed job steps.\r\n----------\r\nLicense: CC BY-SA (Creative Commons Share-alike)\r\nYou are free to:\r\nShare - copy and redistribute the material in any medium or format.\r\nAdapt - remix, transform, and build upon the material for any purpose, even commercially.\r\n\r\nUnder the following terms:\r\nAttribution - You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.\r\nShareAlike - If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.\r\n\r\nPROVIDED &quot;AS IS&quot;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED\r\n*\/\r\nuse master\r\ngo\r\n\r\nset nocount on\r\n\r\ndeclare @history int = 14 -- number in days from script run to look for entries in sysjobhistory table.\r\n\r\n\/*\r\nattempt # 1 : get all successful job outcomes from sysjobhistory, then gather failed steps matching successful job.\r\nattempt # 2 : get all failed job steps from sysjobhistory, then gather successful job outcomes matching the step(s).\r\n*\/\r\n\r\nif OBJECT_ID(N'tempdb.dbo.#failedjobsteps') is not null\r\ndrop table tempdb.dbo.#failedjobsteps\r\ncreate table #failedjobsteps\r\n(\r\ninstance_id int null\r\n, job_id uniqueidentifier null\r\n, job_name sysname null\r\n, rundatetime datetime null\r\n)\r\ninsert into #failedjobsteps\r\nselect\r\n   sjh.instance_id\r\n   , sjh.job_id\r\n   , sj.name\r\n   , msdb.dbo.agent_datetime(sjh.run_date, sjh.run_time) as rundatetime\r\nfrom msdb.dbo.sysjobhistory sjh\r\njoin msdb.dbo.sysjobs sj\r\non sjh.job_id = sj.job_id\r\nwhere\r\n   sjh.step_id &amp;amp;amp;amp;lt;&amp;amp;amp;amp;gt; 0\r\n   and sjh.run_status = 0\r\n   and msdb.dbo.agent_datetime(sjh.run_date, sjh.run_time) &amp;amp;amp;amp;gt;= getdate()-@history -- putting in a timeframe here since this can take a long time with large job history\r\norder by sjh.instance_id asc\r\n\r\n-- ##################################################\r\n\r\nif OBJECT_ID(N'tempdb.dbo.#successfuljobrunswithstepfailures') is not null\r\n   drop table tempdbo.dbo.#successfuljobrunswithstepfailures\r\ncreate table #successfuljobrunswithstepfailures\r\n(\r\n   job_name sysname null\r\n   , job_id uniqueidentifier null\r\n   , rundatetime datetime null\r\n)\r\nwhile (select count(1) from #failedjobsteps) &amp;amp;amp;amp;gt; 0\r\nbegin\r\n   insert into #successfuljobrunswithstepfailures\r\n   select distinct(sj.name), sjh.job_id , msdb.dbo.agent_datetime(sjh.run_date, sjh.run_time)\r\n   from msdb.dbo.sysjobhistory sjh\r\n   join msdb.dbo.sysjobs sj\r\n   on sjh.job_id = sj.job_id\r\n   -- we know that of all the rows from within the sysjobhistory table, job runs are grouped by\r\n   -- (1) the job_id matching\r\n   -- (2) the instance_id is less than or equal to the instance_id of the row containing the job outcome\r\n   -- (3) the run_date &amp;amp;amp;amp;amp; run_time are greater than or equal to the run_date &amp;amp;amp;amp;amp; run_time of the row containing the job outcome\r\n   where sjh.job_id = (select top 1 job_id from #failedjobsteps order by instance_id asc)\r\n   and sjh.instance_id &amp;amp;amp;amp;gt;= (select top 1 instance_id from #failedjobsteps order by instance_id asc)\r\n   and msdb.dbo.agent_datetime(sjh.run_date, sjh.run_time) &amp;amp;amp;amp;lt;= (select top 1 rundatetime from #failedjobsteps order by instance_id asc)\r\n   and sjh.run_status = 1 -- only concerned with successes\r\n   and sjh.step_id = 0 -- only concerned with job outcome steps\r\n   group by sj.name, sjh.job_id , msdb.dbo.agent_datetime(sjh.run_date, sjh.run_time)\r\n\r\n   delete from #failedjobsteps where instance_id = (select top 1 instance_id from #failedjobsteps order by instance_id asc)\r\nend\r\n\r\n-- select * from #successfuljobrunswithstepfailures;\r\n\r\n-- (1) show unique job runs that have at least one failed step. remember, a job can have more than one failed step, be successful and thus\r\n-- have more than one entry within #successfuljobrunswithstepfailures.\r\n-- this will show a list of successful jobs in sysjobhistory having at least one failed job step.\r\n\r\nif OBJECT_ID(N'tempdb.dbo.#unique') is not null\r\n   drop table tempdbo.dbo.#unique\r\ncreate table #unique\r\n(\r\n   job_name sysname null\r\n   , job_id uniqueidentifier null\r\n   , rundatetime datetime null\r\n)\r\ninsert into #unique\r\nselect\r\n   distinct(job_name)\r\n   , job_id\r\n   , rundatetime\r\nfrom #successfuljobrunswithstepfailures\r\ngroup by job_name, job_id, rundatetime\r\norder by rundatetime asc\r\n\r\nselect * from #unique order by rundatetime desc -- return data back to client for later use; let client determine the count.\r\n-- select count(1) numberofuniquejobruns from #unique\r\n\r\n-- (2) show the list of successful outcome jobs having at least one failed step and their most recent execution.\r\n-- this list will be different than (1) since no job_id should occur more than once.\r\n-- this will also show the user a more immediate list (most recent) of jobs that should require attention.\r\n\r\nselect job_name, job_id, rundatetime -- return data back to client for later use; let client determine the count.\r\nfrom\r\n(\r\n   select u.*\r\n   , row_number() over (partition by job_name order by rundatetime desc) as rownum\r\n   from #unique u\r\n) u2\r\nwhere u2.rownum = 1\r\norder by u2.rundatetime desc\r\n\r\n<\/pre>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">The output of the above script will produce something similar to the following:<\/p>\n<p><a href=\"https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-script-output2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-927\" src=\"https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-script-output2.png\" alt=\"\" width=\"567\" height=\"256\" srcset=\"https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-script-output2.png 567w, https:\/\/brentec.ca\/blog\/wp-content\/uploads\/2022\/09\/successful-job-with-failed-step-script-output2-300x135.png 300w\" sizes=\"(max-width: 567px) 100vw, 567px\" \/><\/a><\/p>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">As the script notes, there are going to be 2 output windows within SSMS for this.<\/p>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">The first, showing all occurrences of jobs with failed steps based on the @history variable, i.e., the number of days back that you want to check for. In this example, all job runs in the past 14 days that have at least one failed job step, but &#8220;successful&#8221; job outcome are shown. This can be useful in determining if a particular job has been <em>successfully failing<\/em> for a period of time. Also note that this first output block will contain the same job multiple times should it have executed and <em>successfully failed<\/em> in that time period.<\/p>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">The second, showing a single row per &#8220;successful&#8221; job with at least one failed step for its most recent job execution. This is useful to get a quick list of actionable jobs to look into and resolve.<\/p>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">In a future script we&#8217;ll explore expanding this concept to include the specific job step IDs and job step names that failed to further pinpoint trouble areas.<\/p>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">And there you have it, a means for checking your entire SQL agent job history for <em>successfully failing<\/em> jobs.<\/p>\n<p style=\"font-size: 13px; letter-spacing: 1px;\"><span style=\"color: #3366ff;\"><strong>External Links<\/strong><\/span>: Here are some external links to system tables I reference within the script above:<br \/>\n<a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/system-tables\/dbo-sysjobhistory-transact-sql?view=sql-server-ver16\" target=\"_blank\" rel=\"noopener\">msdb.dbo.sysjobhistory<\/a>; <a href=\"https:\/\/docs.microsoft.com\/en-us\/sql\/relational-databases\/system-tables\/dbo-sysjobs-transact-sql?view=sql-server-ver16\" target=\"_blank\" rel=\"noopener\">msdb.dbo.sysjobs<\/a>.<br \/>\nUnfortunately I could not find any official documentation for scalar function <em>msdb.dbo.agent_datetime<\/em> within the Microsoft Docs website, however, I will reference this one by <a href=\"https:\/\/blog.sqlauthority.com\/2015\/03\/13\/sql-server-interesting-function-agent_datetime\/\" target=\"_blank\" rel=\"noopener\">Pinal Dave on SQL Authority<\/a> for your reading pleasure.<\/p>\n<hr \/>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">Did you find this useful? Did it help you solve an immediate need within your SQL Server agent estate? If so, you might want to check out <a href=\"https:\/\/brentec.ca\/sql-agent-insight\/downloads\/\" target=\"_blank\" rel=\"noopener\">SQL Agent Insight<\/a>, a product I&#8217;ve been working on during spare time to assist in determining problem areas of SQL Server agent instances.<\/p>\n<hr \/>\n<p style=\"font-size: 13px; letter-spacing: 1px;\">Did you find this useful? Do you have a better way to achieve the desired result using T-SQL? Let me know in the comments below. Cheers.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Problem: One of your tasks as a production DBA is to ensure that all SQL agent jobs run successfully 100% of the time. Whether this is for end-user reporting or your backup and restore jobs, it is imperative that jobs succeed without any failed steps within them. When a job succeeds but has failed steps &hellip;<\/p>\n<p class=\"read-more\"> <a class=\"\" href=\"https:\/\/brentec.ca\/blog\/2022\/09\/script-showing-successfully-failing-sql-server-agent-jobs\/\"> <span class=\"screen-reader-text\">Script showing successfully failing SQL Server agent jobs<\/span> Read More &raquo;<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"site-sidebar-layout":"default","site-content-layout":"default","ast-global-header-display":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"default","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","footnotes":""},"categories":[3,56,45],"tags":[],"_links":{"self":[{"href":"https:\/\/brentec.ca\/blog\/wp-json\/wp\/v2\/posts\/908"}],"collection":[{"href":"https:\/\/brentec.ca\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/brentec.ca\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/brentec.ca\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/brentec.ca\/blog\/wp-json\/wp\/v2\/comments?post=908"}],"version-history":[{"count":31,"href":"https:\/\/brentec.ca\/blog\/wp-json\/wp\/v2\/posts\/908\/revisions"}],"predecessor-version":[{"id":1010,"href":"https:\/\/brentec.ca\/blog\/wp-json\/wp\/v2\/posts\/908\/revisions\/1010"}],"wp:attachment":[{"href":"https:\/\/brentec.ca\/blog\/wp-json\/wp\/v2\/media?parent=908"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/brentec.ca\/blog\/wp-json\/wp\/v2\/categories?post=908"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/brentec.ca\/blog\/wp-json\/wp\/v2\/tags?post=908"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}