Friday, December 28, 2012

sp_server_diag_event_parser - Make SP_Server_Diagnostics EVENTS human-readable!

(update 2013/01/08 - just made another speed fix - now runs in under 20 seconds!)

This is my third post about sp_server_diagnostics, in which I've written two parsers to look at the various data returned from it.  This post is about the events component_type. Next post: query_processing.

50000 foot view: my code in this post takes the data from "events" component type, and returns:

For more info on sp_server_diagnostics in general, since it's going to be the replacement for the system_health session (SQL 2008) and the black box trace (SQL 2005), I'd recommend reading:


My first post on sp_server_diagnostics ( covered two of the rows & was pretty simple - it simply returned the memory info from "resource", and the waits from "query_processing".  Which showed sp_server_diag's power.

However, while 3-4 of the rows are fairly readable, even in XML, the component_name "events" is a bit trickier - there are multiple types of event, each one with its own different elements:

  • connectivity_ring_buffer_recorded
  • error_reported
  • resource_monitor_ring_buffer_recorded
  • scheduler_monitor_system_health_ring_buffer_recorded
  • security_error_ring_buffer_recorded
  • scheduler_monitor_non_yielding_ring_buffer_recorded
  • ...and even more

My thought was, rather than to hard-code it (which could mean I miss something), I would parse the XML dynamically, in two passes.  The first pass gets the various elements for each event type, the second uses that info to write another query to get the info from each event type.  As a side effect, it also allows me to format it better.  This is a modified version from my original post, which had some flaws.

What flaws?  Mostly that it really slow on servers with a lot of info, and it didn't return the full error message in some cases.  Fortunately, I was able to speed it up substantially through a trivial change that took me the better part of an afternoon to find - went from a pulling the XML in a derived sub-queries, to a variable.  From that one change, it now runs between 6 seconds and 2 minutes.  I also tweaked it to return more data in long error strings.

You will see cryptic fields like "callstack : : " in the results.
It comes from (sample XML, formatted strange so that it shows up here):

<data name="call_stack">

<type name="callstack" package="package0" />

<value />


As you can see, there's not really anything to report.  At the same point, I'm not sure why that's there, and I'd rather leave it in case things change in the future.  The nice side-effect is that it converts from:
<data name="tds_flags">

<type name="connectivity_record_tds_flag" package="sqlserver" />


<text>DisconnectDueToReadError, NetworkErrorFoundInInputStream, ErrorFoundBeforeLogin, SessionIsKilled, NormalDisconnect</text>

connectivity_record_tds_flag : DisconnectDueToReadError, NetworkErrorFoundInInputStream, ErrorFoundBeforeLogin, SessionIsKilled, No : 0x0000003e

If you do see a field with a value like: "int8: : 15" (where there are two semicolons), please let me know; while I've trapped all the various event datatypes that I could find, I literally found one (int16) while writing this post.

To use: Copy/paste the entire code below, create the SP, then run it (or just run all the code save the CREATE.  It can return several sets of data.  You'll also see where they reference each other: a row in connectivity_ring_buffer_recorded can reference a TDs error, which can then show up in error_reported as a network error message.

I hope this helps you troubleshoot SQL Server 2012.

Feel free to share, feel free to send me changes and enhancements.  Tell me what you like and dislike!

(Note that when copy/pasting, you may see "WHILE @min <= @max;" in the code, which SSMS can't use, and the proc won't compile. That should be a LESS-THAN symbol, followed by an EQUALS sign.  This should be fixed now - converted it to a GIST on 2014/02/24)

Thursday, December 13, 2012

Powershell - run a query against multiple servers in parallel, saving the data to a central table

(update 2013/06/13: I've superceded this with a new framework that will run almost any piece of code, powershell or sql, saving the results to a table, and running with multiple threads. Please give it a look! )

Update at 8:30pm: tweaked to add a throttled scheduler!  Runs 8 simultaneous threads to prevent too much workload on the box running the PS script.

I have a repository server.
Said repository server has a list of servers I monitor.
I want to be able to run a piece of code against all of them, AT THE SAME TIME, and save those results to  a table on my repository.  The code currently takes an hour to run, I'd like to shrink that.  Looking online, it looks like this is fairly basic code, but I hadn't seen it put together like this.

First, we need a table to save the results to.  Simple results, simple table.  Proof of concept, here, so only two fields.

CREATE TABLE myserverinfo (servername sysname, thedate datetime)

The next part needs two existing Powershell scripts written by "Hey Scripting Guy", invoke-sqlcmd2 and write-datatable.  You will need to get those.  I save (and reference) them in c:\sql_tools.

The powershell script.  Feel free to drag, copy, cut, paste and save as a .ps1 file:

#Run SQL query against multiple servers in parallel, saving results to a central table
# 1.00 2012/12/13 mdb / TBD
# 1.10 2012/12/13 mdb / TBD.  Adding throttling scheduler
# 1.11 2012/12/13 mdb adding comment on WHY you want the wait-job
# Code cribbed and lessons learned courtesy of: Hey Scripting Guy, 
# Aaron Bertrand (the bit with $args[0], his "sad panda face" post)
# Kendra Little.  Throttling scheduler is from "start-automating" on 
# Stackoverflow.  Errors are mine, not theirs.  
# Please keep this header and let me know how it works.
. C:\sql_tools\invoke-sqlcmd2.ps1;
$serverlist = invoke-sqlcmd2 -serverinstance "myrepositoryserver" -query "SELECT server FROM dba.dbo.myserverlist WHERE active = 1" #get the list of your servers to monitor
foreach ($server in $serverlist)
    $jobsrunning = @(Get-Job | Where-Object { $_.JobStateInfo.State -eq 'Running' })
    if ($jobsrunning.Count -le 8) #keeps 8 jobs going simultaneously
        start-job -argumentlist $server.server -scriptblock `
        #have to reimport functions here due to the scriptblock IIRC
        . C:\sql_tools\invoke-sqlcmd2.ps1;
        . C:\sql_tools\write-datatable.ps1;
        $quer = invoke-sqlcmd2 -serverinstance $args[0] -database "master" `
            -query "waitfor delay '00:00:10'; select @@servername as servername, getdate() as thedate" -As 'DataTable'
        Write-DataTable -ServerInstance "myrepositoryserver" -Database "dba" -TableName "myserverinfo" -Data $quer
        $jobsrunning | Wait-Job -Any  #as soon as any job finishes, do the next

get-job | wait-job -timeout 10         #wait 10 seconds or until all jobs are 
# done, whichever comes first.  Gives us a timeout before killing the slowpokes.
#additionally, if you schedule a job to run this, not setting a -timeout allows 
#it to wait for all the jobs to finish before quitting.  They wont run in the 
#background like you think they should.  

get-job|Remove-Job -force #cleanup and remove all jobs.  Kills stragglers.

SQLSentry - new feature in 7 (7.1?) that old customers need to know about

So, we lost our SQL Sentry database server last week - thing took a dive.  And, oddly enough, we didn't get a notification from SQL Sentry, even though the database server was being monitored.


However, from talking with their tech support (who, honestly, are pretty awesome), there's a new notification that can be enabled, that fixes that.  It's not on by default (though I told them they need to change that), so you SHOULD enable it.  

In the Navigator Pane, go to Shared Groups (Global)
Open Failsafe Actions (on mine, it's a tab on the right side of the app)
Add "Monitoring Service: SQL Sentry Database Offline", Send Email
Add "Monitoring Service: SQL Sentry Database Online", Send Email

It also seems like they've improved the way the monitoring services work these days - it was scary easy to add a secondary site that uses our main SQL Sentry database, but monitors only servers at that site.  It's also really easy to add a backup/load-balancing monitoring service in general - install the service on another box, point it at your main SQL Sentry database, and away you go.

May my pain be your gain.

Monday, November 26, 2012

[Active Directory] iCloud, Proxy Servers, & locked out account

Came back from vacation and my account kept locking after I logged in.  After a couple hours of digging, I finally gave up, ran Ethereal (packet sniffer) and looked for bad messages.

Lo and behold: iCloud was pummeling the proxy server up to 10 times a second, trying to log in via the proxy. It somehow had cached an old AD password and was sending that to get out, getting rejected, and trying again & again, locking my account.

Friday, November 9, 2012

[SQL Server 2012 SSIS] Getting a useful email on job failure

(updated 2013/02/07 - stupid blogspot ate my HTML tags. Not helpful. fixed.)
(updated 2013/02/14 - making email more bulletproof)
(updated 2013/07/31 - adding info about SSIS "cancelled")
(update 2013/09/05 - adding info about KB 2829948) 

So... SQL Server 2012 has horribly useless emails for SSIS job failures.  If someone is at PASS, please find out whose idea this was, walk up to them, and punch them for me.
If your job fails, here's the error message you get in job history (and therefore to your blackberry or equivalent at 3am):

Executed as user: mydomain\myuser. Microsoft (R) SQL Server Execute Package Utility  Version 11.0.2100.60 for 64-bit  Copyright (C) Microsoft Corporation. All rights reserved.    Started:  5:00:25 PM  Package execution on IS Server failed. Execution ID: 701, Execution Status:4.  To view the details for the execution, right-click on the Integration Services Catalog, and open the [All Executions] report  Started:  5:00:25 PM  Finished: 5:01:43 PM  Elapsed:  77.938 seconds.  The package execution failed.  The step failed.

As my daughter would say (with amazing derision for someone so young):

So my job has failed, and in order to figure out why (and if it's worth getting up for) I have to:
  1. Start up my desktop
  2. VPN to work (since with my luck it's 3am)
  3. RDP into my work machine
  4. Open SSMS 
  5. Connect to that server
  6. Open the SSISDB dashboard report. (oh, and it only holds data for the last 24 hours)
Punching's too good for them.  (And is part of why I disdain SSIS, which seems to actively hate Production DBAs)

Fortunately, the details are stored in a table: [ssisdb].[catalog].[event_messages].

According to SQL Sentry (who we use for a lot of jobs), we could also do this:
I don't use it, because I don't like the changes necessary to put it in place.
This appears fairly minor, I think - it changes each job step to invoke SSIS via the DTS Execute Package Utility directly, changes the job to "Log To Table" the job step history, and other stuff.  However, it also means we can't easily change some of the configurations, among other things.

(However, I did make a version that works with SQL Sentry - instead of adding job steps on SSIS jobs, add a setting to SQL Sentry to run sql code on failure:

So as a proof of concept fix that I use in production now, I wrote a stored procedure that uses the event_messages table and Job Tokens to send an email with the failure info.  Sounds imposing, but fairly easy... and should be scriptable (we're also looking at ways to do it without having to add a job step).

At its core, here's the code we're running.  Run this to get the most recent failure info.

  FROM ssisdb.[catalog].[event_messages]
WHERE event_name = 'OnError'
       AND operation_id IN (SELECT MAX(operation_id)
  FROM ssisdb.[catalog].[event_messages]
  WHERE event_name = 'OnError')
ORDER BY message_time ASC

1. Run the below stored procedure in your "DBA tools" database, whatever it's called, remembering to change the email settings.
2. IN SSMS 2012, create a job step with the following code, and make both the On Success and On Failure be "Quite the job reporting failure", so your other methods will pick up on it.

EXEC DBA_tools.dbo.ssis_job_failure_info @job_id = $(ESCAPE_SQUOTE(JOBID)), @current_step_id = $(ESCAPE_SQUOTE(STEPID))

While it looks complex, all it's doing is using Token Replacement (which is enabled by default) to send the job_id and the step ID.  The SP then uses that to find the most recently failed job step that ISN'T itself. (though on writing this I realize the job step doesn't actually fail until the code has run - doh!)

3. Change all your job steps' "On Failure" from "Quit the job reporting failure" to "run step X", where X is the job step created in step 2 above.

4. Change the next-to-last-step's "On Success"

5. Accept the "The On Success action of the last Step will be changed... to Quit With Success" 

6. Test!

One note: if the job fails because of an SSIS timeout (see:, no email is sent. That's because, while the job failed, technically the package doesn't know it failed, it thinks it was cancelled... or something like that.  Read the Connect item for more details.  Supposedly fixed in SP1, but haven't had it happen up to now. but if it happens to you, there are a handful of fixes available.

  1. Add the 5 indexes as per
  2. Change Retention period as per: 
    1. Don't just set it to 90 if you have a lot of history! You need to iterate through days, running the maint job between each.
  3. Change DB size settings as per same post
  4. Ensure Database is in SIMPLE mode.
  5. (possibly turn on snapshot isolation)
  6. (I also turn on Async Statistics Update) 
  7. CHECK THE EVENT LOGS on the server for details.  It won't be in the SQL Server's ERRORLOG.
  8. - There is a fix from MS!   SP1 CU4 has a fix.  It says it just adds indexes; I suspect it changes one of the delete SPs as well, since adding the indexes did NOT fix it for me.

Enjoy - TBD

CREATE PROCEDURE ssis_job_failure_info 
@job_id UNIQUEIDENTIFIER, @current_step_id INT AS DECLARE @full_ssis_command VARCHAR(4000) , @job_step_id INT , @package_name VARCHAR(4000) , @tableHTML NVARCHAR(MAX) , @MailSubject VARCHAR(200) , @job_name VARCHAR(100) --token replacement happens in the job. --select @job_id = $(ESCAPE_SQUOTE(JOBID)), @current_step_id = $(ESCAPE_SQUOTE(STEPID)) SELECT @job_name = name FROM msdb.dbo.sysjobs WHERE job_id = @job_id --determine which job_step failed. SELECT top 1 @job_step_id = step_id FROM msdb.dbo.sysjobhistory WHERE run_status <> 1 AND step_id > 0 AND job_id = @job_id AND step_id <> @current_step_id ORDER BY instance_id DESC --now find the package name SELECT @full_ssis_command = command FROM msdb.dbo.sysjobsteps WHERE job_id = @job_id AND step_id = @job_step_id IF @full_ssis_command LIKE '%.dtsx%' BEGIN SELECT @package_name = RIGHT(LEFT(@full_ssis_command,CHARINDEX('.dtsx',@full_ssis_command)-1),CHARINDEX('\',REVERSE(LEFT(@full_ssis_command,CHARINDEX('.dtsx',@full_ssis_command)-1)))-1)+'.dtsx' END --goes in the error log, if you have one SELECT [message_time] ,[extended_info_id] ,[package_name] ,[message_source_name] ,[subcomponent_name] ,[package_path] ,[execution_path] ,left([message],400) FROM ssisdb.[catalog].[event_messages] WHERE [package_name] = @package_name AND event_name = 'OnError' AND operation_id IN (SELECT MAX(operation_id) FROM ssisdb.[catalog].[event_messages] WHERE [package_name] = @package_name) ORDER BY message_time ASC SELECT @MailSubject = 'Job Failure on ' + @@servername + ': ' + @job_name FROM msdb.dbo.sysjobs WHERE job_id = @job_id SET @tableHTML = N'<H3>Error for job ' + @job_name + '</H3>' + N'<table border="1">' + N'<th>Message_Time</th>' + N'<th>Extended_info_id</th>' + N'<th>Package_Name</th>' + N'<th>Message_Source_Name</th>' + N'<th>subcomponent_name</th>' + N'<th>package_path</th>' + N'<th>execution_path</th>' + N'<th>message</th>' + CAST(( SELECT td = CONVERT(VARCHAR(24),message_time,121) , '' , td = CONVERT(VARCHAR(10), ISNULL(extended_info_id,'')) , '' , td = CONVERT(VARCHAR(50), RTRIM(LTRIM(ISNULL(package_name,'')))) , '' , td = CONVERT(VARCHAR(50),RTRIM(LTRIM(ISNULL([message_source_name],'')))) , '' , td = CONVERT(VARCHAR(50), RTRIM(LTRIM(ISNULL([subcomponent_name],'')))) , '' , td = CONVERT(VARCHAR(50), RTRIM(LTRIM(ISNULL([package_path],'')))) , '' , td = CONVERT(VARCHAR(50),RTRIM(LTRIM(ISNULL([execution_path],'')))) , '' , td = CONVERT(VARCHAR(400),RTRIM(LTRIM(left(ISNULL([message],''),400)))) FROM ssisdb.[catalog].[event_messages] WHERE [package_name] = @package_name AND event_name = 'OnError' AND operation_id IN (SELECT MAX(operation_id) FROM ssisdb.[catalog].[event_messages] WHERE [package_name] = @package_name AND event_name = 'OnError') ORDER BY event_messages.message_time FOR XML PATH('tr') , TYPE ) AS NVARCHAR(MAX)) + N'</table>' ; --PRINT @tableHTML EXEC msdb.dbo.sp_send_dbmail @profile_name = 'the_baking_dba', @recipients = '', @subject = @MailSubject, @body = @tableHTML, @body_format = 'HTML' ; GO

Wednesday, October 17, 2012

Master of all I survey - using Event Notifications to track code changes and more across multiple SQL servers

Please check here if running my code.  I will post updates as I find/need them.  Google docs is current.
Latest version: 11/14/2012.

(last update 2013/07/31 - looks like it's alive and well in 2014!)



I specifically went with a grandiose title to get people interested in this - it's (Event Notifications) a wonderful feature that very few people use (or even know about!), and now that I use it I find it absolutely vital.  I hope I get across how great this feature is, especially since it works from SQL Server 2005 - 2014.

Below is a link to my slide deck, demos, scripts and reports.

The end result is a table that keeps track of code changes across potentially hundreds of servers.  I've got it running on more than a dozen 50 servers, and am slowly adding them to every server in our environment.

One of the takeaways: once you have it running, you can run the following report in SSMS - right-click on an object, choose the RDL above via "Custom Report...", and it'll show details.  Changes, index maint (if a table), etc.  Works on SP/fxns, tables, databases, and servers - click on an object, it'll provide details on that object.  Click on a DB, it'll give you all changes for that DB.  Click on Server...same thing.
This works in SSMS 2005/2008/2012.  And let me state for the record, it sucks that you have to develop it in BIDS 2005 in order for SSMS 2008 to view it properly.

I'll be tweaking this report; it's bare-bones and not particularly pretty, but is a really handy way to track down changes.

A couple examples: (apologies for the size, but otherwise you can't easily see the commandtext which shows the actual commands executed - click to enlarge)

In order for the reports to work in SSMS, you will need to create a linked server on each monitored server, pointing at the server that holds EventNotificationRec, with the name ENRepository (collation compatible true and RPC OUT on).  This way when you right-click on an object it will send the server/database/objectname across on each server - this is needed because SSMS does not let you use a different data source.

The version of the report shown this evening is EN_ReportSP.  I will be tweaking it, as well as the other code.  Watch this space.

Please feel free to use, share, improve.  Please contact me for commercial use (hey, a guy can hope).

Wednesday, October 3, 2012

[SSMS 2012] Getting my CTRL+SHIFT+C back for comment

I'm sorry, I just can't CTRL+K, CTRL+C.  Too many years using Query Analyzer (which, to this day, still does things easily that SSMS doesn't).

How do you get it back?

  • First, Tools->Options->Keyboard.  
  • Show commands containing -> comment
  • Click in "Press shortcut keys" and do CTRL+SHIFT+C.  It will say "Shortcut currently used by:" write all of them down, if there's more than one (it's a drop-box)
  • Under "Shortcuts for selected command", remove each one that isn't CTRL+SHIFT+C
  • Now go to each of the commands that you wrote down earlier and remove CTRL+SHIFT+C from each.

Click OK, and you're done.

Wednesday, September 26, 2012

[Replication] IDENTITY, failed inserts due to primary key, and no-longer-replicated tables

We recently had a problem where we had to break replication to several table, but keep data flowing to them from another data source.  So we built a process that inserted new records into the no-longer-replicated table

However, our inserts failed, telling us there was a problem with the Primary Key. "Violation of the primary key" because it didn't accept NULL.  Which was odd, because when you look at it the tables clearly have an identity column.  Eventually we (okay, it was JS) thought to check the identity column:


Which returned NULL(!).  Simply doing


Pushed the values back to a real number, at which point the inserts were able to occur.

Note that you cannot RESEED an identity back to NULL.  I have no doubt there's a way, because SQL does it for replication, but if you want to go back to replicating that table, you'll probably need to snapshot it.

Tuesday, September 25, 2012

sp_server_diagnostics - dynamic EVENTS parser

(deprecated - see newer post for a much faster parser with more info)

Here's part 2 - parsing the error logs out of sp_server_diagnostics.  Copy and paste the entire article; code overflows the column but is still copyable.

First, get the data. I use a global temp table for it, and I don't drop it until I'm done troubleshooting.
sp_server_diagnostics should run continuously and you should be able to pull it that way, but I can't remember from where.  Thus, I just run it for 5 seconds to populate the table.

Feel free to share, feel free to send me changes and enhancements.  FYI - if you have a lot of errors, the XML parsing is SLOW if you have a lot of errors.  This runs in under 20 seconds on a normal box, but on one of my trouble boxes it can take 2.5-5 minutes.  I have no doubt someone GOOD at XQuery can make this go considerably faster - and I'd love to know how.  If there are no errors in the results, nothing will be returned.

I hope this helps you troubleshoot SQL Server 2012.

--takes 5 seconds to run, only valid on 2012 servers
if object_id('tempdb..##SpServerDiagnosticsResult') is null 
CREATE TABLE ##SpServerDiagnosticsResult 
  create_time DateTime,
  component_type varchar(128),
  component_name varchar(128),
  state int,
  state_desc varchar(20),
  data varchar(max)

INSERT INTO ##SpServerDiagnosticsResult
EXEC sys.sp_server_diagnostics

--SP_SERVER_DIAGNOSTICS Dynamic Event Parser V 1.11
--You may use this at will, you may share it provided this header remains.  
-- Written 2012 Michael Bourgon
--Commercial use prohibited without permission - if you charge money for using this, I want a cut.
-- If you're just running this as your job as a DBA, enjoy.
-- Due to the XML parsing this takes about 4 minutes to run, of which over half is the parsing to generate the selects
-- Please feel free to share, and feel free to send corrections or enhancements -
-- Thanks to Marc_S on Stackoverflow for the help on parsing XML!
DECLARE @min int, @max INT, @eventtype VARCHAR(100)

--get a list of event types, then walk through each separately; columns won't match 
INSERT INTO @events (EventName)
    DISTINCT EventName = Evt.value('(@name)[1]', 'varchar(100)')
SELECT CAST(data AS XML) AS xml_data 
FROM ##SpServerDiagnosticsResult 
WHERE component_name = 'events'
CROSS APPLY xml_data.nodes('/events/session/RingBufferTarget/event') Tbl(Evt)

--Loop - for each event type, generate a SQL script for those columns
SELECT @min = MIN(id), @max = MAX(id) FROM @events
WHILE @min <= @max
SET @sql = NULL 
SELECT @eventtype = EventName FROM @events WHERE id = @min

--header for the query
SELECT @sql = N'select 
EventName = Evt.value(''(@name)[1]'', ''varchar(100)'')
,OriginalTime = Evt.value(''(@timestamp)[1]'', ''varchar(100)'')' + CHAR(10) + CHAR(9)

--meat of the query - get the data for each unique TYPE, if a normal value.
-- if the subdatatype is not a "normal" type, we assume we want a name/text/value
-- we use varchar(100) for that, rather than a separate CASE, for speed
-- SO, don't just add varchar(100) to this CASE without understanding why.
SELECT @sql = @sql + 
N' ,' + SubEventName + 
+ CASE SubDataType
when N'int' THEN N' = Evt.value(''(data[@name="' + SubEventName + '"]/value' + ')[1]'', ''' + SubDataType + ''')' + CHAR(10) + CHAR(9)
WHEN N'bigint' THEN N' = Evt.value(''(data[@name="' + SubEventName + '"]/value'+ ')[1]'', ''' + SubDataType + ''')'  + CHAR(10) + CHAR(9)
WHEN N'unicode_string' THEN N' = Evt.value(''(data[@name="' + SubEventName + '"]/value' + ')[1]'', ''' + SubDataType + ''')' + CHAR(10) + CHAR(9)
WHEN N'uniqueidentifier' THEN N' = Evt.value(''(data[@name="' + SubEventName + '"]/value' + ')[1]'', ''' + SubDataType + ''')' + CHAR(10) + CHAR(9)
WHEN N'nvarchar(100)' THEN N' = Evt.value(''(data[@name="' + SubEventName + '"]/value' + ')[1]'', ''' + SubDataType + ''')' + CHAR(10) + CHAR(9)
WHEN N'bit' THEN N' = Evt.value(''(data[@name="' + SubEventName + '"]/value' + ')[1]'', ''' + SubDataType + ''')' + CHAR(10) + CHAR(9)
ELSE N' = isnull(Evt.value(''(data[@name="' + SubEventName + '"]/type/@name)[1]'', ''varchar(100)''),'''') + '' : ''
    + isnull(Evt.value(''(data[@name="' + SubEventName + '"]/text)[1]'', ''varchar(100)''),'''') + '' : ''
    + isnull(Evt.value(''(data[@name="' + SubEventName + '"]/value)[1]'', ''varchar(100)''),'''')' + CHAR(10) + CHAR(9)
--break out each event type for the larger query; could just use nvarchar/varchar, but returning the right data type is cleaner
-- (and we need to know when it's a non-standard type for the name/text/value)
(select distinct
--EventName = Evt.value('(../@name)[1]', 'nvarchar(100)'), --disabled since it's not actually used
SubEventName = Evt.value('(@name)[1]', 'nvarchar(100)'),
SubDataType = CASE Evt.value('(type/@name)[1]', 'nvarchar(100)') 
WHEN 'int32' THEN N'int'
WHEN 'uint16' THEN N'int'
WHEN 'boolean' THEN N'bit'
WHEN 'unicode_string' THEN N'nvarchar(100)'
WHEN 'uint32' THEN N'bigint'
WHEN 'uint64' THEN N'nvarchar(100)'
WHEN 'guid' THEN N'uniqueidentifier'
WHEN 'ansi_string' THEN N'nvarchar(100)'
ELSE N'varchar(100)' END --if unknown, then probably name/text/value. 
SELECT CAST(data AS XML) AS xml_data FROM ##SpServerDiagnosticsResult WHERE component_name = 'events'
CROSS APPLY xml_data.nodes('/events/session/RingBufferTarget/event/data') Tbl(Evt)
WHERE Evt.value('(../@name)[1]', 'varchar(100)') = @eventtype

--and the footer for our query; might be able to do a dual CROSS APPLY, but this is more readable
SELECT @sql = @sql + N'
from (
SELECT CAST(data AS XML) AS xml_data FROM ##SpServerDiagnosticsResult WHERE component_name = ''events''
CROSS APPLY xml_data.nodes(''/events/session/RingBufferTarget/event'') Tbl(Evt)
WHERE Evt.value(''(@name)[1]'', ''varchar(100)'') = ''' + @eventtype + ''''

EXEC sp_executesql @sql
-- PRINT @sql

SET @min = @min + 1

sp_server_diagnostics - parsing (part 1 - RESOURCE)

SQL Server 2012 offers a new way to view problems on your server.  It's light-weight, it's already running, and it captures all sorts of information.

It's called sp_server_diagnostics.

There's just one problem - the results are in XML and are hard to read.  We can fix that.  We have the Xquery.

Here's part 1, the results from the memory and waits.
Part 2 is the tricky part - events.  Forthcoming.

First, let's catch the data.

--takes 5 seconds to run, only valid on 2012 servers
if object_id('tempdb..##SpServerDiagnosticsResult') is null 
CREATE TABLE ##SpServerDiagnosticsResult 
 create_time DateTime,
 component_type varchar(128),
 component_name varchar(128),
 state int,
 state_desc varchar(20),
 data varchar(max)

INSERT INTO ##SpServerDiagnosticsResult
EXEC sys.sp_server_diagnostics

--now to parse
--idea pilfered from Neil Hambly, written by M Bourgon
--Memory Info
SELECT a.b.value('@description','varchar(100)') AS descript, a.b.value('@value','bigint') AS val FROM 
SELECT CAST(data AS XML) AS xml_data FROM ##SpServerDiagnosticsResult
CROSS APPLY xml_data.nodes('/resource/memoryReport/*') a(b)

--top 10 wait types
SELECT a.b.value('@waitType','varchar(100)') AS WaitType,
  a.b.value('@waits','bigint') AS WaitType_Waits,
  a.b.value('@averageWaitTime','bigint') AS WaitType_Avg,
  a.b.value('@maxWaitTime','bigint') AS WaitType_Max
SELECT CAST(data AS XML) AS xml_data FROM ##SpServerDiagnosticsResult
CROSS APPLY xml_data.nodes('/queryProcessing/topWaits/nonPreemptive/byCount/*') a(b)

Wednesday, September 12, 2012

[Powershell] Building a dm_os_wait_stats repository

(update 2013/06/13: I've superseded this with a new framework that will run almost any piece of code, powershell or sql, saving the results to a table, and running with multiple threads. Please give it a look! )

Inspired by Paul Randall's talk at PluralSight on Waits & Queues (highly recommended), I've built a repository of dm_os_wait_stats by cobbling together some other people's code.

Lots of code here, but for you it's a matter of copy & pasting 4 files, a table, and a job.  Call this my 1.0 release. For 1.0, I'm only doing one server at a time; yes, I plan on multithreading it, but it does 80 servers in under 3 minutes.  And yes, if it can't reach a server it will throw a message (that I don't currently capture), but it does continue.

What we're doing:

  • grab a list of servers that you already have stored in a database somewhere
  • for each server
    • run Paul Randall's code that aggregates the overall wait stats
    • save results to a central server (probably where you have your server list)
Powershell code pilfered from Chad Miller, SQL code from Paul Randall.  

First, grab the scripts for invoke-sqlcmd2 ( and write-datatable ( and save to files named invoke-sqlcmd2.ps1 and write-datatable.ps1, respectively.  Everything goes in  c:\sql_scripts.

Next, the actual query from Paul Randall; save this as get_dm_os_wait_stats.ps1. This gets useful info from the DMV.

        wait_time_ms / 1000.0 AS WaitS,
        (wait_time_ms - signal_wait_time_ms) / 1000.0 AS ResourceS,
        signal_wait_time_ms / 1000.0 AS SignalS,
        waiting_tasks_count AS WaitCount,
        100.0 * wait_time_ms / SUM (wait_time_ms) OVER() AS Percentage,
        ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS RowNum
    FROM sys.dm_os_wait_stats
    WHERE wait_type NOT IN (
--mdb 2012/09/12 adding 2012-specific waits to ignore
    @@servername as server_name, getdate() as insert_datetime, W1.wait_type AS WaitType, 
    CAST (W1.WaitS AS DECIMAL(14, 2)) AS Wait_S,
    CAST (W1.ResourceS AS DECIMAL(14, 2)) AS Resource_S,
    CAST (W1.SignalS AS DECIMAL(14, 2)) AS Signal_S,
    W1.WaitCount AS WaitCount,
    CAST (W1.Percentage AS DECIMAL(4, 2)) AS Percentage,
    CAST ((W1.WaitS / W1.WaitCount) AS DECIMAL (14, 4)) AS AvgWait_S,
    CAST ((W1.ResourceS / W1.WaitCount) AS DECIMAL (14, 4)) AS AvgRes_S,
    CAST ((W1.SignalS / W1.WaitCount) AS DECIMAL (14, 4)) AS AvgSig_S
FROM Waits AS W1
    INNER JOIN Waits AS W2 ON W2.RowNum <= W1.RowNum
GROUP BY W1.RowNum, W1.wait_type, W1.WaitS, W1.ResourceS, W1.SignalS, W1.WaitCount, W1.Percentage
HAVING SUM (W2.Percentage) - W1.Percentage < 95; -- percentage threshold

Now, create the table for the results.  The identity column is at the end so that write-datatable doesn't balk.  Technically the ID is not needed, but I can more easily see how the process is doing.

CREATE TABLE [dbo].[dm_os_wait_stats_info](
      [server_name] [sysname] NOT NULL,
      [insert_datetime] [datetime] NOT NULL,
      [WaitType] [varchar](120) NOT NULL,
      [Wait_S] [decimal](14, 2) NULL,
      [Resource_S] [decimal](14, 2) NULL,
      [Signal_S] [decimal](14, 2) NULL,
      [WaitCount] [bigint] NULL,
      [Percentage] [decimal](4, 2) NULL,
      [AvgWait_S] [decimal](14, 4) NULL,
      [AvgRes_S] [decimal](14, 4) NULL,
      [AvgSig_S] [decimal](14, 4) NULL,   
--ID at the end, otherwise the write-datatable chokes
      [id] [bigint] IDENTITY(1,1) NOT NULL,

Next, save the following code to a file named get_dm_os_wait_stats.ps1.  I put it in c:\sql_scripts.  The first two lines "dot source" the scripts so that their functions can be called.  The third is the actual heavy lifter.

. c:\sql_scripts\invoke-sqlcmd2.ps1
. c:\sql_scripts\write-datatable.ps1
invoke-sqlcmd2 -serverinstance "serverwithserverlist" -query "SELECT server_names FROM yourdatabase.dbo.yourserverlist WHERE active = 1" | foreach-object {$dt = invoke-sqlcmd2 -erroraction silentlycontinue -serverinstance $_.server -inputfile c:\sql_scripts\dm_os_wait_stats_agg.sql -As 'Datatable'; write-datatable -serverinstance "serverwithserverlist" -DATABASE "targetdb" -tablename "dm_os_wait_stats_info" -DATA $dt}

Finally, create the job.  Only needs one job step, set as Operating system (CmdExec).  Schedule that however often you want - I'd say either hourly or daily.  You'll want another job to delete old records (I'll leave that as an exercise for the reader)
powershell "& c:\sql_scripts\get_dm_os_wait_stats.ps1"

And that's pretty much it!  

Wednesday, September 5, 2012

[SSIS] fun with SSIS 2012 - config UNC paths in jobs

Ran into a problem with Jobs invoking SSIS packages, where you overwrite the Configuration in the job (job step->general->configuration, value).
For some reason, this type of UNC path doesn't work:
but, this does:

Tuesday, September 4, 2012

[statistics] Find last update time for all statistics for a table

(tbd 2015/09/09 - added truncate table to the loop.  Doh!)

Statistics matter. So you want to find when they were last updated.  You usually see this one:

SELECT name AS index_name,
STATS_DATE(OBJECT_ID, index_id) AS StatsUpdated
FROM sys.indexes

Which works - for all statistics tied to an index.  But SQL Server has more than just those: there are also the _WA_Sys stats, which begin with that phrase and are created by the engine.  (IIRC, you can also see DTA stats, built by the Database Tuning Advisor) 

In theory you shouldn't have many of them, because they're only generated if you're querying a field that isn't part of an index.  Theoretically you should clean them out (that may be another post), since a stat could be in there twice - once for an index and once because you queried the table (creating the stat) before you created the index.

This should list all of them.  

DECLARE @sql NVARCHAR (1000), @MIN INT, @MAX INT, @statname NVARCHAR(128)
declare @listofstats TABLE (id int identity, NAME sysname)
INSERT INTO @listofstats (name) 
SELECT name FROM sys.stats WHERE object_id = object_id('yourtablenamehere')
if object_id('tempdb..#stats_info') is not null
    DROP TABLE #stats_info
CREATE TABLE #stats_info (updated DATETIME, [Table cardinality] BIGINT, [snapshot ctr] BIGINT, steps INT, density DECIMAL(19,16), [rows above] INT, [rows below] INT, [squared variance error] DECIMAL (19,16), [inserts since last update] MONEY, [deletes since last update] MONEY, [leading column type] VARCHAR(50))

if object_id('tempdb..#stats_info2') is not null
    DROP TABLE #stats_info2
CREATE TABLE #stats_info2 (stat_name VARCHAR(128), updated DATETIME, [Table cardinality] BIGINT, [snapshot ctr] BIGINT, steps INT, density DECIMAL(19,16), [rows above] INT, [rows below] INT, [squared variance error] DECIMAL (19,16), [inserts since last update] MONEY, [deletes since last update] MONEY, [leading column type] VARCHAR(50))

WHILE @min <= @max
TRUNCATE TABLE #stats_info
 SELECT @sql = NULL, @statname = null
SELECT @statname = NAME FROM @listofstats WHERE id = @min
SELECT @sql = N'DBCC SHOW_STATISTICS (''yourdatabasenamehere..yourtablenamehere'','''+ @statname +''')'
INSERT INTO #stats_info 
EXEC master..sp_executesql @sql
INSERT INTO #stats_info2 SELECT @statname, * FROM #stats_info
SET @min = @min + 1


SELECT #stats_info2.stat_name, #stats_info2.updated FROM #stats_info2 INNER JOIN (SELECT stat_name, MAX(updated) AS max_date FROM #stats_info2 GROUP BY stat_name)a 
ON #stats_info2.stat_name = a.stat_name AND #stats_info2.[updated] = a.max_date
ORDER BY #stats_info2.stat_name