Wednesday, September 12, 2012

[Powershell] Building a dm_os_wait_stats repository

(update 2013/06/13: I've superseded this with a new framework that will run almost any piece of code, powershell or sql, saving the results to a table, and running with multiple threads. Please give it a look!
http://thebakingdba.blogspot.com/2013/04/servers-extensible-powershell.html )

Inspired by Paul Randall's talk at PluralSight on Waits & Queues (highly recommended), I've built a repository of dm_os_wait_stats by cobbling together some other people's code.

Lots of code here, but for you it's a matter of copy & pasting 4 files, a table, and a job.  Call this my 1.0 release. For 1.0, I'm only doing one server at a time; yes, I plan on multithreading it, but it does 80 servers in under 3 minutes.  And yes, if it can't reach a server it will throw a message (that I don't currently capture), but it does continue.

What we're doing:

  • grab a list of servers that you already have stored in a database somewhere
  • for each server
    • run Paul Randall's code that aggregates the overall wait stats
    • save results to a central server (probably where you have your server list)
Powershell code pilfered from Chad Miller, SQL code from Paul Randall.  
and

First, grab the scripts for invoke-sqlcmd2 (http://gallery.technet.microsoft.com/ScriptCenter/en-us/7985b7ef-ed89-4dfd-b02a-433cc4e30894) and write-datatable (http://gallery.technet.microsoft.com/ScriptCenter/en-us/2fdeaf8d-b164-411c-9483-99413d6053ae) and save to files named invoke-sqlcmd2.ps1 and write-datatable.ps1, respectively.  Everything goes in  c:\sql_scripts.

Next, the actual query from Paul Randall; save this as get_dm_os_wait_stats.ps1. This gets useful info from the DMV.

WITH Waits AS
    (SELECT
        wait_type,
        wait_time_ms / 1000.0 AS WaitS,
        (wait_time_ms - signal_wait_time_ms) / 1000.0 AS ResourceS,
        signal_wait_time_ms / 1000.0 AS SignalS,
        waiting_tasks_count AS WaitCount,
        100.0 * wait_time_ms / SUM (wait_time_ms) OVER() AS Percentage,
        ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS RowNum
    FROM sys.dm_os_wait_stats
    WHERE wait_type NOT IN (
        'CLR_SEMAPHORE', 'LAZYWRITER_SLEEP', 'RESOURCE_QUEUE', 'SLEEP_TASK',
        'SLEEP_SYSTEMTASK', 'SQLTRACE_BUFFER_FLUSH', 'WAITFOR', 'LOGMGR_QUEUE',
        'CHECKPOINT_QUEUE', 'REQUEST_FOR_DEADLOCK_SEARCH', 'XE_TIMER_EVENT', 'BROKER_TO_FLUSH',
        'BROKER_TASK_STOP', 'CLR_MANUAL_EVENT', 'CLR_AUTO_EVENT', 'DISPATCHER_QUEUE_SEMAPHORE',
        'FT_IFTS_SCHEDULER_IDLE_WAIT', 'XE_DISPATCHER_WAIT', 'XE_DISPATCHER_JOIN', 'BROKER_EVENTHANDLER',
        'TRACEWRITE', 'FT_IFTSHC_MUTEX', 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP',
        'BROKER_RECEIVE_WAITFOR', 'ONDEMAND_TASK_QUEUE', 'DBMIRROR_EVENTS_QUEUE',
        'DBMIRRORING_CMD', 'BROKER_TRANSMITTER', 'SQLTRACE_WAIT_ENTRIES',
        'SLEEP_BPOOL_FLUSH', 'SQLTRACE_LOCK'
--mdb 2012/09/12 adding 2012-specific waits to ignore
,'DIRTY_PAGE_POLL','HADR_FILESTREAM_IOMGR_IOCOMPLETION'
)
    )
SELECT
    @@servername as server_name, getdate() as insert_datetime, W1.wait_type AS WaitType, 
    CAST (W1.WaitS AS DECIMAL(14, 2)) AS Wait_S,
    CAST (W1.ResourceS AS DECIMAL(14, 2)) AS Resource_S,
    CAST (W1.SignalS AS DECIMAL(14, 2)) AS Signal_S,
    W1.WaitCount AS WaitCount,
    CAST (W1.Percentage AS DECIMAL(4, 2)) AS Percentage,
    CAST ((W1.WaitS / W1.WaitCount) AS DECIMAL (14, 4)) AS AvgWait_S,
    CAST ((W1.ResourceS / W1.WaitCount) AS DECIMAL (14, 4)) AS AvgRes_S,
    CAST ((W1.SignalS / W1.WaitCount) AS DECIMAL (14, 4)) AS AvgSig_S
FROM Waits AS W1
    INNER JOIN Waits AS W2 ON W2.RowNum <= W1.RowNum
GROUP BY W1.RowNum, W1.wait_type, W1.WaitS, W1.ResourceS, W1.SignalS, W1.WaitCount, W1.Percentage
HAVING SUM (W2.Percentage) - W1.Percentage < 95; -- percentage threshold


Now, create the table for the results.  The identity column is at the end so that write-datatable doesn't balk.  Technically the ID is not needed, but I can more easily see how the process is doing.


CREATE TABLE [dbo].[dm_os_wait_stats_info](
      [server_name] [sysname] NOT NULL,
      [insert_datetime] [datetime] NOT NULL,
      [WaitType] [varchar](120) NOT NULL,
      [Wait_S] [decimal](14, 2) NULL,
      [Resource_S] [decimal](14, 2) NULL,
      [Signal_S] [decimal](14, 2) NULL,
      [WaitCount] [bigint] NULL,
      [Percentage] [decimal](4, 2) NULL,
      [AvgWait_S] [decimal](14, 4) NULL,
      [AvgRes_S] [decimal](14, 4) NULL,
      [AvgSig_S] [decimal](14, 4) NULL,   
--ID at the end, otherwise the write-datatable chokes
      [id] [bigint] IDENTITY(1,1) NOT NULL,
 CONSTRAINT [PK_dm_os_wait_stats_info] PRIMARY KEY CLUSTERED
(
      [server_name],
      [insert_datetime],
      [WaitType]
)
)



Next, save the following code to a file named get_dm_os_wait_stats.ps1.  I put it in c:\sql_scripts.  The first two lines "dot source" the scripts so that their functions can be called.  The third is the actual heavy lifter.

. c:\sql_scripts\invoke-sqlcmd2.ps1
. c:\sql_scripts\write-datatable.ps1
invoke-sqlcmd2 -serverinstance "serverwithserverlist" -query "SELECT server_names FROM yourdatabase.dbo.yourserverlist WHERE active = 1" | foreach-object {$dt = invoke-sqlcmd2 -erroraction silentlycontinue -serverinstance $_.server -inputfile c:\sql_scripts\dm_os_wait_stats_agg.sql -As 'Datatable'; write-datatable -serverinstance "serverwithserverlist" -DATABASE "targetdb" -tablename "dm_os_wait_stats_info" -DATA $dt}



Finally, create the job.  Only needs one job step, set as Operating system (CmdExec).  Schedule that however often you want - I'd say either hourly or daily.  You'll want another job to delete old records (I'll leave that as an exercise for the reader)
powershell "& c:\sql_scripts\get_dm_os_wait_stats.ps1"

And that's pretty much it!  

No comments: