Thursday, February 23, 2023

Severity 10 - using Extended Events to get code errors

I wish I knew whom to thank for this.  An absolute genius idea. Probably Ozar, possibly Bertrand, could be White or a bunch of other #SQLFamily.  Warning: once you use this, you'll want it on all your servers. 

Premise: run an Extended Event looking for all errors over Severity 10, saving in a 10mb memory buffer within the SQL Server instance. Keep the rolling last errors so you have a decent amount of history. I believe the original was Severity 10 and above, but that one was too chatty for me, whereas 11+ seems to be ideal. But I refer to it as my "Sev10" code, so here we are. This includes all sorts of stuff you won't normally see - errors from stored procedures, errors in replication, errors in SSMS, etc. 

Testing it is easy: create session & start it, run "select 1/0", then run the query.

I do have a version that I run that saves logs out, but honestly, that's a different post. This Xevent uses 10mb from the Ring Buffer, constantly filling and keeping the last 10mb of errors. Performance hit seems minimal, and we're doing millions of transactions a day.

Performance has always been the kicker on this - using Xquery directly against the ring buffer is slow as heck, and even after solving that via temp tables, I still ran into problems with one particular server. That made me realize I don't appear to have posted this. By splitting it via NODE to multiple rows to a temp table, then querying the temp table to shred the XML, the performance is vastly improved.  In my case, from 8 minutes to 16 seconds. And on most servers, under 5. 

Other thoughts - you might consider adding another WHERE clause to the ADD EVENT; maybe filter end-users' machines or SSMS. I like having it, but you do you. Turns out SSMS is damn chatty sometimes, also. "View server state permission was denied " and "Cannot drop the table '#SVer'" abound.

Want to run this in Azure SQL DB? Sure, just do it against each database, replacing the "ON SERVER" with "ON DATABASE". The table names you query against change slightly (it's commented), and the time zone calculation doesn't work yet. Booooooooo.

I'm working on a collector, just haven't finished yet.

Bonus! Frames! Take the Frames provided in the Event_Data, and go use their amazing code to tell you exactly what piece of code in what function/stored procedure, and what line, caused the issue. Is it in a stored procedure calling a stored procedure calling a function? This will tell you where and what. Genius, and kudos to Tom.  https://straightforwardsql.com/posts/investigating-errors-with-extended-events/

Cheers!


Friday, February 17, 2023

Azure SQL Managed Instance - a simple Extended Event to track logins

 This is a barebones Xevent (Extended Event) . I'm not using blob storage, just the existing "eh, you have a couple hundred meg worth of ring buffer you can use". But I needed to see who was using one of our dev instances. I thought we'd moved everything over to the Azure SQL DBs, since there's a ton of reasons to do so.

Here's some quick off-the-cuff reasons why I prefer Azure SQL DB to Managed Instance:

  1. Spins up/down FAST.  Seconds to minutes. Have a 20gb database? Takes 2 minutes. 30gb? 3.
  2. More availability! Technically more servers!
  3. Serverless options! Allows you to just pay for the price of storage, not compute. Doubly useful for QA/Dev boxes that aren't always used. Also allows us to auto scaleup/down as traffic changes.
  4. Works with our super-limited networking.

Yes, there are downsides. CLR, Replication, Agent, Mail, others. But the wins of using Azure SQL DB are very tangible.  < / soapbox > 




Anyhow, script:

@@

/* 


CREATE EVENT SESSION [Logins] ON SERVER 

ADD EVENT sqlserver.login(

ACTION(sqlserver.client_hostname,sqlserver.client_app_name,sqlserver.database_name,sqlserver.username))

ADD TARGET package0.ring_buffer(SET max_memory = 4096)  --max_event_limit threw error!

go

ALTER EVENT SESSION [Logins] 

ON SERVER 

STATE = start ;

GO*/



if object_id('tempdb..#xevent') is not null

    DROP TABLE #xevent

CREATE TABLE #xevent (target_data XML)

INSERT INTO #xevent (    target_data)

SELECT target_data

 FROM sys.dm_xe_session_targets        AS st

    INNER JOIN sys.dm_xe_sessions       AS se

         ON CAST(se.address AS BINARY(8)) = CAST(st.event_session_address AS BINARY(8))

WHERE

se.name = 'Logins'


DECLARE @count VARCHAR(10) 

SELECT @count = target_data.value('/RingBufferTarget[1]/@eventCount','varchar(10)') 

FROM #xevent --null means it's not on or not "installed"

PRINT 'eventCount = ' + @count


;with events_cte as(

SELECT 

ed.c.value('(@timestamp)[1]', 'DATETIMEOFFSET') at TIME ZONE 'Central Standard Time' AS err_timestamp,

--xevents.event_data.value('(RingBufferTarget/event/@timestamp)[1]', 'datetime2')) AS [err_timestamp],

ed.c.value('(action[@name="client_app_name"]/value)[1]', 'nvarchar(max)') AS client_app_name,

ed.c.value('(action[@name="client_hostname"]/value)[1]', 'nvarchar(max)') AS client_hostname,

ed.c.value('(action[@name="database_id"]/value)[1]', 'int') AS database_id,

ed.c.value('(action[@name="database_name (Action)"]/value)[1]', 'nvarchar(max)') AS nt_username,

ed.c.value('(action[@name="username"]/value)[1]', 'nvarchar(max)') AS username

, ed.c.query('.') AS event_data  --we leave this off since it makes this run SO MUCH LONGER.

FROM #xevent a

cross apply (select CAST(target_data as XML) as event_data) AS xevents

CROSS APPLY event_data.nodes('/RingBufferTarget/event') AS ed(c)

)

SELECT *

--, events_cte.event_data

from events_cte 


order by err_timestamp desc;

@@


Wednesday, February 8, 2023

Azure SQL Database Auditing gotchas.

A couple of notes when using Azure SQL DB Auditing 
1) this is actually kinda cool, overall. I plan on mining this in the future for code changes, etc. 
2) You can filter it! predicateexpression is the term you're looking for, available via the powershell module 3) if you're trying to add predicates, it uses SQL Audit syntax. In our case, we wanted to ignore a particular stored procedure call, so we used: 
"(NOT [statement] like '%myspnamehere%')" 
4) BEFORE YOU DO THIS and add the predicate: turn all the audits off on that "server" first. 
 The database AND the server-level. The regular AND the microsoft. 

We had a heck of a time figuring out why it wasn't working right. 
Between this and making sure the Database-Level Auditing was off, this worked. 

Disconnect-AzAccount myguidhere
Connect-AzAccount -Tenant myguidhere

Set-AzContext -SubscriptionId Get-AzSqlServerAudit -ResourceGroupName "ourRGname" -Servername "ourSERVERname" 

Set-AzSqlServerAudit -ResourceGroupName "ourRGname" -ServerName "ourSERVERname" -PredicateExpression "(NOT [statement] like '%myspnamehere%')" -WorkspaceResourceId "ourworkspaceidforourLogAnalyticsTarget" -LogAnalyticsTargetState ENABLED

Set-AzSqlServerMSSupportAudit -ResourceGroupName "ourRGname" -ServerName "ourSERVERname" -WorkspaceResourceId "ourworkspaceidforourLogAnalyticsTarget" -LogAnalyticsTargetState ENABLED