Tuesday, May 7, 2019

[Table Partitioning] Does WAIT_AT_LOW_PRIORITY work with Standard Edition?

(TL;DR - yes.  Yes, it does.)

As you hopefully know, back with SQL Server 2016 SP1, Microsoft opened the floodgates of formerly-only-enterprise-features (https://sqlperformance.com/2016/11/sql-server-2016/big-deal-sp1). 

But with caveats.

For instance, IMOLTP (in-memory online transactional processing, aka "in-memory") tables are limited in the amount of RAM used. 

But how about WAIT_AT_LOW_PRIORITY?  That was introduced in 2014 to make table partitioning deal better with That-Dude-From-Accounting-Who-Kicks-Off-A-Massive-Query-On-Friday-at-5pm, which causes partitioning to hang on Saturday when you're trying to add and remove partitions.

Specifically, the notes don't say anything about it.  The closest they get is:
"WAIT_AT_LOW_PRIORITY used with ONLINE=ON only."

And ONLINE is an Enterprise-only feature.

So, does it or doesn't it?

For a simple example I actually used Aaron Bertrand's answer to a StackOverflow question:
https://dba.stackexchange.com/questions/155495/table-partitioning-in-sp1-for-sql-server-2016



So, what do we get?
Well, as pointed out by Joe Sack back in 2013(!), that feature will log to the Errorlog when it works. 
https://sqlperformance.com/2013/09/sql-indexes/lock-priority-sql2014

Let's try it out.  In the first window create the objects.  Once created, start a second window and open the transaction.  Then back to the first and run the SWITCHes and see what happens!

...Yup! It works in Standard!

Here are the messages I got:

Date 5/7/2019 3:41:03 PM
Log SQL Server (Current - 3/23/2019 1:06:00 AM)

Source spid61

Message
An ALTER TABLE SWITCH statement was executed on database 'eif_workspace', table 'partitionexample' by hostname 'CH004547', host process ID 30812 with target table 'staging_FactInternetSales' using the WAIT_AT_LOW_PRIORITY options with MAX_DURATION = 1 and ABORT_AFTER_WAIT = BLOCKERS. Blocking user sessions will be killed after the max duration of waiting time.

, followed a minute later by: 

Date 5/7/2019 3:42:03 PM
Log SQL Server (Current - 3/23/2019 1:06:00 AM)

Source spid61

Message
An ABORT_AFTER_WAIT = BLOCKERS lock request was issued on database_id = 7, object_id = 1182627256. All blocking user sessions will be killed.

and 

Date 5/7/2019 3:42:03 PM
Log SQL Server (Current - 3/23/2019 1:06:00 AM)

Source spid61

Message
Process ID 63 was killed by an ABORT_AFTER_WAIT = BLOCKERS DDL statement on database_id = 7, object_id = 1182627256.


I was figuring/hoping it'd work, but that example using ONLINE = ON gave me pause.
Proof, and a blog post!


Thursday, May 2, 2019

[Linked Servers] Kerberos Double Hop - SPNs and Trusted for Delegation

Yesterday I ran into the dread Kerberos Double-Hop when trying to set up a linked server.  Thought it was the standard "Add an SPN using the Microsoft Kerberos Configuration tool".  Which didn't fix it.

What the....

Turns out there's another potential setting you have to fix. 

In active directory, the service account that SQL Server is running under may need this setting:


Tuesday, February 26, 2019

[Powershell] Tracking down that failing app on another server using WMI's win32_process and Profiler

(Yes, I'm still using profiler in some cases)

As we continue to move off SQL Server 2008, we've found instances where we change all the application configs, yet something on ServerA is still querying ServerB.  And since our internal application doesn't have the connection string set to give an ApplicationName, we have no idea which one is misconfigured - or if one was missed entirely!

1) Set up a profiler trace on one side, making sure to grab ClientProcessID



2) Run this WMI query on ServerA, using Powershell. 


$loopcount = 1
$futuretime = (get-date).addseconds(60)
$processeses = gwmi -Query "select caption, commandline, processid, name, executablepath from win32_process" # run locally on the server making the calls for speed)
do {
$processeses += gwmi -Query "select caption, commandline, processid, name, executablepath from win32_process" -ComputerName "mem-qa-trdb-17"
$loopcount++while ((get-date-lt $futuretime)
$processeses | ogv

(using $loopcount will tell you how many times it ran in that minute)

You could run it remotely if you had the proper WMI rights set up (which is probably the case on WS2016+), but my fear is that it'd be slow enough that it wouldn't "catch" the app as it's running, if it's spinning up on its own, failing to connect, then quitting. 

If the app is just running and not starting/stopping, then it'll keep the same PID (process_id) and you can just run the first $processeses and filter it. 


3) When the ogv (out-gridview) runs, it'll pop up an interactive window.  Type the actual number for ClientProcessID you got from the profiler trace, in the "filter" box, and it should filter down to just the Process (and location of the executable) that you need to fix.  


Thursday, February 14, 2019

[EPR] Making the Extensible Powershell Repository work on a box with SQL Server 2017 since add-sqltable fails with Microsoft.SqlServer.Dmf

We're migrating the EPR (http://thebakingdba.blogspot.com/2013/04/servers-extensible-powershell.html) to a new box.  After 6 years, and it's still one of my best tools.  (Just drop a SQL or PS1 script in a folder, and it'll create a table and run it against all your servers).

Alas, Chad Miller's excellent add-sqltable started failing.  To the internet!

Could not load file or assembly 'Microsoft.SqlServer.Dmf"

Looking around, it appears that it's a problem with SMO.  On 2017, they split out SMO and SSMS.  Which means that installing SSMS 2017 doesn't get you SMO.  Supposedly installing it (install-module sqlserver) will do that, but I'd already did that and it wasn't working.

So, after fighting with it for a little bit, I decided to "get it working" and revisit later.

1) Install SSMS 2016 (which is when they started separating out SSMS)
2) Change the version from 11 to 13 in the script, as below: 

try {add-type -AssemblyName "Microsoft.SqlServer.ConnectionInfo, Version=13.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" -EA Stop}
catch {add-type -AssemblyName "Microsoft.SqlServer.ConnectionInfo"}

try {add-type -AssemblyName "Microsoft.SqlServer.Smo, Version=13.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" -EA Stop} 
catch {add-type -AssemblyName "Microsoft.SqlServer.Smo"} 

Sunday, December 16, 2018

[Patching] Beware of SQL Server 2016 SP2 CU3 bug if running Replication and SA is not named SA!

TL;DR – if the SA account is not named SA (because, as per MS and Corporate, we rename it), this patch fails with this error and the server shuts down. 
(note the spots in bold).  Fix is to start the service with a T910 trace flag, rename that formerly-known-as-SA account to SA, then stop the service and start service back up via SERVICES.


Worth reading:
https://blog.sqlauthority.com/2018/04/04/sql-server-upgrade-error-alter-database-statement-not-allowed-within-multi-statement-transaction/

DESCRIBES THE PROBLEM AND SOLUTION:
https://feedback.azure.com/forums/908035-sql-server/suggestions/35805787-sql-server-2016-sp2-cu3-patch-breaks-if-replicatio

We just spent a good 30+ minutes trying to solve this.  Fortunately, after reading Pinal Dave’s post about it (top post), was able to figure out that I needed to google different terms, and found the above post which is far more applicable. (2nd link)

PREVENT:
Rename the SA account to SA before patching, at least for this specific patch. (2016 SP2 CU3)

THE FIX, if you patched and it doesn’t start back up:
1) From a command window, running AS ADMINISTRATOR, run this:
NET START MSSQLSERVER /T902
2) Connect via SSMS. Security->Logins, find the long-random-string-of-letters.  (Alternatively, find it in syslogins, but this appears simpler)
3) Right-click on the former-SA-account, and choose “Rename”
4) Rename to SA. 
5) Go back to the command window, run
NET STOP MSSQLSERVER /T902
6) Go back to the SERVICES, and start the SQL Service.  It should come up.

Thursday, August 16, 2018

[Query Plan] Getting out the query execution plan when SP_whoisactive gives you a big fat NULL.

Ran into a problem where I couldn't get the execution plan for a query that was taking forever. 

Found a smarter man than me who'd build a powershell cmdlet to capture it and save it to disk.
http://www.patrickkeisler.com/2013/09/the-case-of-null-queryplan.html

Here's my wrapper for it:

$SPname = "yourspnamehere"
$servername = "yourserverhere"


. c:\powershell_scripts\invoke-sqlcmd2.ps1

$query = @"
SELECT master.dbo.fn_varbintohexstr(plan_handle) as plan_handle
FROM sys.dm_exec_cached_plans 
CROSS APPLY sys.dm_exec_sql_text(plan_handle) 
CROSS APPLY sys.dm_exec_query_plan(plan_handle)
WHERE 
text LIKE '%$SPname%'
AND objtype = 'Proc'
ORDER BY usecounts DESC;
"@

$queryresults = invoke-sqlcmd2 -serverinstance $servername -query $query
$planhandle = $queryresults.plan_handle
. z:\mydocs\get-queryplan.ps1 -SqlInstance "ods_scd" -planhandle "$planhandle"

Tuesday, June 26, 2018

[AWS] Powershell to figure out the encryption status of your EC2 drives.

since this turned out to be more difficult than I'd thought, posting it here.

$dbserver_instance = (Get-EC2Instance -Filter @{Name="tag:Name";Value="yourservernamehere"}  -Region us-east-1).Instances
get-ec2volume -filter @{Name="attachment.instance-id";value="$($dbserver_instance.instanceid)"} -region us-east-1

The tricky part is the tag:Name, which isn't obvious.  Also odd, if you just run the get-ec2volume with the tag:name and value, you get the exact same message as if you ran get-ec2instance!  I don't pretend to understand that.

Wednesday, June 6, 2018

[AWS RDS] Sending an email when there's a new version of RDS available

Something I needed recently.  Difficulty level: Get-RDSPendingMaintenanceAction ONLY returns info when there's something to be done.  If you're current, there's no way to find out what it looks like.

This will email you a HTML-formatted table with the details of each pending maintenance action for every instance in a region.


Monday, May 7, 2018

[AWS] Reading Aurora audit logs in Cloudwatch for DDL changes

Another form of log-watching, this time the Audit logs that Aurora can push to Cloudwatch.  It's not all stuff by any means; heck, the example they use is just about failed logins.

In this case, we're reading the Cloudwatch logs for our RDS clusters.
Since Cloudwatch offers filters, in this example we're looking for QUERY commands, since we've turned Server Audit on, and are uploading connects + queries to Cloudwatch.  (More information on that in: https://aws.amazon.com/blogs/database/monitoring-amazon-aurora-audit-events-with-amazon-cloudwatch/).  Remember to set a cap on it, though then interesting things could be buried.

Why are we doing this?  In our case, this gives us a few bits of cool info.  Specifically, we can log all the DDL commands that come across our cluster, making sure everything is behaving as expected.


[AWS] Aurora - reading the errorlog files with powershell

Been working on monitoring.  For some reason, when you tell Aurora to send errorlogs to Cloudwatch, all it sends are the Audit Logs, which will tell you that code had changed, etc, but doesn't (!?!?!??!!!) actually put your logs in Cloudwatch.  I don't understand it, so I built this process to look through logs and return the data.  The next step would be to format it and either upload to Cloudwatch manually, or log it, or send email.  Whatever works for you.  You be you.  :D