Sunday, December 16, 2018

[Patching] Beware of SQL Server 2016 SP2 CU3 bug if running Replication and SA is not named SA!

TL;DR – if the SA account is not named SA (because, as per MS and Corporate, we rename it), this patch fails with this error and the server shuts down. 
(note the spots in bold).  Fix is to start the service with a T910 trace flag, rename that formerly-known-as-SA account to SA, then stop the service and start service back up via SERVICES.

Worth reading:


We just spent a good 30+ minutes trying to solve this.  Fortunately, after reading Pinal Dave’s post about it (top post), was able to figure out that I needed to google different terms, and found the above post which is far more applicable. (2nd link)

Rename the SA account to SA before patching, at least for this specific patch. (2016 SP2 CU3)

THE FIX, if you patched and it doesn’t start back up:
1) From a command window, running AS ADMINISTRATOR, run this:
2) Connect via SSMS. Security->Logins, find the long-random-string-of-letters.  (Alternatively, find it in syslogins, but this appears simpler)
3) Right-click on the former-SA-account, and choose “Rename”
4) Rename to SA. 
5) Go back to the command window, run
6) Go back to the SERVICES, and start the SQL Service.  It should come up.

Thursday, August 16, 2018

[Query Plan] Getting out the query execution plan when SP_whoisactive gives you a big fat NULL.

Ran into a problem where I couldn't get the execution plan for a query that was taking forever. 

Found a smarter man than me who'd build a powershell cmdlet to capture it and save it to disk.

Here's my wrapper for it:

$SPname = "yourspnamehere"
$servername = "yourserverhere"

. c:\powershell_scripts\invoke-sqlcmd2.ps1

$query = @"
SELECT master.dbo.fn_varbintohexstr(plan_handle) as plan_handle
FROM sys.dm_exec_cached_plans 
CROSS APPLY sys.dm_exec_sql_text(plan_handle) 
CROSS APPLY sys.dm_exec_query_plan(plan_handle)
text LIKE '%$SPname%'
AND objtype = 'Proc'
ORDER BY usecounts DESC;

$queryresults = invoke-sqlcmd2 -serverinstance $servername -query $query
$planhandle = $queryresults.plan_handle
. z:\mydocs\get-queryplan.ps1 -SqlInstance "ods_scd" -planhandle "$planhandle"

Tuesday, June 26, 2018

[AWS] Powershell to figure out the encryption status of your EC2 drives.

since this turned out to be more difficult than I'd thought, posting it here.

$dbserver_instance = (Get-EC2Instance -Filter @{Name="tag:Name";Value="yourservernamehere"}  -Region us-east-1).Instances
get-ec2volume -filter @{Name="attachment.instance-id";value="$($dbserver_instance.instanceid)"} -region us-east-1

The tricky part is the tag:Name, which isn't obvious.  Also odd, if you just run the get-ec2volume with the tag:name and value, you get the exact same message as if you ran get-ec2instance!  I don't pretend to understand that.

Wednesday, June 6, 2018

[AWS RDS] Sending an email when there's a new version of RDS available

Something I needed recently.  Difficulty level: Get-RDSPendingMaintenanceAction ONLY returns info when there's something to be done.  If you're current, there's no way to find out what it looks like.

This will email you a HTML-formatted table with the details of each pending maintenance action for every instance in a region.

Monday, May 7, 2018

[AWS] Reading Aurora audit logs in Cloudwatch for DDL changes

Another form of log-watching, this time the Audit logs that Aurora can push to Cloudwatch.  It's not all stuff by any means; heck, the example they use is just about failed logins.

In this case, we're reading the Cloudwatch logs for our RDS clusters.
Since Cloudwatch offers filters, in this example we're looking for QUERY commands, since we've turned Server Audit on, and are uploading connects + queries to Cloudwatch.  (More information on that in:  Remember to set a cap on it, though then interesting things could be buried.

Why are we doing this?  In our case, this gives us a few bits of cool info.  Specifically, we can log all the DDL commands that come across our cluster, making sure everything is behaving as expected.

[AWS] Aurora - reading the errorlog files with powershell

Been working on monitoring.  For some reason, when you tell Aurora to send errorlogs to Cloudwatch, all it sends are the Audit Logs, which will tell you that code had changed, etc, but doesn't (!?!?!??!!!) actually put your logs in Cloudwatch.  I don't understand it, so I built this process to look through logs and return the data.  The next step would be to format it and either upload to Cloudwatch manually, or log it, or send email.  Whatever works for you.  You be you.  :D

Thursday, May 3, 2018

[AWS] powershell to patch all Aurora clusters

Pretty basic, but took longer than I figured it would.  The catch was figuring out how to look inside the results.

set-awscredentials -accesskey youraccesskey -secretkey yoursecretkey

Submit-RDSPendingMaintenanceAction -ResourceIdentifier $_.ResourceIdentifier -applyaction $_.PendingMaintenanceActionDetails.action -OptInType immediate }

So when you get the results back, it looks like:

PendingMaintenanceActionDetails               ResourceIdentifier                                         
-------------------------------               ------------------                                         
{Amazon.RDS.Model.PendingMaintenanceAction}   arn:aws:rds:us-west-1:xxxxxxxx:cluster:xxxxxx

How do you view what's in that Amazon.RDS object?  I have no doubt there's some way to unpack it with powershell, but I'm not sure what that is.

What I did:

Looked at the PoSH module documentation for this cmdlet (Get-RDSPendingMaintenanceAction) to see what it returned:

Which says:

Which, indeed, is what it returned to us.

Now, clicking on the object info from the documentation:

takes us to:

And THAT page says it has Action, AutoAppliedAfterDate, etc.
When I run

$DataSet = Get-RDSPendingMaintenanceAction

Here's what I get:

Action               : system-update
AutoAppliedAfterDate : 1/1/0001 12:00:00 AM
CurrentApplyDate     : 5/2/2018 4:41:00 PM
Description          : Aurora 1.17.2 release
ForcedApplyDate      : 1/1/0001 12:00:00 AM
OptInStatus          : immediate

So, now we have the fields we need: what kind of action to take (non optional, and it can be db-update or system-update), and the ResourceIdentifier for the cluster.

Friday, April 13, 2018

[Statistics] Last 4 stats redux - which stats did I update just now that all recent maintenance didn't do?

This is an addendum to my other "last 4 stats" article.  We had a query that had a bad plan, either due to bad statistics or sniffing or whatnot.  Normally rerunning a stats update would fix it, but not today.

Here's our normal stats maint (separate from index maint).

EXECUTE dba_utils.dbo.IndexOptimize
@Databases = 'eif_cde',
@FragmentationLow = NULL,
@FragmentationMedium = NULL,
@FragmentationHigh = NULL,
@UpdateStatistics = 'ALL',
@OnlyModifiedStatistics = 'Y'  

The last time I looked, Ola's code uses the built-in rule for stats maintenance to determine if it should be updated, same as sp_updatestats, I ran a full update (OnlyModifiedStatistics = 'N').  But I was unsure what I had updated that hadn't been before.  Fortunately, due to some other code I had (, I was able to get the last 4 stats updates.  So, let's figure out which ones were JUST updated, and hadn't been since our last update window at 11pm.  Then we get the columns involved in that table, in hopes that we can look at our "hung" query and see that one of the fields in the WHERE clause hadn't been updated.  That gives us an idea where to tweak, or possibly just to force a rebuild next time.

Step 1: run the code from the previous blog post, making sure it doesn't drop the table when done.
Step 2: run the below code, substituting your own windows.  In my case, I wanted to see which newly updated stats may have fixed the issue.

--and now to return which particular stats hadn't been updated.
SELECT stats.the_schema_name,
--       stats.[inserts since last update],
--       stats.[deletes since last update],  FROM #stats_info2 stats
INNER JOIN sys.STATS ss ON ss.NAME = STATS.stat_name
INNER JOIN sys.stats_columns sstatc ON ss.object_id = sstatc.object_id AND sstatc.stats_id = ss.stats_id
INNER JOIN sys.columns ON columns.column_id = sstatc.column_id AND columns.object_id = ss.object_id
WHERE stats.updated >='20180413 14:00'
( SELECT 1 FROM #stats_info2 stats2 WHERE
stats2.updated < '20180413 14:00' --AND stats2.UPDATEd >'20180412 22:59:00' but gives the same results
AND STATS.the_schema_name = stats2.the_schema_name
AND stats.table_name =stats2.table_name
AND stats.stat_name = stats2.stat_name
ORDER BY the_schema_name, table_name, stat_name

Tuesday, March 13, 2018

[AWS] Querying CloudWatch logs using powershell to find DDL changes in your Aurora instance.

Did I get enough buzzwords in that post title? 

So, I've been working on an AWS Aurora MySQL project lately (yes, you need to qualify Aurora, because these days there's a Postgres variant).

One problem we had was getting info out of the logs.  Querying the logs is possible from within MySQL, but time consuming.  Ditto querying the logs that are being saved to disk.  We used the AWS directions ( to get our logs to dump into CloudWatch, which offers a Powershell cmdlet to filter them (although it limits to 1mb at a time, hence the DO WHILE with a low rowcount).  Unfortunately, I didn't find any actual working examples. 

So while this may not be the best, it does work. 

NOTE: results currently go to an out-gridview window.  Obviously, change that to what you need, or:
* remove the write-host
* remove the ogv line
* wrapper the entire DO WHILE into a variable then return that at the end. BE CAREFUL DOING THIS. I tried it with a decent-sized results set (30k) and got a weird error back:

Get-CWLFilteredLogEvent : Error unmarshalling response back from AWS. Request ID:

Monday, February 12, 2018

[Replication] The row was not found at the Subscriber when applying the replicated (null) command for Table '(null)' with Primary Key(s): (null)

Had this come in courtesy of one of my replication monitors (see other posts).

DESCRIPTION:   Replication-Replication Distribution Subsystem: agent my-ser-ver-here-main-Main_big-my-other-server-here-35 failed. The row was not found at the Subscriber when applying the replicated (null) command for Table '(null)' with Primary Key(s): (null)

Yes, it actually says "(null)".

What was it?  We run multi-tier replication.  That way each datacenter only gets one stream of changes, and we don't run into problems due to desktop heap (see other posts, again, on this blog).

After running a trace, we finally figured it out - the second tier had a foreign key where it shouldn't have had one.  The FKs are handled on the "front" server, the one that everything replicates from.  Because the FK was there, the rows were being deleted out of order, so it had a fit and gave us "(null)" instead of a valid name.  

Note that this is only a problem with multi-tier replication; normal repl handles it, but it somehow choked on this.  

In other news: I've been doing a lot of AWS Aurora I need to post about here, including both MySql SQL, and Powershell, but I don't have enough usable yet.  Coming Soon!