Friday, April 22, 2016

[WAT] The weird intersection between Foreign Keys, Replication, Unique Indexes, set-based updates, and missing rows.

We are in process of doing a migration from an ancient creaky server to a shiny new VM.  Rather than just rebuild it and restore everything, we're taking the (painful) opportunity to clean things up and improve several systems.

As part of this, we're replicating data from the old server to the new server, so that we can migrate processes piecemeal, so that rollback is not "OH CRAP TURN IT OFF TURN IT OFF ROLL BACK TO THE OLD SERVER".

But we ran into a weird problem.  On the target server, we had a many-to-many table that sits between, let's say, stores and orders.  We have a stores table, we have an orders table, and this one (call it STORE_ORDERS for simplicity) is just a linking table between the two.  ID, stores_id, orders_id.  Everything scripted identically between the two databases (aside from the NOT FOR REPLICATION flag).

We found, while double-checking everything, that there were records missing from the stores_orders table on our new box, that were just fine on the source table.  What. The. Hell.  After looking for a few, we determined that it had to be related to one of our users doing a bunch of changes.... but just to the stores table.

What could it be?  They're updating the END_ASOF field in STORES, which is part of a unique constraint

CREATE UNIQUE NONCLUSTERED INDEX [STORE_NUMBER_ENDASOF_IDX] ON [dbo].[STORE] ([STORE_NUMBER], [END_ASOF])






The table affected does have a FOREIGN KEY constraint on STORES.  Because we're trying to migrate over, the DDL is the same.

ALTER TABLE [dbo].[STORES_ORDERS]
ADD CONSTRAINT [FK_STORES_ORDERS_STORES] FOREIGN KEY ([STORES_ID]) REFERENCES [dbo].[STORES] ([ID]) ON DELETE CASCADE

And just to make it more fun, there are some AFTER UPDATE triggers involved.  (But those are unrelated).


Any ideas?  Replication, UNIQUE constraint, Foreign Keys.



Now, go read this article: https://blogs.msdn.microsoft.com/repltalk/2011/10/04/sql-server-transaction-replication-bounded-update-or-why-my-update-was-transformed-into-an-deleteinsert-pair/

I'll wait.



In the end, we figured out what had to be happening, then confirmed via trace.  Care to guess?





So our source table, when you update that unique constraint (AS PART OF A SET; doing a singleton update, or "update stores where ID IN (1,2)" doesn't do it, we tried), replication will push a BOUNDED UPDATE, aka sp_MSdel/sp_MSins.

data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAJ8AAABJCAIAAACsHmOFAAADWUlEQVR4nO2aUZaDIAxF3S7ryVrYB9vgrztwPlQIJIC2dqQ5737MqRZDyANkZt6yCl6vl7wJfpFF3oK6ZoC6loG6lllewC5Yu5aBupaBupaZVN3o3bLhfHwsi0C97t/LMcWM3n19bPOqS+GuYIGYAMVFFkgVqa/ulUZK8yvq1nmG4pplHyhdU7iqbn54q32g9HGN3rFh1i2rm8tAO0Vd9vDRTfTOeU+jkIGcOyrJPzfqm0pJRKMRrSfVVWJG75hMKaZSo848CMSyid4VcZy/oC7rhMUJtCwUygzUlsWiGSHUVRdz9G5hebRL4LynfW6njzl3EZMNTU6Eoob9jrsxo3cpfIrBgskaq0PLX1STIHrnzqsbq92g2t5YYLVlXZQ+Usx9Thd3i/G0qxzI+Vj9ELlmGav5H9fO2Lv98j7UmCL5IlbxWJ1nfk5GZk/IZDrqqvKIuqstP1W36Cy/DIoCNeJvcgbaa1Ooy+NSaE2YXvKn1NVjiuTb6tZ5KoP+aO02ttY868p3h3ZEeX9njlFbLOxVX771656PE0fanbf9MOQuUilZgVnGneRP7MxqTDV5dWdW80yRb3rvrsUrX2jKX4KyZX1zPN3rPVh/Jzii+lyiZF30ltVlOWrHp9ap6urOrMeM3pEXhyrtVNXIc/+qGPdHZ+Z/49xvRPIwWRSirtLXebZ3hXnVPbHI/+MPAj/NpOqCW4C6loG6loG6loE3wzJYu5aBupaBupaZVF14M25hXnXhzcjAmwFvBrwZ8GbAm1GHhDcD3ozOYODN4MCbAW8GvBnwZsCbwZlE3ROLHN6MAZOqC24B6loG6loG6loG3gzLYO1aBupaBupaZlJ1n/Zm9P5s/RnwZszizfgG8GY86c3IhWTxtpZHBu/9V2QPCm/Gk96M9Eylbqp2GtH5/QXeDNHZM94M1p1Yu/VXWkqNwcCbUbV+xpvBrofqaik1BgNvBuc5b0buZKCunlIjAXgzOE96M45UR+qqKTVTOE5K8GbAm3ET86p7YlXAmzFgUnXBLUBdy0Bdy0Bdy8CbYRmsXctAXctAXctMqi68Gbcwr7rwZmTgzYA3A94MeDPgzYA3g/dQDhLeDDEYeDM48GbAmwFvBrwZ8GZwJlH3xKqAN2PApOqCW/gDsoUojtVSPNYAAAAASUVORK5CYII=

Which normally wouldn't be a big deal.  

Except for that foreign key.  Which has a CASCADE DELETE.

So, to recap: user does an UPDATE on the source table.  Because they updated a field in the unique constraint, the delete becomes a BOUNDED UPDATE.  Then, replication pushes this update, via a DEL/INS pair, to the target server.  That DEL hits the foreign key's CASCADE DELETE, which then DELETEs the stores_orders row.  Voila!  We updated Table A, and Table B loses records!

Moral of the story?  Beware complications and interactions, and test even when you know everything matches.

Wednesday, April 13, 2016

SSISDB - making a more useful email (additional code so you don't need SQL Sentry)

As you may know by reading http://thebakingdba.blogspot.com/2012/11/sql-server-2012-ssis-getting-useful.html, I built a process a while ago that will scan the SSISDB history, when passed a job, and send out an email with the details of the most recent failure.  I love it and it means you get a useful email instead of a "go run the Integration Services Dashboard and start hunting" that it currently sends.

However, we run a LOT of SSIS jobs, multiple per minute, and so the query now takes over 30 seconds to return.  Which sucks, because SQL Sentry's "execute SQL" has a hard cut-off of 30 seconds.  So we stopped getting the useful emails.  I wanted to fix that.

First attempt: go look in sysjobhistory for failed jobs, and then walk through the jobs that failed and run my SSISDB code explicitly.

Enjoy!



Thursday, April 7, 2016

#BacktoBasics - a list of my basic precepts/tenets/rules

So, last December, Tim Ford (aka @SQLAgentMan) posted this challenge:

tim_challenge

We're now in April and I've "been meaning to" for 3 months now.  So let's get some #backtobasics.

Here's my personal list of rules.  My goal is to expound upon them, 1 post per month.  We'll see if I hit that.

Rule 0: Backup
Rule 1: UPDATE STATS
Rule 2: Data/Log ALWAYS separated
Rule 3: Complexity is the enemy.
Rule 4: Layers of Abstraction
Rule 5: Question Assumptions
Rule 6: NULL is NULL
Rule 7: Once, Adhoc. Twice, Automate.
Rule 8: If you have time, build the framework. (See #3)

Some of the stuff to cover, off the top of my head:
0: Ola, testing, tweaks, diff/full/log
1: Why it matters, why it works, when it doesn't, alternatives, flushing the cache
2: if you can't grow one or the other...
3: find some stories. The 5 hops for replication come to mind.
4: not just views. Synonyms, CNAMEs, views, etc.
5: "it works this way in SQL 200X" "well, SP1" "well, CU5 but not SP1 CU4"
6: Find some examples about getting burned by it.
7: pretty explanatory.  You get one whack at the pie for free.
8: contradicts 3, but reinforces 7. The EPR, partitioning, EN's blacklist.