We are in process of doing a migration from an ancient creaky server
to a shiny new VM. Rather than just rebuild it and restore everything,
we're taking the (painful) opportunity to clean things up and improve
several systems.
As part of this, we're replicating
data from the old server to the new server, so that we can migrate
processes piecemeal, so that rollback is not "OH CRAP TURN IT OFF TURN
IT OFF ROLL BACK TO THE OLD SERVER".
But we ran into a
weird problem. On the target server, we had a many-to-many table that
sits between, let's say, stores and orders. We have a stores table, we
have an orders table, and this one (call it STORE_ORDERS for simplicity)
is just a linking table between the two. ID, stores_id, orders_id.
Everything scripted identically between the two databases (aside from
the NOT FOR REPLICATION flag).
We found, while
double-checking everything, that there were records missing from the
stores_orders table on our new box, that were just fine on the source
table. What. The. Hell. After looking for a few, we determined that it
had to be related to one of our users doing a bunch of changes.... but
just to the stores table.
What could it be? They're updating the END_ASOF field in STORES, which is part of a unique constraint
CREATE UNIQUE NONCLUSTERED INDEX [STORE_NUMBER_ENDASOF_IDX] ON [dbo].[STORE] ([STORE_NUMBER], [END_ASOF])
The table affected does have a FOREIGN KEY constraint on STORES. Because we're trying to migrate over, the DDL is the same.
ALTER TABLE [dbo].[STORES_ORDERS]
ADD CONSTRAINT [FK_STORES_ORDERS_STORES] FOREIGN KEY ([STORES_ID]) REFERENCES [dbo].[STORES] ([ID]) ON DELETE CASCADE
And just to make it more fun, there are some AFTER UPDATE triggers involved. (But those are unrelated).
Any ideas? Replication, UNIQUE constraint, Foreign Keys.
Now, go read this article: https://blogs.msdn.microsoft.com/repltalk/2011/10/04/sql-server-transaction-replication-bounded-update-or-why-my-update-was-transformed-into-an-deleteinsert-pair/
I'll wait.
In the end, we figured out what had to be happening, then confirmed via trace. Care to guess?
So
our source table, when you update that unique constraint (AS PART OF A
SET; doing a singleton update, or "update stores where ID IN (1,2)"
doesn't do it, we tried), replication will push a BOUNDED UPDATE, aka
sp_MSdel/sp_MSins.
Which normally wouldn't be a big deal.
Except for that foreign key. Which has a CASCADE DELETE.
So,
to recap: user does an UPDATE on the source table. Because they
updated a field in the unique constraint, the delete becomes a BOUNDED
UPDATE. Then, replication pushes this update, via a DEL/INS pair, to
the target server. That DEL hits the foreign key's CASCADE DELETE,
which then DELETEs the stores_orders row. Voila! We updated Table A,
and Table B loses records!
Moral of the story? Beware complications and interactions, and test even when you know everything matches.
Friday, April 22, 2016
Wednesday, April 13, 2016
SSISDB - making a more useful email (additional code so you don't need SQL Sentry)
As you may know by reading http://thebakingdba.blogspot.com/2012/11/sql-server-2012-ssis-getting-useful.html, I built a process a while ago that will scan the SSISDB history, when passed a job, and send out an email with the details of the most recent failure. I love it and it means you get a useful email instead of a "go run the Integration Services Dashboard and start hunting" that it currently sends.
However, we run a LOT of SSIS jobs, multiple per minute, and so the query now takes over 30 seconds to return. Which sucks, because SQL Sentry's "execute SQL" has a hard cut-off of 30 seconds. So we stopped getting the useful emails. I wanted to fix that.
First attempt: go look in sysjobhistory for failed jobs, and then walk through the jobs that failed and run my SSISDB code explicitly.
Enjoy!
However, we run a LOT of SSIS jobs, multiple per minute, and so the query now takes over 30 seconds to return. Which sucks, because SQL Sentry's "execute SQL" has a hard cut-off of 30 seconds. So we stopped getting the useful emails. I wanted to fix that.
First attempt: go look in sysjobhistory for failed jobs, and then walk through the jobs that failed and run my SSISDB code explicitly.
Enjoy!
Thursday, April 7, 2016
#BacktoBasics - a list of my basic precepts/tenets/rules
So, last December, Tim Ford (aka @SQLAgentMan) posted this challenge:
We're now in April and I've "been meaning to" for 3 months now. So let's get some #backtobasics.
Here's my personal list of rules. My goal is to expound upon them, 1 post per month. We'll see if I hit that.
Rule 0: Backup
Rule 1: UPDATE STATS
Rule 2: Data/Log ALWAYS separated
Rule 3: Complexity is the enemy.
Rule 4: Layers of Abstraction
Rule 5: Question Assumptions
Rule 6: NULL is NULL
Rule 7: Once, Adhoc. Twice, Automate.
Rule 8: If you have time, build the framework. (See #3)
Some of the stuff to cover, off the top of my head:
0: Ola, testing, tweaks, diff/full/log
1: Why it matters, why it works, when it doesn't, alternatives, flushing the cache
2: if you can't grow one or the other...
3: find some stories. The 5 hops for replication come to mind.
4: not just views. Synonyms, CNAMEs, views, etc.
5: "it works this way in SQL 200X" "well, SP1" "well, CU5 but not SP1 CU4"
6: Find some examples about getting burned by it.
7: pretty explanatory. You get one whack at the pie for free.
8: contradicts 3, but reinforces 7. The EPR, partitioning, EN's blacklist.
We're now in April and I've "been meaning to" for 3 months now. So let's get some #backtobasics.
Here's my personal list of rules. My goal is to expound upon them, 1 post per month. We'll see if I hit that.
Rule 0: Backup
Rule 1: UPDATE STATS
Rule 2: Data/Log ALWAYS separated
Rule 3: Complexity is the enemy.
Rule 4: Layers of Abstraction
Rule 5: Question Assumptions
Rule 6: NULL is NULL
Rule 7: Once, Adhoc. Twice, Automate.
Rule 8: If you have time, build the framework. (See #3)
Some of the stuff to cover, off the top of my head:
0: Ola, testing, tweaks, diff/full/log
1: Why it matters, why it works, when it doesn't, alternatives, flushing the cache
2: if you can't grow one or the other...
3: find some stories. The 5 hops for replication come to mind.
4: not just views. Synonyms, CNAMEs, views, etc.
5: "it works this way in SQL 200X" "well, SP1" "well, CU5 but not SP1 CU4"
6: Find some examples about getting burned by it.
7: pretty explanatory. You get one whack at the pie for free.
8: contradicts 3, but reinforces 7. The EPR, partitioning, EN's blacklist.
Subscribe to:
Posts (Atom)