Say you have, like most of us, a table with an "id INT IDENTITY PRIMARY KEY," so your PK is the ID, but the table also has a date column. Nobody cares about the date. Until they do, months later. So now you have a busy table that you can't easily add an index on, and an ADHOC OMGWTFBBQ project comes in, where you need to query a date range, and no idea how to get there quickly. .
*raises hand* Yup, that'd be me. So I wrote this.
What it does: cycle down through a table, using the ID column. Start by decrementing the max ID from the table by 10 million. Does that go below the date you need? Yes? Okay, try 1 million. Yes? 100k. No? How about 200k? Yes. 110k? No. 120k? Yes. 111k? No. 112k? Yes. 111100? (and so on).
It'll pop down even a terabyte table pretty quickly, since it only does a handful of queries, and they're all against the PK. Granted, I could make it faster by doing a (min+max)/2, then (newmin+max)/2 etc, but this works. I've nicknamed it "Zeno's Arrow", although technically my code doesn't go halfsies - but it is fast and direct.
Also, (importantly!) it may not cope with missing rows (which could happen due to stuff like, for instance, a failed transaction). I don't have a good fix for that, yet. Maybe in V2
Hope this helps.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
--table must have at least 10000 rows, so that it doesn't get lost in NULLs. I think. | |
DECLARE @table_name sysname, @date_column sysname, @sql NVARCHAR(4000), @lowest_date VARCHAR(20) | |
SET @table_name = 'mytablename' | |
SET @date_column = 'mydatefield' | |
SET @lowest_date = '20150415' | |
SELECT @sql = ' | |
DECLARE @tempdate DATETIME, @target_id bigint, @max_id bigint | |
SELECT @target_id = MAX(id) FROM ' + @table_name + ' with (NOLOCK) | |
set @max_id = @target_id | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
WHILE @tempdate >''' + @lowest_date + ''' | |
BEGIN | |
SELECT @target_id = @target_id - 10000000 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
IF @@ROWCOUNT = 0 SET @tempdate = ''2000/01/01'' | |
END | |
SELECT @target_id = @target_id + 10000000 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
WHILE @tempdate >''' + @lowest_date + ''' | |
BEGIN | |
SELECT @target_id = @target_id - 1000000 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
IF @@ROWCOUNT = 0 SET @tempdate = ''2000/01/01'' | |
END | |
SELECT @target_id = @target_id + 1000000 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
WHILE @tempdate >''' + @lowest_date + ''' | |
BEGIN | |
SELECT @target_id = @target_id - 100000 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
IF @@ROWCOUNT = 0 SET @tempdate = ''2000/01/01'' | |
END | |
SELECT @target_id = @target_id + 100000 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
WHILE @tempdate >''' + @lowest_date + ''' | |
BEGIN | |
SELECT @target_id = @target_id - 10000 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
IF @@ROWCOUNT = 0 SET @tempdate = ''2000/01/01'' | |
END | |
SELECT @target_id = @target_id + 10000 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
WHILE @tempdate >''' + @lowest_date + ''' | |
BEGIN | |
SELECT @target_id = @target_id - 1000 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
IF @@ROWCOUNT = 0 SET @tempdate = ''2000/01/01'' | |
END | |
SELECT @target_id = @target_id + 1000 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
WHILE @tempdate >''' + @lowest_date + ''' | |
BEGIN | |
SELECT @target_id = @target_id - 100 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
IF @@ROWCOUNT = 0 SET @tempdate = ''2000/01/01'' | |
END | |
SELECT @target_id = @target_id + 100 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
WHILE @tempdate >''' + @lowest_date + ''' | |
BEGIN | |
SELECT @target_id = @target_id - 10 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
IF @@ROWCOUNT = 0 SET @tempdate = ''2000/01/01'' | |
END | |
SELECT @target_id = @target_id + 10 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
WHILE @tempdate >''' + @lowest_date + ''' | |
BEGIN | |
SELECT @target_id = @target_id - 1 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
IF @@ROWCOUNT = 0 SET @tempdate = ''2000/01/01'' | |
END | |
SET @target_id = @target_id + 1 | |
SELECT @tempdate = ' + @date_column + ' FROM ' + @table_name + ' with (NOLOCK) WHERE id = @target_id | |
SELECT @max_id as max_id, @target_id as lowest_id_for_date, @tempdate as value_for_LowID' | |
EXEC dbo.sp_executesql @sql | |
1 comment:
Thanks for that, nice approximation technique. Please do an update post when you have a V2 version (gaps are always an issue)
Post a Comment