Wednesday, January 7, 2015

[BULK INSERT] importing a fixed-width file that has header rows ala SQL, but without a format file.

Needed this for today.  Basically someone saved out a query as fixed width, but we weren't able to import it (Excel, BIDS & Import/Export Wizard all failed).  So I decided to build something that could also do it.

A basic file would look like this.  Note that the dashes and field names line up.  So what we do is really basic but still effective - look for those singleton spaces, then use that to pull out the details from the header row. 

STORE_CITY                          STORE_STATE STORE_ZIP
----------------------------------- ----------- ---------
RICHMOND                            VA          23229
SPRINGFIELD                         MO          65807

The one down side is that I don't have time to tweak it to get it working with sp_executesql (since my particular query is over 8000 characters), so I print out each line, toss that into my SELECT, strip out the last comma and add the tablename. 

CREATE TABLE bulk_insert_raw 

    (id INT IDENTITY PRIMARY KEY, resultant VARCHAR(8000))
--view is needed in order to do a BULK INSERT with an inline IDENTITY - using staging table would NOT guarantee order.
CREATE VIEW bulk_insert_view 

SELECT resultant FROM bulk_insert_raw

BULK INSERT bulk_insert_view
   FROM '\\myserver\myfile.csv'
         FIELDTERMINATOR = '|',
         ROWTERMINATOR = '0x0a'  --if a UNIX-type file

DECLARE @min INT, @max INT, @header VARCHAR(8000), @dashes VARCHAR(8000), @last_dash INT, @last_field VARCHAR(200), @sql VARCHAR(max)
SET @min = 1

--The offset is because our particular data file had a bad first row. 

--We know line 1 and 2 are the header/dashes.
SET @header = (SELECT SUBSTRING(resultant,4,8000) FROM bulk_insert_raw WHERE id = 1)
SET @dashes = (SELECT resultant FROM bulk_insert_raw WHERE id = 2)

SET @max = LEN(@dashes) +2 --so we get the last field.
SET @last_dash = 1

SET @sql = 'select '
WHILE @min < @max
    IF (SELECT SUBSTRING(@dashes,@min,1)) IN (' ', '')
            SELECT @last_field = QUOTENAME(RTRIM(SUBSTRING(@header,@last_dash, (@min)-@last_dash    )))
            PRINT @last_field + CONCAT(' = substring(resultant,', @last_dash, ',', (@min)-@last_dash,')', ',')
            SET @last_dash = @min + 1

    SET @min = @min + 1

No comments: