Thursday, January 15, 2009

SQL Server, extracting UTF-16BE data from an image field

If like me you made a choice back in the mists of time to store UTF16-BE data in an image field in SQL Server, you may find the following functions useful.

As a bit of back-story, it seems that SQL Server stores data as UCS2 (roughly UTF16-LE) data, and the convert -> NVARCHAR approach does not recognise any form of binary data other than this :(

To that end the following code swaps the byte-order of a set of bytes stored in an 'image' field.

It will only work on SQL 2005 as I was far to lazy to sort out TEXTPTRs and TEXTREAD etc.

Hope you find it useful:


CREATE FUNCTION dbo.my_convert_image_to_text( @value AS VARBINARY(MAX) ) RETURNS nvarchar(MAX)
BEGIN
RETURN cast( @value as NVARCHAR(MAX) )
END
GO

CREATE FUNCTION dbo.swap_byte_order( @value AS VARBINARY(MAX) ) RETURNS VARBINARY(MAX)
BEGIN
DECLARE @b varbinary(MAX)
DECLARE @i bigint
DECLARE @count bigint
SET @i= 1
SET @b = CAST('' AS varbinary(1))
SET @count= datalength(@value)
WHILE (@i <= @count)
BEGIN
SET @b = @b + CAST( SUBSTRING(@value, @i+1, 1) as binary(1) )
+ CAST( SUBSTRING(@value, @i, 1) as binary(1) )
SET @i = @i + 2
END
RETURN @b
END
GO

CREATE FUNCTION dbo.image_as_string (@value AS VARBINARY(MAX) ) RETURNS nvarchar(MAX)
BEGIN
DECLARE @dataSize bigint
DECLARE @result nvarchar(MAX)

SET @dataSize= datalength(@value)
IF @dataSize >2 AND @dataSize%2 = 0
BEGIN
SET @result= dbo.my_convert_image_to_text(dbo.swap_byte_order(@value))
END
ELSE
SET @result= dbo.my_convert_image_to_text(@value)
RETURN @result
END
GO

select dbo.test_image_as_string(someImageColumn) from someTable