Alphanumeric Sorting In MSSQL

I recently had a requirement to sort alphanumeric values in an MS SQL Server stored procedure. This was something that I was previously doing in C# code, but it was part of a much bigger operation that needed to be moved to a stored procedure for performance reasons. My first instinct was Google. But, after quite a bit of searching, writing and testing, I was beginning to lose faith. It seemed most folks didn’t quite need to sort the same way I did. Most of them just wanted to ignore the alpha portions of the strings. And those that didn’t, didn’t consider certain (perhaps more infrequent) cases (such as A 1 vs A 100 vs A 2).

I was trying to sort room numbers. And those room numbers could be like any of the following:

1
10
2
1 A
2 A
10 A
1 B
1C
A1
B1
A100
A2
A 3

The sorted result needed to look like this:

1
1 A
1 B
1C
2
2 A
10
10 A
A1
A2
A 3
A100
B1

I put this portion of the procedure aside for some time, to clear my head and accomplish some other tasks. When I went back to it, I was resolved to writing a CLR procedure for it. But, I figured I should hit Google one more time just in case. This time I got lucky. I found the exact thing I was looking for on an MS Access programming board. I adapted it for SQL, and it works just as I need. The adapted code follows (written and optimized specifically for 50 character or less strings — which is all my case would ever need). Hopefully, this will save you the time it cost me.

CREATE FUNCTION [dbo].[fn_CreateAlphanumericSortValue]
(
	@ItemToSort varchar(50)
)
RETURNS varchar(100)
AS
	--==========================================================================================
	-- This function takes an alphanumeric string and encodes it so that it can be properly sorted
	--    against other alphanumeric strings
	-- The encoding will insert a two digit string before each numeric portion of the item to sort
	--    The two digits represent the number of digits in the numeric portion that it will precede (zero-padded)
	-- The encoding will also account for leading zeros in each numeric portion by adding a two digit
	--    string at the end of the item to sort, for each numeric portion.  Those two digits will
	--    represent the number of leading zeros in the numeric portion (zero-padded)
	-- Examples:
	-- ABC1 =     ABC011 00
	-- ABC1ABC1 = ABC011ABC011 00
	-- ABC12    = ABC0212 00
	-- ABC012   = ABC0212 01
	--==========================================================================================
BEGIN
	declare @WorkingItem varchar(50) = @ItemToSort
	declare @DigitCount int = 0
	declare @LeadingZeroCount int = 0
	declare @CurrentNumber varchar(50) = ''
	declare @Leftmost varchar(1) = ''
	declare @LeadingZeroString varchar(50) = ''

	--==========================================================================================
	-- With 50 character input, the worst case output should be 100 characters
	--==========================================================================================
	declare @SortValue varchar(100) = ''	

	--==========================================================================================
	-- We will work thru the input string one character at a time
	--==========================================================================================
	while (len(@WorkingItem) > 0)
	begin
		select @Leftmost = left(@WorkingItem, 1)

		--==========================================================================================
		-- Is the first character a number?
		--==========================================================================================
		if (isnumeric(@Leftmost) = 1)
		begin
			while (isnumeric(@Leftmost) = 1)
			begin
				--==========================================================================================
				-- Parse out all of the consecutive digits to get the current number
				--==========================================================================================
				if (@Leftmost = '0' and @DigitCount = 0)
				begin
					--==========================================================================================
					-- Leading zero -- just count how many we have in this set of digits
					--    We'll add the string for it to the end of our output below
					--==========================================================================================
					select @LeadingZeroCount = @LeadingZeroCount + 1
				end
				else
				begin
					--==========================================================================================
					-- Not a leading zero, so increment the digit count, and remember the current number value
					--==========================================================================================
					select @DigitCount = @DigitCount + 1
					select @CurrentNumber = @CurrentNumber + @Leftmost
				end

				--==========================================================================================
				-- Trim off the character we just checked, get the next character to check and continue the inner loop
				--==========================================================================================
				select @WorkingItem = substring(@WorkingItem, 2, 50)
				select @Leftmost = left(@WorkingItem, 1)
			end -- while (isnumeric(@Leftmost) = 1)

			--==========================================================================================
			-- We now have the current number from our input string
			--    Add the current number's leading zero string to the entire leading zero string
			--==========================================================================================
			if (@LeadingZeroCount < 10)
				select @LeadingZeroString = @LeadingZeroString + '0' + cast(@LeadingZeroCount as varchar)
			else
				select @LeadingZeroString = @LeadingZeroString + cast(@LeadingZeroCount as varchar)

			--==========================================================================================
			-- Add the current number's sort code, along with the current number, to the returned sort value
			--==========================================================================================
			if (@DigitCount < 10)
				select @SortValue = @SortValue + '0' + cast(@DigitCount as varchar) + @CurrentNumber
			else
				select @SortValue = @SortValue + cast(@DigitCount as varchar) + @CurrentNumber

			--==========================================================================================
			-- Reset for the next iteration
			--==========================================================================================
			select @DigitCount = 0
			select @CurrentNumber = ''
			select @LeadingZeroCount = 0
		end -- if (isnumeric(@Leftmost) = 1)

		--==========================================================================================
		-- The character we are currently working with is not a number, just tag it onto our return value
		--    Ignoring whitespace
		--==========================================================================================
		if (@Leftmost != ' ')
			select @SortValue = @SortValue + @Leftmost

		--==========================================================================================
		-- Trim off the character we just checked and continue the main loop
		--==========================================================================================
		select @WorkingItem = substring(@WorkingItem, 2, 50)

	end -- while (len(@WorkingItem) > 0)

	--==========================================================================================
	-- Finally, tag on the leading zero value and return our sort value
	--==========================================================================================
	select @SortValue = @SortValue +  ' ' + @LeadingZeroString

	return @SortValue
END


Comments are closed.

Rss Feed