Part 1: Cleaning up parentheses in a string Part 2: Replacing alternate operators Part 3: Splitting the search string Part 4: Wrapping it up

A few years ago I wanted to have a crack at improving one of our websites’ search by providing “search operator” functionality.  These are basically special words or characters (i.e. operators) which provide Boolean search capability; e.g. AND, OR, NOT.

I looked around but couldn’t find anything implemented using T/SQL.  There were a few .Net solutions , but I’m not a developer, and can only handle very basic .Net on a good day.  So I had a crack at writing my own. I started off with the Google cheat-sheet as my initial list of requirements, which was basically as follows:

  • Must support the operators AND, OR, NOT, and NEAR

  • Must allow for “exact-phrase” searches

  • Should respect parentheses (for precedence control)

  • Should be able to perform wildcard searches

  • Operators must be case-sensitive (i.e. OR != or)

To attack this problem, I decided to split it into multiple chunks, and write separate blocks of code to meet each requirements.  In this post I’m going to introduce the first piece of the puzzle (which is number 3 from the list above; handling parentheses).  What this means is that given a string containing parentheses, I need to output the same string with valid parentheses intact (and obviously invalid ones removed).

Below is the function. I won’t explain it line by line since its pretty well commented, and relatively simple anyway.

IF OBJECT_ID('[dbo].[fn_search_clause_parentheses_handler]') IS NULL
EXEC('CREATE FUNCTION [dbo].[fn_search_clause_parentheses_handler] () RETURNS INT AS BEGIN RETURN(0); END ');
GO
/***************************************************************************************************************************
This is a supporting function, which is called by the function [fn_search_clause_get].
It accepts a string input, and outputs the same string, but with any invalid parentheses removed, and all remaining
valid parentheses buffered (by a space on each side) so that they are split correctly by the calling function.
Author : David Curlewis
Date : 07/2009

This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License
http://creativecommons.org/licenses/by-nc/3.0/
****************************************************************************************************************************/
ALTER FUNCTION [dbo].[fn_search_clause_parentheses_handler] (
@search_text NVARCHAR(4000)
) RETURNS NVARCHAR(4000)
AS
BEGIN
-------------------------------------------------------------------------------------------------
-- PARENTHESES HANDLER
-------------------------------------------------------------------------------------------------
-- IF parentheses exist in the string
IF (PATINDEX('%[()]%', @search_text) > 0)
BEGIN
DECLARE @pos_o INT, @pos_x INT, @charpos INT;
DECLARE @tt_char_split TABLE (id SMALLINT IDENTITY(1,1), [char] VARCHAR(2) NOT NULL);
SET @charpos = 1;

-- split the string apart into a temp table
WHILE @charpos 0
SET @search_text = REPLACE(@search_text, ' ( ) ', '');
END

RETURN(@search_text);
END
GO

So, its a scalar function which accepts a string, and returns that same string, but with it’s parentheses tidied up and “buffered” (i.e. I wrap them in spaces to make it easier for the calling function to then split the string into its component parts).

Now, as I sit here watching the latest straight-to-dvd masterpiece that is “Barbie: A fashion fairy-tale ” (hey, its a Sunday afternoon and I have 3 daughters – you’d lose the battle too!), I’m having a hard time getting the code to post properly, so if you have any issues, flick me a note in the comments below and I’ll sort something else out.

Give it a try and let me know if you find any bugs (or make any improvements!). I’ll follow up soon with posts describing the rest of the process, but for now I’ve had enough of Barbie, and think I deserve a whisky. 😉

Cheers

DB Dave