Regular Expressions are very powerful when working with text, and in my case when detecting blog spam. I didn't feel like modifying .TEXT so I modified the DB to match comments against a list of "bad words". I replaced MT Blacklist with regular expressions after it went offline
The user function below matches a string against a regular expression. It works in all flavors of SQL Server that I have tried it on (SQL Server 2000 MSDE, SQL Server 2005 standard, Express and Express with Advanced Services). It has caught more than 12.800 spam comments since it went online :-) But more on the anti spam later, this is the function:
CREATE FUNCTION evalRegEx
(
@source varchar(5000),
@regexp varchar(1000),
@ignorecase bit = 0
)
RETURNS bit
AS
BEGIN
DECLARE @hr integer
DECLARE @objRegExp integer
DECLARE @objMatches integer
DECLARE @objMatch integer
DECLARE @count integer
DECLARE @results bit
SET @results = 0
EXEC @hr = sp_OACreate 'VBScript.RegExp', @objRegExp OUTPUT
IF @hr = 0 BEGIN
EXEC @hr = sp_OASetProperty @objRegExp, 'Pattern', @regexp
IF @hr = 0 BEGIN
EXEC @hr = sp_OASetProperty @objRegExp, 'Global', false
IF @hr = 0 BEGIN
EXEC @hr = sp_OASetProperty @objRegExp, 'IgnoreCase', @ignorecase
IF @hr = 0 BEGIN
EXEC @hr = sp_OAMethod @objRegExp, 'Test', @results OUTPUT, @source
END
END
END
EXEC @hr = sp_OADestroy @objRegExp
END
RETURN @results
END
Example calls:
- SELECT dbo.evalRegEx( 'this is a test', 'IS', 0)
Returns 0
- SELECT dbo.evalRegEx( 'this is a test', 'IS', 1)
- Returns 1
Feel free to drop a few cents in the
tip jar if this post saved you time and money