Using regular expression within a stored procedure - sql

What is the regular expression pattern that enables every input besides characters? So far this is what i have -
CREATE PROCEDURE Paging_Movies
#alphaChar char(1)
AS
if #alphaChar = '#'
select * from Movies where movies like '[0-9]%'
else
select * from Movies where movies like #alphaChar + '%'

If you want true regular expression pattern matching you will need to roll your own CLR UDF. This link goes over how to do that:
http://msdn.microsoft.com/en-us/magazine/cc163473.aspx
Keep in mind that you can only do this in SQL Server 2005 or higher.
If you just want non-alpha you can do this:
'([^a-z])'
Here is the documentation for SQL Server like:
http://msdn.microsoft.com/en-us/library/ms179859.aspx

SQL Server 2008 R2 has some regular expression functions built-in.
Here's a link explaining how to extract them and use them in your own database.

Related

Using Regex to determine what kind of SQL statement a row is from a list?

I have a large list of SQL commands such as
SELECT * FROM TEST_TABLE
INSERT .....
UPDATE .....
SELECT * FROM ....
etc. My goal is to parse this list into a set of results so that I can easily determine a good count of how many of these statements are SELECT statements, how many are UPDATES, etc.
so I would be looking at a result set such as
SELECT 2
INSERT 1
UPDATE 1
...
I figured I could do this with Regex, but I'm a bit lost other than simply looking at everything string and comparing against 'SELECT' as a prefix, but this can run into multiple issues. Is there any other way to format this using REGEX?
You can add the SQL statements to a table and run them through a SQL query. If the SQL text is in a column called SQL_TEXT, you can get the SQL command type using this:
upper(regexp_substr(trim(regexp_replace(SQL_TEXT, '\\s', ' ')),
'^([\\w\\-]+)')) as COMMAND_TYPE
You'll need to do some clean up to create a column that indicates the type of statement you have. The rest is just basic aggregation
with cte as
(select *, trim(lower(split_part(regexp_replace(col, '\\s', ' '),' ',1))) as statement
from t)
select statement, count(*) as freq
from cte
group by statement;
SQL is a language and needs a parser to turn it from text into a structure. Regular expressions can only do part of the work (such as lexing).
Regular Expression Vs. String Parsing
You will have to limit your ambition if you want to restrict yourself to using regular expressions.
Still you can get some distance if you so want. A quick search found this random example of tokenizing MySQL SQL statements using regex https://swanhart.livejournal.com/130191.html

SQL Server 2008 XML Attribute search is case sensitive

I store user preferences in an XML column which looks like this:
<tags>
<user>
<tag name="AB"/>
</user>
</tags/>
When I use the query below,
select *
from company
where CAST(tags.query('tags/user/tag[fn:contains(#name,"Ab")]') as varchar(2000) ) <> ''
it does not return any results, the attribute value is in different case then one in the xml column.
Any ideas on making the search by attribute name case insensitive?
Thanks
With SQL Server 2008 you can make use of the lower-case and upper-case functions like so:
select * from company where CAST(tags.query('tags/user/tag[fn:contains(lower-case(#name),"ab")]') as varchar(2000) ) <>''
see:
New XQuery functions introduced in SQL Server 2008: upper-case() and lower-case()
lower-case Function (XQuery) (MSDN)

Use like in T-SQl to search for words separated by an unknown number of spaces

I have this query:
select * from table where column like '%firstword[something]secondword[something]thirdword%'
What do I replace [something] with to match an unknown number of spaces?
Edited to add: % will not work as it matches any character, not just spaces.
Perhaps somewhat optimistically assuming "unknown number" includes zero.
select *
from table where
REPLACE(column_name,' ','') like '%firstwordsecondwordthirdword%'
The following may help: http://blogs.msdn.com/b/sqlclr/archive/2005/06/29/regex.aspx
as it describes using regular expressions in SQL queries in SQL Server 2005
I would definitely suggest cleaning the input data instead, but this example may work when you call it as a function from the SELECT statement. Note that this will potentially be very expensive.
http://www.bigresource.com/MS_SQL-Replacing-multiple-spaces-with-a-single-space-9llmmF81.html

Make an SQL request more efficient and tidy?

I have the following SQL query:
SELECT Phrases.*
FROM Phrases
WHERE (((Phrases.phrase) Like "*ing aids*")
AND ((Phrases.phrase) Not Like "*getting*")
AND ((Phrases.phrase) Not Like "*contracting*"))
AND ((Phrases.phrase) Not Like "*preventing*"); //(etc.)
Now, if I were using RegEx, I might bunch all the Nots into one big (getting|contracting|preventing), but I'm not sure how to do this in SQL.
Is there a way to render this query more legibly/elegantly?
Just by removing redundant stuff and using a consistent naming convention your SQL looks way cooler:
SELECT *
FROM phrases
WHERE phrase LIKE '%ing aids%'
AND phrase NOT LIKE '%getting%'
AND phrase NOT LIKE '%contracting%'
AND phrase NOT LIKE '%preventing%'
You talk about regular expressions. Some DBMS do have it: MySQL, Oracle... However, the choice of either syntax should take into account the execution plan of the query: "how quick it is" rather than "how nice it looks".
With MySQL, you're able to use regular expression where-clause parameters:
SELECT something FROM table WHERE column REGEXP 'regexp'
So if that's what you're using, you could write a regular expression string that is possibly a bit more compact that your 4 like criteria. It may not be as easy to see what the query is doing for other people, however.
It looks like SQL Server offers a similar feature.
Sinec it sounds like you're building this as you go to mine your data, here's something that you could consider:
CREATE TABLE Includes (phrase VARCHAR(50) NOT NULL)
CREATE TABLE Excludes (phrase VARCHAR(50) NOT NULL)
INSERT INTO Includes VALUES ('%ing aids%')
INSERT INTO Excludes VALUES ('%getting%')
INSERT INTO Excludes VALUES ('%contracting%')
INSERT INTO Excludes VALUES ('%preventing%')
SELECT
*
FROM
Phrases P
WHERE
EXISTS (SELECT * FROM Includes I WHERE P.phrase LIKE I.phrase) AND
NOT EXISTS (SELECT * FROM Excludes E WHERE P.phrase LIKE E.phrase)
You are then always just running the same query and you can simply change what's in the Includes and Excludes tables to refine your searches.
Depending on what SQL server you are using, it may support REGEX itself. For example, google searches show that SQL Server, Oracle, and mysql all support regex.
You could push all your negative criteria into a short circuiting CASE expression (works Sql Server, not sure about MSAccess).
SELECT *
FROM phrases
WHERE phrase LIKE '%ing aids%'
AND CASE
WHEN phrase LIKE '%getting%' THEN 2
WHEN phrase LIKE '%contracting%' THEN 2
WHEN phrase LIKE '%preventing%' THEN 2
ELSE 1
END = 1
On the "more efficient" side, you need to find some criteria that allows you to avoid reading the entire Phrases column. Double sided wildcard criteria is bad. Right sided wildcard criteria is good.

CONTAINSTABLE and CONTAINS, which string to pass to match all records?

We have a Single Statement FUNCTION in SQL Server 2005 which uses CONTAINSTABLE().
All works fine when we pass a non empty search string. Is there a wildcard string we can pass to CONTAINSTABLE() so that it matches all records in a table.
Kind regards,
You have to use logic within the stored procedure to run a SQL statement without the CONTAINSTABLE predicate if there isn't a full text phrase to search by.
I don't think there is, you'd have to do something like (psuedocode)
IF #searchterm='*'
SELECT * FROM YOURTTABLE
ELSE
SELECT * FROM YOURTABLE INNER JOIN CONTAINSTABLE etc
END IF