Regular Expression Pattern for Search in SQL - sql

I want to search a table which has file name(s) with a {Numerical Pattern String}.PDF.
Example: 1.PDF, 12.PDF, 123.PDF 1234.PDF etc.....
select * from web_pub_subfile where file_name like '[0-9]%[^a-z].pdf'
But above SQL Query is resulting even these kind of files
1801350 Ortho.pdf
699413.processing2.pdf
15-NOE-301.pdf
Could any one help me what I am missing here.

One way to do it is getting the substring before the file extension and checking if it is numeric. This solution only works well if there is only one . character in the file name.
select * from web_pub_subfile
where isnumeric(left(file_name,charindex('.',file_name)-1)) = 1
Note:
ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($).
To handle file names with mutliple . characters and if there is always a .filetype extension, use
select * from web_pub_subfile
where isnumeric(left(file_name,len(file_name)-charindex('.',reverse(file_name)))) = 1
and charindex('.',file_name) > 0
Sample demo
As suggested by #Blorgbeard in the comments, to avoid the use of isnumeric, use
select * from web_pub_subfile
where left(file_name,len(file_name)-charindex('.',reverse(file_name))) NOT LIKE '%[^0-9]%'
and len(left(file_name,len(file_name)-charindex('.',reverse(file_name)))) > 0

You can't really do what you are trying to do using plain out of the box sql. The reason you are seeing those results is that the % character matches any character, any number of times. It's not like * in a regex which matches the pervious character 0 or more times.
Your best option would probably be to create some CLR functions that implement regex functionality on the SQL Server side. You can take a look at this link to find a good place to start.

Depending on your version if 2012+, you could use Try_Convert()
select * from web_pub_subfile where Try_Convert(int,replace(file_name,'.pdf',''))>0
Declare #web_pub_subfile table (file_name varchar(100))
Insert Into #web_pub_subfile values
('1801350 Ortho.pdf'),
('699413.processing2.pdf'),
('15-NOE-301.pdf'),
('1.pdf'),
('1234.pdf')
select * from #web_pub_subfile where Try_Convert(int,replace(file_name,'.pdf',''))>0
Returns
file_name
1.pdf
1234.pdf

Related

How run Select Query with LIKE on thousands of rows

Newbie here. Been searching for hours now but I can seem to find the correct answer or properly phrase my search.
I have thousands of rows (orderids) that I want to put on an IN function, I have to run a LIKE at the same time on these values since the columns contains json and there's no dedicated table that only has the order_id value. I am running the query in BigQuery.
Sample Input:
ORD12345
ORD54376
Table I'm trying to Query: transactions_table
Query:
SELECT order_id, transaction_uuid,client_name
FROM transactions_table
WHERE JSON_VALUE(transactions_table,'$.ordernum') LIKE IN ('%ORD12345%','%ORD54376%')
Just doesn't work especially if I have thousands of rows.
Also, how do I add the order id that I am querying so that it appears under an order_id column in the query result?
Desired Output:
Option one
WITH transf as (Select order_id, transaction_uuid,client_name , JSON_VALUE(transactions_table,'$.ordernum') as o_num from transactions_table)
Select * from transf where o_num like '%ORD12345%' or o_num like '%ORD54376%'
Option two
split o_num by "-" as separator , create table of orders like (select 'ORD12345' as num
Union
Select 'ORD54376' aa num) and inner join it with transf.o_num
One method uses OR:
WHERE JSON_VALUE(transactions_table, '$.ordernum') LIKE IN '%ORD12345%' OR
JSON_VALUE(transactions_table, '$.ordernum') LIKE '%ORD54376%'
An alternative method uses regular expressions:
WHERE REGEXP_CONTAINS(JSON_VALUE(transactions_table, '$.ordernum'), 'ORD12345|ORD54376')
According to the documentation, here, the LIKE operator works as described:
Checks if the STRING in the first operand X matches a pattern
specified by the second operand Y. Expressions can contain these
characters:
A percent sign "%" matches any number of characters or
bytes.
An underscore "_" matches a single character or byte.
You can escape "\", "_", or "%" using two backslashes. For example, "\%". If
you are using raw strings, only a single backslash is required. For
example, r"\%".
Thus , the syntax would be like the following:
SELECT
order_id,
transaction_uuid,
client_name
FROM
transactions_table
WHERE
JSON_VALUE(transactions_table,
'$.ordernum') LIKE '%ORD12345%'
OR JSON_VALUE(transactions_table,
'$.ordernum') LIKE '%ORD54376%
Notice that we specify two conditions connected with the OR logical operator.
As a bonus information, when querying large datasets it is a good pratice to select only the columns you desire in your out output ( either in a Temp Table or final view) instead of using *, because BigQuery is columnar, one of the reasons it is faster.
As an alternative for using LIKE, you can use REGEXP_CONTAINS, according to the documentation:
Returns TRUE if value is a partial match for the regular expression, regex.
Using the following syntax:
REGEXP_CONTAINS(value, regex)
However, it will also work if instead of a regex expression you use a STRING between single/double quotes. In addition, you can use the pipe operator (|) to allow the searched components to be logically ordered, when you have more than expression to search, as follows:
where regexp_contains(email,"gary|test")
I hope if helps.

SQL : REGEX MATCH - Character followed by numbers inside quotes

I have a column in sql which holds value inside double quotes like "P1234567" , "P1234" etc..
I need to identify only columns which start with letter P and is followed by seven digits (numbers) only. I tried where column like'"P[0-9][0-9][0-9][0-9][0-9][0-9][0-9]"' but it doesn't seem to work.
Can someone please correct me or point me to a thread which can help me out?
Thanks
Standard SQL has no regex support, but most SQL engines have regex extensions added to them on top of the standard SQL. So, for example, if you're using MySQL then you'd do this:
... WHERE column REGEXP '^"P[0-9]{7}"'
And if you're using Postgres then that would be:
... WHERE column ~ '^"P[0-9]{7}"'
(updated to match the double-quote part of the question, I'd misunderstood that to begin with)
How about using length and isnumeric:
Select
*
from
mytable
where
mycolumn like '"P%'
and len(mycolumn) = 10 --2 chars for quotes + 1 for 'P' + 7 for the digits
and isnumeric(substring(mycolumn, 3, 7))=1
This answer is for SQL Server, other DBMS's may have a different syntax for length

Select query that displays Joined words separately, not using a function

I require a select query that adds a space to the data based on the placement of the capital letters i.e. 'HelpMe' using this query would be displayed as 'Help Me' . Note i cannot use a stored function to do this the it must be done in the query itself. The Data is of variable length and query must be in SQL. Any Help will be appreciated.
Thanks
You need to use user defined function for this until MS give us support for regular expressions. Solution would be something like:
SELECT col1, dbo.RegExReplace(col1, '([A-Z])',' \1') FROM Table
Aldo this would produce leading space that you can remove with TRIM.
Replace regular expresion function:
http://connect.microsoft.com/SQLServer/feedback/details/378520
About dbo.RegexReplace you can read at:
TSQL Replace all non a-z/A-Z characters with an empty string
Assume if you are using Oracle RDBMS, you use the following,
REGEX_REPLACE
SELECT REGEXP_REPLACE('ILikeToWatchCSIMiami',
'([A-Z.])', ' \1')
AS RX_REPLACE
FROM dual
;
Managed to get this output: * SQLFIDDLE
But as you see it doesn't treat well on words such as CSI though.

SQLite string contains other string query

How do I do this?
For example, if my column is "cats,dogs,birds" and I want to get any rows where column contains cats?
Using LIKE:
SELECT *
FROM TABLE
WHERE column LIKE '%cats%' --case-insensitive
While LIKE is suitable for this case, a more general purpose solution is to use instr, which doesn't require characters in the search string to be escaped. Note: instr is available starting from Sqlite 3.7.15.
SELECT *
FROM TABLE
WHERE instr(column, 'cats') > 0;
Also, keep in mind that LIKE is case-insensitive, whereas instr is case-sensitive.

MS Access: searching a column for a star/asterisk

I'm looking for a way to search a column of string datatype which contains a * - the problem is that the star or asterisk is a reserved symbol. The following query doesn't work properly:
select * from users where instr(pattern,"*")
How can you write an Access query to search a column for an asterisk?
You can search for reseverd charaters in Access by using square brackets:
select * from users where pattern like "*[*]*"
yay, found it out by myself:
select * from users where instr(pattern,chr(42))
Just use
select * from users where instr(pattern,"*") > 0
From Access: Instr Function
In Access, the Instr function returns
the position of the first occurrence
of a string in another string.
Use the ALIKE function because its wildcard characters do not include * e.g.
SELECT * FROM Users WHERE pattern ALIKE '%*%';
(Edit by DWF: see #onedayone's useful explanation of ALIKE)