SQL : REGEX MATCH - Character followed by numbers inside quotes - sql

I have a column in sql which holds value inside double quotes like "P1234567" , "P1234" etc..
I need to identify only columns which start with letter P and is followed by seven digits (numbers) only. I tried where column like'"P[0-9][0-9][0-9][0-9][0-9][0-9][0-9]"' but it doesn't seem to work.
Can someone please correct me or point me to a thread which can help me out?
Thanks

Standard SQL has no regex support, but most SQL engines have regex extensions added to them on top of the standard SQL. So, for example, if you're using MySQL then you'd do this:
... WHERE column REGEXP '^"P[0-9]{7}"'
And if you're using Postgres then that would be:
... WHERE column ~ '^"P[0-9]{7}"'
(updated to match the double-quote part of the question, I'd misunderstood that to begin with)

How about using length and isnumeric:
Select
*
from
mytable
where
mycolumn like '"P%'
and len(mycolumn) = 10 --2 chars for quotes + 1 for 'P' + 7 for the digits
and isnumeric(substring(mycolumn, 3, 7))=1
This answer is for SQL Server, other DBMS's may have a different syntax for length

Related

Imapala Regex - find specific sequence of characters, with delimiters between them, some are not letters, digits or underscore

I am new to regex and need to search a string field in Impala for multiple matches to this exact sequence of characters: ~FC* followed by 11 more * that could have letters/digits between (but could not, they are basically delimiters in this string field). After the 12th * (if you count #1 in ~FC*) it should be immediately followed by Y~.
since the asterisks are not letters or digits, I am unsure on how to search for these delimiters properly.
This is my SQL so far:
select
regexp_extract(col_name, '(~FC\\*).*(\\*Y~)', 1) as "pattern_found"
from db.table
where id = 123456789
limit 1
data returned:
pattern_found
--------------
~FC*
(~FC\\*) in Impala SQL it returns ~FC* which is great (got it from my other question)
Been trying this (~FC\\*).*(\\*Y~) which obviously isnt counting the number of asterisks but its is also not picking the Y up.
This is a test string, it has 2 occurrences:
N4*CITY*STATE*2155446*2120~FC*C*IND*30*MC*blah blah fjdgfeufh*27*0*****Y~FC*Z*IND*39*MC*jhlkfhfudfgsdkufgkusgfn*23*0*****Y~
results should be these 2, which has an overlapping ~ between them. but will settle for at least the first being found if both cannot.
~FC*C*IND*30*MC*blah blah fjdgfeufh*27*0*****Y~
~FC*Z*IND*39*MC*jhlkfhfudfgsdkufgkusgfn*23*0*****Y~
figured out a solution but happy to learn of a better way to accomplish this
This is what worked in Impala SQL, needed parentheses and double escape backslashes for allllll the asterisks:
(~FC\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*Y)
Full SQL:
select
regexp_extract(col_name, '(~FC\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*[^\\*]*\\*Y)', 1) as "pattern_found"
from db.table
where id = 123456789
limit 1
and here is the RegexDemo without the additional syntax needed for Impala SQL

How to find a row where col have special characters or numbers (except hyphen,apostrophe and space) in Oracle SQL

I need to find rows where col have special characters or numbers (except hyphen,apostrophe and space) in Oracle SQL.
I am doing like below:
SELECT *
FROM test
WHERE Name_test LIKE '%[^A-Za-z _]%'
But It is not working and I also need to exclude any apostrophe.
Kindly help.
If you need to find all rows where column have ONLY numbers and special characters (and you can specify all of required special characters):
SELECT *
FROM test
WHERE regexp_like(Name_test, q['^[0-9'%##]+$]')
as you can see you just need to add your special characters after 0-9.
^ - start
$ - end
About format q'[SOMETHING]' please see TEXT LITERALS here: https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Literals.html#GUID-1824CBAA-6E16-4921-B2A6-112FB02248DA
If you need to find all rows where column have no alpha-characters:
SELECT *
FROM test
WHERE regexp_like(Name_test, '^[^a-zA-Z]*$');
or
SELECT *
FROM test
WHERE regexp_like(Name_test, '^\W*$');
about \W - please see "Table 8-5 PERL-Influenced Operators in Oracle SQL Regular Expressions" here:
https://docs.oracle.com/database/121/ADFNS/adfns_regexp.htm#ADFNS235
I need to find rows where col have special characters or numbers (except hyphen, apostrophe and space [and presumably single quotes]) in Oracle SQL.
You can use double single quotes to put a single quote in:
WHERE Name_test LIKE '%[^-A-Za-z _'']%'
However, this is not Oracle syntax. If the above works, then I would guess you are using SQL Server. In Oracle:
WHERE REGEXP_LIKE(Name_test, '[^A-Za-z _'']')

Regular Expression Pattern for Search in SQL

I want to search a table which has file name(s) with a {Numerical Pattern String}.PDF.
Example: 1.PDF, 12.PDF, 123.PDF 1234.PDF etc.....
select * from web_pub_subfile where file_name like '[0-9]%[^a-z].pdf'
But above SQL Query is resulting even these kind of files
1801350 Ortho.pdf
699413.processing2.pdf
15-NOE-301.pdf
Could any one help me what I am missing here.
One way to do it is getting the substring before the file extension and checking if it is numeric. This solution only works well if there is only one . character in the file name.
select * from web_pub_subfile
where isnumeric(left(file_name,charindex('.',file_name)-1)) = 1
Note:
ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($).
To handle file names with mutliple . characters and if there is always a .filetype extension, use
select * from web_pub_subfile
where isnumeric(left(file_name,len(file_name)-charindex('.',reverse(file_name)))) = 1
and charindex('.',file_name) > 0
Sample demo
As suggested by #Blorgbeard in the comments, to avoid the use of isnumeric, use
select * from web_pub_subfile
where left(file_name,len(file_name)-charindex('.',reverse(file_name))) NOT LIKE '%[^0-9]%'
and len(left(file_name,len(file_name)-charindex('.',reverse(file_name)))) > 0
You can't really do what you are trying to do using plain out of the box sql. The reason you are seeing those results is that the % character matches any character, any number of times. It's not like * in a regex which matches the pervious character 0 or more times.
Your best option would probably be to create some CLR functions that implement regex functionality on the SQL Server side. You can take a look at this link to find a good place to start.
Depending on your version if 2012+, you could use Try_Convert()
select * from web_pub_subfile where Try_Convert(int,replace(file_name,'.pdf',''))>0
Declare #web_pub_subfile table (file_name varchar(100))
Insert Into #web_pub_subfile values
('1801350 Ortho.pdf'),
('699413.processing2.pdf'),
('15-NOE-301.pdf'),
('1.pdf'),
('1234.pdf')
select * from #web_pub_subfile where Try_Convert(int,replace(file_name,'.pdf',''))>0
Returns
file_name
1.pdf
1234.pdf

Sybase to Teradata inquiry LIKE '[0-9]'

CASE
WHEN <in_data> LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]' THEN SUBSTR(<in_data>,1,3)
ELSE '000'
END
We're doing a migration project from Sybase to Teradata, and having a problem figuring this one out :) I'm still new to Teradata.
I would like to ask the equivalent TD code for this -
LIKE '[0-9][0-9][0-9][0-9][0-9][0-9]' to Teradata
Basically, it just checks whether the digits are numeric value.
Can someone give me a hint on this
You can also use REGEXP_SUBSTR to directly extract the three digits:
COALESCE(REGEXP_SUBSTR(in_data,'^[0-9]{3}(?=[0-9]{3}$)'), '000')
This looks for the first three digits and then does a lookahead for three following digits without adding them to the overall match.
^ indicates the begin of the string, '$' the end, so there are no other characters before or after the six digits. (?=...) is a so-called "lookahead", i.e. those three digits are checked, but ignored.
If there's no match the regex returns NULL which is changed to '000'.
You need to use regexp instead of like, since [0-9][0-9][0-9][0-9][0-9][0-9] is a regular expression.
To do an exact match, you need to add anchors. ie, to match the string which contains an exact 6 digit chars.
regexp '^[0-9]{6}$'
or
regexp '^[[:digit:]]{6}$'

Matching exactly 2 characters in string - SQL

How can i query a column with Names of people to get only the names those contain exactly 2 “a” ?
I am familiar with % symbol that's used with LIKE but that finds all names even with 1 a , when i write %a , but i need to find only those have exactly 2 characters.
Please explain - Thanks in advance
Table Name: "People"
Column Names: "Names, Age, Gender"
Assuming you're asking for two a characters search for a string with two a's but not with three.
select *
from people
where names like '%a%a%'
and name not like '%a%a%a%'
Use '_a'. '_' is a single character wildcard where '%' matches 0 or more characters.
If you need more advanced matches, use regular expressions, using REGEXP_LIKE. See Using Regular Expressions With Oracle Database.
And of course you can use other tricks as well. For instance, you can compare the length of the string with the length of the same string but with 'a's removed from it. If the difference is 2 then the string contained two 'a's. But as you can see things get ugly real soon, since length returns 'null' when a string is empty, so you have to make an exception for that, if you want to check for names that are exactly 'aa'.
select * from People
where
length(Names) - 2 = nvl(length(replace(Names, 'a', '')), 0)
Another solution is to replace everything that is not an a with nothing and check if the resulting String is exactly two characters long:
select names
from people
where length(regexp_replace(names, '[^a]', '')) = 2;
This can also be extended to deal with uppercase As:
select names
from people
where length(regexp_replace(names, '[^aA]', '')) = 2;
SQLFiddle example: http://sqlfiddle.com/#!4/09bc6
select * from People where names like '__'; also ll work