How to identify combination of number and character in SQL - sql

I have a requirement where I have to find number of records in a special pattern in the field ref_id in a table. It's a varchar column. I need to find all the records where 8th, 9th and 10th character are numeric+XX. That is it should be like 2XX or 8XX. I tried using regexp :digit: but no luck. Essentially I am looking for all records where 8th-10th characters are 1XX, 2XX, 3XX… etc

Using REGEXP_LIKE, replace table with Yours:
SELECT COUNT(*)
FROM table
WHERE REGEXP_LIKE(ref_id,'^.{7}[0-9]XX');
.{7} whatever seven characters
[0-9] 8th character digit
XX 9th and 10th characters X
Or with [:digit:] class as You are mentioning, You may use:
SELECT COUNT(*)
FROM table
WHERE REGEXP_LIKE(ref_id,'^.{7}[[:digit:]]XX');

This can also be achieved using standard non-regex SQL functions
select * from t where s like '________XX%' -- any 8 characters and then XX
AND translate( substr(s,8,1),'?0123456789','?') is null; --8th one is numeric
DEMO

No need for a regexp:
select * from mytable where substr(ref_id, 8, 3) in ('0XX','1XX','2XX','3XX','4XX','5XX','6XX','7XX','8XX','9XX')
or
select * from mytable where substr(ref_id, 8, 3) in ('1XX','2XX','3XX','4XX','5XX','6XX','7XX','8XX','9XX')
I don't know if '0XX' is a valid match or not.
Regexp's tend to be slow.

Related

SQL Server - Regex pattern match only alphanumeric characters

I have an nvarchar(50) column myCol with values like these 16-digit, alphanumeric values, starting with '0':
0b00d60b8d6cfb19, 0b00d60b8d6cfb05, 0b00d60b8d57a2b9
I am trying to delete rows with myCol values that don't match those 3 criteria.
By following this article, I was able to select the records starting with '0'. However, despite the [a-z0-9] part of the regex, it also keeps selecting myCol values containing special characters like 00-d#!b8-d6/f&#b. Below is my select query:
SELECT * from Table
WHERE myCol LIKE '[0][a-z0-9]%' AND LEN(myCol) = 16
How should the expression be changed to select only rows with myCol values that don't contain special characters?
If the value must only contain a-z and digits, and must start with a 0 you could use the following:
SELECT *
FROM (VALUES(N'0b00d60b8d6cfb19'),
(N'0b00d60b8d6cfb05'),
(N'0b00d60b8d57a2b9'),
(N'00-d#!b8-d6/f&#b'))V(myCol)
WHERE V.myCol LIKE '0%' --Checks starts with a 0
AND V.myCol NOT LIKE '%[^0-9A-z]%' --Checks only contains alphanumerical characters
AND LEN(V.myCol) = 16;
The second clause works as the LIKE will match any character that isn't an alphanumerical character. The NOT then (obviously) reverses that, meaning that the expression only resolves to TRUE when the value only contains alphanumerical characters.
Pattern matching in SQL Server is not awesome, and there is currently no real regex support.
The % in your pattern is what is including the special characters you show in your example. The [a-z0-9] is only matching a single character. If your character lengths are 16 and you're only interested in letters and numbers then you can include a pattern for each one:
SELECT *
FROM Table
WHERE myCol LIKE '[0][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9][a-z0-9]';
Note: you don't need the AND LEN(myCol) = 16 with this.

Pull 3 digits from middle of id and sort by even odd

I have file ids in my database that start with:
a single character prefix
a period
a three digit client id
a hyphen
a three digit file number.
Example F.129-123
We have several ids for each client.
I need to be able to strip out the three digit file number and then pull them based on even or odd so that I can assign specific data to each result population.
One added issue. Some of the ids have characters added at the end.
Example: F.129-123A or F.129-123.NF
So I need to be able to just use the three digit file number without any other characters, because the added characters create errors while conversion.
If you are using SQL SERVER,
you can use CHARINDEX() to find the index of - and then
get 3 digits after - using SUBSTRING()
SELECT substring('F.123-234',charindex('-','F.123-234')+1, 3)
If you are using MySQL,
you can use POSITION() to find the index of - and then get 3 digits after - using SUBSTRING()
SELECT SUBSTRING('F.123-234',POSITION( '-' IN 'F.123-234' )+1,3);
If you are using Oracle,
you can use INSTR() to find the index of - and then get 3 digits after - using SUBSTR()
UPDATES:
Based on the requirements in comments, you can use a query like below achieve what you need.
SELECT
SUBSTRING(MatterID,CHARINDEX('-',MatterID)+1, 3) as FileNo
FROM
Matters
WHERE
MatterID LIKE'f.129%'
AND MatterID NOT LIKE '%col%'
AND substring( MatterID, CHARINDEX('-',MatterID)+1, 3) % 2 = 0
If you are working with Microsoft SQL Server, then you could use of patindex() function with substring() function to get the only 3 digits file number
select left(substring(string, PATINDEX('%[0-9][-]%', string)+2, LEN(string)), 3)
Note that if you have other period (i.e. -, /) then you will need to modify chars like PATINDEX('%[0-9][/]%')
In Postgres you can use split_part() to get the part after the hyphen, then cast it to an integer:
select *
from the_table
order by split_part(file_id, '-', 2)::int;
This assumes that there is always exactly one - in the string. I understand your question that this is the case as the format is fixed.
Is this helpful
Create table #tmpFileNames(id int, FileName VARCHAR(50))
insert into #tmpFileNames values(1,'F.129-123')
insert into #tmpFileNames values(2,'F.129-125')
insert into #tmpFileNames values(3,'F.129-124')
insert into #tmpFileNames values(4,'F.129-123A')
insert into #tmpFileNames values(5,'F.129-124B')
insert into #tmpFileNames values(6,'F.129-125.PQ')
insert into #tmpFileNames values(7,'F.129-123.NF')
select SUBSTRING(STUFF(FileName, 1, CHARINDEX('-',FileName), ''),0,4), * from #tmpFileNames
Order by SUBSTRING(STUFF(FileName, 1, CHARINDEX('-',FileName), ''),0,4),id
Drop table #tmpFileNames

SQL : REGEX MATCH - Character followed by numbers inside quotes

I have a column in sql which holds value inside double quotes like "P1234567" , "P1234" etc..
I need to identify only columns which start with letter P and is followed by seven digits (numbers) only. I tried where column like'"P[0-9][0-9][0-9][0-9][0-9][0-9][0-9]"' but it doesn't seem to work.
Can someone please correct me or point me to a thread which can help me out?
Thanks
Standard SQL has no regex support, but most SQL engines have regex extensions added to them on top of the standard SQL. So, for example, if you're using MySQL then you'd do this:
... WHERE column REGEXP '^"P[0-9]{7}"'
And if you're using Postgres then that would be:
... WHERE column ~ '^"P[0-9]{7}"'
(updated to match the double-quote part of the question, I'd misunderstood that to begin with)
How about using length and isnumeric:
Select
*
from
mytable
where
mycolumn like '"P%'
and len(mycolumn) = 10 --2 chars for quotes + 1 for 'P' + 7 for the digits
and isnumeric(substring(mycolumn, 3, 7))=1
This answer is for SQL Server, other DBMS's may have a different syntax for length

Parsing subfields in SQL

I have the following table
DRIVER_GID DRIVER_REFNUM_QUAL_GID
SDL2/C001.100000 SDL2.486900 CURRENT DISTRICT
SDL2/C001.100000 SDL2.486900 PERMANENT DISTRICT
SDL2/C001.100000000 SDL2.486900 CURRENT DISTRICT
SDL2/C001.100000000 SDL2.486900 PERMANENT DISTRICT
SDL2.600119036 SDL2.436001 CURRENT DISTRICT
SDL2.600119036 SDL2.436001 PERMANENT DISTRICT
I need to extract the numeric value after the string "SDL2." from DRIVER_REFNUM_QUAL_GID column. could anyone please recommend a query.
Use regexp_substr function of ORACLE:
select regexp_substr(DRIVER_REFNUM_QUAL_GID, '[[:digit:]]{6}') from YOURTABLE
This will extract adjacent 6 digits from DRIVER_REFNUM_QUAL_GID column of your yourtable.
If you prefer to extract all digits following period then use the following code:
select regexp_substr(DRIVER_REFNUM_QUAL_GID, '(\.)([[:digit:]]+)',1,1,'i',2) from YOURTABLE
To eliminate NULLS, you can use NVL function.
For example,
select NVL(regexp_substr(DRIVER_REFNUM_QUAL_GID, '(\.)([[:digit:]]+)',1,1,'i',2),999999) from YOURTABLE
So, if the result of the regexp_substr function is NULL, the result will be 999999.
Considering you only need numbers after SDL2. and considering that the number of digits could go up to a max of 20 digits, you can use something like-
select REGEXP_SUBSTR('SDL2.486900 CURRENT DISTRICT', '[[:digit:]]{2,20}') numval from dual;
You can edit the number 20 to suit your needs.
So it would look like-
select REGEXP_SUBSTR(DRIVER_REFNUM_QUAL_GID,'[[:digit:]]{2,20}') from your_table;

How to get rightmost 10 places of a string in oracle

I am trying to fetch an id from an oracle table. It's something like TN0001234567890345. What I want is to sort the values according to the right most 10 positions (e.g. 4567890345). I am using Oracle 11g. Is there any function to cut the rightmost 10 places in Oracle SQL?
You can use SUBSTR function as:
select substr('TN0001234567890345',-10) from dual;
Output:
4567890345
codaddict's solution works if your string is known to be at least as long as the length it is to be trimmed to. However, if you could have shorter strings (e.g. trimming to last 10 characters and one of the strings to trim is 'abc') this returns null which is likely not what you want.
Thus, here's the slightly modified version that will take rightmost 10 characters regardless of length as long as they are present:
select substr(colName, -least(length(colName), 10)) from tableName;
Another way of doing it though more tedious. Use the REVERSE and SUBSTR functions as indicated below:
SELECT REVERSE(SUBSTR(REVERSE('TN0001234567890345'), 1, 10)) FROM DUAL;
The first REVERSE function will return the string 5430987654321000NT.
The SUBSTR function will read our new string 5430987654321000NT from the first character to the tenth character which will return 5430987654.
The last REVERSE function will return our original string minus the first 8 characters i.e. 4567890345
SQL> SELECT SUBSTR('00000000123456789', -10) FROM DUAL;
Result: 0123456789
Yeah this is an old post, but it popped up in the list due to someone editing it for some reason and I was appalled that a regular expression solution was not included! So here's a solution using regex_substr in the order by clause just for an exercise in futility. The regex looks at the last 10 characters in the string:
with tbl(str) as (
select 'TN0001239567890345' from dual union
select 'TN0001234567890345' from dual
)
select str
from tbl
order by to_number(regexp_substr(str, '.{10}$'));
An assumption is made that the ID part of the string is at least 10 digits.