Parsing subfields in SQL - sql

I have the following table
DRIVER_GID DRIVER_REFNUM_QUAL_GID
SDL2/C001.100000 SDL2.486900 CURRENT DISTRICT
SDL2/C001.100000 SDL2.486900 PERMANENT DISTRICT
SDL2/C001.100000000 SDL2.486900 CURRENT DISTRICT
SDL2/C001.100000000 SDL2.486900 PERMANENT DISTRICT
SDL2.600119036 SDL2.436001 CURRENT DISTRICT
SDL2.600119036 SDL2.436001 PERMANENT DISTRICT
I need to extract the numeric value after the string "SDL2." from DRIVER_REFNUM_QUAL_GID column. could anyone please recommend a query.

Use regexp_substr function of ORACLE:
select regexp_substr(DRIVER_REFNUM_QUAL_GID, '[[:digit:]]{6}') from YOURTABLE
This will extract adjacent 6 digits from DRIVER_REFNUM_QUAL_GID column of your yourtable.
If you prefer to extract all digits following period then use the following code:
select regexp_substr(DRIVER_REFNUM_QUAL_GID, '(\.)([[:digit:]]+)',1,1,'i',2) from YOURTABLE
To eliminate NULLS, you can use NVL function.
For example,
select NVL(regexp_substr(DRIVER_REFNUM_QUAL_GID, '(\.)([[:digit:]]+)',1,1,'i',2),999999) from YOURTABLE
So, if the result of the regexp_substr function is NULL, the result will be 999999.

Considering you only need numbers after SDL2. and considering that the number of digits could go up to a max of 20 digits, you can use something like-
select REGEXP_SUBSTR('SDL2.486900 CURRENT DISTRICT', '[[:digit:]]{2,20}') numval from dual;
You can edit the number 20 to suit your needs.
So it would look like-
select REGEXP_SUBSTR(DRIVER_REFNUM_QUAL_GID,'[[:digit:]]{2,20}') from your_table;

Related

How to return rows that contain a numbers or decimals

I have a VARCHAR column called description in the table test_table. How do I only return rows that contain a number or decimal number in the description column?
So for example these would be considered valid rows to return:
I love the number 3424
434 is cool
when can 23 be the best age
My sweet16today
when there is 0.143secs left
I love 0.314 because its pi
As long as there is any number in the description, its considered a valid row to return.
I've tried:
SELECT * FROM test_table
WHERE REGEXP_LIKE(X, '^[[:digit:]]+$');
You can use bracket wildcards that search for only numbers.
SELECT * FROM test_table
WHERE description LIKE '%[1234567890]%';
^[[:digit:]]+$ will match strings with numeric characters only (no non-numeric characters). You should use '[[:digit:]]+' or [0-9].

selecting date with regexp_extract

I will need to extract the year from a column name , , it is returning null value and the same number of character. i would want to only extract the date however there is a few column with the same number of character .
sample data in table
10020020
1172053041
597246141
3339110821
26590621
192133643
20190203
20180109
20170204
20190904
I have tried this,
select regexp_extract((colname), '([0-9]{8})', 1) from tablename
however it is returning the result that has the same number of characters with null values. i wish to only extract only the date which is 20170109,20190204 etc etc . what is the best approach and what did i go wrong ?
10020020
26590621
20190203
20180109
20170204
20190904
i have tried using wildcard select regexp_extract((maxvalue), '([0-9]{8})', 1) like '%2019%' from profilingoverviewreport but it returning boolean instead
If you want to match values which have exactly 8 digits, and those 8 digit values correspond to dates, then I suggest the following pattern:
^(20|19)[0-9]{6}$
Your updated SQL code:
SELECT *
FROM tablename
WHERE colname IREGEXP '^(20|19)[0-9]{6}$';
Check the demo to see the regex pattern correctly identifying the dates in your column:
Demo

How to identify combination of number and character in SQL

I have a requirement where I have to find number of records in a special pattern in the field ref_id in a table. It's a varchar column. I need to find all the records where 8th, 9th and 10th character are numeric+XX. That is it should be like 2XX or 8XX. I tried using regexp :digit: but no luck. Essentially I am looking for all records where 8th-10th characters are 1XX, 2XX, 3XX… etc
Using REGEXP_LIKE, replace table with Yours:
SELECT COUNT(*)
FROM table
WHERE REGEXP_LIKE(ref_id,'^.{7}[0-9]XX');
.{7} whatever seven characters
[0-9] 8th character digit
XX 9th and 10th characters X
Or with [:digit:] class as You are mentioning, You may use:
SELECT COUNT(*)
FROM table
WHERE REGEXP_LIKE(ref_id,'^.{7}[[:digit:]]XX');
This can also be achieved using standard non-regex SQL functions
select * from t where s like '________XX%' -- any 8 characters and then XX
AND translate( substr(s,8,1),'?0123456789','?') is null; --8th one is numeric
DEMO
No need for a regexp:
select * from mytable where substr(ref_id, 8, 3) in ('0XX','1XX','2XX','3XX','4XX','5XX','6XX','7XX','8XX','9XX')
or
select * from mytable where substr(ref_id, 8, 3) in ('1XX','2XX','3XX','4XX','5XX','6XX','7XX','8XX','9XX')
I don't know if '0XX' is a valid match or not.
Regexp's tend to be slow.

Get total number of user where username have defferrent case

I have SQL table where username have different cases for example "ACCOUNTS\Ninja.Developer" or "ACCOUNTS\ninja.developer"
I want to find the how many records where username where first in first and last name capitalize ? how can use Regex to find the total ?
x table
User
"ACCOUNTS\James.McAvoy"
"ACCOUNTS\michael.fassbender"
"ACCOUNTS\nicholas.hoult"
"ACCOUNTS\Oscar.Isaac"
Do you want something like this?
select count(*)
from t
where name rlike 'ACCOUNTS\[A-Z][a-z0-9]*[.][A-Z][a-z0-9]*'
Of course, different databases implement regular expressions differently, so the actual comparator may not be rlike.
In SQL Server, you can do:
select count(*)
from t
where name like 'ACCOUNTS\[A-Z][^.][.][A-Z]%';
You might need to be sure that you have a case-sensitive collation.
In most cases in MS SQL string collation is case insensitive so we need some trick. Here is an example:
declare #accts table(acct varchar(100))
--sample data
insert #accts values
('ACCOUNTS\James.McAvoy'),
('ACCOUNTS\michael.fassbender'),
('ACCOUNTS\nicholas.hoult'),
('ACCOUNTS\Oscar.Isaac')
;with accts as (
select
--cleanup and split values
left(replace(acct,'ACCOUNTS\',''),charindex('.',replace(acct,'ACCOUNTS\',''),0)-1) frst,
right(replace(acct,'ACCOUNTS\',''),charindex('.',replace(acct,'ACCOUNTS\',''),0)) last
from #accts
)
,groups as (--add comparison columns
select frst, last,
case when CAST(frst as varbinary(max)) = CAST(lower(frst) as varbinary(max)) then 'lower' else 'Upper' end frstCase, --circumvert case insensitive
case when CAST(last as varbinary(max)) = CAST(lower(last) as varbinary(max)) then 'lower' else 'Upper' end lastCase
from accts
)
--and gather fruit
select frstCase, lastCase, count(frst) cnt
from groups
group by frstCase,lastCase
Your question is a little vague but;
You might be looking for the DISTINCT command.
REF
I don't think you need regex.
Maybe do something like:
Get distinct names from Table X as Table A
Use inputs table A as where clause on Table X
count
union
I hope this helps,
Rhys
Given your example set you can use a combination of techniques. First if the user name always begins with "ACCOUNTS\" then you can use substr to select the characters that start after the "\" character.
For the first name:
Then you can use a regex function to see if it matches against [A-Z] or [a-z] assuming your username must start with an alpha character.
For the last name:
Use the instr function on the substr and search for the character '.' and again apply the regex function to match against [A-Z] or [a-z] to see if the last name starts with an upper or a lower character.
To total:
Select all matches where both first and last match against upper and do a count. Repeat for the lower matches and you'll have both totals.

using oracle sql substr to get last digits

I have a result of a query and am supposed to get the final digits of one column say 'term'
The value of column term can be like:
'term' 'number' (output)
---------------------------
xyz012 12
xyz112 112
xyz1 1
xyz02 2
xyz002 2
xyz88 88
Note: Not limited to above scenario's but requirement being last 3 or less characters can be digit
Function I used: to_number(substr(term.name,-3))
(Initially I assumed the requirement as last 3 characters are always digit, But I was wrong)
I am using to_number because if last 3 digits are '012' then number should be '12'
But as one can see in some specific cases like 'xyz88', 'xyz1') would give a
ORA-01722: invalid number
How can I achieve this using substr or regexp_substr ?
Did not explore regexp_substr much.
Using REGEXP_SUBSTR,
select column_name, to_number(regexp_substr(column_name,'\d+$'))
from table_name;
\d matches digits. Along with +, it becomes a group with one or more digits.
$ matches end of line.
Putting it together, this regex extracts a group of digits at the end of a string.
More details here.
Demo here.
Oracle has the function regexp_instr() which does what you want:
select term, cast(substr(term, 1-regexp_instr(reverse(term),'[^0-9]')) as int) as number
select SUBSTRING(acc_no,len(acc_no)-1,len(acc_no)) from table_name;