Pattern Matching with SQL Like with first part letters and second part numbers of varying length

Pattern Matching with SQL Like with first part letters and second part numbers of varying length - sql

Is there a way to use Pattern Matching with SQL LIKE, to match first part of letters and a second part of variable number of numbers?
For example, I want to select only ABC1002, ABC23, ABC569, CDE48569.

Here is one method:
where col like '[A-Z][A-Z][A-Z][0-9]%' and
col not like '[A-Z][A-Z][A-Z]%[^0-9]%'
The logic says:
The column starts with three letters and a digit.
Nothing other than a digit follows the three letters.

Related

Using regexp_like in Oracle to match on multiple string conditions using a range of values

I have a field in my Oracle DB that contains codes and from this I need to pull multiple values using a range of values.
As an example I need to pull all codes in the range C00.0 - C39.9 i.e. begins with C, the second character can be 0-3, third character is 0-9, followed by a "." and then the last digit is 0-9 e.g.
CODES
-----
C00.0
C10.4
C15.8
C39.8
The example above is for one pattern, I have multiple patterns to match on, here is another example
C50.011-C69.92
Again, starts with C, second character is 5-6, third is 0-9, fourth is ".", fifth is 0-9, sixth is 1-2 etc.
I have tried the following but my pipe function doesn't appear to pick up the second condition and therefore I am only getting results for the first condition '^[C][0-3][0-9][.][0-9]':
SELECT DISTINCT CODES
FROM
TABLE
WHERE REGEXP_LIKE (CODES, '^[C][0-3][0-9][.][0-9]|
^[C][4][0-3][.][0-9]|
^[C][4][A][.][0-9]|
^[C][4][4-9][.][0-9]|
^[C][4][9][.][A][0-9]|
^[C][5-6][0-9][.][0-9][1-9]|
^[C][7][0-5][.][0-9]|
^[C][7][A-B][.][0-8]')
ORDER BY CODES
I would be very grateful if anyone could make a suggestion on how I can pull the additional patterns.

You have newlines in the pattern -- in other words, your attempt at readability is causing the problem. You can just remove them, although I would probably factor out common elements:
WHERE REGEXP_LIKE (CODES, '^[C]([0-3][0-9][.][0-9]|[4][0-3][.][0-9]|[4][A][.][0-9]|[4][4-9][.][0-9]|[4][9][.][A][0-9]|[5-6][0-9][.][0-9][1-9]|[7][0-5][.][0-9]|[7][A-B][.][0-8])')
I think you also want $ at the end.
If you want readability, you could use or:
SELECT DISTINCT CODES
FROM TABLE
WHERE REGEXP_LIKE (CODES, '^[C][0-3][0-9][.][0-9]') OR
REGEXP_LIKE (CODES, '^[C][4][0-3][.][0-9]|') OR
. . .

Here is a regex pattern for what you want to match here:
^C[0-3][0-9][.][0-9]$
Demo
This would match the range of C00.0 - C39.9. If you want to match other ranges, then you would need an alternation with another pattern to cover those ranges.
Applying this to your current query:
SELECT DISTINCT CODES
FROM yourTable
WHERE REGEXP_LIKE (CODES, '^C[0-3][0-9][.][0-9]$');

Consider a query to find details of research fields where the first two parts of the ID are D and 2 and the last part is one character (digit)

The ID of research fields have three parts, each part separated by a period.
Consider a query to find the details of research fields where the first two parts of the ID are D and 2, and the last part is a single character (digit).
IDs like D.2.1 and D.2.3 are in the query result whereas IDs like D.2.12 or D.2.15 are not.
The SQL query given below does not return the correct result. Explain the reason why it does not return the correct result and give the correct SQL query.
select *
from field
where ID like 'B.1._';
I have no idea why it doesnt work.
Anyone can help on this? Many thanks

D.2.1 and D.2.3 are in the query result whereas IDs like D.2.12 or D.2.15 are not.
An underscore matches any single character in a LIKE filter so B.1._ is looking for the start of the string followed by a B character followed by a . character then a 1 character then a . character then any single character then the end of the string.
You could use:
SELECT *
FROM field
WHERE ID like 'B.1._%';
The % will match any number of characters (including zero) until the end of the string and the preceding underscore will enforce that there is at least one character after the period.

How can I extract a substring from a character column without using SUBSTR()?

I have a questions regarding below data.
You clearly can see each EMP_IDENTIFIER has connected with EMP_ID.
So I need to pull only identifier which is 10 characters that will insert another column.
How would I do that?
I did some traditional way, using INSTR, SUBSTR.
I just want to know is there any other way to do it but not using INSTR, SUBSTR.
EMP_ID(VARCHAR2)EMP_IDENTIFIER(VARCHAR2)
62049 62049-2162400111
6394 6394-1368000222
64473 64473-1814702333
61598 61598-0876000444
57452 57452-0336503555
5842 5842-0000070666
75778 75778-0955501777
76021 76021-0546004888
76274 76274-0000454999
73910 73910-0574500122
I am using Oracle 11g.

If you want the second part of the identifier and it is always 10 characters:
select t.*, substr(emp_identifier, -10) as secondpart
from t;

Here is one way:
REGEXP_SUBSTR (EMP_IDENTIFIER, '-(.{10})',1,1,null,1)
That will give the 1st 10 character string that follows a dash ("-") in your string. Thanks to mathguy for the improvement.
Beyond that, you'll have to provide more details on the exact logic for picking out the identifier you want.

Since apparently this is for learning purposes... let's say the assignment was more complicated. Let's say you had a longer input string, and it had several groups separated by -, and the groups could include letters and digits. You know there are at least two groups that are "digits only" and you need to grab the second such "purely numeric" group. Then something like this will work (and there will not be an instr/substr solution):
select regexp_substr(input_str, '(-|^)(\d+)(-|$)', 1, 2, null, 2) from ....
This searches the input string for one or more digits ( \d means any digit, + means one or more occurrences) between a - or the beginning of the string (^ means beginning of the string; (a|b) means match a OR b) and a - or the end of the string ($ means end of the string). It starts searching at the first character (the second argument of the function is 1); it looks for the second occurrence (the argument 2); it doesn't do any special matching such as ignore case (the argument "null" to the function), and when the match is found, return the fragment of the match pattern included in the second set of parentheses (the last argument, 2, to the regexp function). The second fragment is the \d+ - the sequence of digits, without the leading and/or trailing dash -.
This solution will work in your example too, it's just overkill. It will find the right "digits-only" group in something like AS23302-ATX-20032-33900293-CWV20-3499-RA; it will return the second numeric group, 33900293.

Display certain sequence only in VARCHAR

I have a column error_desc with values like:
Failure occurred in (Class::Method) xxxxCalcModule::endCustomer. Fan id 111232 is not Effective or not present in BL9_XXXXX for date 20160XXX.
What SQL query can I use to display only the number 111232 from that column? The number is placed at 66th position in VARCHAR column and ends 71st.

SELECT substr(ERROR_DESC,66,6) as ABC FROM bl1_cycle_errors where error_desc like '%FAN%'

This solution uses regular expressions.
The challenge I faced was on pulling out alphanumerics. We have to retain only numbers and filter out string,alphanumerics or punctuations in this case, to detect the standalone number.
Pure strings and words not containing numbers can be easily filtered out using
[^[:digit:]]
Possible combinations of alphanumerics are :
1.Begins with a character, contains numbers, may end with characters or punctuations :
[a-zA-Z]+[0-9]+[[:punct:]]*[a-zA-Z]*[[:punct:]]*
2.Begins with numbers and then contains alphabets,may contain punctuations :
[0-9]+[[:punct:]]*[a-zA-Z]+[[:punct:]]*
Begins with numbers then contains punctuations,may contain alphabets :
-- [0-9]+[a-zA-Z][[:punct:]]+[a-zA-Z] --Not able to highlight as code, refer solution's last regex combination
Combining these regular expressions using | operator we get:
select trim(REGEXP_REPLACE(error_desc,'[^[:digit:]]|[a-zA-Z]+[0-9]+[[:punct:]]*[a-zA-Z]*[[:punct:]]*|[0-9]+[[:punct:]]*[a-zA-Z]+[[:punct:]]*|[0-9]+[a-zA-Z]*[[:punct:]]+[a-zA-Z]*',' '))
from error_table;
Will work in most cases.

MSAccess Query a string matching a pattern

I have a table with a string field containing location information. I want to be able to query this table and retrieve all of the tags matching the format xxxxxxAA where xxxxxx is a 6-digit number and AA is two alphabetic characters.
Is there a method of querying this using SQL or is this something that I need to do in VBA?
Sample data:
BGS5 PM RGP5
022051PM
022201PM
030539PM
WAS3N
179546MM
And I want to return the following without knowing the values:
022051PM
022201PM
030539PM
179546MM
thanks in advance
Jason

You can use a query with a Like comparison in the WHERE clause.
SELECT y.text_field
FROM YourTable AS y
WHERE y.text_field Like '######[A-Z][A-Z]'
The # matches a digit.
[A-Z] matches one character from a character class consisting of only letters. That character class is actually upper case letters. However, the comparison is case-insensitive, so will match lower case letters, too.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Pattern Matching with SQL Like with first part letters and second part numbers of varying length - sql

Is there a way to use Pattern Matching with SQL LIKE, to match first part of letters and a second part of variable number of numbers? For example, I want to select only ABC1002, ABC23, ABC569, CDE48569.

Here is one method: where col like '[A-Z][A-Z][A-Z][0-9]%' and col not like '[A-Z][A-Z][A-Z]%[^0-9]%' The logic says: The column starts with three letters and a digit. Nothing other than a digit follows the three letters.

Related

Using regexp_like in Oracle to match on multiple string conditions using a range of values

Consider a query to find details of research fields where the first two parts of the ID are D and 2 and the last part is one character (digit)

How can I extract a substring from a character column without using SUBSTR()?

Display certain sequence only in VARCHAR

MSAccess Query a string matching a pattern

Categories

Resources