query for substring formation - sql

I want to take the 01 part of a string abcd_01 using SQL. What should be the query for this, where the length before the _ varies? That is, it may be abcde_01 or ab_01. Basically, I want part after the _.

This is one of those examples of how there's similar functionality between SQL and the various extensions, but are just different enough that you can not guarantee portability between all databases.
The SUBSTRING keyword, using PostgreSQL syntax (no mention of pattern matching) is ANSI-99. Why this took them so long, I don't know.
The crux of your need is to obtain a substring of the existing column value, so you need to know what the database substring function(s) are called.
Oracle
SELECT SUBSTR('abcd_01', -2) FROM DUAL
Oracle doesn't have a RIGHT function, with is really just a wrapper for the substring function anyway. But Oracle's SUBSTR does allow you to specify a negative number in order to process the string in reverse (end towards the start).
SQL Server
Two options - SUBSTRING, and RIGHT:
SELECT SUBSTRING('abcd_01', LEN('abcd_01') - 1, 2)
SELECT RIGHT('abcd_01', 2)
For brevity, RIGHT is ideal. But for portability, SUBSTRING is a better choice...
MySQL
Like SQL Server, three options - SUBSTR, SUBSTRING, and RIGHT:
SELECT SUBSTR('abcd_01', LENGTH('abcd_01') - 1, 2)
SELECT SUBSTRING('abcd_01', LENGTH('abcd_01') - 1, 2)
SELECT RIGHT('abcd_01', 2)
PostgreSQL
PostgreSQL only has SUBSTRING:
SELECT SUBSTRING('abcd_01' FROM LENGTH('abcd_01')-1 for 2)
...but it does support limited pattern matching, which you can see is not supported elsewhere.
SQLite
SQLite only supports SUBSTR:
SELECT SUBSTR('abcd_01', LENGTH('abcd_01') - 1, 2)
Conclusion
Use RIGHT if it's available, while SUBSTR/SUBSTRING would be better if there's a need to port the query to other databases so it's explicit to others what is happening and should be easier to find equivalent functionality.

If it's always the last 2 characters then use RIGHT(MyString, 2) in most SQL dialects

to get 01 from abcd_01 you should write this way (assuming column name is col1)
SELECT substring(col1,-2) FROM TABLE
this will give you last two chars.

select substring(names,charindex('_',names)+1,len(names)-charindex('_',names)) from test

Related

T-SQL - How to pattern match for a list of values?

I'm trying to find the most efficient way to do some pattern validation in T-SQL and struggling with how to check against a list of values. This example works:
SELECT *
FROM SomeTable
WHERE Code LIKE '[0-9]JAN[0-9][0-9]'
OR Code LIKE '[0-9]FEB[0-9][0-9]'
OR Code LIKE '[0-9]MAR[0-9][0-9]'
OR Code LIKE '[0-9]APRIL[0-9][0-9]
but I am stuck on wondering if there is a syntax that will support a list of possible values within the single like statement, something like this (which does not work)
SELECT *
FROM SomeTable
WHERE Code LIKE '[0-9][JAN, FEB, MAR, APRIL][0-9][0-9]'
I know I can leverage charindex, patindex, etc., just wondering if there is a simpler supported syntax for a list of possible values or some way to nest an IN statement within the LIKE. thanks!
I think the closest you'll be able to get is with a table value constructor, like this:
SELECT *
FROM SomeTable st
INNER JOIN (VALUES
('[0-9]JAN[0-9][0-9]'),
('[0-9]FEB[0-9][0-9]'),
('[0-9]MAR[0-9][0-9]'),
('[0-9]APRIL[0-9][0-9]')) As p(Pattern) ON st.Code LIKE p.Pattern
This is still less typing and slightly more efficient than the OR option, if not as brief as we hoped for. If you knew the month was always three characters we could do a little better:
Code LIKE '[0-9]___[0-9][0-9]'
Unfortunately, I'm not aware of SQL Server pattern character for "0 or 1" characters. But maybe if you want ALL months we can use this much to reduce our match:
SELECT *
FROM SomeTable
WHERE (Code LIKE '[0-9]___[0-9][0-9]'
OR Code LIKE '[0-9]____[0-9][0-9]'
OR Code LIKE '[0-9]_____[0-9][0-9]')
You'll want to test this to check if the data might contain false positive matches, and of course the table-value constructor could use this strategy, too. Also, I really hope you're not storing dates in a varchar column, which is a broken schema design.
One final option you might have is building the pattern on the fly. Something like this:
Code LIKE '[0-9]' + 'JAN' + '[0-9][0-9]'
But how you find that middle portion is up to you.
The native TSQL string functions don't support anything like that.
But you can use a workaround (dbfiddle) such as
WHERE CASE WHEN Code LIKE '[0-9]%[^ ][0-9][0-9]' THEN SUBSTRING(Code, 2, LEN(Code) - 3) END
IN
( 'JAN', 'FEB', 'MAR', 'APRIL' )
So first of all check that the string starts with a digit and ends in a non-space character followed by two digits and then check the remainder of the string (not matched by the digit check) is one of the values you want.
The reason for including the SUBSTRING inside the CASE is so that is only evaluated on strings that pass the LIKE check to avoid possible "Invalid length parameter passed to the LEFT or SUBSTRING function." errors if it was to be evaluated on a shorter string.

Use REGEXP_SUBSTR to extract string of varied length

I want to extract alphanumeric text of varied length from a string between the second occurrence of a specific characters.
I have tried various forms of substr and regexp_substr but can't seem to get the syntax right. This is for use in Teradata SQL assistant. In the past I would have to create a temp table and use substr twice before trimming down the string to what I need. I want to do it all in one go.
SELECT regexp_substr('Channel:DF GB, Order Num:12345T6, Order Date:01/01/2019, Charge Codes:TAXES,,GBRAX', 'Num\\:+(\\:+)',1,2, ':') as RESULTING_STRING
My desired result is to return ONLY what is between "Num:" and the next "," in this case "12345T6". The length of the order number can vary so it is not a fixed length. When I run my code the actual output is a '?' returned by Teradata. What am I doing wrong?
This seems to work:
SELECT regexp_substr('Channel:DF GB, Order Num:12345T6, Order Date:01/01/2019, Charge Codes:TAXES,,GBRAX', 'Num:(\w*)', 1, 1, NULL, 1) as RESULTING_STRING from dual
Finds Num: and then captures as many word characters (, is not a word char) as there are available. The last parameter - subexpr - specifies which subexpression (aka capture group) you want, without it the whole thing will be matched (Num:12345T6).
Assuming you use Teradata SQL Assistant to query a Teradata system (but why do you tag Oracle then) the RegEx syntax is slightly different (both use a different RegEx dialects):
Teradata's RegExp_Substr doesn't support the subexpression parameter, you can either switch to the (I really don't know why) undocumented RegExp_Substr_gpl
RegExp_Substr_gpl(x, 'Num:([^,]*)', 1, 1, 'i', 1)
or tell the RegEx to forget the previous match using \K:
RegExp_Substr(x, 'Num:\K[^,]*', 1,1, 'i')
You can give a try to the below pattern search !
SELECT REGEXP_REPLACE ((REGEXP_SUBSTR('Channel:DF GB, Order Num:12345T6, Order Date:01/01/2019, Charge Codes:TAXES,,GBRAX', 'Num:[A-Za-z0-9]*',1,1, 'i')),'Num:','',1,1,'i') AS RESULTING_STRING
Regexp_substr pattern search ['Num:[A-Za-z0-9]*'], will first filter out the alphanumeric characters that follow the pattern 'Num:',astriek, helps to find out zero or more occurrences of the specified pattern.
For eg:, in this 'Num:12345T6' will be filtered out of the string provided, also note the last parameter in the regexp_substr is 'i', which ensures case in-specific search.
Lastly, Regexp_replace will replace the pattern 'Num:' from the output of the regexp_substr with an empty string,resulting in a final string as '12345T6'.

substr in Oracle from column

What is the Syntax to substr in Oracle to subtract a string
i have "123456789 #073"
I only want what after the #
substr (table.col, 17,3)
is that ok ?
Most likely the simplest (and most performant) way of doing this would be to use the base string functions:
SELECT SUBSTR(col, INSTR(col, '#') + 1)
FROM yourTable;
Demo
We could also try using REGEXP_REPLACE here:
SELECT REGEXP_REPLACE(col, '.*#(.*)', '\1')
FROM yourTable;
The regex option would in general not perform as well as the first query. The reason for this is that invoking a regex incurs a performance overhead. You might want to consider a regex option if you expect that the string logic might change or get more complicated in the future. Otherwise, go with base string functions wherever possible.
I think the most direct method might be regexp_substr():
select regexp_substr('123456789 #073', '[^#]+$')
from dual;
The regular expression says: "get me all non-hash characters at the end of the string".
If you happen to know that there are 3 characters and really want the last three characters of the string:
select substr('123456789 #073', -3)

Query to take the value after '/'

Suppose there is a value 842545/003. I need to take the part after '/'.
84454/02. I want a query to take the only 02 from here
You can try below substr() and instr() function
select substr('84454/02',instr('84454/02','/')+1,
length('84454/02')-instr('84454/02','/')) as val
from dual
You can use a regex (with the usual warnings about regex performance - the simple string functions like instr and substr are faster if you are processing millions of rows).
regexp_replace(yourcolumn, '^.*/')
This removes everything up to and including the / character (or the final one if there is more than one).
If your are using SQL, you can use this.
SELECT SUBSTRING('84454/02',PATINDEX('%/%', '84454/02')+1,LEN('84454/02'));
'84454/02'= Column name

How to select all the string characters preceding a . in oracle

I am using Oracle 11 G and have the following set of data:
12.0
4.2
Version.1
7.9
abc.72
I want to return all string characters before the period. What sort of query would I run in order to achieve this? Any help would be greatly appreciated, thanks!
You can try a combination of instr and substr.
Something like this:
select substr(field, 1, instr(field, '.') - 1)
from your_table;
Assuming field always contains a . character on it.
You can also deal with strings without a . by using case, if or any other similar valid conditional function on Oracle's SQL language implementation.
Of course, you can always put this on a function to make it look nicer on your query.