How to retrieve specific chars after decimal using REGEX in Oracle SQL? - sql

I'm using RegEx in a View in Oracle 11g and I need to display certain codes that have an 'S' in the 8th position.
Using https://regexr.com/2v41h,
I was able to display these results with
REGEXP_SUBSTR(code, '\S{8}')
Y38.9X2S
Y38.9X2D
Y38.9X2A
Y38.9X1S
My issue is that I need to return only the values that have an 'S' in the last position which is the 8th position counting the decimal. What expression should I use?
Example:
Y38.9X2S
Y38.9X1S
I have tried:
REGEXP_SUBSTR(code, '\b[S]*[8]\b') AS CODE
Thank you in advance for your help.

I am thinking:
select substr(code, -8)
from t
where code like '%_______S'
If the code is always long enough, just use '%S'.
Or, as a case expression:
select (case when code like '%_______S' then substr(code, -8) end)
You will need regular expressions if code has other characters but they may not be necessary.

Related

How to extract numeric values from a column in SQL

I am trying to extract only the numeric values from a column that contains cells that are exclusively numbers, and cells that are exclusively letter values, so that I can multiply the column with another that contains only numeric values. I have tried
SELECT trim(INTENT_VOLUME)
from A
WHERE ISNUMERIC(INTENTVOLUME)
and also
SELECT trim(INTENT_VOLUME)
from A
WHERE ISNUMERIC(INTENTVOLUME) = 1
and neither works. I get the error Function ISNUMERIC(VARCHAR) does not exist. Can someone advise? Thank you!
It highly depends on DBMS.
in SqlServer you have a limited built-in features to do it, so the next query may not work with all variants of your data:
select CAST(INTENT_VOLUME AS DECIMAL(10, 4))
from A
where INTENT_VOLUME LIKE '%[0-9.-]%'
and INTENT_VOLUME NOT LIKE '%[^0-9.-]%';
In Oracle you can use regex in a normal way:
select to_number(INTENT_VOLUME)
from A
where REGEXP_LIKE(INTENT_VOLUME,'^[-+]?[0-9]+(\.[0-9]+)?$');
MySQL DBMS has also built-in regex
Try this, which tests if that text value can be cast as numeric...
select intent_volume
from a
where (intent_volume ~ '^([0-9]+[.]?[0-9]*|[.][0-9]+)$') = 't'

Using decode for parameter positioning cases

I have a table which contains a list of 13 digit numbers.
I want to use informatica to break these numbers down and separate them based on cases.
For example, I have the number 1196804120316.
For the first case, I wish to only take the two digits after the 68. In our example, I extract the number 04 and store it in a column.
The SQL Code for it is:
CASE WHEN ODS_CI_RPT.ADMIN.REGEXP_LIKE(DEC_REGISTRN_NBR,'^(19|20)?[0-9]{2}-[0-9]{2}-[0-9]{5,6}$')
THEN
ODS_CI_RPT.ADMIN.REGEXP_REPLACE(DEC_REGISTRN_NBR,'.*-([0-9]{2})-.*','\1',1,1)
ELSE '05'
END
AS
STATE_CODE
The next case is to take the number after 19 and store it. In this case the 68.
The SQL is:
CASE WHEN ODS_CI_RPT.ADMIN.REGEXP_LIKE(DEC_REGISTRN_NBR,'^(19|20)?[0-9]{2}-[0-9]{2}-[0-9]{5,6}$') THEN
ODS_CI_RPT.ADMIN.REGEXP_REPLACE(DEC_REGISTRN_NBR,'^([0-9]{2,4})-.*','\1',1,1)
ELSE ODS_CI_RPT.ADMIN.REGEXP_REPLACE(DEC_REGISTRN_NBR,'^([0-9]{4})-.*','\1',1,1)
END
AS
D_BIRTH_YEAR,
How would I implement this using decode in informatica?
Could you try:
WITH
input(literal) AS (
SELECT '1196804120316'
)
SELECT
-- use below in PowerCenter
MONTH(TO_DATE(SUBSTR(literal,2),'YYYYMMDDHHMI'))
-- use above in PowerCenter
AS the_month
FROM input;
the_month
4
Power Center offers all functions of Oracle. So just use the formula I show above .....
My solution to this was to use SUBSTR() in an expression. After importing the source from the table, I used:
SUBSTR(COLUMN_NAME,6,2)
To tell Informatica which position of the string I wanted broken down. Once it was broken down, the expression would capture it into a variable.

Convert Access Left & Mid to SQL

I am currently trying to convert this query from Access into proper SQL.
The Left and Mid functions in the statement have me kind of baffled.
SELECT
name,
entnum
IIf(Left(Mid([entnum],4,3),1)=0,Mid([entnum],5,2),Mid([entnum],4,3)) as AGENCYCODE
FROM CUSTFILE
the entnum field's type is varchar 15
Any help with trying to understand this would be greatly appreciated.
You can use SUBSTRING for MID and LEFT. Conditional IIF statements exists in some dialects of SQL, but you might be safer with a CASE statement.
Looking at your statement, I think it can be reduced to the following:
SELECT
name,
entnum,
CASE
WHEN SUBSTRING(entnum,4,1) = '0' THEN SUBSTRING(entnum,5,2)
ELSE SUBSTRING(entnum,4,3)
END agencycode
FROM CUSTFILE
Try this instead:
SELECT
name,
entnum,
CASE
WHEN LEFT(SUBSTRING([entnum], 4, 3), 1) = '0'
THEN SUBSTRING([entnum], 5, 2)
ELSE SUBSTRING([entnum], 4, 3)
END as AGENCYCODE
FROM CUSTFILE
SUBSTRING is used exactly as MID. The CASE statement allows you to specify multiple WHEN..THEN conditions as well as an ELSE.

How can I SELECT DISTINCT on the last, non-numerical part of a mixed alphanumeric field?

I have a data set that looks something like this:
A6177PE
A85506
A51SAIO
A7918F
A810004
A11483ON
A5579B
A89903
A104F
A9982
A8574
A8700F
And I need to find all the ENDings where they are non-numeric. In this example, that means PE, AIO, F, ON, B and F.
In pseudocode, I'm imagining I need something like
SELECT DISTINCT X FROM
(SELECT SUBSTR(COL,[SOME_CLEVER_LOGIC]) AS X FROM TABLE);
Any ideas? Can I solve this without learning regexp?
EDIT: To clarify, my data set is a lot larger than this example. Also, I'm only interested in the part of the string AFTER the numeric part. If the string is "A6177PE" I want "PE".
Disclaimer: I don't know Oracle SQL. But, I think something like this should work:
SELECT DISTINCT X FROM
(SELECT SUBSTR(COL,REGEXP_INSTR(COL, "[[:ALPHA:]]+$")) AS X FROM TABLE);
REGEXP_INSTR(COL, "[[:ALPHA:]]+$") should return the position of the first of the characters at the end of the field.
For readability, I'd recommend using the REGEXP_SUBSTR function (If there are no performance issues of course, as this is definitely slower than the accepted solution).
...also similar to REGEXP_INSTR, but instead of returning the position of the substring, it returns the substring itself
SELECT DISTINCT SUBSTR(MY_COLUMN,REGEXP_SUBSTR("[a-zA-Z]+$")) FROM MY_TABLE;
(:alpha: is supported also, as #Audun wrote )
Also useful: Oracle Regexp Support (beginning page)
For example
SELECT SUBSTR(col,INSTR(TRANSLATE(col,'A0123456789','A..........'),'.',-1)+1)
FROM table;

What is the best way to select string fields based on character ranges?

I need to add the ability for users of my software to select records by character ranges.
How can I write a query that returns all widgets from a table whose name falls in the range Ba-Bi for example?
Currently I'm using greater than and less than operators, so the above example would become:
select * from widget
where name >= 'ba' and name < 'bj'
Notice how I have "incremented" the last character of the upper bound from i to j so that "bike" would not be left out.
Is there a generic way to find the next character after a given character based on the field's collation or would it be safer to create a second condition?
select * from widget
where name >= 'ba'
and (name < 'bi' or name like 'bi%')
My application needs to support localization. How sensitive is this kind of query to different character sets?
I also need to support both MSSQL and Oracle. What are my options for ensuring that character casing is ignored no matter what language appears in the data?
Let's skip directly to localization. Would you say "aa" >= "ba" ? Probably not, but that is where it sorts in Sweden. Also, you simply can't assume that you can ignore casing in any language. Casing is explicitly language-dependent, with the most common example being Turkish: uppercase i is İ. Lowercase I is ı.
Now, your SQL DB defines the result of <, == etc by a "collation order". This is definitely language specific. So, you should explicitly control this, for every query. A Turkish collation order will put those i's where they belong (in Turkish). You can't rely on the default collation.
As for the "increment part", don't bother. Stick to >= and <=.
For MSSQL see this thread: http://bytes.com/forum/thread483570.html .
For Oracle, it depends on your Oracle version, as Oracle 10 now supports regex(p) like queries: http://www.psoug.org/reference/regexp.html (search for regexp_like ) and see this article: http://www.oracle.com/technology/oramag/webcolumns/2003/techarticles/rischert_regexp_pt1.html
HTH
Frustratingly, the Oracle substring function is SUBSTR(), whilst it SQL-Server it's SUBSTRING().
You could write a simple wrapper around one or both of them so that they share the same function name + prototype.
Then you can just use
MY_SUBSTRING(name, 2) >= 'ba' AND MY_SUBSTRING(name, 2) <= 'bi'
or similar.
You could use this...
select * from widget
where name Like 'b[a-i]%'
This will match any row where the name starts with b, the second character is in the range a to i, and any other characters follow.
I think that I'd go with something simple like appending a high-sorting string to the end of the upper bound. Something like:
select * from widgetwhere name >= 'ba' and name <= 'bi'||'~'
I'm not sure that would survive EBCDIC conversion though
You could also do it like this:
select * from widget
where left(name, 2) between 'ba' and 'bi'
If your criteria length changes (as you seemed to indicate in a comment you left), the query would need to have the length as an input also:
declare #CriteriaLength int
set #CriteriaLength = 4
select * from widget
where left(name, #CriteriaLength) between 'baaa' and 'bike'