Search for substring, return another substring - sql

I need to search for and display a part of a string field. The string value from record to record may be different. For example:
Record #1
String Value:
IA_UnsafesclchOffense0IA_ReceivedEdServDuringExp0IA_SeriousBodilyInjuryN
Record #2
String Value:
IA_ReasonForRemovalTIA_Beh_Inc_Num1392137419IA_RemovalTypeNIA_UnsafesclchOffense0IA_ReceivedEdServDuringExp0IA_SeriousBodilyInjuryN
Record #3
String Value:
IA_UnsafesclchOffense0IA_RemovalTypeSIA_ReasonForRemovalPIA_ReceivedEdServDuringExp0IA_Beh_Inc_Num1396032888IA_SeriousBodilyInjuryN
In each case, I need to search for IA_Beh_Inc_Num. Assuming it's found, and IF it's followed by numeric data, I want to RETURN the numeric portion of that data. The numeric data, when present, will always be 10 characters.
In other words, record #1 should return no value, record #2 should return 1392137419 and record #3 should return 1396032888
Is there a way to do this within a select statement without having to write a full function with PL/SQL?

This should be easy with a Regular Expression: find a search string and check if it's followed by 10 digits:
REGEXP_SUBSTR(col, '(?<=IA_Beh_Inc_Num)([0-9]{10})')
but Oracle doesn't seem to support RegEx lookahead, so it's bit more complicated:
REGEXP_SUBSTR(value, '(IA_Beh_Inc_Num)([0-9]{10})',1,1,'i',2)
Remarks: the search is case-insensitive and if there are less than 10 digits NULL will be returned.

This would work:
SELECT
CASE WHEN instr(value, 'IA_Beh_Inc_Num') > 0
THEN substr(substr(value, instr(value, 'IA_Beh_Inc_Num'), 25),15,10)
ELSE 'not found'
END AS result
FROM example
See this SQL Fiddle.

Angelo's answer is correct for Oracle, as the question asked.
For those from SQL Server coming across this, the below would work:
SELECT CASE
WHEN CHARINDEX('IA_Beh_Inc_Num', StringColumn) = 0
THEN NULL
ELSE SUBSTRING(StringColumn, CHARINDEX('IA_Beh_Inc_Num', StringColumn) + LEN('IA_Beh_Inc_Num'), 10)
END AS unix_time
,*
FROM MyTable

EDIT:
Modified query to select all rows. The query prints "NOT A TIMESTAMP" if IA_Beh_Inc_Num does not exist within the string or if it is not followed by 10 numbers.
SELECT
DECODE
(
REGEXP_INSTR (value, 'IA_Beh_Inc_Num[0-9]{10}'),
0,
'NOT A TIMESTAMP',
SUBSTR(value, INSTR(value, 'IA_Beh_Inc_Num')+14, 10)
) timestamp
FROM example;
SQL Fiddle

Related

REPLACE not doing what I need in SQL

I've got a query pulling data from a table. In one particular field, there are several cases where it is a zero, but I need the four digit location number. Here is where I'm running into a problem. I've got
SELECT REPLACE(locationNbr, '0', '1035') AS LOCATION...
Two issues -
Whoever put the table together made all fields VARCHAR, hence the single quotes.
In the cases where there already is the number 1035, I get 1103535 as the location number because it's replacing the zero in the middle of 1035.
How do I select the locationNbr field and leave it alone if it's anything other than zero (as a VARCHAR), but if it is zero, change it to 1035? Is there a way to somehow use TO_NUMBER within the REPLACE?
SELECT CASE WHEN locationNbr='0' THEN '1035' ELSE locationNbr END AS LOCATION...
REPLACE( string, string_to_replace , replacement_string )
REPLACE looks for a string_to_replace inside a string and replaces it with a replacent_string. That is why you get the undesired behaviour - you are using the wrong function.
CASE WHEN condition THEN result1 ELSE result2 END
CASE checks a condition and if it is true it returns result1 and if it is not it will return result2. This is a simple example, you can write a case statement with more than one condition check.
Don't use replace(). Use case:
(case when locationNbr = '0' then '1035' else locationNbr end)
You can make use of length in Oracle:
select case when length(loacation) = 1 then REPLACE(loacation, '0', '1035') else loacation end as location
from location_test;

Find a character index in string in spark sql

I am SQL person and new to Spark SQL
I need to find the position of character index '-' is in the string if there is then i need to put the fix length of the character otherwise length zero
string name = 'john-smith'
if '-' is in character position 4 then 10 otherwise length 0
I have done in SQL Server but now need to do in Spark SQL.
select
case
when charindex('-', name) = 4 then 10
else 0
end
I tried in Spark SQL but failed to get results.
select find_in_set('-',name)
Please help. Thanks
You can use instr function as shown next. insrt checks if the second str argument is part of the first one, if so it returns its index starting from 1.
//first create a temporary view if you don't have one already
df.createOrReplaceTempView("temp_table")
//then use instr to check if the name contains the - char
spark.sql("select if(instr(name, '-') = 4, 10, 0) from temp_table")
The arguments for the if statement are:
instr(name, '-') = 4 condition to check
10 result for valid condition
0 result for false condition

Prevent ORA-01722: invalid number in Oracle

I have this query
SELECT text
FROM book
WHERE lyrics IS NULL
AND MOD(TO_NUMBER(SUBSTR(text,18,16)),5) = 1
sometimes the string is something like this $OK$OK$OK$OK$OK$OK$OK, sometimes something like #P,351811040302663;E,101;D,07112018134733,07012018144712;G,4908611,50930248,207,990;M,79379;S,0;IO,3,0,0
if I would like to know if it is possible to prevent ORA-01722: invalid number, because is some causes the char in that position is not a number.
I run this query inside a procedure a process all the rows in a cursor, if 1 row is not a number I can't process any row
You could use VALIDATE_CONVERSION if it's Oracle 12c Release 2 (12.2),
WITH book(text) AS
(SELECT '#P,351811040302663;E,101;D,07112018134733,07012018144712;G,4908611,50930248,207,990;M,79379;S,0;IO,3,0,0'
FROM DUAL
UNION ALL SELECT '$OK$OK$OK$OK$OK$OK$OK'
FROM DUAL
UNION ALL SELECT '12I45678912B456781234567812345671'
FROM DUAL)
SELECT *
FROM book
WHERE CASE
WHEN VALIDATE_CONVERSION(SUBSTR(text,18,16) AS NUMBER) = 1
THEN MOD(TO_NUMBER(SUBSTR(text,18,16)),5)
ELSE 0
END = 1 ;
Output
TEXT
12I45678912B456781234567812345671
Assuming the condition should be true if and only if the 16-character substring starting at position 18 is made up of 16 digits, and the number is equal to 1 modulo 5, then you could write it like this:
...
where .....
and case when translate(substr(text, 18, 16), 'z0123456789', 'z') is null
and substr(text, 33, 1) in ('1', '6')
then 1 end
= 1
This will check that the substring is made up of all-digits: the translate() function will replace every occurrence of z in the string with itself, and every occurrence of 0, 1, ..., 9 with nothing (it will simply remove them). The odd-looking z is needed due to Oracle's odd implementation of NULL and empty strings (you can use any other character instead of z, but you need some character so no argument to translate() is NULL). Then - the substring is made up of all-digits if and only if the result of this translation is null (an empty string). And you still check to see if the last character is 1 or 6.
Note that I didn't use any regular expressions; this is important if you have a large amount of data, since standard string functions like translate() are much faster than regular expression functions. Also, everything is based on character data type - no math functions like mod(). (Same as in Thorsten's answer, which was only missing the first part of what I suggested here - checking to see that the entire substring is made up of digits.)
SELECT text
FROM book
WHERE lyrics IS NULL
AND case when regexp_like(SUBSTR(text,18,16),'^[^a-zA-Z]*$') then MOD(TO_NUMBER(SUBSTR(text,18,16)),5)
else null
end = 1;

Translate function not returning relevant string in amazon redshift

I am trying to use a simple Translate function to replace "-" in a 23 digit string. The example of one such string is "1049477-1623095-2412303" The expected outcome of my query should be 104947716230952412303
The list of all "1049477-1623095-2412303" is present in a single column "table1". The name of the column is "data"
My query is
Select TRANSLATE(t.data, '-', '')
from table1 as t
However, it is returning 104947716230952000000 as the output.
At first, I thought it is an overflow error since the resulting integer is 20 digit so I also tried to use following
SELECT CAST(TRANSLATE(t.data,'-','') AS VARCHAR)
from table1 as t
but this is not working as well.
Please suggest a way so that I could have my desirable output
This is too long for a comment.
This code:
select translate('1049477-1623095-2412303', '-', '')
is going to return:
'104947716230952412303'
The return value is a string, not a number.
There is no way that it can return '104947716230952000000'. I could only imagine that happening if somehow the value is being converted to a numeric or bigint type.
Try regexp_replace()
Taking your own example, execute:
select regexp_replace('[string / column_name]','-');
It can be achieve RPAD try below code.
SELECT RPAD(TRANSLATE(CAST(t.data as VARCHAR),'-','') ,20,'00000000000000000000')

DB2 SQL Anything left of a /

I've been working on this for days and can't seem to work it out. Basically I need return digits from a field before there is a forward slash. e.g. if the field was 1234/TEXT I want to return 1234. I can't just use left fieldname 4 as the digits vary in left e.g. 12345/TEXT, so it needs to be anything left of the forward slash. Now in the World of MS Access, it is something like this - and it works
Left(TABLE!FIELD,InStr(1,TABLE!FIELD,"/")-1)
However, how do I convert this to be used in an IBM\DB2 system? The DB2 SQL seems somewhat different to 'normal' SQL.
Thanks!
Rather than INSTR, maybe LOCATE
LOCATE(char, string)
char is the search term
string is the string being searched
You can achieve this by combining LOCATE with SUBSTR;
Locate information
Substring information
Cheat sheet (for this example);
SUBSTRING('FIELD','START POSITION', 'LENGTH')
LOCATE('SEARCH STRING', 'SOURCE STRING')
SUBSTRING lets you retrieve specific characters from a string, i.e.;
AFIELD = 'Hello'
SUBSTRING(AFIELD,4,2)
Result = 'lo' (position 4 and 5 of Hello)
LOCATE returns the position of the first character of the search string it finds as a number, i.e.;
AFIELD = 'Hello'
LOCATE('ello', AFIELD)
Result = 2 (it starts at position 2)
So you can combine these to do what you want, example;
XTABLE has 1 column called ACOL with the following values in it;
123467/ABCD
1321/ABDD
1123467/ABCD
To just retrieve the numbers;
SELECT SUBSTRING(ACOL,1, LOCATE('/',ACOL)-1)
FROM XRDK/XTABLE
Result;
123467
1321
1123467
What are we doing?
SUBSTRING(
ACOL,
1,
LOCATE('/',ACOL)-1
)
SUBSTRING(
Field ACOL,
Starting at position 1,
Length; using locate set this to where I find a '/' and subtract 1 from the
resulting postion (without the -1 you'd have the / on the end)
)
Try this
SELECT SUBSTRING(CAST (ROUND(COLUMN,2) AS DECIMAL(6,2)), 0, locate('/',CAST (ROUND(COLUMN,2) AS DECIMAL(6,2))))
FROM TABLE