Impala - How to get the third to last occurrence of a character within a string - impala

I've the following string:
SELECT '00000-AAA1-1111-BBBB1-010101-CCCC1' as Word
FROM Table_A
I have the above string I only want to extract the string between the third-to-last and the last occurence.
Basically, I want to extract the following string:
'BBBB1'
I was trying with that but it gives me the wrong result:
substr(Word, 1, length(a.cod_system_shipment_detail) - length(reverse(split_part(reverse(Word),'|',1))))
"00000-AAA1-1111-BBBB1-010101-"
How can extract only BBBB1 from the above string?
Thanks!

If you want to split by - and get the third element, you can do it as follow
SELECT split_part('00000-AAA1-1111-BBBB1-010101-CCCC1', '-',4);

Related

MariaDB: How to get a part of string

How to get a part of an string from a MariaDB?
I would like to update from a table to another.But my problem is that I don't know how to get only a part of string.
I would like to get the part before the first "/".
For example for SQL i could do:
select substring(lesson,1,CHARINDEX('/', lesson) -1)) from lesson
In MariaDB i don't find a function like charindex.
Table example:
I find INSTR function in MariaDB!
Returns the position of the first occurrence of substring substr in string str. This is the same as the two-argument form of LOCATE(), except that the order of the arguments is reversed.
The solution is:
SELECT SUBSTRING(lessonr1,INSTR(flesson, '/')-1) FROM lesson ;

REGEXP_EXTRACT value from left between 4th and 5th underscore

I have a string column that contains either 7 or 8 elements that are always separated by underscores:
AAA_BBB_CCC_DDD_EEE_FFF_GGG_HHH
AAA_BBB_CCC_DDD_EEE_FFF_GGG
Values between underscores can be of various length and contain other characters like + as an example
How do I extract only the value between the 4th and 5th underscore? That is, for both of these strings, I would get EEE?
The code I am trying to use is:
SELECT
REGEXP_EXTRACT("AAA_BBB_CCC_DDD_EEE_FFF_GGG_HHH", r'.+_.+_.+_.+_(.+)_.+_.+_.+') AS a
If it is the longer string (ending with HHH), I get the value EEE, but if it is the shorter string, I get null. What am I doing wrong?
The following logic using REGEXP_EXTRACT with a capture group should be working here:
SELECT REGEXP_EXTRACT(col, r'^[^_]+_[^_]+_[^_]+_[^_]+_([^_]+)'
FROM yourTable;
An alternative is to split your string into an array, and select the 5th element of it (from 0)
WITH test AS
(SELECT "AAA_BBB_CCC_DDD_EEE_FFF_GGG_HHH" as letter_group
UNION ALL
SELECT "AAA_BBB_CCC_DDD_EEE_FFF_GGG" as letter_group)
SELECT letter_array[OFFSET(5)] FROM (SELECT SPLIT(letter_group, "_") as letter_array FROM test) T;

replace all occurrences of a sub string between 2 charcters using sql

Input string: ["1189-13627273","89-13706681","118-13708388"]
Expected Output: ["14013627273","14013706681","14013708388"]
What I am trying to achieve is to replace any numbers till the '-' for each item with hard coded text like '140'
SELECT replace(value_to_replace, '-', '140')
FROM (
VALUES ('1189-13627273-77'), ('89-13706681'), ('118-13708388')
) t(value_to_replace);
check this
I found the right way to achieve that using the below regular expression.
SELECT REGEXP_REPLACE (string_to_change, '\\"[0-9]+\\-', '140')
You don't need a regexp for this, it's as easy as concatenation of 140 and the substring from - (or the second part when you split by -)
select '140'||substring('89-13706681' from position('-' in '89-13706681')+1 for 1000)
select '140'||split_part('89-13706681','-',2)
also, it's important to consider if you might have instances that don't contain - and what would be the output in this case
Use regexp_replace(text,text,text) function to do so giving the pattern to match and replacement string.
First argument is the value to be replaced, second is the POSIX regular expression and third is a replacement text.
Example
SELECT regexp_replace('1189-13627273', '.*-', '140');
Output: 14013627273
Sample data set query
SELECT regexp_replace(value_to_replace, '.*-', '140')
FROM (
VALUES ('1189-13627273'), ('89-13706681'), ('118-13708388')
) t(value_to_replace);
Caution! Pattern .*- will replace every character until it finds last occurence of - with text 140.

Trying to create a regular expression [ORACLE]

Good,
I need help to create a regular expression to just take the name and extension of file of the following directories.
/home/user/work/file1.dbf
/opt/user/file2.dfb
I am trying to create an expression in Oracle12C to only output "file1.dbf" and "file2.dbf".
I am currently trying to do the regular expression on the next page and reading the following documentation.
Thanks in advance and I hope I have explained correctly.
You don't need a regular expression to do this. A combination of substr and instr would be sufficient.
instr(colname,'/',-1) gets the last occurrence of / in the string. And the substring after that position would be the filename as per the data shown.
The filter instr(colname,'/') > 0 restricts the rows which don't have a / in them.
select substr(colname,instr(colname,'/',-1)+1) as filename
from tablename
where instr(colname,'/') > 0
A regular expression for the same would be
select regexp_substr(colname,'(.*/|^)(.+)$',1,1,null,2) as filename
from tablename
(.*/|^) - All the characters upto the last / occurence in the string or the start of the string if there are no / characters.
(.+)$ - All the characters after the last / if it exists in the string or the full string if / doesn't exist.
They are extracted as 2 groups and we are interested in the 2nd group. Hence the argument 2 at the end of regexp_substr.
Read about the arguments to REGEXP_SUBSTR here.
An alternative regex approach would be to strip everything up to the last /:
with demo as
( select '/home/user/work/file1.dbf' as path from dual union all
select '/opt/user/file2.dfb' from dual )
select path
, regexp_replace(path,'^..*/') as filename
from demo;

How do I extract a substring starting from end of string until the second 0 is encountered, oracle?

I have a table client_requests with the column number_request.
In the column number_request I have the following records:
20130000000008,
20130000000010,
20130000000503
I want to extract only the end of string without 0,
example:
for 20130000000008 I want to get 8
for 20130000000010 I want to get 10
for 20130000000503 I want to get 503
I think I need to use regexp_substr but do not know how.
One possible solution:
select replace('201500000010',regexp_substr('201500000010','.{4}0*'),'') from dual;
Essentially, I'm using the regexp_subst function to extract the unwanted digits, then using the replace function to replace the unwanted digits with '', which is nothing.