How to extract a substring from column in oracle? - sql

I've a column in oracle that stores values in keys. Just for example-
Column_name
((key1="value1" AND key2='value1') OR (key1="value1" AND key2='value2'))
((key1="null" AND key2='value3') OR (key1="value1" AND key2='value4'))
I want to only extract the value of key2 before OR clause (as there are 2 key2 in every row of this column)
Expected result:
Column_name
Value
((key1="value1" AND key2='value1') OR (key1="value1" AND key2='value2'))
value1
((key1="null" AND key2='value3') OR (key1="value1" AND key2='value4'))
value3
Can somebody give me roughly an idea how to do this?

Assuming we can describe your logic as extracting the first key2 value, we can try using REGEXP_SUBSTR with a capture group:
SELECT col, REGEXP_SUBSTR(col, 'key2=''(.*?)''', 1, 1, NULL, 1) AS key
FROM yourTable;

Related

Parsing a string in postgresql

Let's say I have column of datatype varchar, the column contains values similar to these
'My unique id [john3 UID=123]'
'My unique id [henry2 UID=1234]'
'My unique id [tom2 UID=56]'
'My unique id [jerry25 UID=98765]'
How can I get only the numbers after UID= in the strings using postgresql.
for eg in string 'My unique id [john3 UID=123]' I want only 123, similarly in string 'My unique id [jerry25 UID=98765]' I want only 98765
Is there a way in PostgreSQL to do it?
We can use REGEXP_REPLACE here:
SELECT col, REGEXP_REPLACE(col, '.*\[\w+ UID=(\d+)\].*$', '\1') AS uid
FROM yourTable;
Demo
Edit:
In case a given value might not match the above pattern, in which case you would want to return the entire original value, we can use a CASE expression:
SELECT col,
CASE WHEN col LIKE '%[%UID=%]%'
THEN REGEXP_REPLACE(col, '.*\[\w+ UID=(\d+)\].*$', '\1')
ELSE col END AS uid
FROM yourTable;
You can also use regexp_matches for a shorter regular expression:
select regexp_matches(col, '(?<=UID\=)\d+') from t;

How to split string based on column length and insert into table

I have a string that I need to split and create table from it.
00001 00000009716496000000000331001700000115200000000000
I know the exact length of each column:
Col1 = 5
Col2 = 7
Col3 = 23
etc...
I need something like this (Empty values are NULL's)
Can you direct me to the right way of doing that?
Use substring():
select substring(col, 1, 5) as col1,
substring(col, 6, 2) as col2,
. . .
you can use computed column to improve your performance(visit https://www.sqlservertutorial.net/sql-server-basics/sql-server-computed-columns/)
use below function to fill your column
SUBSTRING(string, start, length)

How do i remove string form a column in SQL?

Here is my column :
column
abc1234
abc5678
abc4567
Now I need to remove the abc only from the column. Please help me write a query.
You might want to use REGEXP_REPLACE here:
UPDATE yourTable
SET col = REGEXP_REPLACE(col, '^abc', '')
WHERE col LIKE 'abc%';
If you don't care about the particular position of abc, and accept removing all occurrences of it anywhere, then we can do without regex:
UPDATE yourTable
SET col = OREPLACE(col, 'abc', '')
WHERE col LIKE 'abc%';

Get group maxima from combined strings

I have a table with a column code containing multiple pieces of data like this:
001/2017/TT/000001
001/2017/TT/000002
001/2017/TN/000003
001/2017/TN/000001
001/2017/TN/000002
001/2016/TT/000001
001/2016/TT/000002
001/2016/TT/000001
002/2016/TT/000002
There are 4 items in 001/2016/TT/000001: 001, 2016, TT and 000001.
How can I extract the max for every group formed by the first 3 items? The result I want is this:
001/2017/TT/000003
001/2017/TN/000002
001/2016/TT/000002
002/2016/TT/000002
Edit
The subfield separator is /, and the length of subfields can vary.
I use PostgreSQL 9.3.
Obviously, you should normalize the table and split the combined string into 4 columns with proper data type. The function split_part() is the tool of choice if the separator '/' is constant in your string and the length of can vary.
CREATE TABLE tbl_better AS
SELECT split_part(code, '/', 1)::int AS col_1 -- better names?
, split_part(code, '/', 2)::int AS col_2
, split_part(code, '/', 3) AS col_3 -- text?
, split_part(code, '/', 4)::int AS col_4
FROM tbl_bad
ORDER BY 1,2,3,4 -- optionally cluster data.
Then the task is trivial:
SELECT col_1, col_2, col_3, max(col_4) AS max_nr
FROM tbl_better
GROUP BY 1, 2, 3;
Related:
Split comma separated column data into additional columns
Of course, you can do it on the fly, too. For varying subfield length you could use substring() with a regular expression like this:
SELECT max(substring(code, '([^/]*)$')) AS max_nr
FROM tbl_bad
GROUP BY substring(code, '^(.*)/');
Related (with basic explanation for regexp pattern):
Filter strings with regex before casting to numeric
Or to get only the complete string as result:
SELECT DISTINCT ON (substring(code, '^(.*)/'))
code
FROM tbl_bad
ORDER BY substring(code, '^(.*)/'), code DESC;
About DISTINCT ON:
Select first row in each GROUP BY group?
Be aware that data items cast to a suitable type may behave differently from their string representation. The max of 900001 and 1000001 is 900001 for text and 1000001 for integer ...
Use the LEFT and RIGHT functions.
SELECT MAX(RIGHT(code,6)) AS MAX_CODE
FROM yourtable
GROUP BY LEFT(code,12)
check this out, possible helpfull
select
distinct on (tab[4],tab[2]) tab[4],tab[3],tab[2],tab[1]
from
(
select
string_to_array(exe.x,'/') as tab,
exe.x
from
(
select
unnest
(
array
['001/2017/TT/000001',
'001/2017/TT/000002',
'001/2017/TN/000003',
'001/2017/TN/000001',
'001/2017/TN/000002',
'001/2016/TT/000001',
'001/2016/TT/000002',
'001/2016/TT/000001',
'002/2016/TT/000002']
) as x
) exe
) exe2
order by tab[4] desc,tab[2] desc,tab[3] desc;

pgsql parse string to get a string after certain position

I have a table column that has data like
NA_PTR_51000_LAT_CO-BOGOTA_S_A
NA_PTR_51000_LAT_COL_M_A
NA_PTR_51000_LAT_COL_S_A
NA_PTR_51000_LAT_COL_S_B
NA_PTR_51000_LAT_MX-MC_L_A
NA_PTR_51000_LAT_MX-MTY_M_A
I want to parse each column value so that I get the values in column_B. Thank you.
COLUMN_A COLUMN_B
NA_PTR_51000_LAT_CO-BOGOTA_S_A CO-BOGOTA
NA_PTR_51000_LAT_COL_M_A COL
NA_PTR_51000_LAT_COL_S_A COL
NA_PTR_51000_LAT_COL_S_B COL
NA_PTR_51000_LAT_MX-MC_L_A MX-MC
NA_PTR_51000_LAT_MX-MTY_M_A MX-MTY
I'm not sure of the Postgresql and I can't get SQL fiddle to accept the schema build...
substring and length may vary...
Select Column_A, substr(columN_A,18,length(columN_A)-17-4) from tableName
Ok how about this then:
http://sqlfiddle.com/#!15/ad0dd/56/0
Select column_A, b
from (
Select Column_A, b, row_number() OVER (ORDER BY column_A) AS k
FROM (
SELECT Column_A
, regexp_split_to_table(Column_A, '_') b
FROM test
) I
) X
Where k%7=5
Inside out:
Inner most select simply splits the data into multiple rows on _
middle select adds a row number so that we can use the use the mod operator to find all occurances of a 5th remainder.
This ASSUMES that the section of data you're after is always the 5th segment AND that there are always 7 segments...
Use regexp_matches() with a search pattern like 'NA_PTR_51000_LAT_(.+)_'
This should return everything after NA_PTR_51000_LAT_ before the next underscore, which would match the pattern you are looking for.