Extract city from the address column - sql

enter image description hereWhat happens if if ship 3 and 4 are null, but ship2 is not null, that should be city state
Here is sample data in the picture.

I prefer the oldfashioned SUBSTR + INSTR combination which, if compared to Gordon's and Barbaros' suggestions, seems to be somewhat better as their queries return strings that don't even contain a comma, while the OP says
extract city from 1 letter until 1 comma
Here's a comparison:
SQL> with tab (addr) as
2 (
3 select 'RALEIGH, NC 27604-3229' from dual union all
4 select 'SUITE A' from dual union all
5 select 'COEUR D ALENE, ID 83815-8652' from dual union all
6 select '*O/S CITY LIMITS*' from dual
7 )
8 select addr,
9 substr(addr, 1, instr(addr, ',') - 1) littlefoot,
10 --
11 regexp_substr(addr, '[^,]+', 1, 1) gordon,
12 regexp_substr(addr,'[^,]+') barbaros
13 from tab;
ADDR LITTLEFOOT GORDON BARBAROS
---------------------------- --------------- -------------------- --------------------
RALEIGH, NC 27604-3229 RALEIGH RALEIGH RALEIGH
SUITE A SUITE A SUITE A
COEUR D ALENE, ID 83815-8652 COEUR D ALENE COEUR D ALENE COEUR D ALENE
*O/S CITY LIMITS* *O/S CITY LIMITS* *O/S CITY LIMITS*
SQL>

If you want the part before the first comma, you can use regexp_substr():
select regexp_substr(addr, '[^,]+', 1, 1)

Just use regexp_substr with [^,]+ pattern as below
select regexp_substr(address,'[^,]+') as city
from tab;
SQL Fiddle Demo 1
Or alternatively by creating an auxilary table :
with tab as
(
select 'RALEIGH, NC 27604-3229' as str from dual union all
select 'SALINAS, CA 93901' from dual union all
select 'DEPEW, NY 14043-2603' from dual
)
select regexp_substr(str,'[^,]+') as city
from tab;
SQL Fiddle Demo 2

If you don't want to use regexp, you can just use:
select substr(city,1,(instr(city,',')-1))
from mytable;

Related

how to get the number after '-' in Oracle

I have some strings in my table. They are like 1101-1, 1101-2, 1101-10, 1101-11 pulse, shock, abc, 1104-2, 1104-11, 2201-1, 2202-4. I tried to sort them like below:
1101-1
1101-2
1101-10
1101-11
1104-2
1104-11
2201-1
2202-4
abc
pulse
shock
But I can't get the sort correctly. Below is my codes:
select column from table
order by regexp_substr(column, '^\D*') nulls first,
to_number(substr(regexp_substr(column, '\d+'),1,4)) asc
Sort numbers as numbers:
first the ones in front of the hyphen (line #16)
then the ones after it (line #17),
then the rest (line #18)
Mind the to_number function! Without it, you'll be sorting strings! and get the wrong result.
SQL> with test (col) as
2 ( select '1101-1' from dual union all
3 select '1101-2' from dual union all
4 select '1101-10' from dual union all
5 select '1101-11' from dual union all
6 select 'pulse' from dual union all
7 select 'shock' from dual union all
8 select 'abc' from dual union all
9 select '1104-2' from dual union all
10 select '1104-11' from dual union all
11 select '2201-1' from dual union all
12 select '2202-4' from dual
13 )
14 select col
15 from test
16 order by to_number(regexp_substr(col, '^\d+')),
17 to_number(regexp_substr(col, '\d+$')),
18 col;
COL
-------
1101-1
1101-2
1101-10
1101-11
1104-2
1104-11
2201-1
2202-4
abc
pulse
shock
11 rows selected.
SQL>
For your examples, this should do:
order by regexp_substr(column, '^[^-]+'), -- everything before the hyphen
len(column),
column
To get the number after '-' specifically:
with ttt (col) as (
select cast(column_value as varchar2(10)) as second_str
from table(sys.dbms_debug_vc2coll
( '1101-1'
, '1101-2'
, '1101-10'
, '1101-11'
, '1104-2'
, '1104-11'
, '2201-1'
, '2202-4'
, 'abc'
, 'pulse'
, 'shock'
))
)
select col
, regexp_substr(col, '(^\d+-)(\d+)', 1, 1, '', 2)
from ttt;
COL SECOND_STR
---------- ----------
1101-1 1
2201-1 1
1101-10 10
1101-11 11
1104-11 11
1101-2 2
1104-2 2
2202-4 4
abc
pulse
shock
11 rows selected
This treats the text string as two values, (^\d+-) followed by (\d+), and takes the second substring (the final '2' parameter). As only positional parameters are allowed for built-in SQL functions, you also have to specify occurrence (1) and match param (null, as we don't care about case etc).

How can I get a natural numeric sort order in Oracle?

I have a column with a letter followed by either numbers or letters:
ID_Col
------
S001
S1001
S090
SV911
SV800
Sfoofo
Szap
Sbart
How can I order it naturally with the numbers first (ASC) then the letters alphabetically? If it starts with S and the remaining characters are numbers, sort by the numbers. Else, sort by the letter. So SV911should be sorted at the end with the letters since it also contains a V. E.g.
ID_Col
------
S001
S090
S1001
Sbart
Sfoofo
SV800
SV911
Szap
I see this solution uses regex combined with the TO_NUMBER function, but since I also have entries with no numbers this doesn't seem to work for me. I tried the expression:
ORDER BY
TO_NUMBER(REGEXP_SUBSTR(ID_Col, '^S\d+$')),
ID_Col
/* gives ORA-01722: invalid number */
Would this help?
SQL> with test (col) as
2 (select 'S001' from dual union all
3 select 'S1001' from dual union all
4 select 'S090' from dual union all
5 select 'SV911' from dual union all
6 select 'SV800' from dual union all
7 select 'Sfoofo' from dual union all
8 select 'Szap' from dual union all
9 select 'Sbart' from dual
10 )
11 select col
12 from test
13 order by substr(col, 1, 1),
14 case when regexp_like(col, '^[[:alpha:]]\d') then to_number(regexp_substr(col, '\d+$')) end,
15 substr(col, 2);
COL
------
S001
S090
S1001
Sbart
Sfoofo
SV800
SV911
Szap
8 rows selected.
SQL>

How to trim out letter in the column

I don't know the effective way to trim out letter in the name. For example, the f_name column have Jenny, Johnny, Doe, Ken, Smith.
I wanted to trim out the letter in these name so it consist only the first 2 letter. Like Je, Jo, Do, Ke, Sm as the output for the new column.
But the letter in these name don't have equal number of letter, like Johnny have 6 letter and John have 4 letter.
Is there any effective way to trim the uneven character's length without count all the character's length in f_name and place all the condition to trim all names. Like these below.
CASE WHEN LENGTH(f_name) > 4 THEN LTRIM(f_name, 2)
For Oracle use substr():
with data (f_name) as (
select 'Jenny' from dual union all
select 'Johnny' from dual union all
select 'Doe' from dual union all
select 'Ken' from dual union all
select 'Smith' from dual
)
select substr(f_name, 1, 2)
from data
Returns:
SUBSTR(F_NAME,1,2)
------------------
Je
Jo
Do
Ke
Sm
USE SUBSTRING
CASE WHEN LENGTH(f_name) > 4 THEN SUBSTR(f_name,1, 2)
If you want to get least acronym by all names. You may write something like
with s as (select level as lvl from dual connect by level <(select max(LENGTH(f_name)) from your_table ))
select f_name,
max(sub_f_name) keep (dense_rank FIRST order by cnt, t.lvl desc) as least_acronym
select f_name
, substr(t.f_name,-lvl) as sub_f_name
, t.lvl
, count(*) over (partition by substr(t.f_name,-lvl)) as cnt
from your_table t
, s)
group by f_name
NB. Just as Idea. Not tested yet

SQL Query to show string before a dash

I would like to execute a query that will only show all the string before dash in the particular field.
For example:
Original data: AB-123
After query: AB
You can use substr:
SQL> WITH DATA AS (SELECT 'AB-123' txt FROM dual)
2 SELECT substr(txt, 1, instr(txt, '-') - 1)
3 FROM DATA;
SUBSTR(TXT,1,INSTR(TXT,'-')-1)
------------------------------
AB
or regexp_substr (10g+):
SQL> WITH DATA AS (SELECT 'AB-123' txt FROM dual)
2 SELECT regexp_substr(txt, '^[^-]*')
3 FROM DATA;
REGEXP_SUBSTR(TXT,'^[^-]*')
---------------------------
AB
You can use regexp_replace.
For example
WITH DATA AS (
SELECT 'AB-123' as text FROM dual
UNION ALL
SELECT 'ABC123' as text FROM dual
)
SELECT
regexp_replace(d.text, '-.*$', '') as result
FROM DATA d;
will lead to
WITH DATA AS (
2 SELECT 'AB-123' as text FROM dual
3 UNION ALL
4 SELECT 'ABC123' as text FROM dual
5 )
6 SELECT
7 regexp_replace(d.text, '-.*$', '') as result
8 FROM DATA d;
RESULT
------------------------------------------------------
AB
ABC123
I found this simple
SELECT distinct
regexp_replace(d.pyid, '-.*$', '') as result
FROM schema.table d;
pyID column contains ABC-123, DEF-3454
SQL Result:
ABC
DEF

SQL function REGEXP_SUBSTR: Regular Expression how to get the content between two characters but not include them

For these strings
RSLR_AIRL19_ID3454_T20030913091226
RSLR_AIRL19_ID3122454_T20030913091226
RSLR_AIRL19_ID34_T20030913091226
How to get the number after ID ?
Or how to get the content between two characters but not include them ?
I use this '/\_ID([^_]+)/' got matches like Array ( [0] => _ID3454 [1] => 3454 )
Is this the right way?
To extract a number after an ID, you could write a similar query.
SQL> with t1 as(
2 select 'RSLR_AIRL19_ID3454_T20030913091226' as col from dual union all
3 select 'RSLR_AIRL19_ID3122454_T20030913091226' from dual union all
4 select 'RSLR_AIRL19_ID34_T20030913091226' from dual
5 )
6 select regexp_substr(col, '^([[:alnum:]]+_){2}ID([[:digit:]]+)_([[:alnum:]]+){1}$', 1, 1, 'i', 2) as ID
7 from t1
8 ;
ID
-------------
3454
3122454
34
Or, if you want to extract digits from a first occurrence of the pattern without verifying if an entire string matches a specific format:
SQL> with t1 as(
2 select 'RSLR_AI_RL19_ID3454_T20030913091226' as col from dual union all
3 select 'RSLR_AIRL19_ID3122454_T20030913091226' from dual union all
4 select 'RSLR_AIRL19_ID34_T20030913091226' from dual
5 )
6 select regexp_substr(col, 'ID([[:digit:]]+)', 1, 1, 'i', 1) as ID
7 from t1
8 ;
ID
--------------
3454
3122454
34
With pcre & perl engines :
ID\K\w+
NOTE
\K "restart" the match.
See http://www.phpfreaks.com/blog/pcre-regex-spotlight-k (php use pcre)