Using Regex_substr in Oracle to select string up to the last occurence of a space within \n characters length - sql

We have an issue where a column in our Oracle database has a longer character length than a field in another system.
Therefore I am trying to use case statements along with substr in order to split strings that are more than 40 characters in length. My case statements so far do what I want them to do in the fact that it leaves the first 40 characters of a string in column_a and then puts the remainder of the string in column_b.
However, the problem that I have is that by just using substr, the strings are being split midway through words.
So I was wondering if anybody knew of a couple of regular expressions that I could use with regex_substr that will -
select a string UP TO the last space within 40 characters - for
column_a
select a string AFTER the last space within 40 characters - for
column_b
These are the case statements that I have so far with substr:
CASE WHEN Length(column_a) > 10 THEN SubStr(column_a, 0, 40) END AS column_a,
CASE WHEN Length(column_a) > 40 THEN SubStr(addressnum, 41) END AS column_b
I am not familiar with regular expressions at all and so any help would be very much appreciated!

I've solved with instr/substr:
select substr(column_a,1,instr(substr(column_a,1,40), ' ', -1 )) column1,
substr(column_a,instr(substr(column_a,1,40), ' ', -1 )+1, 40) column2
from table1

A very similar problem was posted today on OTN. https://community.oracle.com/message/13928697#13928697
I posted a general solution, which will cover the problem proposed here as well. It may come in handy if there are similar needs in the future.
For the problem posted here on SO, the row_lengths table will have only one row, with r_id = 1 and r_len = 40. For demo purposes I am showing below an input_strings different from what I used on OTN.
Setup:
create table input_strings (str_id number, txt varchar2(500));
insert into input_strings values (1,
'One Hundred Sixty-Nine Thousand Eight Hundred Seventy-Four Dollars And Nine Cents');
insert into input_strings values (2, null);
insert into input_strings values (3, 'Mathguy rules');
create table row_lengths (r_id number, r_len number);
insert into row_lengths values (1, 40);
commit;
select * from input_strings;
STR_ID TXT
------- ---------------------------------------------------------------------------------
1 One Hundred Sixty-Nine Thousand Eight Hundred Seventy-Four Dollars And Nine Cents
2
3 Mathguy rules
3 rows selected
select * from row_lengths;
R_ID R_LEN
------- ----------
1 40
1 row selected.
Query and output: (NOTE: I include token length to verify that the first token is no more than 40 characters. OP did not answer if the SECOND token can be more than 40 characters; if it can't, one can add rows to the row_lengths table, perhaps with r_len = 40 for every row.)
with
r ( r_id, r_len ) as (
select r_id , r_len from row_lengths union all
select max(r_id) + 1, 4000 from row_lengths union all
select max(r_id) + 2, null from row_lengths
),
b (str_id, str, r_id, token, prev_pos, new_pos) as (
select str_id, txt || ' ', -1, null, null, 0
from input_strings
union all
select b.str_id, b.str, b.r_id + 1,
substr(str, prev_pos + 1, new_pos - prev_pos - 1),
b.new_pos,
new_pos + instr(substr(b.str, b.new_pos + 1, r.r_len + 1) , ' ', -1)
from b join r
on b.r_id + 2 = r.r_id
)
select str_id, r_id, token, nvl(length(token), 0) as len
from b
where r_id > 0
order by str_id, r_id;
STR_ID R_ID TOKEN LEN
------- ------- ------------------------------------------------ -------
1 1 One Hundred Sixty-Nine Thousand Eight 37
1 2 Hundred Seventy-Four Dollars And Nine Cents 43
2 1 0
2 2 0
3 1 Mathguy rules 13
3 2 0
6 rows selected.

Related

Oracle SQL: Merging multiple columns into 1 with conditions

I am new to SQL and don't really have a lot of experience. I need help on this where I have Table A and I want to write a SQL query to generate the result. Any help would be greatly appreciated! Thanks!
Table A
Name
Capacity A
Capacity B
Capacity C
Plant 1
10
20
Plant 2
10
Result Table
Name
Type
Capacity
Plant 1
A,C
10,20
Plant 2
B
10
I know listagg function might be able to combine few columns into one, but is there anyway for me to generate the additional column 'Type' where its smart enough to know which column I am taking my value from? Preferably without creating any additional views/table.
Use NVL2 (or CASE) and concatenate the columns and trim any excess trailing commas:
SELECT Name,
RTRIM(
NVL2(CapacityA,'A,',NULL)
||NVL2(CapacityB,'B,',NULL)
||NVL2(CapacityC,'C',NULL),
','
) AS type,
RTRIM(
NVL2(CapacityA,CapacityA||',',NULL)
||NVL2(CapacityB,CapacityB||',',NULL)
||NVL2(CapacityC,CapacityC,NULL),
','
) AS capacity
FROM table_name;
Which, for the sample data:
CREATE TABLE table_name (name, capacitya, capacityb, capacityc) AS
SELECT 'Plant1', 10, NULL, 20 FROM DUAL UNION ALL
SELECT 'Plant2', NULL, 10, NULL FROM DUAL;
Outputs:
NAME
TYPE
CAPACITY
Plant1
A,C
10,20
Plant2
B
10
db<>fiddle here
Here's one option:
sample data in lines #1 - 4
temp CTE simply - conditionally - concatenates types and capacities
final query (line #17)
removes double separators (commas) (regexp)
removes superfluous leading/trailing commas (trim)
SQL> with test (name, capa, capb, capc) as
2 (select 'Plant1', 10, null, 20 from dual union all
3 select 'Plant2', null, 10, null from dual
4 ),
5 temp as
6 (select name,
7 --
8 case when capa is not null then 'A' end ||','||
9 case when capb is not null then 'B' end ||','||
10 case when capc is not null then 'C' end as type,
11 --
12 case when capa is not null then capa end ||','||
13 case when capb is not null then capb end ||','||
14 case when capc is not null then capc end as capacity
15 from test
16 )
17 select name,
18 trim(both ',' from regexp_replace(type , ',+', ',')) as type,
19 trim(both ',' from regexp_replace(capacity, ',+', ',')) as capacity
20 from temp;
NAME TYPE CAPACITY
------ ---------- ----------
Plant1 A,C 10,20
Plant2 B 10
SQL>

REGEXP to validate a specific number

How can I search for a specific number in an array using REGEXP?
I have an array and need to verify if it has a specific number.
Ex: [5,2,1,4,6,19] and I am looking for number 1, but just the number 1 and not any number that contain the digit 1.
I had to do this:
case when REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[[]{1}[1][,]')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[,]{1}[1][,]{1}')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[,]{1}[1][]]')<>0
or REGEXP_INSTR(JSON_QUERY(MY_JSON_COLUMN,'$.path') , '[[]{1}[1][]]') <>0
then 'DIGIT_ONE' else 'NO_DIGIT_ONE'
end
Is there anything simpler?
You can use
(^|\D)1(\D|$)
This will seach for 1 not enclosed with other digits.
See this regex demo.
Details
(^|\D) - start of string or non-digit
1 - a 1 char
(\D|$) - non-digit or end of string.
Do NOT use regular expressions, use a proper JSON parser and then filter for the number you want:
SELECT my_json_column,
CASE
WHEN JSON_EXISTS( my_json_column, '$?(#.path[*] == 1)' )
THEN 'DIGIT ONE'
ELSE 'NO DIGIT ONE'
END AS has_one
FROM table_name;
or (if you are using Oracle 12.1 and cannot use path filter expressions with JSON_EXISTS, which is only available from Oracle 12.2):
SELECT my_json_column,
CASE
WHEN EXISTS(
SELECT 'X'
FROM JSON_TABLE(
t.my_json_column,
'$.path[*]'
COLUMNS (
value NUMBER PATH '$'
)
)
WHERE value = 1
)
THEN 'DIGIT ONE'
ELSE 'NO DIGIT ONE'
END
FROM table_name t;
Which, for the sample data:
CREATE TABLE table_name (
my_json_column CHECK ( my_json_column IS JSON )
) AS
SELECT '{"path":[5,2,1,4,6,19],"not_this_path":[1,2,3,4,5]}' FROM DUAL UNION ALL
SELECT '{"path":[5,2,4,6,19],"not_this_path":[1,2,3,4,5]}' FROM DUAL UNION ALL
SELECT '{"path":[11],"not_this_path":[1]}' FROM DUAL UNION ALL
SELECT '{"path":[2],"not_this_path":[1]}' FROM DUAL UNION ALL
SELECT '{"path":[1,11]}' FROM DUAL;
Both output:
MY_JSON_COLUMN | HAS_ONE
:-------------------------------------------------- | :-----------
{"path":[5,2,1,4,6,19],"not_this_path":[1,2,3,4,5]} | DIGIT ONE
{"path":[5,2,4,6,19],"not_this_path":[1,2,3,4,5]} | NO DIGIT ONE
{"path":[11],"not_this_path":[1]} | NO DIGIT ONE
{"path":[2],"not_this_path":[1]} | NO DIGIT ONE
{"path":[1,11]} | DIGIT ONE
db<>fiddle here
Alternatively, with a little bit more typing (a little bit? Am I kidding?!), splitting the string into rows and comparing values to the search string:
SQL> with test (col) as
2 (select '[5,2,1,4,6,19]' from dual)
3 select t.col,
4 case when '&par_search_string' in
5 (select regexp_substr(substr(col, 2, length(col) - 1), '[^,]+', 1, level) val
6 from test
7 connect by level <= regexp_count(col, ',') + 1
8 )
9 then 'Search string exists'
10 else 'Search string does not exist'
11 end result
12 from test t;
Enter value for par_search_string: 1
COL RESULT
-------------- ----------------------------
[5,2,1,4,6,19] Search string exists
SQL> /
Enter value for par_search_string: 24
COL RESULT
-------------- ----------------------------
[5,2,1,4,6,19] Search string does not exist
SQL>

Regexp_substr expression

I have problem with my REGEXP expression which I want to loop and every iteration deletes text after slash. My expression looks like this now
REGEXP_SUBSTR('L1161148/1/10', '.*(/)')
I'm getting L1161148/1/ instead of L1161148/1
You said you wanted to loop.
CAVEAT: Both of these solutions assume there are no NULL list elements (all slashes have a value in between them).
SQL> with tbl(data) as (
select 'L1161148/1/10' from dual
)
select level, nvl(substr(data, 1, instr(data, '/', 1, level)-1), data) formatted
from tbl
connect by level <= regexp_count(data, '/') + 1 -- Loop # of delimiters +1 times
order by level desc;
LEVEL FORMATTED
---------- -------------
3 L1161148/1/10
2 L1161148/1
1 L1161148
SQL>
EDIT: To handle multiple rows:
SQL> with tbl(rownbr, col1) as (
select 1, 'L1161148/1/10/2/34/5/6' from dual
union
select 2, 'ALKDFJV1161148/123/456/789/1/2/3' from dual
)
SELECT rownbr, column_value substring_nbr,
nvl(substr(col1, 1, instr(col1, '/', 1, column_value)-1), col1) formatted
FROM tbl,
TABLE(
CAST(
MULTISET(SELECT LEVEL
FROM dual
CONNECT BY LEVEL <= REGEXP_COUNT(col1, '/')+1
) AS sys.OdciNumberList
)
)
order by rownbr, substring_nbr desc
;
ROWNBR SUBSTRING_NBR FORMATTED
---------- ------------- --------------------------------
1 7 L1161148/1/10/2/34/5/6
1 6 L1161148/1/10/2/34/5
1 5 L1161148/1/10/2/34
1 4 L1161148/1/10/2
1 3 L1161148/1/10
1 2 L1161148/1
1 1 L1161148
2 7 ALKDFJV1161148/123/456/789/1/2/3
2 6 ALKDFJV1161148/123/456/789/1/2
2 5 ALKDFJV1161148/123/456/789/1
2 4 ALKDFJV1161148/123/456/789
2 3 ALKDFJV1161148/123/456
2 2 ALKDFJV1161148/123
2 1 ALKDFJV1161148
14 rows selected.
SQL>
You can try removing the string after the last slash:
select regexp_replace('L1161148/1/10', '/([^/]*)$', '') from dual
You are trying to go as far as the last / and then "look back" and retain what was before it. With regular expressions you can do that with a subexpression, like this:
select regexp_substr('L1161148/1/10', '(.*)/.*', 1, 1, null, 1) from dual;
Here, as usual, the first argument "1" means where to start the search, the second "1" means which matching substring to choose, "null" means no special matching modifiers (like case-insensitive matching and such - not needed here), and the last "1" means return the first subexpression - the first thing in parentheses in the "match pattern."
However, regular expressions should only be used when you can't do it with the standard substr and instr (and translate) functions. Here the job is quite easy:
instr(text_string, '/', -1)
will give you the position of the LAST / in text_string (the -1 means find the last occurrence, instead of the first: count from the end of the string). So the whole thing can be written as:
select substr('L1161148/1/10', 1, instr('L1161148/1/10', '/', -1) - 1) from dual;
Edit: In the spirit of Gary_W's solution, here is a generalization to several strings and stripping successive layers from each input string; still not using regular expressions (resulting in slightly faster performance) and using a recursive CTE, available since Oracle version 11; I believe Gary's solution works only from Oracle 12c on.
Query: (I changed Gary's second input string a bit, to make sure the query works properly)
with tbl(item_id, input_str) as (
select 1, 'L1161148/1/10/2/34/5/6' from dual union all
select 2, 'ALKD/FJV11/61148/123/456/789/1/2/3' from dual
),
r (item_id, proc_string, stage) as (
select item_id, input_str, 0 from tbl
union all
select item_id, substr(proc_string, 1, instr(proc_string, '/', -1) - 1), stage + 1
from r
where instr(proc_string, '/') > 0
)
select * from r
order by item_id, stage;
Output:
ITEM_ID PROC_STRING STAGE
---------- ---------------------------------------- ----------
1 L1161148/1/10/2/34/5/6 0
1 L1161148/1/10/2/34/5 1
1 L1161148/1/10/2/34 2
1 L1161148/1/10/2 3
1 L1161148/1/10 4
1 L1161148/1 5
1 L1161148 6
2 ALKD/FJV11/61148/123/456/789/1/2/3 0
2 ALKD/FJV11/61148/123/456/789/1/2 1
2 ALKD/FJV11/61148/123/456/789/1 2
2 ALKD/FJV11/61148/123/456/789 3
2 ALKD/FJV11/61148/123/456 4
2 ALKD/FJV11/61148/123 5
2 ALKD/FJV11/61148 6
2 ALKD/FJV11 7
2 ALKD 8

Ignoring specific letters to find match in SQL Query

I want to query a table for all the values that are on a list on another table to find matches, but I know that some of the values in either table may be typed in incorrectly. One table may have '10Hf7K8' and another table may have '1OHf7K8' but I still want them to match.
Another example, if one table has 'STOP' but I know that in myTable, some of fields may say '5T0P' or 'ST0P' or '5TOP'. I want those to come up as results too. The same thing may occur for '2' and 'Z' if I want 'ZEPT' and '2EPT' to match.
So if I know to account for inconsistencies between '0' and 'O', '5' and 'S' and 'Z' and '2', and knowing that they will be in the same spot, but I do not know where exactly they will be in the word or how many letters the word will have, is it possible to make a query ignoring those letters?
Additional Information: These values are hundreds of serial keys that I have no way of confirming which is correct version between the two tables. I should not have used actual words for my example, these values can be any combination of letters and numbers in any order. There is no distinct pattern that I can hard code.
SOLUTION: Goat CO, Learning, and user3216429's answers contained the solution I needed. I was able to find matching values while keeping the underlying data.
Cleaning data is preferable, but could use nested REPLACE() statements if you can't alter the underlying data:
SELECT *
FROM Table1 a
JOIN Table2 b
ON REPLACE(REPLACE(REPLACE(a.field1,'2','Z'),'5','S'),'0','O') = REPLACE(REPLACE(REPLACE(b.field1,'2','Z'),'5','S'),'0','O')
Cleansing the data could be the same nested replace statement:
ALTER TABLE Table1 ADD cleanfield VARCHAR(25)
UPDATE Table1
SET cleanfield = REPLACE(REPLACE(REPLACE(dirtyfield,'2','Z'),'5','S'),'0','O')
Then you'd be able to join the tables on the clean field.
what you can and should do is to clean your data, replace all these 2,0,5 with Z,O and S.
But if you want to try some other solution, then you can try something like this
select case when
REPLACE(REPLACE(REPLACE('stop','0','o'),'5','s'),'2','Z') = REPLACE(REPLACE(REPLACE('5t0p','0','o'),'5','s'),'2','Z') then 1 else 2 end
Like previously said, if you have time, clean up the data.
If not, SQL SERVER supplies two string functions that might help.
The example is from my blog article. http://craftydba.com/?p=5211
The SOUNDEX() function turns a word into a 4 character value. The DIFFERENCE() function tells you how close two words are.
Your example seems to be one word. You might want to use a calculated column and index it so that the where clause is SARGABLE.
If you are using paragraphs, use a standard split function to turn your text paragraph into words. Use these functions to search the data. However, this will result in a non-SARGABLE expression.
-- Example returns 4, words are very close
select
soundex('Dog') as word_val1,
soundex('Dogs') as word_val2,
difference('Dog', 'Dogs') as how_close
-- Example returns 0, words are very different
select
soundex('Rattle-Snake') as word_val1,
soundex('Mongoose') as word_val2,
difference('Rattle-Snake', 'Mongoose') as how_close
output:
word_val1 word_val2 how_close
--------- --------- -----------
D200 D200 4
word_val1 word_val2 how_close
--------- --------- -----------
R340 M522 0
Last but not least, you an also look into FULL text indexing for speed. This requires some extra overhead (FTI structure and process to update FTI).
http://craftydba.com/?p=1421
select REPLACE(REPLACE( REPLACE([column_name],'O','0'),'Z','2'),'5','S')
from [table_name]
1) To filter out all rows containing all forms of STOP word (STOP, 5TOP, ST0P, 5T0P) you could use following query based on LIKE:
SELECT *
FROM (
SELECT 1, 'CocoJambo' UNION ALL
SELECT 2, '5T0P' UNION ALL
SELECT 3, ' 5TOP ' UNION ALL
SELECT 4, ' ST0P ' UNION ALL
SELECT 5, ' STOP ' UNION ALL
SELECT 6, 'ZTOP'
) x (ID, ColA)
WHERE x.ColA LIKE '%[5S]T[0O]P%';
Output:
ID ColA
----------- ---------
2 5T0P
3 5TOP
4 ST0P
5 STOP
2) Regarding your question:
For every table
first I would try to build a table with all patterns for every word and for every pattern I would store the proper/accurate word,
then I would try to replace every occurrence of pattern with the proper word
After this prepossessing of these two tables I will try to match both tables.
This script will replace only the first occurrence of pattern
SELECT x.*, oa.*,
CASE
WHEN oa.PatIx > 0 THEN STUFF( x.ColA , oa.PatIx , LEN(oa.Word), oa.Word )
ELSE x.ColA
END AS NewColA
FROM (
SELECT 1, 'CocoJambo' UNION ALL
SELECT 2, '5T0P' UNION ALL
SELECT 3, ' 5TOP ' UNION ALL
SELECT 4, ' ST0P ' UNION ALL
SELECT 5, ' STOP jambo jumbo 5TOP bOb ' UNION ALL
SELECT 6, 'ZTOP'
) x (ID, ColA)
OUTER APPLY (
SELECT *
FROM (
SELECT w.WordPattern, w.Word, PATINDEX( w.WordPattern , x.ColA ) AS PatIx
FROM #Words w
) y
WHERE y.PatIx > 0
) oa
Output:
ID ColA WordPattern Word PatIx NewColA
-- ----------------------------- ------------ ---- ----- ----------------------------
1 CocoJambo %b[o0]% bob 8 CocoJambob
2 5T0P %[5S]T[0O]P% STOP 1 STOP
3 5TOP %[5S]T[0O]P% STOP 2 STOP
4 ST0P %[5S]T[0O]P% STOP 3 STOP
5 STOP jambo jumbo 5TOP bOb %[5S]T[0O]P% STOP 4 STOP jambo jumbo 5TOP bOb
5 STOP jambo jumbo 5TOP bOb %b[o0]% bob 12 STOP jambobjumbo 5TOP bOb
6 ZTOP NULL NULL NULL ZTOP
Note: this solution it's just a proof of concept. It needs development.
Or you could try this solution which replaces all wrong words with the proper form:
CREATE TABLE dbo.Words ( Id INT IDENTITY PRIMARY KEY, WordSource NVARCHAR(50) NOT NULL, Word NVARCHAR(50) NOT NULL );
INSERT dbo.Words ( WordSource , Word ) VALUES ( N'5T0P' , N'STOP' );
INSERT dbo.Words ( WordSource , Word ) VALUES ( N'5TOP' , N'STOP' );
INSERT dbo.Words ( WordSource , Word ) VALUES ( N'ST0P' , N'STOP' );
INSERT dbo.Words ( WordSource , Word ) VALUES ( N'b0b' , N'bob' );
INSERT dbo.Words ( WordSource , Word ) VALUES ( N'bOb' , N'bob' );
GO
CREATE FUNCTION dbo.ReplaceWords (#ColA NVARCHAR(4000), #Num INT)
RETURNS TABLE
AS
RETURN
WITH CteRecursive
AS
(
SELECT w.Id, w.WordSource, w.Word, REPLACE(#ColA, w.WordSource, w.Word) AS NewColA
FROM dbo.Words w
WHERE w.Id = 1
UNION ALL
SELECT w.Id, w.WordSource, w.Word, REPLACE(prev.NewColA, w.WordSource, w.Word) AS NewColA
FROM CteRecursive prev INNER JOIN dbo.Words w ON prev.Id + 1 = w.Id
WHERE prev.Id + 1 <= #Num
)
SELECT r.NewColA
FROM CteRecursive r
WHERE r.Id = #Num
GO
-- Testing
SELECT * FROM dbo.ReplaceWords(N' ST0P jambo 5TOP bOb jumbo ', 5) f;
Output
NewColA
----------------------------
STOP jambo STOP bob jumbo
You can use previous function to replace all wrong words within every table and then you can compare both tables:
DECLARE #Num INT;
SET #Num = (SELECT COUNT(*) FROM dbo.Words);
SELECT x.*, rpl.NewColA
FROM (
SELECT 1, N'CocoJambo' UNION ALL
SELECT 2, N'5T0P' UNION ALL
SELECT 3, N' 5TOP ' UNION ALL
SELECT 4, N' ST0P ' UNION ALL
SELECT 5, N' STOP jambo jumbo 5TOP bOb ' UNION ALL
SELECT 6, N'ZTOP' UNION ALL
SELECT 7, N'' UNION ALL
SELECT 8, NULL
) x (ID, ColA)
OUTER APPLY dbo.ReplaceWords(x.ColA, #Num) rpl
Output:
ID ColA NewColA
-- ----------------------------- ----------------------------
1 CocoJambo CocoJambo
2 5T0P STOP
3 5TOP STOP
4 ST0P STOP
5 STOP jambo jumbo 5TOP bOb STOP jambo jumbo STOP bob
6 ZTOP ZTOP
7
8 NULL NULL

Oracle custom sort

The query...
select distinct name from myTable
returns a bunch of values that start with the following character sequences...
ADL*
FG*
FH*
LAS*
TWUP*
Where '*' is the remainder of the string.
I want to do an order by that sorts in the following manner...
ADL*
LAS*
TWUP*
FG*
FH*
But then I also want to sort within each name in the standard order by fashion. So, an example, if I have the following values
LAS-21A
TWUP-1
FG999
FH3
ADL99999
ADL88888
ADL77777
LAS2
I want it to be sorted like this...
ADL77777
ADL88888
ADL99999
LAS2
TWUP-1
FG999
FH3
I initially thought I could accomplish this vias doing an order by decode(blah) with some like trickery inside of the decode but I've been unable to accomplish it. Any insights?
Goofy and verbose, but should work:
select name, case when substr (name, 1, 3) = 'ADL' then 1
when substr (name, 1, 3) = 'LAS' then 2
when substr (name, 1, 4) = 'TWUP' then 3
when substr (name, 1, 2) = 'FG' then 4
when substr (name, 1, 2) = 'FH' then 5
else 6
end SortOrder
from myTable
order by 2, 1;
Not sure if 6 is the correct place to sort the other items, but it is obvious how to fix that. At least it is clear what is going on, even if I have no idea why you are doing it this way.
EDIT: If these are the only values, you could change lines 4 and 5:
select name, case when substr (name, 1, 3) = 'ADL' then 1
when substr (name, 1, 3) = 'LAS' then 2
when substr (name, 1, 4) = 'TWUP' then 3
when substr (name, 1, 1) = 'F' then 4
else 6
end SortOrder
from myTable
order by 2, 1;
ANOTHER EDIT: And again, if these are the only values, you can simplify even more. Since the only one out of order is the F* series, you can force them to the end, and use the actual first letter for all the others. This is simpler, but relies too much on the exact values for my preference. On the other hand, it does remove many of the seemingly unnecessary calls to substr :
select name, case when substr (name, 1, 1) = 'F' then 'Z'
else name
end SortOrder
from myTable
order by 2, 1;
The problem is that your prefix contains a variable number of characters. This is a good time to deploy regular expressions (if you have 10g or higher).
SQL> select cola
2 from t34
3 order by decode( regexp_substr(cola, '[[:alpha:]]+')
4 , 'ADL' , 10
5 , 'LAS', 20
6 , 'TWUP', 30
7 , 'FG' , 40
8 , 'FH' , 50
9 , 60 )
10 , cola
11 /
COLA
----------
ADL77777
ADL88888
ADL99999
LAS-21A
LAS2
TWUP-1
FG999
FH3
8 rows selected.
SQL>
If earlier versions of Oracle we can use the OWA_PATTERN.AMATCH() function to the same effect:
SQL> select cola
2 from t34
3 order by decode( owa_pattern.amatch(cola, 1, '^[A-Z]+')
4 , 'ADL' , 10
5 , 'LAS', 20
6 , 'TWUP', 30
7 , 'FG' , 40
8 , 'FH' , 50
9 , 60 )
10 , cola
11 /
COLA
----------
ADL77777
ADL88888
ADL99999
FG999
FH3
LAS-21A
LAS2
TWUP-1
8 rows selected.
SQL>