Oracle - Extract numbers for comparison from varchar2 column - sql

I have encountered a following problem while solving a task:
In Oracle database I have got a table ENTITY_INFO that is fairly simple in structure. It contains 3 columns:
ENTITY_ID (VARCHAR2) - PK of the entity in database
NAME (VARCHAR2) - name of the information, i.e. "location", "cost", "last encounter"
VALUE (VARCHAR2) - a value of the information, i.e. "assets/music", "1500", "1.1.2000"
Currently, I need to filter out entities that have its "cost" < 1000.
A naive approach via
SELECT ENTITY_ID FROM ENTITY_INFO WHERE NAME = 'cost' AND TO_NUMBER(VALUE)<1000
does not work, because column VALUE contains values that are not number.
But all column values that match the filter NAME = 'cost' are numbers, so the case I need to do is valid.
I found Select string as number on Oracle topic, but the information inside prove not useful to solving this problem.
Due to nature of ENTITY_INFO and a state of project, the change of datamodel is not viable solution too.
Thanks for any hints.

You could make the conversion to a number conditional:
SELECT ENTITY_ID
FROM ENTITY_INFO
WHERE NAME = 'cost'
AND TO_NUMBER(CASE WHEN NAME = 'cost' THEN VALUE ELSE NULL END) < 1000

Alternate approach leveraging WITH clause, on the presumption that all the records with name are numbers
In the tab1 part, use the filter condition and query from tab1 with TO_NUMBER
WITH tab1
AS (SELECT entity_id, name, VALUE
FROM entity_info
WHERE name = 'cost')
SELECT *
FROM tab1
WHERE TO_NUMBER (VALUE) < 1000
Having numbers and characters in one column is an accident waiting to happen. Adding another column to distinguish numeric and non-numeric is not an option, I would reckon to have a constraint to deter entering non-numeric if name is cost

In my compiler, I see no problem with your code (or this equivalence of it's):
SELECT ENTITY_ID
FROM ENTITY_INFO
WHERE NAME = 'cost'
AND VALUE < 1000
Example with data samples:
with ENTITY_INFO as (
select 1 as ENTITY_ID, 'cost' as name, '2000' as value from dual
union all
select 2 as ENTITY_ID, 'cost' as name, '900' as value from dual
union all
select 3 as ENTITY_ID, 'cost' as name, '3000' as value from dual
union all
select 4 as ENTITY_ID, 'cost' as name, '2500' as value from dual
union all
select 5 as ENTITY_ID, 'cost' as name, '700' as value from dual
union all
select 6 as ENTITY_ID, 'frf' as name, '250sasd0' as value from dual
union all
select 7 as ENTITY_ID, 'corfrst' as name, '70fa0' as value from dual
)
SELECT ENTITY_ID
FROM ENTITY_INFO
WHERE NAME = 'cost'
AND VALUE < 1000
Result:
ENTITY_ID
2
5
Alternatively, you can use the subquery that would assure that all of the resulting column values from it would be number-like strings:
SELECT ENTITY_ID
FROM (SELECT ENTITY_ID,
VALUE
FROM ENTITY_INFO
WHERE NAME = 'cost' )
WHERE TO_NUMBER(VALUE)<1000
I hope I helped!

Related

OracleSQL CASE Statement on Number in string

I have a table called TEST_TABLE with 1 column called COLUMN1. This table has 2 records:
V.WEEKLY_2020_15
V.WEEKLY_2020_16
I'm trying to write a CASE statement that maps these records to different Periods. e.g.
SELECT
CASE WHEN COLUMN1='V.WEEK_2020_ **MAXIMUM NUMBER** ' THEN 'CURRENT PERIOD'
ELSE 'HISTORICAL PERIOD 1' END
FROM TEST_TABLE
I'm not sure what is the best way to do this though. I need to get the number from the end of the string, and then compare it to the other numbers in the table. Once it finds one number that is higher or lower it can stop the search as there will always only be 2 numbers in this table.
You can get the number from the end of the string with a regular expression. This one gets the 3rd group of characters which don't include an underscore.
select column1, regexp_substr(column1,'[^_]+',1,3) from test_table;
Alternately you could get the 2nd group of numbers with regexp_substr(column1,'[0-9]+',1,2). The best regexp will depend on your knowledge of the possible string values. If you know the number will always be the last 2 characters, you could do substr(column1, -2)
And if you want to identify rows which have the highest/lowest/etc value, adding a column which applies an window/analytical function is a common pattern.
-- sample data
with test_table as (select 'V.WEEKLY_2020_15' as column1 from dual
union select 'V.WEEKLY_2020_16' from dual)
-- query
SELECT column1, regexp_substr(column1,'[^_]+',1,3) as regex, max_number,
CASE WHEN COLUMN1=max_number THEN 'CURRENT PERIOD'
ELSE 'HISTORICAL PERIOD 1' END as period
FROM (select test_table.*,
max(column1) over (order by regexp_substr(column1,'[^_]+',1,3) desc) as max_number
from test_table) T;
Usually the data will be more complicated then you're showing - for example, you might have 2 periods in the table for each primary key, and then you'll want to partition your window function.
-- sample data
with test_table as (select 1 as pk, 'V.WEEKLY_2020_15' as column1 from dual
union select 1, 'V.WEEKLY_2020_16' from dual
union select 2, 'V.WEEKLY_2021_1' from dual
union select 2, 'V.WEEKLY_2021_200' from dual)
-- query
SELECT pk, column1, regexp_substr(column1,'[^_]+',1,3) as regex,
CASE WHEN COLUMN1=max_number THEN 'CURRENT PERIOD'
ELSE 'HISTORICAL PERIOD 1' END as period
FROM (select test_table.*,
max(column1) over (partition by pk order by regexp_substr(column1,'[^_]+',1,3) desc) as max_number
from test_table) T;
Output:
PK
COLUMN1
REGEX
PERIOD
1
V.WEEKLY_2020_16
16
CURRENT PERIOD
1
V.WEEKLY_2020_15
15
HISTORICAL PERIOD 1
2
V.WEEKLY_2021_200
200
CURRENT PERIOD
2
V.WEEKLY_2021_1
1
HISTORICAL PERIOD 1

Why this sql will cause type conversion error?

WITH tb_testl AS (
SELECT 1 AS id ,'hehe' AS value
UNION ALL
SELECT 1 AS id, '1' AS value
UNION ALL
SELECT 2 AS id, '2' AS value
UNION ALL
SELECT 2 AS id, '2' AS value
), tb_test2 AS (
SELECT CONVERT(INT , value) AS value FROM tb_testl WHERE id = 2
)
SELECT * FROM tb_test2 WHERE value = 2;
this sql will cause error
Conversion failed when converting the varchar value 'hehe' to data
type int.
but the table tb_test2 dosen't have the value 'hehe' which is in the anthor table tb_test1. And I found that this sql will work well if I don't append the statement WHERE value = 2; .I've tried ISNUMBERIC function but it didn't work.
version:mssql2008 R2
With respect to the why this occurs:
There is a Logical Processing Order, which describes the order in which clauses are evaluated. The order is:
FROM
ON
JOIN
WHERE
GROUP BY
WITH CUBE or WITH ROLLUP
HAVING
SELECT
DISTINCT
ORDER BY
TOP
You can also see the processing order when you SET SHOWPLAN_ALL ON. For this query, the processing is as follows:
Constant scan - this is the FROM clause, which consists of hard coded values, hence the constants.
Filter - this is the WHERE clause. While it looks like there are two where clauses (WHERE id = 2 and WHERE value = 2). SQL Server sees this differently, it considers a single WHERE clause: WHERE CONVERT(INT , value) = 2 AND id = 2.
Compute scaler. This is the CONVERT function in the select.
Because both WHERE clauses are executed simultaneously, the hehe value is not filtered out of the CONVERT scope.
Effectively, the query is simplified to something like:
SELECT CONVERT(INT, tb_testl.value) AS Cvalue
FROM (
SELECT 1 AS id
, 'hehe' AS value
UNION ALL
SELECT 1 AS id
, '1' AS value
UNION ALL
SELECT 2 AS id
, '2' AS value
UNION ALL
SELECT 2 AS id
, '2' AS value
) tb_testl
WHERE CONVERT(INT, tb_testl.value) = 2
AND tb_testl.id = 2
Which should clarify why the error occurs.
With SQL, you cannot read code in the same way as imperative languages like C. Lines of SQL code are not necessarily (mostly not at all, in fact) executed in the same order it is written in. In this case, it's an error to think the inner where is executed before the outer where.
SQL Server does not guarantee the order of processing of statements (with one exception below). That is, there is no guarantee that WHERE filtering happens before the SELECT. Or that one CTE is evaluated before another. This is considered an advantage because it allows SQL Server to rearrange the processing to optimize performance (although I consider the issue that you are seeing a bug).
Obviously, the problem is in this part of the code:
tb_test2 AS (
SELECT CONVERT(INT, value) AS value
FROM tb_testl
WHERE id = 2
)
(Well, actually, it is where tb_test2 is referenced.)
What is happening is that SQL Server pushes the CONVERT() to where the values are being read, so the conversion is attempted before the WHERE clause is processed. Hence, the error.
In SQL Server 2012+, you can easily solve this using TRY_CNVERT():
tb_test2 AS (
SELECT TRY_CONVERT(INT, value) AS value
FROM tb_testl
WHERE id = 2
)
However, that doesn't work in SQL Server 2008. You can use the fact that CASE does have some guarantees on the order of processing:
tb_test2 AS (
SELECT (CASE WHEN value NOT LIKE '%[^0-9]%' THEN CONVERT(INT, value)
END) AS value
FROM tb_testl
WHERE id = 2
)
error caused by this part of statement
), tb_test2 AS (
SELECT CONVERT(INT , value) AS value FROM tb_testl WHERE id = 2
value has type of varchar and 'hehe' value cannot be converted to integer
WITH tb_testl AS (
SELECT 1 AS id ,'hehe' AS value
UPDATE: sql try convert all value(s) to integer in you statement. to avoid error rewrite statement as
WITH tb_testl AS (
SELECT 1 AS id ,'hehe' AS value
UNION ALL SELECT 1 AS id, '1' AS value
UNION ALL SELECT 2 AS id, '2' AS value
UNION ALL SELECT 2 AS id, '2' AS value
), tb_test2 AS (
SELECT value AS value FROM tb_testl WHERE id = 2
),
tb_test3 AS (
SELECT cast(value as int) AS value FROM tb_test2
)
SELECT * FROM tb_test3

Override Max Function to allow strings SQL?

Hello what I feel to be a simple question but cannot figure it out. I am trying to find the max number in relation to another column and group it, the issue that comes up is that one of the values is a string.
Name Value
Nate 0
Nate 1
Jeff 2
Nate 2
Nate 'Test'
For the data I actually want 'Test' to be equal to 1. However if I use the MAX() function here I will get:
Name Value
Nate 'Test'
Jeff 2
I can only think that maybe if I read 'Test' as 1 then use the max function (which I am not sure how to do) or possibly overload MAX() to my own definition somehow.
Thank you for any help you can give.
Storing mixed data in a string column is generally a bad idea.
You can convert a specific string to a fixed value with a case expression:
select max(case when value = 'Test' then '1' else value end) from ...
But you are still dealing with strings, so you probably want to convert them to numbers, to prevent '10' sorting before '2' for instance:
select max(to_number(case when value = 'Test' then '1' else value end)) from ...
or
select max(case when value = 'Test' then 1 else to_number(value) end) from ...
Using a CTE for your sample data:
-- CTE for dummy data
with your_table (name, value) as (
select 'Nate', '0' from dual
union all select 'Nate', '1' from dual
union all select 'Jeff', '2' from dual
union all select 'Nate', '2' from dual
union all select 'Nate', 'Test' from dual
)
-- actual query
select name,
max(case when value = 'Test' then 1 else to_number(value) end) as value
from your_table
group by name;
NAME VALUE
---- ----------
Nate 2
Jeff 2
But you have to cover all values that cannot be explicitly or implicitly converted to numbers.
If would be slightly easier if you wanted to ignore non-numeric values, or treat them all as the same fixed value, rather than mapping individual strings to their own numeric values. Then you could write a function that attempts to convert any string and if it gets any exception returns null (or some other fixed value).
From 12cR1 you can even do with with a PL/SQL declaration rather than a permanent standalone or packaged function, if this an occasional thing:
with
function hack_to_number(string varchar2) return number is
begin
return to_number(string);
exception
when others then
return 1;
end;
select name,
max(hack_to_number(value)) as value
from your_table
group by name;
NAME VALUE
---- ----------
Nate 2
Jeff 2
You'd probably be better off going back and redesigning the data model to prevent this kind of issue by using the correct data types.
As #DrYWit pointed out in a comment, from 12cR2 you don't even need to do that, as the to_number() function has this built in, if you call it explicitly:
select name,
max(to_number(value default 1 on conversion error)) as value
from your_table
group by name;
How about this regular expression "trick"?
SQL> with your_table (name, value) as (
2 select 'Nate', '0' from dual
3 union all select 'Nate', '1' from dual
4 union all select 'Jeff', '2' from dual
5 union all select 'Nate', '2' from dual
6 union all select 'Nate', 'Test' from dual
7 )
8 select name, max(to_number(value)) mv
9 from your_table
10 where regexp_like (value, '^\d+$')
11 group by name;
NAME MV
---- ----------
Nate 2
Jeff 2
SQL>

Oracle Order By Sorting: Column Values with character First Followed by number

I have column values as
AVG,ABC, AFG, 3/M, 150,RFG,567, 5HJ
Requirement is to sort as below:
ABC,AFG,AVG,RFG,3/M,5HJ,150,567
Any help?
If you want to sort letters before numbers, then you can test each character. Here is one method:
order by (case when substr(col, 1, 1) between 'A' and 'Z' then 1 else 2 end),
(case when substr(col, 2, 1) between 'A' and 'Z' then 1 else 2 end),
(case when substr(col, 3, 1) between 'A' and 'Z' then 1 else 2 end),
col
This doesn't produce the requested output, but for lexicographic with numbers second TRANSLATE is a simple solution:
http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions196.htm
select value
from (
select 'AVG' as value from dual
union all
select 'ABC' as value from dual
union all
select 'AFG' as value from dual
union all
select '3/M' as value from dual
union all
select '150' as value from dual
union all
select 'RFG' as value from dual
union all
select '567' as value from dual
union all
select '5HJ' as value from dual
)
order by translate(upper(value), 'ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ')
;
This shifts all the letters down and numbers to the end.
Unfortunately within the sort order numbers are before chars.
I could suggest you the put an additional calculated column where you are adding 'ZZZ' in front of the values if they start with a number then you will sort by that virtual column
If there aren't a large number of unique values, build a table that has the value and it's artificial sort order, then order by the sort key. Something like:
create table sort_map
( value varchar2(35),
sort_order number(4)
);
insert into sort_map (value, sort_order) values ('ABC',10);
insert into sort_map (value, sort_order) values ('AFG', 20);
....
insert into sort_map (value, sort_order) values ('150', 70);
insert into sort_map (value, sort_order) values ('567', 80);
--example query
select t.my_col, s.sort_order
from my_table t
join sort_map s
on (t.my_col = s.value)
order by s.sort_order;
A) If you only want to change the order of full numberic secuences just create your isNumeric function:
SELECT * FROM table WHERE isNumeric(field) ORDER BY FIELD
UNION ALL
SELECT * FROM table WHERE NOT isNumeric(field) ORDER BY FIELD
B) If you want to change the order of characters.
Create a funtion that adds a number before every character with a modifier.
Number -> 0
Other -> 2
Letter -> 4
For example:
shortHelper("2FT/") => "024F4T2/"
shortHelper("AZ") => "4A4Z"
shortHelper("Z1") => "4Z01"
Then use "ORDER BY shortHelper(Field)

How do you make an Oracle SQL query to

This table is rather backwards from a normal schema, and I'm not sure how to get the data I need from it.
Here is some sample data,
Value (column) Info (column)
---------------------------------------------
Supplier_1 'Some supplier'
Supplier_1_email 'foo#gmail.com'
Supplier_1_rating '5'
Supplier_1_status 'Active'
Supplier_2 'Some other supplier'
Supplier_2_email 'bar#gmail.com'
Supplier_2_rating '4'
Supplier_2_status 'Active'
Supplier_3 'Yet another supplier'
...
I need a query to find the email of the supplier which has the highest rating and is currently of status 'Active'.
select
m.sup_email, r.sup_rating
from
(select substr(value, 1, length(value) - length('_email') as sup_name, info as sup_email from table where value like '%_email') as m
left join
(select substr(value, 1, length(value) - length('_rating') as sup_name), info as sub_rating from table where value like '%_rating') as r on m.sup_name = r.sup_name
order by
sup_rating desc
limit
1;
For a single pass solution, try:
select "email" from
(select
substr("value", 1, 8 + instr(substr("value", 10, length("value")-9),'_')) "supplier",
max(case when "value" like '%_status' then "info" end) as "status",
max(case when "value" like '%_rating' then cast("info" as integer) end) as "rating",
max(case when "value" like '%_email' then "info" end) as "email"
from "table" t
where "value" like '%_rating' or "value" like '%_email' or "value" like '%_status'
group by substr("value", 1, 8 + instr(substr("value", 10, length("value")-9),'_'))
having max(case when "value" like '%_status' then "info" end) = 'Active'
order by 3 desc
) where rownum = 1
(Column names are all double-quoted as some are reserved words.)
Expanding on Mike's excellent suggestion:
CREATE VIEW supplier_names AS
SELECT SUBSTR(Value,INSTR(Value,'_')+1) AS supplier_id
,Info AS supplier_name
FROM the_table
WHERE INSTR(Value,'_',1,2) = 0;
CREATE VIEW supplier_emails AS
SELECT SUBSTR(Value,INSTR(Value,'_')+1,INSTR(Value,'_',1,2)-INSTR(Value,'_')-1)
AS supplier_id
,Info AS supplier_email
FROM the_table
WHERE Value LIKE '%email';
CREATE VIEW supplier_ratings AS
SELECT SUBSTR(Value,INSTR(Value,'_')+1,INSTR(Value,'_',1,2)-INSTR(Value,'_')-1)
AS supplier_id
,Info AS supplier_rating
FROM the_table
WHERE Value LIKE '%rating';
CREATE VIEW supplier_statuses AS
SELECT SUBSTR(Value,INSTR(Value,'_')+1,INSTR(Value,'_',1,2)-INSTR(Value,'_')-1)
AS supplier_id
,Info AS supplier_rating
FROM the_table
WHERE Value LIKE '%status';
The queries will perform like dogs, so I'd suggest you look into creating some virtual columns, or at least function-based indexes, to optimise these queries.