Oracle SQL: How to perform comparison by converting "Varchar" to "Number" - sql

I have only read-only access to Oracle SQL (Can use SELECT command only).
I want to perform the comparison conditions on a Varchar type column by converting it to Number type.
Reference Data:
ID | Price | Currency
-------------------------
548 | 6000 | USD
9784 | 7000 | EUR
254 | 5000 | USD
Query used:
select id, price, currency
from ( select item_id id,
to_number(item_price) price,
item_currency currency
from item
where item_price is not null) A
where A.price <= 6000;
Expected Output:
ID | Price | Currency
-------------------------
548 | 6000 | USD
254 | 5000 | USD

"ORA-01722: invalid number" means what it says: you are attempting to cast a string to a number when the string contains a non-numeric value.
This is the danger of using weakly-typed columns. People always say, "our application will validate the input" . But the one thing you can guarantee is that someone (or something) will stick a non-numeric value into that column.
Okay, so hindsight is a marvellous thing and you probably don't want a lecture from me about data integrity: what, practically, can you do? Basically you need to identify the values which won't cast to numbers and handle them somehow (change the value, filter them from the query, whatever).
There's no Oracle built in to test for numberness but it's easy to write one:
create or replace function is_number (p_str in varchar2)
return varchar2
is
return_value varchar2(5);
n number;
begin
begin
dbms_output.put_line('str='||p_str);
n := to_number(p_str);
return_value := 'TRUE';
exception
when invalid_number or value_error then
dbms_output.put_line('here');
return_value := 'FALSE';
when others then
dbms_output.put_line(sqlerrm);
raise;
end;
return return_value;
end;
/
Here's one way to use it.
with cte as ( select id, price, currency from item
where is_number(price) = 'TRUE')
select id, price, currency
from cte
where to_number(price) <= 6000;

You can use CAST() function.
Example of using it is below
SELECT product_id, CAST(ad_sourcetext AS VARCHAR2(30)) FROM print_media;
For more informations visit THIS.
Hope it helps

Related

Query returns rows outside of `between` range?

I am querying a SQL Server database to get results from a table between two number values. Here is that statement:
select *
FROM [DATA].[dbo].[TableName] with (nolock)
where number between '1400' and '1500'
order by CAST(number as float);
For the most part, the results are within the range as expected. However, I do see some anomalies where a number that has the first four digits within the range is returned as a result. For example:
14550
In the result above, the first four digits are 1455 which would be within the range of 1400 to 1500. My guess is that this has to do with the CAST(number as float) part of the statement. Any suggestions on how I can update this statement to only return numbers between the stated values?
Here is the number info I get when running sp_help:
| Column_name | Type | Computed | Length | Prec | Scale | Nullable | TrimTrailingBlanks | FixedLenNullInSource | Collation |
=============================================================================================================================================================
| NUMBER | varchar | no | 4000 | | | yes | no | yes | SQL_Latin1_General_CP1_CI_AS |
Your comparison is being done as a string, because a column named number is stored as a string and the comparison values are strings. You could easily fix this just by changing the comparison values to numbers:
select *
FROM [DATA].[dbo].[TableName]
where number between 1400 and 1500
order by CAST(number as float);
But this is a hacky solution -- and it will return an error if any of the number values are not numbers. The real solution is to fix the data model, so it is not storing numbers as strings:
alter table tablename alter number int;
This uses int because all the referenced values in the question are ints.
If you cannot do this because the column is erroneously called number and contains non-numbers, then use a safe conversion function:
select *
FROM [DATA].[dbo].[TableName]
where try_cast(number as float) between 1400 and 1500
order by try_cast(number as float);
Note: I'm also not sure if this is the logic you really want, because it includes 1500. You might really want:
select *
FROM [DATA].[dbo].[TableName]
where try_cast(number as float) >= 1400 and
try_cast(number as float) < 1500
order by try_cast(number as float);
You have to cast the number as an int...
select *
FROM [DATA].[dbo].[TableName]
where CAST(number as int) between 1400 and 1500
order by CAST(number as int);

Achieving window function-like behavior using a PostgreSQL user defined function?

Let's say that given a table observations_tbl with attributes date (day) and value, I want to produce the new attribute prev_day_value to get the following table:
|---------------------|-------|----------------|
| date | value | prev_day_value |
|---------------------|-------|----------------|
| 01.01.2015 00:00:00 | 5 | 0 |
| 02.01.2015 00:00:00 | 4 | 5 |
| 03.01.2015 00:00:00 | 3 | 4 |
| 04.01.2015 00:00:00 | 2 | 3 |
|---------------------|-------|----------------|
I am well-aware that such an output can typically be obtained using a WINDOW function. But how would I achieve this through a PostgreSQL user defined function? I want to indicate that I am in a situation where I must use a function, difficult to explain why without going into detail - these are the restrictions I have and if anything, it is a technical challenge.
Take into consideration this template query:
SELECT *, lag(value,1) AS prev_day_value -- or lag(record,1) or lag(date,value,1) or lag(date,1) or lag(observations_tbl,1), etc.
FROM observations_tbl
I am using function lag with parameter 1 to look for a value which comes before the current row by 1 - a distance of 1 row. I don't care what other parameters the function lag can have (table name, other attributes) - what could the function lag look like to achieve such functionality? The function can be of any language, SQL, PL/pgSQL and even C using PostgreSQL API/backend.
I understand that one answer can be wrapping a WINDOW query inside lag user defined function. But I am thinking that would be a rather costly operation if I have to scan the entire table twice (once inside the lag function and once outside). I was thinking that maybe each PostgreSQL record would have a pointer to its previous record which is directly accessible? Or that I can somehow open a cursor at this specific row / row number without having to scan the entire table? Or is what I am asking impossible?
Your request is not possible to solve with relational tools (window functions are not relational extension in SQL). In C language you can write own alternative of function lag. You can do same work in PL8 language (Javascript). Unfortunately the API for window functions doesn't exist for PL/pgSQL. You cannot to write simple PL/pgSQL function that has access to different row than is processed.
The one possible alternative (but with some performance risk) is writing table function. There you have a control over all processed dataset, and you can do this operation simply.
CREATE OR REPLACE FUNCTION report()
RETURNS TABLE(d date, v int, prev_v int) $$
DECLARE r RECORD;
BEGIN
prev_v := 0;
FOR r IN SELECT date, value FROM observations_tbl t ORDER BY 1
LOOP
d := r.date; v := r.value;
RETURN NEXT;
prev_v := v;
END LOOP;
END;
$$ LANGUAGE plpgsql;
There is not any other alternative usable solution. In very old date these values was calculated with correlated selfjoins, but this solution has pretty terrible performance.
What Pavel posted, just with fewer assignments. Should be faster:
CREATE OR REPLACE FUNCTION report()
RETURNS TABLE(d date, v int, prev_v int) AS
$func$
BEGIN
prev_v := 0;
FOR d, v IN
SELECT date, value FROM observations_tbl ORDER BY 1
LOOP
RETURN NEXT;
prev_v := v;
END LOOP;
END
$func$ LANGUAGE plpgsql;
The general idea can pay if it actually replaces multiple scans over the table with a single one. Like here:
GROUP BY and aggregate sequential numeric values

Oracle performance: query executing multiple identical function calls

Is it possible for Oracle to reuse the result of a function when it is called in the same query (transaction?) without the use of the function result cache?
The application I am working with is heavily reliant on Oracle functions. Many queries end up executing the exact same functions multiple times.
A typical example would be:
SELECT my_package.my_function(my_id),
my_package.my_function(my_id) / 24,
my_package.function_also_calling_my_function(my_id)
FROM my_table
WHERE my_table.id = my_id;
I have noticed that Oracle always executes each of these functions, not realizing that the same function was called just a second ago in the same query. It is possible that some elements in the function get cached, resulting in a slightly faster return. This is not relevant to my question as I want to avoid the entire second or third execution.
Assume that the functions are fairly resource-consuming and that these functions may call more functions, basing their result on tables that are reasonably large and with frequent updates (a million records, updates with say 1000 updates per hour). For this reason it is not possible to use Oracle's Function Result Cache.
Even though the data is changing frequently, I expect the result of these functions to be the same when they are called from the same query.
Is it possible for Oracle to reuse the result of these functions and how? I am using Oracle11g and Oracle12c.
Below is an example (just a random non-sense function to illustrate the problem):
-- Takes 200 ms
SELECT test_package.testSpeed('STANDARD', 'REGEXP_COUNT')
FROM dual;
-- Takes 400ms
SELECT test_package.testSpeed('STANDARD', 'REGEXP_COUNT')
, test_package.testSpeed('STANDARD', 'REGEXP_COUNT')
FROM dual;
Used functions:
CREATE OR REPLACE PACKAGE test_package IS
FUNCTION testSpeed (p_package_name VARCHAR2, p_object_name VARCHAR2)
RETURN NUMBER;
END;
/
CREATE OR REPLACE PACKAGE BODY test_package IS
FUNCTION testSpeed (p_package_name VARCHAR2, p_object_name VARCHAR2)
RETURN NUMBER
IS
ln_total NUMBER;
BEGIN
SELECT SUM(position) INTO ln_total
FROM all_arguments
WHERE package_name = 'STANDARD'
AND object_name = 'REGEXP_COUNT';
RETURN ln_total;
END testSpeed;
END;
/
Add an inline view and a ROWNUM to prevent the Oracle from re-writing the query into a single query block and executing the functions multiple times.
Sample function and demonstration of the problem
create or replace function wait_1_second return number is
begin
execute immediate 'begin dbms_lock.sleep(1); end;';
-- ...
-- Do something here to make caching impossible.
-- ...
return 1;
end;
/
--1 second
select wait_1_second() from dual;
--2 seconds
select wait_1_second(), wait_1_second() from dual;
--3 seconds
select wait_1_second(), wait_1_second() , wait_1_second() from dual;
Simple query changes that do NOT work
Both of these methods still take 2 seconds, not 1.
select x, x
from
(
select wait_1_second() x from dual
);
with execute_function as (select wait_1_second() x from dual)
select x, x from execute_function;
Forcing Oracle to execute in a specific order
It's difficult to tell Oracle "execute this code by itself, don't do any predicate pushing, merging, or other transformations on it". There are hints for each of those optimizations, but they are difficult to use. There are a few ways to disable those transformations, adding an extra ROWNUM is usually the easiest.
--Only takes 1 second
select x, x
from
(
select wait_1_second() x, rownum
from dual
);
It's hard to see exactly where the functions get evaluated. But these explain plans show how the ROWNUM causes the inline view to run separately.
explain plan for select x, x from (select wait_1_second() x from dual);
select * from table(dbms_xplan.display(format=>'basic'));
Plan hash value: 1388734953
---------------------------------
| Id | Operation | Name |
---------------------------------
| 0 | SELECT STATEMENT | |
| 1 | FAST DUAL | |
---------------------------------
explain plan for select x, x from (select wait_1_second() x, rownum from dual);
select * from table(dbms_xplan.display(format=>'basic'));
Plan hash value: 1143117158
---------------------------------
| Id | Operation | Name |
---------------------------------
| 0 | SELECT STATEMENT | |
| 1 | VIEW | |
| 2 | COUNT | |
| 3 | FAST DUAL | |
---------------------------------
You can try the deterministic keyword to mark functions as pure. Whether or not this actually improves performance is another question though.
Update:
I don't know how realistic your example above is, but in theory you can always try to re-structure your SQL so it knows about repeated functions calls (actually repeated values). Kind of like
select x,x from (
SELECT test_package.testSpeed('STANDARD', 'REGEXP_COUNT') x
FROM dual
)
Use an in-line view.
with get_functions as(
SELECT my_package.my_function(my_id) as func_val,
my_package.function_also_calling_my_function(my_id) func_val_2
FROM my_table
WHERE my_table.id = my_id
)
select func_val,
func_val / 24 as func_val_adj,
func_val_2
from get_functions;
If you want to eliminate the call for item 3, instead pass the result of func_val to the third function.

Comparing Value Ranges Between 2 Tables

I have an oracle 10g database that has 2 tables: a REBATES table, and an ORDERS table.
The REBATES table looks sort of like this:
| rebate_percentage | min_purchase |
------------------------------------
| 1.0 | 5000 |
| 1.5 | 7000 |
| 2.0 | 11000 |
| 5.0 | 20000 |
I'm trying to determine the rebate percentage to apply, based on total orders. I know how to find the sum of all orders for a particular customer, for a particular time range, but how do I also grab the rebate percentage, all in one query?
For example, if the order total is 16,000 then how can I construct a query that takes this value, compares it against the REBATES table, and returns 2.0?
In my opinion, the easiest way is if you have a min and max purchase amounts:
select rebate_percentage, min_purchase,
(lead(min_purchase, 1) over (order by min_purchase) - 1) as max_purchase
from rebates
Then you can do a simple between join, where the join condition looks like:
on totalorders between rebates.min_purchase and rebates.max_purchase
You can handle the final case (with NULLs) with a modified join condition:
on totalorders >= rebates.min_purchase and
(totalorders <= rebates.max_purchase or rebates.max_purchase is null)
Or, alternatively, by changing the original logic to have a coalesce() on the lead function with some very large value.
use Functions:
Example:
FUNCTION RebatePercentage(purchase Number) RETURN NUMBER IS
rebateVal NUMBER;
minPurchase NUMBER;
BEGIN
SELECT MAX(min_purchase)
INTO minPurchase
FROM REBATES
WHERE min_purchase <= purchase;
SELECT rebate_percentage
INTO rebateVal
FROM REBATES WHERE min_purchase = minPurchase;
RETURN ( rebateVal );
END;
Now you can call this function in your query
SELECT RebatePercentage(purchase_amt) from orders;

What is a simple way to combine grouped values in one field?

I mean:
Table PHONE_CODES:
ID CODE_NAME PHONE_CODE
1 USA 8101
2 USA 8102
3 PERU 8103
4 PERU_MOB 81031
5 PERU_MOB 81032
And I want via select to get something like this:
CODE_NAME ZONE_CODES
USA 8101; 8102;
PERU 8103
PERU_MOB 81031; 81032;
I could get it via the function below, but perhaps there is a better way:
select distinct(CODE_NAME) as CODE_NAME, get_code_names_by_ZONE(CODE_NAME) as ZONE_CODES from PHONE_CODES;
Function:
create or replace function get_code_names_by_ZONE
(
ZONE_CODE_NAME in varchar2
)
return varchar2
as
codes_list varchar2(4000);
cursor cur_codes_list is
select p.PHONE_CODE
from PHONE_CODES p
where p.CODE_NAME = ZONE_CODE_NAME;
begin
for codes_list_rec in cur_codes_list
LOOP
-- dbms_output.put_line('PHONE_CODE:[' || codes_list_rec.PHONE_CODE || ']');
codes_list := codes_list || codes_list_rec.PHONE_CODE || '; ';
end loop;
return codes_list;
EXCEPTION
when NO_DATA_FOUND then
return 'notfound';
WHEN others then
dbms_output.put_line('Error code:' || SQLCODE || ' msg:' || SQLERRM);
return null;
end get_code_names_by_ZONE;
/
Tim Hall has an excellent discussion on the various string aggregation techniques that are available in Oracle.
A function would be my preferred method of achieving what you want.
If you're on 11g, take a look at the new PIVOT extension to SQL - the best documentation looks to be in the Data Warehousing Guide section. I believe however that the target of the "... for in ..." clause cannot be a subquery and has to be a hard-coded list of values.
Good link Justin. Tim hall is awesome. I followed his advice and here it is:
1 SELECT CODE_NAME,
2 LTRIM(MAX(SYS_CONNECT_BY_PATH(PHONE_CODES,';'))
3 KEEP (DENSE_RANK LAST ORDER BY curr),';') AS PHONE_CODES
4 FROM (SELECT CODE_NAME,
5 PHONE_CODES,
6 ROW_NUMBER() OVER (PARTITION BY CODE_NAME ORDER BY PHONE_CODES) AS curr,
7 ROW_NUMBER() OVER (PARTITION BY CODE_NAME ORDER BY PHONE_CODES) -1 AS prev
8 FROM a)
9 GROUP BY CODE_NAME
10 CONNECT BY prev = PRIOR curr AND CODE_NAME = PRIOR CODE_NAME
11* START WITH curr = 1
SQL> /
CODE_NAME PHONE_CODES
---------- --------------------------------------------------
PERU 8103
PERU_MOB 81031;81032
USA 8101;8102
dbBradley - I don't think the Pivot extension works here. The Pivot extension requires the use of an aggregate (sum, count, ...).