CREATE TYPE nums_list AS TABLE OF NUMBER;
What is maximum possible rows count in oracle's nested table ?
UPDATE
CREATE TYPE nums_list AS TABLE OF NUMBER;
CREATE OR REPLACE FUNCTION generate_series(from_n NUMBER, to_n NUMBER)
RETURN nums_list AS
ret_table nums_list := nums_list();
BEGIN
FOR i IN from_n..to_n LOOP
ret_table.EXTEND;
ret_table(i) := i;
END LOOP;
RETURN ret_table;
END;
SELECT count(*) FROM TABLE ( generate_series(1,4555555) );
This gives error: ORA-22813 operand value exceeds system limits, Object or Collection value was too large
The range of subscripts for a nested table is 1..2**31 so you can have 2**31 elements in the collection. That limit hasn't changed since at least 8.1.6 though, of course, it might change in the future.
Just as an additional observation, it isn't the nested table itself that is too large or using too much memory. With an exception handler you can see that the error is not being thrown by your function. You can populate the same thing in an anonymous block:
DECLARE
ret_table nums_list := nums_list();
BEGIN
FOR i IN 1..4555555 LOOP
ret_table.EXTEND;
ret_table(i) := i;
END LOOP;
dbms_output.put_line(ret_table.count);
END;
/
anonymous block completed
4555555
And you can call your function from a block too:
DECLARE
ret_table nums_list;
BEGIN
ret_table := generate_series(1,4555555);
dbms_output.put_line(ret_table.count);
END;
/
anonymous block completed
4555555
It's only when you use it as table collection expression that you get an error:
SQL Error: ORA-22813: operand value exceeds system limits
22813. 00000 - "operand value exceeds system limits"
*Cause: Object or Collection value was too large. The size of the value
might have exceeded 30k in a SORT context, or the size might be
too big for available memory.
*Action: Choose another value and retry the operation.
The cause text refers to the SORT context, and a sort is being done by your query:
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 2 | 29 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 2 | | |
| 2 | COLLECTION ITERATOR PICKLER FETCH| GENERATE_SERIES | 8168 | 16336 | 29 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------
As #a_horse_with_no_name suggested, you can avoid the problem by making your function pipelined:
CREATE OR REPLACE FUNCTION generate_series(from_n NUMBER, to_n NUMBER)
RETURN nums_list PIPELINED AS
BEGIN
FOR i IN from_n..to_n LOOP
PIPE ROW (i);
END LOOP;
RETURN;
END;
/
SELECT count(*) FROM TABLE ( generate_series(1,4555555) );
COUNT(*)
----------
4555555
That still does a SORT AGGREGATE but it doesn't seem to mind any more. Not really sure why it does that in either case; perhaps someone else will be able to explain what it's doing. (I'm doing this in an 11gR2 instance by the way; I don't have a 12c instance to verify the behaviour is the same, but your symptoms suggest it will be). Or maybe it isn't the SORT context that's the issue, perhaps it is available memory. In my environment your version seems to consistently work up to 4177918 elements - which doesn't seem to be a significant number, so is likely to be environment related?
But it depends how you intend to use the collection; from a PL/SQL context your original version might be more suitable.
Related
I am a read-only user for a database with he following problem:
Scenario:
Call center employees for a company submit tickets to me through our database on behalf of our clients. The call center includes alphanumeric lot numbers of an exact length in their message for me to troubleshoot. Depending on how many times a ticket is updated, there could be several messages for one ticket, each of them having zero or more of these alphanumeric lot numbers embedded in the message. I can access all of these messages with Oracle SQL and SQL Tools.
How can I extract just the lot numbers to make a single-column table of all the given lot numbers?
Example Data:
-- Accessing Ticket 1234 --
SELECT *
FROM communications_detail
WHERE ticket_num = 1234;
-- Results --
TICKET_NUM | MESSAGE_NUM | MESSAGE
------------------------------------------------------------------------------
1234 | 1 | A customer recently purchased some products with
| | a lot number of vwxyz12345 and wants to know if
| | they have been recalled.
------------------------------------------------------------------------------
1234 | 2 | Same customer found lots vwxyz23456 and zyxwv12345
| | in their storage as well and would like those checked.
------------------------------------------------------------------------------
1234 | 3 | These lots have not been recalled. Please inform
| | the client.
So-Far:
I am able to isolate the lot numbers of a constant string with the following code, but it gets put into standard output and not a table format.
DECLARE
msg VARCHAR2(200) := 'Same customer found lots xyz23456 and zyx12345 in their storage as well and would like those checked.';
cnt NUMBER := regexp_count(msg, '[[:alnum:]]{10}');
BEGIN
IF cnt > 0 THEN
FOR i IN 1..cnt LOOP
Dbms_Output.put_line(regexp_substr(msg, '[[:alnum:]]{10}', 1, i));
END LOOP;
END IF;
END;
/
Goals:
Output results into a table that can itself be used as a table in a larger query statement.
Somehow be able to apply this to all of the messages associated with the original ticket.
Update: Changed the example lot numbers from 8 to 10 characters long to avoid confusion with real words in the messages. The real-world scenario has much longer codes and very specific formatting, so a more complex regular expression will be used.
Update 2: Tried using a table variable instead of standard output. It didn't error, but it didn't populate my query tab... This may just be user error...!
DECLARE
TYPE lot_type IS TABLE OF VARCHAR2(10);
lots lot_type := lot_type();
msg VARCHAR2(200) := 'Same customer found lots xyz23456 and zyx12345 in their storage as well and would like those checked.';
cnt NUMBER := regexp_count(msg, '[[:alnum:]]{10}');
BEGIN
IF cnt > 0 THEN
FOR i IN 1..cnt LOOP
lots.extend();
lots(i) := regexp_substr(msg, '[[:alnum:]]{10}', 1, i);
END LOOP;
END IF;
END;
/
This is a regex format which matches the LOT mask you provided: '[a-z]{3}[0-9]{5}' . Using something like this will help you avoid the false positives you mention in your question.
Now here is a read-only, pure SQL solution for you.
with cte as (
select 'Same customer found lots xyz23456 and zyx12345 in their storage as well and would like those checked.' msg
from dual)
select regexp_substr(msg, '[a-z]{3}[0-9]{5}', 1, level) as lotno
from cte
connect by level <= regexp_count(msg, '[a-z]{3}[0-9]{5}')
;
I'm using the WITH clause just to generate the data. The important thing is the the use of the CONNECT BY operator which is part of Oracle's hierarchical data syntax but here generates a table from one row. The pseudo-column LEVEL allows us to traverse the string and pick out the different occurrences of the regex pattern.
Here's the output:
SQL> r
1 with cte as ( select 'Same customer found lots xyz23456 and zyx12345 in their storage as well and would like those checked.' msg from dual)
2 select regexp_substr(msg, '[a-z]{3}[0-9]{5}', 1, level) as lotno
3 from cte
4 connect by level <= regexp_count(msg, '[a-z]{3}[0-9]{5}')
5*
LOTNO
----------
xyz23456
zyx12345
SQL>
I'm trying to use create a transaction block inside a function, so my goal is to use this function one at time, so if some one use this Function and another want to use it, he can't until the first one is finish i create this Function :
CREATE OR REPLACE FUNCTION my_job(time_to_wait integer) RETURNS INTEGER AS $$
DECLARE
max INT;
BEGIN
BEGIN;
SELECT MAX(max_value) INTO max FROM sch_lock.table_concurente;
INSERT INTO sch_lock.table_concurente(max_value, date_insertion) VALUES(max + 1, now());
-- Sleep a wail
PERFORM pg_sleep(time_to_wait);
RETURN max;
COMMIT;
END;
$$
LANGUAGE plpgsql;
But it seams not work, i have a mistake Syntax error BEGIN;
Without BEGIN; and COMMIT i get a correct result, i use this query to check :
-- First user should to wait 10 second
SELECT my_job(10) as max_value;
-- First user should to wait 3 second
SELECT my_job(3) as max_value;
So the result is :
+-----+----------------------------+------------+
| id | date | max_value |
+-----+----------------------------+------------+
| 1 | 2017-02-13 13:03:58.12+00 | 1 |
+-----|----------------------------+------------+
| 2 | 2017-02-13 13:10:00.291+00 | 2 |
+-----+----------------------------+------------+
| 3 | 2017-02-13 13:10:00.291+00 | 2 |
+-----+----------------------------+------------+
But the result should be :
+-----+----------------------------+------------+
| id | date | max_value |
+-----+----------------------------+------------+
| 1 | 2017-02-13 13:03:58.12+00 | 1 |
+-----|----------------------------+------------+
| 2 | 2017-02-13 13:10:00.291+00 | 2 |
+-----+----------------------------+------------+
| 3 | 2017-02-13 13:10:00.291+00 | 3 |
+-----+----------------------------+------------+
so the third one id = 3 should have the max_value = 3 and not 2, this happen because the first user Select the max = 1 and wait 10 sec and the second user Select the max = 1 and wait 3 sec before Insertion, but the right solution is : I can't use this Function Until the First one finish, for that i want to make something secure and protected.
My questions is :
how can i make a Transaction block inside a function?
Do you have any suggestion how can we make this, with a secure way?
Thank you.
Ok so you cannot COMMIT in a function. You can have a save point and roll back to the save point however.
Your smallest possible transaction is a single statement parsed and executed by the server from the client, so every transaction is a function. Within a transaction, however, you can have save points. In this case you would look at the exception handling portions of PostgreSQL to handle this.
However that is not what you want here. You want (I think?) data to be visible during a long-running server-side operation. For that you are kind of out of luck. You cannot really increment your transaction ids while running a function.
You have a few options, in order of what I would consider to be good practices (best to worst):
Break down your logic into smaller slices that each move the db from one consistent state to another, and run those in separate transactions.
Use a message queue (like pg_message_queue)in the db, plus an external worker, and something which runs a step and yields a message for the next step. Disadvantage is this adds more maintenance.
Use a function or framework like dblink or pl/python, or pl/perlu to connect back to the db and run transactions there. ick....
You can use dblink for this. Something like :
CREATE OR REPLACE FUNCTION my_job(time_to_wait integer) RETURNS INTEGER AS $$
DECLARE
max INT;
BEGIN
SELECT INTO RES dblink_connect('con','dbname=local');
SELECT INTO RES dblink_exec('con', 'BEGIN');
...
SELECT INTO RES dblink_exec('con', 'COMMIT');
SELECT INTO RES dblink_disconnect('con');
END;
$$
LANGUAGE plpgsql;
I don't know if this is a good way or not but what if we use LOCK TABLE for example like this :
CREATE OR REPLACE FUNCTION my_job(time_to_wait integer) RETURNS INTEGER AS $$
DECLARE
max INT;
BEGIN
-- Lock table so no one will use it until the first one is finish
LOCK TABLE sch_lock.table_concurente IN ACCESS EXCLUSIVE MODE;
SELECT MAX(max_value) INTO max FROM sch_lock.table_concurente;
INSERT INTO sch_lock.table_concurente(max_value, date_insertion) VALUES(max + 1, now());
PERFORM pg_sleep(time_to_wait);
RETURN max;
END;
$$
LANGUAGE plpgsql;
It gives me the right result.
There is a procedure which tries to fetch details of project/s from PROJECTS table.
The snippet goes here:
PROCEDURE GET_PROJECTS (
P_PROJECT_ID_LIKE IN VARCHAR2 DEFAULT '%',
P_SEPARATOR IN VARCHAR2 DEFAULT '-=-' )
AS
CURSOR PROJECTS_CURSOR IS
.....
WHERE
PROJECT_ID LIKE P_PROJECT_ID_LIKE
The concern is:
PROJECT_ID has a datatype - NUMBER.
P_PROJECT_ID_LIKE has a datatype - VARCHAR2.
I am wondering how LIKE can be used on PROJECT_ID ?
It is working perfectly fine for
GET_PROJECTS('%','-=-');
GET_PROJECTS('28','-=-')
Any insight would be a great help!
An implicit type conversion will take place. Notice 1 - filter(TO_CHAR("N") LIKE 'asdf%') in predicate information section.
15:13:51 (133)LKU#sandbox> create table t (n number);
Table created.
Elapsed: 00:00:00.10
15:14:22 (133)LKU#sandbox> select * from t where n like 'asdf%'
15:14:37 2
15:14:37 (133)LKU#sandbox> #xplan
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------
Plan hash value: 1601196873
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| T | 1 | 13 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1 / T#SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_CHAR("N") LIKE 'asdf%')
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - "N"[NUMBER,22]
In either case, it doesn't make much sense to filter identifiers using like operator. If you want to get all values in case certain condition is met, then you should probably do it sort of this way:
where project_id = P_PROJECT_ID or P_PROJECT_ID = -1
using basically any numeric value that is not a valid project id.
As #be here now suggested an implicit conversion will take place. This will work 99% of the times but there are some cavets when reaching big numbers.
Take this scenario for example,
SQL> select to_char(power(2,140)) from dual;
TO_CHAR(POWER(2,140))
----------------------------------------
1.3937965749081639463459823920405226E+42
The number was converted to char with an exponential notation. So some string might not match.
If you don't reach these numbers you should be fine.
Although this is an oracle question take some advice from the Zen Of Python
Explicit is better than implicit.
I didn't check this in Oracle but In SQL Server it is possible to use like against an INT attribute,
for example
Create table mytable(
id int not null ,
name varchar(50))
then you can select with like
select * from mytable where id like '123%'
And the reason why this is possible is that char is an Integral type.
so in my conclusion ,this might be possible in oracle too please check.
Background
For a data entry project, a user can enter variables using a short-hand notation:
"Pour i1 into a flask."
"Warm the flask to 25 degrees C."
"Add 1 drop of i2 to the flask."
"Immediately seek cover."
In this case i1 and i2 are reference variables, where the number refers to an ingredient. The text strings are in the INSTRUCTION table, the ingredients the INGREDIENT table.
Each ingredient has a sequence number for sorting purposes.
Problem
Users may rearrange the ingredient order, which adversely changes the instructions. For example, the ingredient order might look as follows, initially:
seq | label
1 | water
2 | sodium
The user adds another ingredient:
seq | label
1 | water
2 | sodium
3 | francium
The user reorders the list:
seq | label
1 | water
2 | francium
3 | sodium
At this point, the following line is now incorrect:
"Add 1 drop of i2 to the flask."
The i2 must be renumbered (because ingredient #2 was moved to position #3) to point to the original reference variable:
"Add 1 drop of i3 to the flask."
Updated Details
This is a simplified version of the problem. The full problem can have lines such as:
"Add 1 drop of i2 to the o3 of i1."
Where o3 is an object (flask), and i1 and i2 are water and sodium, respectively.
Table Structure
The ingredient table is structured as follows:
id | seq | label
The instruction table is structured as follows:
step
Algorithm
The algorithm I have in mind:
Repeat for all steps that match the expression '\mi([0-9]+)':
Break the step into word tokens.
For each token:
If the numeric portion of the token matches the old sequence number, replace it with the new sequence number.
Recombine the tokens and update the instruction.
Update the ingredient number.
Update
The algorithm may be incorrect as written. There could be two reference variables that must change. Consider before:
seq | label
1 | water
2 | sodium
3 | caesium
4 | francium
And after (swapping sodium and caesium):
seq | label
1 | water
2 | caesium
3 | sodium
4 | francium
Every i2 in every step must become i3; similarly i3 must become i2. So
"Add 1 drop of i2 to the flask, but absolutely do not add i3."
Becomes:
"Add 1 drop of i3 to the flask, but absolutely do not add i2."
Code
The code to perform the first two parts of the algorithm resembles:
CREATE OR REPLACE FUNCTION
renumber_steps(
p_ingredient_id integer,
p_old_sequence integer,
p_new_sequence integer )
RETURNS void AS
$BODY$
DECLARE
v_tokens text[];
BEGIN
FOR v_tokens IN
SELECT
t.tokens
FROM (
SELECT
regexp_split_to_array( step, '\W' ) tokens,
regexp_matches( step, '\mi([0-9]+)' ) matches
FROM
instruction
) t
LOOP
RAISE NOTICE '%', v_tokens;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
Question
What is a more efficient way to solve this problem (i.e., how would you eliminate the looping constructs), possibly leveraging PostgreSQL-specific features, without a major revision to the data model?
Thank you!
System Details
PostgreSQL 9.1.2.
You have to take care that you don't change ingredients and seq numbers back and forth. I introduce a temporary prefix for ingredients and negative numbers for seq for that purpose and exchange them for permanent values when all is done.
Could work like this:
CREATE OR REPLACE FUNCTION renumber_steps(_old int[], _new int[])
RETURNS void AS
$BODY$
DECLARE
_prefix CONSTANT text := ' i'; -- prefix, incl. leading space
_new_prefix CONSTANT text := ' ###'; -- temp prefix, incl. leading space
i int;
o text;
n text;
BEGIN
IF array_upper(_old,1) <> array_upper(_new,1) THEN
RAISE EXCEPTION 'Array length mismatch!';
END IF;
FOR i IN 1 .. array_upper(_old,1) LOOP
IF _old[i] <> _new[i] THEN
o := _prefix || _old[i] || ' '; -- leading and trailing blank!
-- new instruction are temporarily prefixed with new_marker
n := _new_prefix || _new[i] || ' ';
UPDATE instruction
SET step = replace(step, o, n) -- replace all instances
WHERE step ~~ ('%' || o || '%');
UPDATE ingredient
SET seq = _new[i] * -1 -- temporarily negative
WHERE seq = _old[i];
END IF;
END LOOP;
-- finally replace temp. prefix
UPDATE instruction
SET step = replace(step, _new_prefix, _prefix)
WHERE step ~~ ('%' || _new_prefix || '%');
-- .. and temp. negative seq numbers
UPDATE ingredient
SET seq = seq * -1
WHERE seq < 0;
END;
$BODY$
LANGUAGE plpgsql VOLATILE STRICT;
Call:
SELECT renumber_steps('{2,3,4}'::int[], '{4,3,2}'::int[]);
The algorithm requires ...
... that ingredients in the steps are delimited by spaces.
... that there are no permanent negative seq numbers.
_old and _new are ARRAYs of the old and new instruction.seq of ingredients that change position. The length of both arrays has to match, or an exception will be raised. It can contain seq that don't change. Nothing will happen to those.
Requires PostgreSQL 9.1 or later.
I think your model is problematic... you should have the "real name (id)" (i1, o3 etc.) FIXED after creation and have a second field in the ingredient table providing the "sorting". The user enters the "sorting name" and you immediately replace it with the "real name" (id) on saving the entered data into the step table.
When you read it from the step table you just replace/map the "real name" (id) with the current "sorting name" for display purposes if need be...
This way you don't have to change the data already in the step table for everytime someone changes the sorting which is a complex and expensive operation IMHO - it is prone to concurrency problems too...
The above option reduces the whole problem to a mapping operiton (table ingredient) on INSERT/UPDATE/SELECT (table step) for the one entry currently worked on - it doesn't mess with any other entries already there.
In a stored procedure (which has a date parameter named 'paramDate' ) I have a query like this one
select id, name
from customer
where period_aded = to_char(paramDate,'mm/yyyy')
will Oracle convert paramDate to string for each row?
I was sure that Oracle wouldn't but I was told that Oracle will.
In fact I thought that if the parameter of the function was constraint (not got a fierld nor a calculated value inside the query) the result should be allways the same, and that's why Oracle should perform this conversion only once.
Then I realized that I've sometimes executed DML sentences in several functions, and perhaps this could cause the resulting value to change, even if it does not change for each row.
This should mean that I should convert such values before I add them to the query.
Anyway, perhaps well 'known functions' (built in) are evaluated once, or even my functions would also be.
Anyway, again...
Will oracle execute that to_char once or will Oracle do it for each row?
Thanks for your answers
I do not think this is generally the case, as it would prevent an index from being used.
At least for built-in functions, Oracle should be able to figure out that it could evaluate it only once. (For user-defined functions, see below).
Here is a case where an index is being used (and the function is not evaluated for every row):
SQL> select id from tbl_table where id > to_char(sysdate, 'YYYY');
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 35 | 140 | 1 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| SYS_C004274 | 35 | 140 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ID">TO_NUMBER(TO_CHAR(SYSDATE#!,'YYYY')))
For user-defined functions check out this article. It mentions two ways to ensure
that your function gets called only once:
Since Oracle 10.2, you can define the function as DETERMINISTIC.
On older versions you can re-phrase it to use "scalar subquery caching":
SELECT COUNT(*)
FROM EMPLOYEES
WHERE SALARY = (SELECT getValue(1) FROM DUAL);
Looking at write-ups on the DETERMINISTIC keyword (here is one, here is another), it was introduced to allow the developer to tell Oracle that the function will return the same value for the same input params. So if you want your functions to be called only once, and you can guarantee they will always return the same value for the same input params you can use the keyword DETERMINISTIC.
With regards to built-in functions like to_char, I defer to those who are better versed in the innards of Oracle to give you direction.
The concern about to_char does not ring a bell with me. However, in your pl/sql,
you could have
create or replace procedure ........
some_variable varchar2(128);
begin
some_variable := to_char(paramDate,'mm/yyyy');
-- and your query could read
select id, name from customer where period_aded = some_variable;
.
.
.
end;
/
Kt