Mutating Table in Oracle 11 caused by a function - sql

We've recently upgraded from Oracle 10 to Oracle 11.2. After upgrading, I started seeing a mutating table error caused by a function rather than a trigger (which I've never come across before). It's old code that worked in prior versions of Oracle.
Here's a scenario that will cause the error:
create table mutate (
x NUMBER,
y NUMBER
);
insert into mutate (x, y)
values (1,2);
insert into mutate (x, y)
values (3,4);
I've created two rows. Now, I'll double my rows by calling this statement:
insert into mutate (x, y)
select x + 1, y + 1
from mutate;
This isn't strictly necessary to duplicate the error, but it helps with my demonstration later. So the contents of the table now look like this:
X,Y
1,2
3,4
2,3
4,5
All is well. Now for the fun part:
create or replace function mutate_count
return PLS_INTEGER
is
v_dummy PLS_INTEGER;
begin
select count(*)
into v_dummy
from mutate;
return v_dummy;
end mutate_count;
/
I've created a function to query my table and return a count. Now, I'll combine that with an INSERT statement:
insert into mutate (x, y)
select x + 2, y + 2
from mutate
where mutate_count() = 4;
The result? This error:
ORA-04091: table MUTATE is mutating, trigger/function may not see it
ORA-06512: at "MUTATE_COUNT", line 6
So I know what causes the error, but I am curious as to the why. Isn't Oracle performing the SELECT, retrieving the result set, and then performing a bulk insert of those results? I would only expect a mutating table error if records were already being inserted before the query finished. But if Oracle did that, wouldn't the earlier statement:
insert into mutate (x, y)
select x + 1, y + 1
from mutate;
start an infinite loop?
UPDATE:
Through Jeffrey's link I found this in the Oracle docs:
By default, Oracle guarantees statement-level read consistency. The
set of data returned by a single query is consistent with respect to a
single point in time.
There's also a comment from the author in his post:
One could argue why Oracle doesn't ensure this 'statement-level read
consistency' for repeated function calls that appear inside a SQL
statement. It could be considered a bug as far as I'm concerned. But
this is the way it currently works.
Am I correct in assuming that this behavior has changed between Oracle versions 10 and 11?

Firstly,
insert into mutate (x, y)
select x + 1, y + 1
from mutate;
Does not start an infinite loop, because the query will not see the data that was inserted - only data that existed as of the start of the statement. The new rows will only be visible to subsequent statements.
This explains it quite well:
When Oracle steps out of the SQL-engine that's currently executing the
update statement, and invokes the function, then this function -- just
like an after row update trigger would -- sees the intermediate states
of EMP as they exist during execution of the update statement. This
implies that the return value of our function invocations heavily
depend on the order in which the rows happen to be updated.

Statement-Level Read Consistency and Transaction-Level Read Consistency".
From the manual:
"If a SELECT list contains a function, then the database applies
statement-level read consistency at the statement level for SQL run
within the PL/SQL function code, rather than at the parent SQL
level. For example, a function could access a table whose data is
changed and committed by another user. For each execution of the
SELECT in the function, a new read consistent snapshot is
established".
Both concepts are explained in the "Oracle® Database Concepts" :
http://download.oracle.com/docs/cd/B19306_01/server.102/b14220/consist.htm#sthref1955
->>> UPDATE
->>>*Section added after the OP was closed
The rule
The technical rule , well linked by Mr Kemp(#jeffrey-kemp) and well explained by Toon Koppelaars here, is reported in "Pl/Sql language reference - Controlling Side Effects of PL/SQL Subprograms"(your function violates RNDS reads no database state):
When invoked from an INSERT, UPDATE, or DELETE statement, the function
cannot query or modify any database tables modified by that statement.
If a function either queries or modifies a table, and a DML statement
on that table invokes the function, then ORA-04091 (mutating-table
error) occurs.
PL/SQL Functions that SQL Statements Can Invoke

Related

How to use a temp sequence within a Postgresql function

I have some lines of SQL which will take a set of IDs from the same GROUP_ID that are not contiguous (ex. if some rows got deleted) and will make them contiguous again. I wanted to turn this into a function for reusability purposes. The lines work if executed individually but when I try to create the function I get the error
ERROR: relation "id_seq_temp" does not exist
LINE 10: UPDATE THINGS SET ID=nextval('id_se...
If I create a sequence outside of the function and use that sequence in the function instead then the function is created successfully (schema qualified or unqualified). However I felt like creating the temp sequence inside of the function rather than leaving it in the schema was a cleaner solution.
I have seen this question: Function shows error "relation my_table does not exist"
However, I'm using the public schema and schema qualifying the sequence with public. does not seem to help.
I've also seen this question: How to create a sql function using temp sequences and a SELECT on PostgreSQL8. I probably could use generate_series but this adds a lot of complexity that SERIES solves such as needing to know how big of a series to generate.
Here is my function, I anonymized some of the names - just in case there's a typo.
CREATE OR REPLACE FUNCTION reindex_ids(IN BIGINT) RETURNS VOID
LANGUAGE SQL
AS $$
CREATE TEMPORARY SEQUENCE id_seq_temp
MINVALUE 1
START WITH 1
INCREMENT BY 1;
ALTER SEQUENCE id_seq_temp RESTART;
UPDATE THINGS SET ID=ID+2000 WHERE GROUP_ID=$1;
UPDATE THINGS SET ID=nextval('id_seq_temp') WHERE GROUP_ID=$1;
$$;
Is it possible to use a sequence you create within a function later in the function?
Answer to question
The reason is that SQL functions (LANGUAGE sql) are parsed and planned as one. All objects used must exist before the function runs.
You can switch to PL/pgSQL, (LANGUAGE plpgsql) which plans each statement on demand. There you can create objects and use them in the next command.
See:
Why can PL/pgSQL functions have side effect, while SQL functions can't?
Since you are not returning anything, consider a PROCEDURE. (FUNCTION works, too.)
CREATE OR REPLACE PROCEDURE reindex_ids(IN bigint)
LANGUAGE plpgsql AS
$proc$
BEGIN
IF EXISTS ( SELECT FROM pg_catalog.pg_class
WHERE relname = 'id_seq_temp'
AND relnamespace = pg_my_temp_schema()
AND relkind = 'S') THEN
ALTER SEQUENCE id_seq_temp RESTART;
ELSE
CREATE TEMP SEQUENCE id_seq_temp;
END IF;
UPDATE things SET id = id + 2000 WHERE group_id = $1;
UPDATE things SET id = nextval('id_seq_temp') WHERE group_id = $1;
END
$proc$;
Call:
CALL reindex_ids(123);
This creates your temp sequence if it does not exist already.
If the sequence exists, it is reset. (Remember that temporary objects live for the duration of a session.)
In the unlikely event that some other object occupies the name, an exception is raised.
Alternative solutions
Solution 1
This usually works:
UPDATE things t
SET id = t1.new_id
FROM (
SELECT pk_id, row_number() OVER (ORDER BY id) AS new_id
FROM things
WHERE group_id = $1 -- your input here
) t1
WHERE t.pk_id = t1.pk_id;
And only updates each row once, so half the cost.
Replace pk_id with your PRIMARY KEY column, or any UNIQUE NOT NULL (combination of) column(s).
The trick is that the UPDATE typically processes rows according to the sort order of the subquery in the FROM clause. Updating in ascending order should never hit a duplicate key violation.
And the ORDER BY clause of the window function row_number() imposes that sort order on the resulting set. That's an undocumented implementation detail, so you might want to add an explicit ORDER BY to the subquery. But since the behavior of UPDATE is undocumented anyway, it still depends on an implementation detail.
You can wrap that into a plain SQL function.
Solution 2
Consider not doing what you are doing at all. Gaps in sequential numbers are typically expected and not a problem. Just live with it. See:
Serial numbers per group of rows for compound key

Using table variables in Oracle Stored Procedure

I have lots of experience with T-SQL (MS SQL Server).
There it is quite common to first select some set of records into a
table variable or say temp table t, and then work with this t
throughout the whole SP body using it just like a regular table
(for JOINS, sub-queries, etc.).
Now I am trying the same thing in Oracle but it's a pain.
I get errors all the way and it keeps saying
that it does not recognize my table (i.e. my table variable).
Error(28,7): PL/SQL: SQL Statement ignored
Error(30,28): PL/SQL: ORA-00942: table or view does not exist
I start thinking what at all is possible to do with this
table variable and what not (in the SP body) ?
I have this declaration:
TYPE V_CAMPAIGN_TYPE IS TABLE OF V_CAMPAIGN%ROWTYPE;
tc V_CAMPAIGN_TYPE;
What on Earth can I do with this tc now in my SP?!
This is what I am trying to do in the body of the SP.
UPDATE ( SELECT t1.STATUS_ID, t2.CAMPAIGN_ID
FROM V_CAMPAIGN t1
INNER JOIN tc t2 ON t1.CAMPAIGN_ID = t2.CAMPAIGN_ID
) z
SET z.STATUS_ID = 4;
V_CAMPAIGN is a DB view, tc is my table variable
Presumably you are trying to update a subset of the V_CAMPAIGN records.
While in SQLServer it may be useful to define a 'temporary' table containing the subset and then operate on that it isn't necessary in Oracle.
Simply update the table with the where clause you would have used to define the temp table.
E.g.
UPDATE v_campaign z
SET z.status_id = 4
WHERE z.column_name = 'a value'
AND z.status <> 4
I assume that the technique you are familiar with is to minimise the effect of read locks that are taken while selecting the data.
Oracle uses a different locking strategy so the technique is mostly unnecessary.
Echoing a comment above - tell us what you want to achieve in Oracle and you will get suggestions for the best way forward.

How to pass a set of rows from one function into another?

Overview
I'm using PostgreSQL 9.1.14, and I'm trying to pass the results of a function into another function. The general idea (specifics, with a minimal example, follow) is that we can write:
select * from (select * from foo ...)
and we can abstract the sub-select away in a function and select from it:
create function foos()
returns setof foo
language sql as $$
select * from foo ...
$$;
select * from foos()
Is there some way to abstract one level farther, so as to be able to do something like this (I know functions cannot actually have arguments with setof types):
create function more_foos( some_foos setof foo )
language sql as $$
select * from some_foos ... -- or unnest(some_foos), or ???
$$:
select * from more_foos(foos())
Minimal Example and Attempted Workarounds
I'm using PostgreSQL 9.1.14. Here's a minimal example:
-- 1. create a table x with three rows
drop table if exists x cascade;
create table if not exists x (id int, name text);
insert into x values (1,'a'), (2,'b'), (3,'c');
-- 2. xs() is a function with type `setof x`
create or replace function xs()
returns setof x
language sql as $$
select * from x
$$;
-- 3. xxs() should return the context of x, too
-- Ideally the argument would be a `setof x`,
-- but that's not allowed (see below).
create or replace function xxs(x[])
returns setof x
language sql as $$
select x.* from x
join unnest($1) y
on x.id = y.id
$$;
When I load up this code, I get the expected output for the table definitions, and I can call and select from xs() as I'd expect. But when I try to pass the result of xs() to xxs(), I get an error that "function xxs(x) does not exist":
db=> \i test.sql
DROP TABLE
CREATE TABLE
INSERT 0 3
CREATE FUNCTION
CREATE FUNCTION
db=> select * from xs();
1 | a
2 | b
3 | c
db=> select * from xxs(xs());
ERROR: function xxs(x) does not exist
LINE 1: select * from xxs(xs());
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
I'm a bit confused about "function xxs(x) does not exist"; since the return type of xs() was setof x, I'd expected that its return type would be setof x (or maybe x[]), not x. Following the complaints about the type, I can get to either of the following , but while with either definition I can select xxs(xs());, I can't select * from xxs(xs());.
create or replace function xxs( x )
returns setof x
language sql as $$
select x.* from x
join unnest(array[$1]) y -- unnest(array[...]) seems pretty bad
on x.id = y.id
$$;
create or replace function xxs( x )
returns setof x
language sql as $$
select * from x
where x.id in ($1.id)
$$;
db=> select xxs(xs());
(1,a)
(2,b)
(3,c)
db=> select * from xxs(xs());
ERROR: set-valued function called in context that cannot accept a set
Summary
What's the right way to pass the results of a set-returning function into another function?
(I have noted that create function … xxs( setof x ) … results in the error: ERROR: functions cannot accept set arguments, so the answer won't literally be passing a set of rows from one function to another.)
Table functions
I perform very high speed, complex database migrations for a living, using SQL as both the client and server language (no other language is used), all running server side, where the code rarely surfaces from the database engine. Table functions play a HUGE role in my work. I don't use "cursors" since they are too slow to meet my performance requirements, and everything I do is result set oriented. Table functions have been an immense help to me in completely eliminating use of cursors, achieving very high speed, and have contributed dramatically towards reducing code volume and improving simplicity.
In short, you use a query that references two (or more) table functions to pass the data from one table function to the next. The select query result set that calls the table functions serves as the conduit to pass the data from one table function to the next. On the DB2 platform / version I work on, and it appears based on a quick look at the 9.1 Postgres manual that the same is true there, you can only pass a single row of column values as input to any of the table function calls, as you've discovered. However, because the table function call happens in the middle of a query's result set processing, you achieve the same effect of passing a whole result set to each table function call, albeit, in the database engine plumbing, the data is passed only one row at a time to each table function.
Table functions accept one row of input columns, and return a single result set back into the calling query (i.e. select) that called the function. The result set columns passed back from a table function become part of the calling query's result set, and are therefore available as input to the next table function, referenced later in the same query, typically as a subsequent join. The first table function's result columns are fed as input (one row at a time) to the second table function, which returns its result set columns into the calling query's result set. Both the first and second table function result set columns are now part of the calling query's result set, and are now available as input (one row at a time) to a third table function. Each table function call widens the calling query's result set via the columns it returns. This can go on an on until you start hitting limits on the width of a result set, which likely varies from one database engine to the next.
Consider this example (which may not match Postgres' syntax requirements or capabilities as I work on DB2). This is one of many design patterns in which I use table functions, is one of the simpler ones, that I think is very illustrative, and one that I anticipate would have broad appeal if table functions were in heavy mainstream use (to my knowledge they are not, but I think they deserve more attention than they are getting).
In this example, the table functions in use are: VALIDATE_TODAYS_ORDER_BATCH, POST_TODAYS_ORDER_BATCH, and DATA_WAREHOUSE_TODAYS_ORDER_BATCH. On the DB2 version I work on, you wrap the table function inside "TABLE( place table function call and parameters here )", but based on quick look at a Postgres manual it appears you omit the "TABLE( )" wrapper.
create table TODAYS_ORDER_PROCESSING_EXCEPTIONS as (
select TODAYS_ORDER_BATCH.*
,VALIDATION_RESULT.ROW_VALID
,POST_RESULT.ROW_POSTED
,WAREHOUSE_RESULT.ROW_WAREHOUSED
from TODAYS_ORDER_BATCH
cross join VALIDATE_TODAYS_ORDER_BATCH ( ORDER_NUMBER, [either pass the remainder of the order columns or fetch them in the function] )
as VALIDATION_RESULT ( ROW_VALID ) --example: 1/0 true/false Boolean returned
left join POST_TODAYS_ORDER_BATCH ( ORDER_NUMBER, [either pass the remainder of the order columns or fetch them in the function] )
as POST_RESULT ( ROW_POSTED ) --example: 1/0 true/false Boolean returned
on ROW_VALIDATED = '1'
left join DATA_WAREHOUSE_TODAYS_ORDER_BATCH ( ORDER_NUMBER, [either pass the remainder of the order columns or fetch them in the function] )
as WAREHOUSE_RESULT ( ROW_WAREHOUSED ) --example: 1/0 true/false Boolean returned
on ROW_POSTED = '1'
where coalesce( ROW_VALID, '0' ) = '0' --Capture only exceptions and unprocessed work.
or coalesce( ROW_POSTED, '0' ) = '0' --Or, you can flip the logic to capture only successful rows.
or coalesce( ROW_WAREHOUSED, '0' ) = '0'
) with data
If table TODAYS_ORDER_BATCH contains 1,000,000 rows, then
VALIDATE_TODAYS_ORDER_BATCH will be called 1,000,000 times, once for
each row.
If 900,000 rows pass validation inside VALIDATE_TODAYS_ORDER_BATCH, then POST_TODAYS_ORDER_BATCH will be called 900,000 times.
If only 850,000 rows successfully post, then VALIDATE_TODAYS_ORDER_BATCH needs some loopholes closed LOL, and DATA_WAREHOUSE_TODAYS_ORDER_BATCH will be called 850,000 times.
If 850,000 rows successfully made it into the Data Warehouse (i.e. no additional exceptions were generated), then table TODAYS_ORDER_PROCESSING_EXCEPTIONS will be populated with 1,000,000 - 850,000 = 150,000 exception rows.
The table function calls in this example are only returning a single column, but they could be returning many columns. For example, the table function validating an order row could return the reason why an order failed validation.
In this design, virtually all the chatter between a HLL and the database is eliminated, since the HLL requestor is asking the database to process the whole batch in ONE request. This results in a reduction of millions of SQL requests to the database, in a HUGE removal of millions of HLL procedure or method calls, and as a result provides a HUGE runtime improvement. In contrast, legacy code which often processes a single row at a time, would typically send 1,000,000 fetch SQL requests, 1 for each row in TODAYS_ORDER_BATCH, plus at least 1,000,000 HLL and/or SQL requests for validation purposes, plus at least 1,000,000 HLL and/or SQL requests for posting purposes, plus 1,000,000 HLL and/or SQL requests for sending the order to the data warehouse. Granted, using this table function design, inside the table functions SQL requests are being sent to the database, but when the database makes requests to itself (i.e from inside a table function), the SQL requests are serviced much faster (especially in comparison to a legacy scenario where the HLL requestor is doing single row processing from a remote system, with the worst case over a WAN - OMG please don't do that).
You can easily run into performance problems if you use a table function to "fetch a result set" and then join that result set to other tables. In that case, the SQL optimizer can't predict what set of rows will be returned from the table function, and therefore it can't optimize the join to subsequent tables. For that reason, I rarely use them for fetching a result set, unless I know that result set will be a very small number of rows, hence not causing a performance problem, or I don't need to join to subsequent tables.
In my opinion, one reason why table functions are underutilized is that they are often perceived as only a tool to fetch a result set, which often performs poorly, so they get written off as a "poor" tool to use.
Table functions are immensely useful for pushing more functionality over to the server, for eliminating most of the chatter between the database server and programs on remote systems, and even for eliminating chatter between the database server and external programs on the same server. Even chatter between programs on the same server carries more overhead than many people realize, and much of it is unnecessary. The heart of the power of table functions lies in using them to perform actions inside result set processing.
There are more advanced design patterns for using table functions that build on the above pattern, where you can maximize result set processing even further, but this post is a lot for most to absorb already.

SQL on-demand cache table (possibly using SQL MERGE)

I am working on implementing an on-demand SQL cache table for an application so I have
CacheTable with columns Type, Number, Value
Then I have a function called GetValue( Type, Number )
So I want to have a function that does the following
If (CacheTable contains Type, Number) then return value
Else call GetValue( Type, Number) and put that value into CacheTable and return the Value
Does anyone know the most elegant way to do this?
I was thinking of using a SQL merge.
Not sure how elegant one can get, but we might do it just the way you describe. Query the database
select Value from Tab1 where Type=#type and Number=#num
and if no rows are returned, compute the value, then store it in the database for next time.
However, if the "compute the value" requires the database itself, and we can compute it in the database, then we can do the whole cycle with one database round trip -- more 'elegant' perhaps but faster at least than 3 round trips (lookup, compute, store).
declare #val int
select #val=Value from Tab1 where Type=#type and Number=#num
if ##ROWCOUNT=0 BEGIN
exec compute_val #type,#num,#val OUTPUT
insert into Tab1 values (#type,#num,#val)
END
SELECT #val[Value]--return
The only use for SQL Merge is if you think there may be concurrent users and the number is inserted between above select and insert, giving an error on the insert. I'd just catch the error and skip the insert (as we can assume the value won't be different by definition).

postgresql -- Curval not working, using PHP PDO

So I'm trying to run some SQL here through PHP's PDO (which I don't believe should be the problem) like such:
INSERT INTO example (
d_id,
s_id
)
VALUES (
currval('d_id_seq'),
currval('s_id_seq')
);
I have two sequences called d_id_seq and s_id_sec (lets pretend I have a table named d and a table named s, and this sequence is a column called ID and serial type).
Now, obviously I'm doing this wrong, as I get an error about the sequence not being used in this session:
Object not in prerequisite state: 7 ERROR: currval of sequence "d_id_seq" is not yet defined in this session
So, how should I write this?
Problem can be solved via the following command:
SELECT last_value FROM d_id_seq;
Note that I'm using PostgreSQL 9.1.9, I do not know about other or older versions.
The error means you did not "use" the sequence in this session (postgres connection). For instance you did not do any INSERTs on the table d.
Perhaps you have a bug in your code and reconnect to postgres after each query ?
A more convenient way to do it is to use INSERT RETURNING on your INSERTs. Then you get the ids.
From the fine manual:
currval
Return the value most recently obtained by nextval for this sequence in the current session. (An error is reported if nextval has never been called for this sequence in this session.) Because this is returning a session-local value, it gives a predictable answer whether or not other sessions have executed nextval since the current session did.
You use currval to get the last value that was pulled out of the sequence in the current session. The usual pattern is to do an INSERT that uses a sequence and then you call currval to figure out what value the INSERT used. If you haven't called nextval with the sequence in question in the current session then there is nothing for currval to return.
Maybe you're actually looking for select max(id) from d and select max(id) from s:
INSERT INTO example (d_id, s_id)
SELECT MAX(d.id), MAX(s.id)
FROM d, s;
Or maybe you need to wrap your d and s inserts in a stored procedure that takes care of inserting in all three tables at once.