Delete on Oracle DB when joining with a piplined table function - sql

I have a table called CUSTOMERS from which I want to delete all entries that are not present in VALUE_CUSTOMERS. VALUE_CUSTOMERS is a Pipelined Table Function which builds on customers.
DELETE
(
SELECT *
FROM CUSTOMERS
LEFT JOIN
(
SELECT
1 AS DELETABLE,
VC.*
FROM
(
CUSTOMER_PACKAGE.VALUE_CUSTOMERS(TRUNC(SYSDATE) - 30)) VC
)
USING
(
FIRST_NAME, LAST_NAME, DATE_OF_BIRTH
)
WHERE
DELETABLE IS NULL
)
;
When I try to execute the statement, I get the Error:
ORA-01752: cannot delete from view without exactly one key-preserved
table

It looks like you're example has wrong syntax (lack of table keyword) - I've tested it on Oracle 12c, so maybe it works on newer ones. Below some ideas - based on Oracle 12c.
You've got multiple options here:
Save result of CUSTOMER_PACKAGE.VALUE_CUSTOMERS to some temporary table and use it in your query
CREATE GLOBAL TEMPORARY TABLE TMP_CUSTOMERS (
FIRST_NAME <type based on CUSTOMERS.FIRST_NAME>
, LAST_NAME <type based on CUSTOMERS.LAST_NAME>
, DATE_OF_BIRTH <type based on CUSTOMERS.DATE_OF_BIRTH>
)
ON COMMIT DELETE ROWS;
Then in code:
INSERT INTO TMP_CUSTOMERS(FIRST_NAME, LAST_NAME, DATE_OF_BIRTH)
SELECT VC.FIRST_NAME, VC.LAST_NAME, VC.DATE_OF_BIRTH
FROM TABLE(CUSTOMER_PACKAGE.VALUE_CUSTOMERS(TRUNC(SYSDATE) - 30)) VC
;
-- and then:
DELETE FROM CUSTOMERS C
WHERE NOT EXISTS(
SELECT 1
FROM TMP_CUSTOMERS TMP_C
-- Be AWARE that NULLs are not handled here
-- so it's correct only if FIRST_NAME, LAST_NAME, DATE_OF_BIRTH are not nullable
WHERE C.FIRST_NAME = TMP_C.FIRST_NAME
AND C.LAST_NAME = TMP_C.LAST_NAME
AND C.DATE_OF_BIRTH = TMP_C.DATE_OF_BIRTH
)
;
-- If `CUSTOMER_PACKAGE.VALUE_CUSTOMERS` can return a lot of rows,
-- then you should create indexes on FIRST_NAME, LAST_NAME, DATE_OF_BIRTH
-- or maybe even 1 multi-column index on all of above columns.
Also, consider rewriting your query. You should insert into TMP_CUSTOMERS just a customer ID and then call a delete based on these ids.
The main risk here is that the data could be changed between these 2 operations and you should consider that issue.
You can save result in collection variable and then do a bulk delete with forall loop.
If number of rows to delete could be big, then you should extend this example using LIMIT clause. Even with limit, you still could encounter some problems - like not enough space in UNDO. So this solution is good only for small amount of data. The risk here is the same as in example above.
DECALRE
type t_tab is table of number index by pls_integer;
v_tab t_tab;
BEGIN
SELECT CUSTOMER.ID -- I hope you have some kind of Primary key there...
BULK COLLECT INTO v_tab
FROM CUSTOMERS
LEFT JOIN
(
SELECT
1 AS DELETABLE,
VC.*
FROM TABLE(CUSTOMER_PACKAGE.VALUE_CUSTOMERS(TRUNC(SYSDATE) - 30)) VC
)
USING
(
FIRST_NAME, LAST_NAME, DATE_OF_BIRTH
)
WHERE
DELETABLE IS NULL
;
FORALL idx in 1..V_TAB.COUNT()
DELETE FROM CUSTOMERS
WHERE CUSTOMERS.ID = V_TAB(idx)
;
END;
/
Do it completely different - that's preferable.
For example: logic from CUSTOMER_PACKAGE.VALUE_CUSTOMERS move to a view and based on that view create a delete statement. Remember to change CUSTOMER_PACKAGE.VALUE_CUSTOMERS to use that new view also (DRY principle).

Related

Postgres - How to find id's that are not used in different multiple tables (inactive id's) - badly written query

I have table towns which is main table. This table contains so many rows and it became so 'dirty' (someone inserted 5 milions rows) that I would like to get rid of unused towns.
There are 3 referent table that are using my town_id as reference to towns.
And I know there are many towns that are not used in this tables, and only if town_id is not found in neither of these 3 tables I am considering it as inactive and I would like to remove that town (because it's not used).
as you can see towns is used in this 2 different tables:
employees
offices
and for table * vendors there is vendor_id in table towns since one vendor can have multiple towns.
so if vendor_id in towns is null and town_id is not found in any of these 2 tables it is safe to remove it :)
I created a query which might work but it is taking tooooo much time to execute, and it looks something like this:
select count(*)
from towns
where vendor_id is null
and id not in (select town_id from banks)
and id not in (select town_id from employees)
So basically I said, if vendor_is is null it means this town is definately not related to vendors and in the same time if same town is not in banks and employees, than it will be safe to remove it.. but query took too long, and never executed successfully...since towns has 5 milions rows and that is reason why it is so dirty..
In face I'm not able to execute given query since server terminated abnormally..
Here is full error message:
ERROR: server closed the connection unexpectedly This probably means
the server terminated abnormally before or while processing the
request.
Any kind of help would be awesome
Thanks!
You can join the tables using LEFT JOIN so that to identify the town_id for which there is no row in tables banks and employee in the WHERE clause :
WITH list AS
( SELECT t.town_id
FROM towns AS t
LEFT JOIN tbl.banks AS b ON b.town_id = t.town_id
LEFT JOIN tbl.employees AS e ON e.town_id = t.town_id
WHERE t.vendor_id IS NULL
AND b.town_id IS NULL
AND e.town_id IS NULL
LIMIT 1000
)
DELETE FROM tbl.towns AS t
USING list AS l
WHERE t.town_id = l.town_id ;
Before launching the DELETE, you can check the indexes on your tables.
Adding an index as follow can be usefull :
CREATE INDEX town_id_nulls ON towns (town_id NULLS FIRST) ;
Last but not least you can add a LIMIT clause in the cte so that to limit the number of rows you detele when you execute the DELETE and avoid the unexpected termination. As a consequence, you will have to relaunch the DELETE several times until there is no more row to delete.
You can try an JOIN on big tables it would be faster then two IN
you could also try UNION ALL and live with the duplicates, as it is faster as UNION
Finally you can use a combined Index on id and vendor_id, to speed up the query
CREATE TABLe towns (id int , vendor_id int)
CREATE TABLE
CREATE tABLE banks (town_id int)
CREATE TABLE
CREATE tABLE employees (town_id int)
CREATE TABLE
select count(*)
from towns t1 JOIN (select town_id from banks UNION select town_id from employees) t2 on t1.id <> t2.town_id
where vendor_id is null
count
0
SELECT 1
fiddle
The trick is to first make a list of all the town_id's you want to keep and then start removing those that are not there.
By looking in 2 tables you're making life harder for the server so let's just create 1 single list first.
-- build empty temp-table
CREATE TEMPORARY TABLE TEMP_must_keep
AS
SELECT town_id
FROM tbl.towns
WHERE 1 = 2;
-- get id's from first table
INSERT TEMP_must_keep (town_id)
SELECT DISTINCT town_id
FROM tbl.banks;
-- add index to speed up the EXCEPT below
CREATE UNIQUE INDEX idx_uq_must_keep_town_id ON TEMP_must_keep (town_id);
-- add new ones from second table
INSERT TEMP_must_keep (town_id)
SELECT town_id
FROM tbl.employees
EXCEPT -- auto-distincts
SELECT town_id
FROM TEMP_must_keep;
-- rebuild index simply to ensure little fragmentation
REINDEX TABLE TEMP_must_keep;
-- optional, but might help: create a temporary index on the towns table to speed up the delete
CREATE INDEX idx_towns_town_id_where_vendor_null ON tbl.towns (town_id) WHERE vendor IS NULL;
-- Now do actual delete
-- You can do a `SELECT COUNT(*)` rather than a `DELETE` first if you feel like it, both will probably take some time depending on your hardware.
DELETE
FROM tbl.towns as del
WHERE vendor_id is null
AND NOT EXISTS ( SELECT *
FROM TEMP_must_keep mk
WHERE mk.town_id = del.town_id);
-- cleanup
DROP INDEX tbl.idx_towns_town_id_where_vendor_null;
DROP TABLE TEMP_must_keep;
The idx_towns_town_id_where_vendor_null is optional and I'm not sure if it will actaully lower the total time but IMHO it will help out with the DELETE operation if only because the index should give the Query Optimizer a better view on what volumes to expect.

Is it possible that nested loop joins different data to same id in different loops

We have an interesting phenomenon with a sql and the oracle database that we could not reproduce. The example was simplified. We believe not, but possibly oversimplified.
Main question: Given a nested loop, where the inner (not driving) table has an analytic function, whose result is ambiguous (multiple rows could be the first row of the order by), would it be feasible that said analytic function can return different results for different outer loops?
Secondary Question: If yes, how can we reproduce this behaviour?
If no, have you any other ideas why this query would produce multiple rows for the same company.
Not the question: Should the assumption on what is wrong be correct, correcting the sql would be easy. Just make the order by in the analytic function unambiguous e.g. by adding the id column as second criteria.
Problem:
Company has a n:m relation to owner and a 1:n relation to address.
The SQL joins all tables while reading only a single address per company making use of the analytic function row_number(), groups by company AND address and accumulates the owner name.
We use the query for multiple purposes, other purposes involve reading the “best” address, the problematic one does not. We got multiple error reports with results like this:
Company A has owners N1, N2, N3.
Result was
Company
Owner list
A
N1
A
N2, N3
All cases that were reported involve companies with multiple “best” addresses, hence the theory, that somehow the subquery that should deliver a single address is broken. But we could not reproduce the result.
Full Details:
(for smaller numbers the listagg() is the original function used, but it fails for bigger numbers. count(*) should be a suitable replacement)
--cleanup
DROP TABLE rau_companyowner;
DROP TABLE rau_owner;
DROP TABLE rau_address;
DROP TABLE rau_company;
--create structure
CREATE TABLE rau_company (
id NUMBER CONSTRAINT pk_rau_company PRIMARY KEY USING INDEX (CREATE UNIQUE INDEX idx_rau_company_p ON rau_company(id))
);
CREATE TABLE rau_owner (
id NUMBER CONSTRAINT pk_rau_owner PRIMARY KEY USING INDEX (CREATE UNIQUE INDEX idx_rau_owner_p ON rau_owner(id)),
name varchar2(1000)
);
CREATE TABLE rau_companyowner (
company_id NUMBER,
owner_id NUMBER,
CONSTRAINT pk_rau_companyowner PRIMARY KEY (company_id, owner_id) USING INDEX (CREATE UNIQUE INDEX idx_rau_companyowner_p ON rau_companyowner(company_id, owner_id)),
CONSTRAINT fk_companyowner_company FOREIGN KEY (company_id) REFERENCES rau_company(id),
CONSTRAINT fk_companyowner_owner FOREIGN KEY (owner_id) REFERENCES rau_owner(id)
);
CREATE TABLE rau_address (
id NUMBER CONSTRAINT pk_rau_address PRIMARY KEY USING INDEX (CREATE UNIQUE INDEX idx_rau_address_p ON rau_address(id)),
company_id NUMBER,
prio NUMBER NOT NULL,
street varchar2(1000),
CONSTRAINT fk_address_company FOREIGN KEY (company_id) REFERENCES rau_company(id)
);
--create testdata
DECLARE
TYPE t_address IS TABLE OF rau_address%rowtype INDEX BY pls_integer;
address t_address;
TYPE t_owner IS TABLE OF rau_owner%rowtype INDEX BY pls_integer;
owner t_owner;
TYPE t_companyowner IS TABLE OF rau_companyowner%rowtype INDEX BY pls_integer;
companyowner t_companyowner;
ii pls_integer;
company_id pls_integer := 1;
test_count PLS_INTEGER := 10000;
--test_count PLS_INTEGER := 50;
BEGIN
--rau_company
INSERT INTO rau_company VALUES (company_id);
--rau_owner,rau_companyowner
FOR ii IN 1 .. test_count
LOOP
owner(ii).id:=ii;
owner(ii).name:='N'||to_char(ii);
companyowner(ii).company_id:=company_id;
companyowner(ii).owner_id:=ii;
END LOOP;
forall ii IN owner.FIRST .. owner.LAST
INSERT INTO rau_owner VALUES (owner(ii).id, owner(ii).name);
forall ii IN companyowner.FIRST .. companyowner.LAST
INSERT INTO rau_companyowner VALUES (companyowner(ii).company_id, companyowner(ii).owner_id);
--rau_address
FOR ii IN 1 .. test_count
LOOP
address(ii).id:=ii;
address(ii).company_id:=company_id;
address(ii).prio:=1;
address(ii).street:='S'||to_char(ii);
END LOOP;
forall ii IN address.FIRST .. address.LAST
INSERT INTO rau_address VALUES (address(ii).id, address(ii).company_id, address(ii).prio, address(ii).street);
COMMIT;
END;
-- check testdata
SELECT 'rau_company' tab, COUNT(*) count FROM rau_company
UNION all
SELECT 'rau_owner', COUNT(*) FROM rau_owner
UNION all
SELECT 'rau_companyowner', COUNT(*) FROM rau_companyowner
UNION all
SELECT 'rau_address', COUNT(*) FROM rau_address;
-- the sql: NL with address as inner loop enforced
-- ‘order BY prio’ is ambiguous because all addresses have the same prio
-- => the single row in ad could be any row
SELECT /*+ leading(hh hhoo oo ad) use_hash(hhoo oo) USE_NL(hh ad) */
hh.id company,
ad.street,
-- LISTAGG(oo.name || ', ') within group (order by oo.name) owner_list,
count(oo.id) owner_count
FROM rau_company hh
LEFT JOIN rau_companyowner hhoo ON hh.id = hhoo.company_id
LEFT JOIN rau_owner oo ON hhoo.owner_id = oo.id
LEFT JOIN (
SELECT *
FROM (
SELECT company_id, street,
row_number() over ( partition by company_id order BY prio asc ) as row_num
FROM rau_address
)
WHERE row_num = 1
) ad ON hh.id = ad.company_id
GROUP BY hh.id,
ad.street;
Cris Saxon was so nice to answer my question: https://asktom.oracle.com/pls/apex/f?p=100:11:::::P11_QUESTION_ID:9546263400346154452
In short: As long as the order by is ambiguous (non-deterministic), there will always be a chance for different results even within the same sql.
To reproduce add this to my test data:
ALTER TABLE rau_address PARALLEL 8;
and try the select at the bottom, it should deliver multiple rows.

Oracle Data Migration [data modification] - Data Tuning

I'm facing with a data migration, my goal is update 2.5M of row in less than 8 hours, that's because the customer have a limited window of time where the service can be deactivated. Moreover the table can't be locked during this execution because is used by other procedures, I can lock the record only. The execution will done through batch process.
Probably in this case migration isn't the correct word, could be better say "altering data"...
System: Oracle 11g
Table Informations
Table name: Tab1
Tot rows: 520.000.000
AVG row len: 57
DESC Tab1;
Name Null? Type
---------------- -------- -----------
t_id NOT NULL NUMBER
t_fk1_id NUMBER
t_fk2_id NUMBER
t_start_date NOT NULL DATE
t_end_date DATE
t_del_flag NOT NULL NUMBER(1)
t_flag1 NOT NULL NUMBER(1)
f_falg2 NOT NULL NUMBER(1)
t_creation_date DATE
t_creation_user NUMBER(10)
t_last_update DATE
t_user_update NUMBER(10)
t_flag3 NUMBER(1)
Indexs are:
T_ID_PK [t_id] UNIQUE
T_IN_1 [t_fk2_id,t_fk1_id,t_start_date,t_del_flag] NONUNIQUE
T_IN_2 [t_last_update,t_fk2_id] NONUNIQUE
T_IN_3 [t_fk2_id,t_fk1_id] NONUNIQUE
Currently I've thinked some possible solutions and most of that I've already test:
Insert + delete: selecting the existing data, insert the new record with needed modification and delete the old one [this result as slowest method ~21h]
Merge: use the merge command for update the existing data [this result as the fastest method ~16h]
Update: update the existing data [~18h]
With the above solution I've faced some issues like: if executed wit /*+ parallel(x) / option the table was locked, the /+ RESULT_CACHE */ seem not affect at all the selection time.
My last idea is partition the table by a new column and use that for avoid table locking and proceed with the solution 1.
Here the query used for Merge option (for the others two is the same more or less):
DECLARE
v_recordset NUMBER;
v_row_count NUMBER;
v_start_subset NUMBER;
v_tot_loops NUMBER;
BEGIN
--set the values manually for example purpose, I've use the same values
v_recordset := 10000;
v_tot_loops := 10000;
BEGIN
SELECT NVL(MIN(MOD(m_id,v_recordset)), 99999)
INTO v_start_subset
FROM MIGRATION_TABLE
WHERE m_status = 0; -- 0=not migrated , 1=migrated
END;
FOR v_n_subset IN v_start_subset..v_tot_loops
LOOP
BEGIN
MERGE INTO Tab1 T1
USING (
SELECT m.m_new_id, c2.c_id, t.t_id
FROM MIGRATION_TABLE m
JOIN Tab1 t ON t.t_fk_id = m.m_old_id
JOIN ChildTable c ON c.c_id = t.t_fk2_id
JOIN ChildTable c2 ON c.c_name = c2.c_name --c_name is an UNIQUE index of ChildTable
WHERE MOD(m.m_id,v_recordset) = v_n_subset
AND c.c_fk_id = old_product_id --value obtained from another subsystem
AND c2.c_fk_id = new_product_id --value obtained from another subsystem
AND t.t_del_flag = 0 --not deleted items
) T2
ON (T1.t_id = T2.t_id)
WHEN MATCHED THEN
UPDATE T1.t_fk_id = T2.m_new_id, T1.t_fk2_id = T2.c_id, T1.t_last_update = trunc(sysdate)
;
--Update the record as migrated and proceed
COMMIT;
EXCEPTION WHEN OTHERS THEN
ROLLBACK;
END;
END LOOP;
END;
In the above script I've deleted the parallel and cache options but I've already test is with both and I've not obtained any bug result.
Anyone, please! Could you guys help me with this, in more than one week I wasn't able to reach the desired timing, any ideas?
MIGRATION_TABLE
CREATE TABLE MIGRATION_TABLE(
m_customer_from VARCHAR2(5 BYTE),
m_customer_to VARCHAR2(5 BYTE),
m_old_id NUMBER(10,0) NOT NULL,
m_new_id NUMBER(10,0) NOT NULL,
m_status VARCHAR2(100 BYTE),
CONSTRAINT M_MIG_PK_1
(
m_old_id
)
ENABLE
)
CREATE UNIQUE INDEX M_MIG_PK_1 ON MIGRATION_TABLE (m_old_id ASC)
ChildTable
CREATE TABLE ChildTable(
c_id NUMBER(10, 0) NOTE NULL,
c_fk_id NUMBER(10, 0),
c_name VARCHAR2(100 BYTE),
c_date DATE,
c_note VARCHAR2(100 BYTE),
CONSTRAINT C_CT_PK_1
(
c_id
)
ENABLE
)
CREATE UNIQUE INDEX C_CT_PK_1 ON ChildTable (c_id ASC)
CREATE UNIQUE INDEX C_CT_PK_2 ON ChildTable (c_name ASC, c_fk_id ASC)
Method 2 is similar to Method 1, but it is using ROWIDs instead of a primary key. In theory, it should be therefore a bit faster.
CREATE TABLE migration_temp NOLOGGING AS
SELECT t.t_id,
t.rowid AS rid,
m.m_new_id AS new_fk1_id,
c2.c_id AS new_fk2_id
FROM MIGRATION_TABLE m
JOIN Tab1 t ON t.t_fk1_id = m.m_old_id
JOIN ChildTable c1 ON c1.c_id = t.t_fk2_id
JOIN ChildTable c2 ON c1.c_name = c2.c_name
WHERE t.t_del_flag = 0
ORDER BY t.rowid;
EXEC DBMS_STATS.GATHER_TABLE_STATS(null,'migration_temp');
MERGE INTO Tab1 t USING migration_temp m ON (t.rowid = m.rid)
WHEN MATCHED THEN UPDATE SET
t.t_fk1_id = m.new_fk1_id,
t.t_fk2_id = m.new_fk2_id,
t.t_last_update = trunc(sysdate);
You could think of batching the MERGE based on blocks of ROWIDs. Those tend to be logically colocated, therefore it should be a bit faster.
Wow, 520 million rows! However, updating 2.5 million of them is only 0.5%, that should be doable. Not knowing your data, my first assumption is that the self join of Tab1 x Tab1 inside the MERGE takes up most of the time. Possibly also the many joins to migration- and child_tables. And the indexes T_IN_1, 2, and 3 need maintenance, too.
As you say the rows to be updated are fixed, I'd try to prepare the heavy work. This doesn't lock the table and wouldn't count towards the downtime:
CREATE TABLE migration_temp NOLOGGING AS
SELECT t.t_id,
m.m_new_id AS new_fk1_id,
c2.c_id AS new_fk2_id
FROM MIGRATION_TABLE m
JOIN Tab1 t ON t.t_fk1_id = m.m_old_id
JOIN ChildTable c1 ON c1.c_id = t.t_fk2_id
JOIN ChildTable c2 ON c1.c_name = c2.c_name
WHERE t.t_del_flag = 0;
I omitted the bit with the old/new product_ids because I didn't fully understand how it should work, but that is hopefully not a problem.
Method 1 would be a join via primary keys:
ALTER TABLE migration_temp ADD CONSTRAINT pk_migration_temp PRIMARY KEY(t_id);
EXEC DBMS_STATS.GATHER_TABLE_STATS(null,'migration_temp');
MERGE INTO Tab1 t USING migration_temp m ON (t.t_id = m.t_id)
WHEN MATCHED THEN UPDATE SET
t.t_fk1_id = m.new_fk1_id,
t.t_fk2_id = m.new_fk2_id,
t.t_last_update = trunc(sysdate);
I'm not a fan of batched updates. As you have time estimates, it looks like you have a test system. I'd suggest to give it a go and try it in one batch.
If method 1 and 2 are still too slow, you could follow your partitioning idea. For instance, introduce a column to distinguish the rows to be migrated. Because of DEFAULT ... NOT NULL this will be very fast:
ALTER TABLE Tab1 ADD (todo NUMBER DEFAULT 0 NOT NULL);
Now partition your table into two partions, one with the migration data, one with the rest that you will not touch. I don't have much experience with introducing partitions while the application is running, but I think it is solvable, for instance with online redefinition or
ALTER TABLE Tab1 MODIFY
PARTITION BY LIST (todo) (
PARTITION pdonttouch VALUES (0),
PARTITION pmigration VALUES (1)
) ONLINE UPDATE INDEXES (
T_ID_PK GLOBAL, T_IN_1 GLOBAL,
T_IN_2 GLOBAL, T_IN_3 GLOBAL
);
Now you can identify the rows to be moved. This can be done row by row and doesn't affect the other processes and should not count towards your downtime. The migration rows will move from partition pdonttouch to partition pmigration therefore you need to enable row movement.
ALTER TABLE Tab1 ENABLE ROW MOVEMENT;
UPDATE Tab1 SET todo=1 WHERE .... JOIN ...;
Now you can work on the partition PMIGRATION and update the data there. This should be much faster than on the original table, as the size of the partition is only 0.5% of the whole table. Don't know about the indexes, though.
Theoretically, you could create a table with the same structure and data as PMIGRATION, work on the table, and once done, swap the partition and the working table with EXCHANGE PARTITION. Don't know about the indexes, again.

SQL Queries instead of Cursors

I'm creating a database for a hypothetical video rental store.
All I need to do is a procedure that check the availabilty of a specific movie (obviously the movie can have several copies). So I have to check if there is a copy available for the rent, and take the number of the copy (because it'll affect other trigger later..).
I already did everything with the cursors and it works very well actually, but I need (i.e. "must") to do it without using cursors but just using "pure sql" (i.e. queries).
I'll explain briefly the scheme of my DB:
The tables that this procedure is going to use are 3: 'Copia Film' (Movie Copy) , 'Include' (Includes) , 'Noleggio' (Rent).
Copia Film Table has this attributes:
idCopia
Genere (FK references to Film)
Titolo (FK references to Film)
dataUscita (FK references to Film)
Include Table:
idNoleggio (FK references to Noleggio. Means idRent)
idCopia (FK references to Copia film. Means idCopy)
Noleggio Table:
idNoleggio (PK)
dataNoleggio (dateOfRent)
dataRestituzione (dateReturn)
dateRestituito (dateReturned)
CF (FK to Person)
Prezzo (price)
Every movie can have more than one copy.
Every copy can be available in two cases:
The copy ID is not present in the Include Table (that means that the specific copy has ever been rented)
The copy ID is present in the Include Table and the dataRestituito (dateReturned) is not null (that means that the specific copy has been rented but has already returned)
The query I've tried to do is the following and is not working at all:
SELECT COUNT(*)
FROM NOLEGGIO
WHERE dataNoleggio IS NOT NULL AND dataRestituito IS NOT NULL AND idNoleggio IN (
SELECT N.idNoleggio
FROM NOLEGGIO N JOIN INCLUDE I ON N.idNoleggio=I.idNoleggio
WHERE idCopia IN (
SELECT idCopia
FROM COPIA_FILM
WHERE titolo='Pulp Fiction')) -- Of course the title is just an example
Well, from the query above I can't figure if a copy of the movie selected is available or not AND I can't take the copy ID if a copy of the movie were available.
(If you want, I can paste the cursors lines that work properly)
------ USING THE 'WITH SOLUTION' ----
I modified a little bit your code to this
WITH film
as
(
SELECT idCopia,titolo
FROM COPIA_FILM
WHERE titolo = 'Pulp Fiction'
),
copy_info as
(
SELECT N.idNoleggio, N.dataNoleggio, N.dataRestituito, I.idCopia
FROM NOLEGGIO N JOIN INCLUDE I ON N.idNoleggio = I.idNoleggio
),
avl as
(
SELECT film.titolo, copy_info.idNoleggio, copy_info.dataNoleggio,
copy_film.dataRestituito,film.idCopia
FROM film LEFT OUTER JOIN copy_info
ON film.idCopia = copy_info.idCopia
)
SELECT COUNT(*),idCopia FROM avl
WHERE(dataRestituito IS NOT NULL OR idNoleggio IS NULL)
GROUP BY idCopia
As I said in the comment, this code works properly if I use it just in a query, but once I try to make a procedure from this, I got errors.
The problem is the final SELECT:
SELECT COUNT(*), idCopia INTO CNT,COPYFILM
FROM avl
WHERE (dataRestituito IS NOT NULL OR idNoleggio IS NULL)
GROUP BY idCopia
The error is:
ORA-01422: exact fetch returns more than requested number of rows
ORA-06512: at "VIDEO.PR_AVAILABILITY", line 9.
So it seems the Into clause is wrong because obviously the query returns more rows. What can I do ? I need to take the Copy ID (even just the first one on the list of rows) without using cursors.
You can try this -
WITH film
as
(
SELECT idCopia, titolo
FROM COPIA_FILM
WHERE titolo='Pulp Fiction'
),
copy_info as
(
select N.idNoleggio, I.dataNoleggio , I.dataRestituito , I.idCopia
FROM NOLEGGIO N JOIN INCLUDE I ON N.idNoleggio=I.idNoleggio
),
avl as
(
select film.titolo, copy_info.idNoleggio, copy_info.dataNoleggio,
copy_info.dataRestituito
from film LEFT OUTER JOIN copy_info
ON film.idCopia = copy_info.idCopia
)
select * from avl
where (dataRestituito IS NOT NULL OR idNoleggio IS NULL);
You should think in terms of sets, rather than records.
If you find the set of all the films that are out, you can exclude them from your stock, and the rest is rentable.
select copiafilm.* from #f copiafilm
left join
(
select idCopia from #r Noleggio
inner join #i include on Noleggio.idNoleggio = include.idNoleggio
where dateRestituito is null
) out
on copiafilm.idCopia = out.idCopia
where out.idCopia is null
I solved the problem editing the last query into this one:
SELECT COUNT(*),idCopia INTO CNT,idCopiaFilm
FROM avl
WHERE (dataRestituito IS NOT NULL OR idNoleggio IS NULL) AND rownum = 1
GROUP BY idCopia;
IF CNT > 0 THEN
-- FOUND AVAILABLE COPY
END IF;
EXCEPTION
WHEN NO_DATA_FOUND THEN
-- NOT FOUND AVAILABLE COPY
Thank you #Aditya Kakirde ! Your suggestion almost solved the problem.

Getting original and modified content from a table with an audit trail

I came across the following table structure and I need to perform a certain type of query upon it.
id
first_name
last_name
address
email
audit_parent_id
audit_entry_type
audit_change_date
The last three fields are for the audit trail. There is a convention that says: all original entries have the value "0" for "audit_parent_id" and the value "master" for "audit_entry_type". All the modified entries have the value of their parent id for audit_parent_id" and the value "modified" for the "audit_entry_type".
Now what I want to do is to be able to get the original value and the modified value for a field and I want to make this with less queries possible.
Any ideas? Thank you.
Assuming a simple case, when you want to get the latest adress value change for the record with id 50, this query fits your needs.
select
p.id,
p.adress as original_address,
(select p1.adress from persons p1 where p1.audit_parent_id = p.id order by audit_change_date desc limit 1) as latest_address
from
persons p -- Assuming it's the table name
where
p.id = 50
But this assumes that, even if the address value doesn't change between one audit to the other, it remains the same in the field.
Here's another example, showing all persons that had an address change:
select
p.id,
p.adress as original_address,
(select p1.adress from persons p1 where p1.audit_parent_id = p.id order by audit_change_date desc limit 1) as latest_address
from
persons p -- Assuming it's the table name
where
p.audit_parent_id = 0
and
p.adress not like (select p1.adress from persons p1 where p1.audit_parent_id = p.id order by audit_change_date desc limit 1)
This can be solved with pure SQL in modern Postgres using WITH RECURSIVE.
For PostgreSQL 8.3, this plpgsql function does the job while it is also a decent solution for modern PostgreSQL. You want to ..
get the original value and the modified value for a field
The demo picks first_name as filed:
CREATE OR REPLACE FUNCTION f_get_org_val(integer
, OUT first_name_curr text
, OUT first_name_org text) AS
$func$
DECLARE
_parent_id int;
BEGIN
SELECT INTO first_name_curr, first_name_org, _parent_id
first_name, first_name, audit_parent_id
FROM tbl
WHERE id = $1;
WHILE _parent_id <> 0
LOOP
SELECT INTO first_name_org, _parent_id
first_name, audit_parent_id
FROM tbl
WHERE id = _parent_id;
END LOOP;
END
$func$ LANGUAGE plpgsql;
COMMENT ON FUNCTION f_get_org_val(int) IS 'Get current and original values for id.
$1 .. id';
Call:
SELECT * FROM f_get_org_val(123);
This assumes that all trees have a root node with parent_id = 0. No circular references, or you will end up with an endless loop. You might want to add a counter and exit the loop after x iterations.