want to make query using running sum for postgres - sql

I have a problem for creating a query for postgres(strictly speaking its redshift).
table data is below.
the table is PARTITION BY user_id ORDER BY created_at desc
data
user_id| x | y | min | created_at
-------+---+---+------+---------------------
1| 1 | 1 | 1 | 2015-01-15 17:26:53
1| 1 | 1 | 2 | 2015-01-15 17:26:54
1| 1 | 1 | 3 | 2015-01-15 17:26:55
1| 2 | 1 | 10 | 2015-01-16 02:46:21
1| 1 | 1 | 15 | 2015-01-16 02:46:22
1| 3 | 3 | 11 | 2015-01-16 03:01:44
1| 3 | 3 | 2 | 2015-01-16 03:02:06
2| 1 | 1 | 3 | 2015-01-16 03:02:12
2| 2 | 1 | 4 | 2015-01-16 03:02:15
2| 2 | 1 | 7 | 2015-01-16 03:02:18
and what I want is below
ideal result
user_id| x | y | sum_min |
-------+---+---+----------+
1| 1 | 1 | 6 |
1| 2 | 1 | 10 |
1| 1 | 1 | 15 |
1| 3 | 3 | 13 |
2| 1 | 1 | 3 |
2| 2 | 1 | 11 |
If I use simply group by user_id, x, y,
the result of will be
user_id| x | y | sum_min |
-------+---+---+----------+
1| 1 | 1 | 21 |
:| : | : | : |
this is not good for me:(

try this
with cte as (
select user_id,x,y,created_at,sum(min) over (partition by user_id,x,y,replace order by user_id ) sum_min from (
select user_id,x,y,min,replace( created_at::date::text ,'-',''),created_at from usr order by created_at
)t order by created_at
)
select user_id,x,y,sum_min from cte
group by sum_min,user_id,x,y
order by user_id

Maybe try grouping it by the creation date as well:
select user_id, x, y, sum(min), created_at::date from test
group by user_id, x, y, created_at::date
order by user_id, x, y, created_at

It seems that what you want to do is to calculate an aggregate function over a cluster of records ordered on a column that is based on same values in three columns, separated from other clusters only by those three column values. That is not possible in standard SQL because the order of records is not relevant to any of the SQL commands. The fact that you order by date does not change that: SQL commands simply do not support this kind of stratification.
The only option that I am aware of is to create a plpgsql function with a cursor on your data relation (presumably a view, but would work equally well with a table). You iterate over all the records in the relation and for each cluster encountered sum up the min values and output a new record with the clustering columns and the summed value.
CREATE FUNCTION sum_clusters()
RETURNS TABLE (user_id int, x int, y int, sum_int int) AS $$
DECLARE
data_row data%ROWTYPE;
cur CURSOR FOR SELECT * FROM data;
cur_user integer;
cur_x integer;
cur_y integer;
sum integer;
BEGIN
OPEN cur;
FETCH NEXT cur INTO data_row;
LOOP
IF NOT FOUND THEN
EXIT;
END IF;
cur_user := data_row.user_id;
cur_x := data_row.x;
cur_y := data_row.y;
sum := data_row.min;
LOOP
FETCH NEXT cur INTO data_row;
IF NOT FOUND THEN
EXIT;
END IF;
IF (data_row.user_id = cur_user) AND (data_row.x = cur_x) AND (data_row.y = cur_y) THEN
sum += data_row.min;
ELSE
EXIT;
END IF;
END LOOP;
RETURN NEXT cur_user, cur_x, cur_y, sum;
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
That is a lot of code and not particularly fast, but it should work.

Related

insert extra rows in query result sql

Given a table with entries at irregular time stamps, "breaks" must be inserted at regular 5 min intervals ( the data associated can / will be NULL ).
I was thinking of getting the start time, making a subquery that has a window function and adds 5 min intervals to the start time - but I only could think of using row_number to increment the values.
WITH data as(
select id, data,
cast(date_and_time as double) * 1000 as time_milliseconds
from t1), -- original data
start_times as(
select id, MIN(CAST(date_and_time as double) * 1000) as start_time
from t1
GROUP BY id
), -- first timestamp for each id
boundries as (
SELECT T1.id,(row_number() OVER (PARTITION BY T1.id ORDER BY T1.date_and_time)-1) *300000 + start_times.start_time
as boundry
from T1
INNER JOIN start_times ON start_times.id= T1.id
) -- increment the number of 5 min added on each row and later full join boundries table with original data
However this limits me to the number of rows present for an id in the original data table, and if the timestamps are spread out, the number of rows cannot cover the amount of 5 min intervals needed to be added.
sample data:
initial data:
|-----------|------------------|------------------|
| id | value | timestamp |
|-----------|------------------|------------------|
| 1 | 3 | 12:00:01.011 |
|-----------|------------------|------------------|
| 1 | 4 | 12:03:30.041 |
|-----------|------------------|------------------|
| 1 | 5 | 12:12:20.231 |
|-----------|------------------|------------------|
| 1 | 3 | 15:00:00.312 |
data after my query:
|-----------|------------------|------------------|
| id | value | timestamp (UNIX) |
|-----------|------------------|------------------|
| 1 | 3 | 12:00:01 |
|-----------|------------------|------------------|
| 1 | 4 | 12:03:30 |
|-----------|------------------|------------------|
| 1 | NULL | 12:05:01 | <-- Data from "boundries"
|-----------|------------------|------------------|
| 1 | NULL | 12:10:01 | <-- Data from "boundries"
|-----------|------------------|------------------|
| 1 | 5 | 12:12:20 |
|-----------|------------------|------------------|
| 1 | NULL | 12:15:01 | <-- Data from "boundries"
|-----------|------------------|------------------|
| 1 | NULL | 12:20:01 | <-- Data from "boundries"
|-----------|------------------|------------------| <-- Jumping directly to 15:00:00 (WRONG! :( need to insert more 5 min breaks here )
| 1 | 3 | 15:00:00 |
I was thinking of creating a temporary table inside HIVE and filling it with x rows representing 5 min intervals from the starttime to the endtime of the data table, but I couldn't find any way of accomplishing that.
Any way of using "for loops" ? Any suggestions would be appreciated.
Thanks
You can try calculating the difference between current timestamp and next one, divide 300 to get number of ranges, produce a string of spaces with length = num_ranges, explode to generate rows.
Demo:
with your_table as (--initial data example
select stack (3,
1,3 ,'2020-01-01 12:00:01.011',
1,4 ,'2020-01-01 12:03:30.041',
1,5 ,'2020-01-01 12:20:20.231'
) as (id ,value ,ts )
)
select id ,value, ts, next_ts,
diff_sec,num_intervals,
from_unixtime(unix_timestamp(ts)+h.i*300) new_ts, coalesce(from_unixtime(unix_timestamp(ts)+h.i*300),ts) as calculated_timestamp
from
(
select id ,value ,ts, next_ts, (unix_timestamp(next_ts)-unix_timestamp(ts)) diff_sec,
floor((unix_timestamp(next_ts)-unix_timestamp(ts))/300 --diff in seconds/5 min
) num_intervals
from
(
select id ,value ,ts, lead(ts) over(order by ts) next_ts
from your_table
) s
)s
lateral view outer posexplode(split(space(cast(s.num_intervals as int)),' ')) h as i,x --this will generate rows
Result:
id value ts next_ts diff_sec num_intervals new_ts calculated_timestamp
1 3 2020-01-01 12:00:01.011 2020-01-01 12:03:30.041 209 0 2020-01-01 12:00:01 2020-01-01 12:00:01
1 4 2020-01-01 12:03:30.041 2020-01-01 12:20:20.231 1010 3 2020-01-01 12:03:30 2020-01-01 12:03:30
1 4 2020-01-01 12:03:30.041 2020-01-01 12:20:20.231 1010 3 2020-01-01 12:08:30 2020-01-01 12:08:30
1 4 2020-01-01 12:03:30.041 2020-01-01 12:20:20.231 1010 3 2020-01-01 12:13:30 2020-01-01 12:13:30
1 4 2020-01-01 12:03:30.041 2020-01-01 12:20:20.231 1010 3 2020-01-01 12:18:30 2020-01-01 12:18:30
1 5 2020-01-01 12:20:20.231 \N \N \N \N 2020-01-01 12:20:20.231
Additional rows were added. I left all intermediate columns for debugging purposes.
A recursive query could be helpful here but hive does not support these more info.
You may consider creating the table outside of hive or writing a UDF.
Either way this query can be expensive and the use of materialized views/tables are recommended depending on your frequency.
The example shows a UDF inbetween created using pyspark to run the query. It
generate the values in between the min and max timestamp from the dataset
using CTEs and the UDF to create a temporary table intervals
generating all possible intervals using an expensive cross join in possible_records
Using a left join to retrieve the records with actual values (for demonstration purposes i've represented the timestamp value as just the time string)
The code below shows how it was evaluated using hive
Example Code
from pyspark.sql.functions import udf
from pyspark.sql.types import IntegerType,ArrayType
inbetween = lambda min_value,max_value : [*range(min_value,max_value,5*60)]
udf_inbetween = udf(inbetween,ArrayType(IntegerType()))
sqlContext.udf.register("inbetween",udf_inbetween)
sqlContext.sql("""
WITH max_timestamp(t) as (
select max(timestamp) as t from initial_data2
),
min_timestamp(t) as (
select min(timestamp) as t from initial_data2
),
intervals as (
select explode(inbetween(unix_timestamp(mint.t),unix_timestamp(maxt.t))) as interval_time FROM
min_timestamp mint, max_timestamp maxt
),
unique_ids as (
select distinct id from initial_data2
),
interval_times as (
select interval_time from (
select
cast(from_unixtime(interval_time) as timestamp) as interval_time
from
intervals
UNION
select distinct d.timestamp as interval_time from initial_data2 d
)
order by interval_time asc
),
possible_records as (
select
distinct
d.id,
i.interval_time
FROM
interval_times i, unique_ids d
)
select
p.id,
d.value,
split(cast(p.interval_time as string)," ")[1] as timestamp
FROM
possible_records p
LEFT JOIN
initial_data2 d ON d.id = p.id and d.timestamp = p.interval_time
ORDER BY p.id, p.interval_time
""").show(20)
Output
+---+-----+---------+
| id|value|timestamp|
+---+-----+---------+
| 1| 3| 12:00:01|
| 1| 4| 12:03:30|
| 1| null| 12:05:01|
| 1| null| 12:10:01|
| 1| 5| 12:12:20|
| 1| null| 12:15:01|
| 1| null| 12:20:01|
| 1| null| 12:25:01|
| 1| null| 12:30:01|
| 1| null| 12:35:01|
| 1| null| 12:40:01|
| 1| null| 12:45:01|
| 1| null| 12:50:01|
| 1| null| 12:55:01|
| 1| null| 13:00:01|
| 1| null| 13:05:01|
| 1| null| 13:10:01|
| 1| null| 13:15:01|
| 1| null| 13:20:01|
| 1| null| 13:25:01|
+---+-----+---------+
only showing top 20 rows
Data Prep to replicate
raw_data1 = [
{"id":1,"value":3,"timestam":"12:00:01"},
{"id":1,"value":4,"timestam":"12:03:30"},
{"id":1,"value":5,"timestam":"12:12:20"},
{"id":1,"value":3,"timestam":"15:00:00"},
]
raw_data = [*map(lambda entry : Row(**entry),raw_data1)]
initial_data = sqlContext.createDataFrame(raw_data,schema="id int, value int, timestam string ")
initial_data.createOrReplaceTempView('initial_data')
sqlContext.sql("create or replace temp view initial_data2 as select id,value,cast(timestam as timestamp) as timestamp from initial_data")

Mutating error on an AFTER insert trigger

CREATE OR REPLACE TRIGGER TRG_INVOICE
AFTER INSERT
ON INVOICE
FOR EACH ROW
DECLARE
V_SERVICE_COST FLOAT;
V_SPARE_PART_COST FLOAT;
V_TOTAL_COST FLOAT;
V_INVOICE_DATE DATE;
V_DUEDATE DATE;
V_REQ_ID INVOICE.SERVICE_REQ_ID%TYPE;
V_INV_ID INVOICE.INVOICE_ID%TYPE;
BEGIN
V_REQ_ID := :NEW.SERVICE_REQ_ID;
V_INV_ID := :NEW.INVOICE_ID;
SELECT SUM(S.SERVICE_COST) INTO V_SERVICE_COST
FROM INVOICE I, SERVICE_REQUEST SR, SERVICE S, SERVICE_REQUEST_TYPE SRT
WHERE I.SERVICE_REQ_ID = SR.SERVICE_REQ_ID
AND SR.SERVICE_REQ_ID = SRT.SERVICE_REQ_ID
AND SRT.SERVICE_ID = S.SERVICE_ID
AND I.SERVICE_REQ_ID = V_REQ_ID;
SELECT SUM(SP.PRICE) INTO V_SPARE_PART_COST
FROM INVOICE I, SERVICE_REQUEST SR, SERVICE S, SERVICE_REQUEST_TYPE SRT,
SPARE_PART_SERVICE SRP,
SPARE_PART SP
WHERE I.SERVICE_REQ_ID = SR.SERVICE_REQ_ID
AND SR.SERVICE_REQ_ID = SRT.SERVICE_REQ_ID
AND SRT.SERVICE_ID = S.SERVICE_ID
AND S.SERVICE_ID = SRP.SERVICE_ID
AND SRP.SPARE_PART_ID = SP.SPARE_PART_ID
AND I.SERVICE_REQ_ID = V_REQ_ID;
V_TOTAL_COST := V_SERVICE_COST + V_SPARE_PART_COST;
SELECT SYSDATE INTO V_INVOICE_DATE FROM DUAL;
SELECT ADD_MONTHS(SYSDATE, 1) INTO V_DUEDATE FROM DUAL;
UPDATE INVOICE
SET COST_SERVICE_REQ = V_SERVICE_COST, COST_SPARE_PART =
V_SPARE_PART_COST,
TOTAL_BALANCE = V_TOTAL_COST, PAYMENT_DUEDATE = V_DUEDATE, INVOICE_DATE =
V_INVOICE_DATE
WHERE INVOICE_ID = V_INV_ID;
END;
I'm trying to calculate some columns after the user inserts a row.
Using the service_request_id I want to calculate the service/parts/total cost. Also, I would like to generate the creation and due dates. But, I keep getting
INVOICE is mutating, trigger/function may not see it
Not sure how the table is mutating after the insert statement.
Not sure how the table is mutating after the insert statement.
Imagine a simple table:
create table x(
x int,
my_sum int
);
and an AFTER INSERT FOR EACH ROW trigger, similar to yours, which calculates a sum of all values in the table and updates my_sum column.
Now imagine this insert statement:
insert into x( x )
select 1 as x from dual
connect by level <= 1000;
This single statement basically inserts 1000 records, each one with 1 value, see this demo: http://sqlfiddle.com/#!4/0f211/7
Since in SQL each individual statement must be ATOMIC (more on this here: Statement-Level Read Consistency, Oracle is free to perform this query in any way as long as the final result is correct (consistent). It can save records in the order of execution, maybe in reverse order, it can divide the batch into 10 threads and do it in parallel.
Since the trigger is fired individually after inserting each row, and it cannot know in advance the "final" result, then considering the above all the below results are possible depending on "internal" method choosed by Oracle to execute this query. As you see, these result do not meet the definition of consistency. And Oracle prevents this issuing mutating table error.
In other words - your assumption are bad and your design is flawed, you need to change it.
| X | MY_SUM |
|---|--------|
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
...
...
or maybe :
| X | MY_SUM |
|---|--------|
| 1 | 1000 |
| 1 | 1000 |
| 1 | 1000 |
| 1 | 1000 |
| 1 | 1000 |
| 1 | 1000 |
| 1 | 1000 |
...
or maybe:
| X | MY_SUM |
|---|--------|
| 1 | 4 |
| 1 | 8 |
| 1 | 12 |
| 1 | 16 |
| 1 | 20 |
| 1 | 24 |
| 1 | 28 |
...
...

Find a latest gap between Unix timestamps

I currently have two functions that should return the time a device started logging again, the time when the previous row before it was more than 60 seconds away. These functions may work fine but I have to see it work as it takes forever. Is there any shortcuts to make this faster?
CREATE OR REPLACE FUNCTION findNextTime(startt integer)
RETURNS integer AS
$nextTime$
DECLARE
nextTime integer;
BEGIN
select time into nextTime from m01 where time < startt ORDER BY time DESC LIMIT 1;
return nextTime;
END;
$nextTime$ LANGUAGE plpgsql;
CREATE OR REPlACE FUNCTION findStart()
RETURNS integer AS
$lastTime$
DECLARE
currentTime integer;
lastTime integer;
BEGIN
select time into currentTime from m01 ORDER BY time DESC LIMIT 1;
LOOP
RAISE NOTICE 'Current Time: %', currentTime;
select findNextTime(currentTime) into lastTime;
EXIT WHEN ((currentTime - lastTime) > 60);
currentTime := lastTime;
END LOOP;
return lastTime;
END;
$lastTime$ LANGUAGE plpgsql;
To clarify, I want to essentially find the last time there was a break of more than 60 seconds between any two rows.
CREATE TABLE IF NOT EXISTS m01 (
time integer,
value decimal,
id smallint,
driveId smallint
)
Sample Data:
In this case it would return 1520376063 because the next entry (1520375766) is more than 60 seconds apart it.
| time | value | id | driveid |
|------------|--------------------|------|---------|
| 1520376178 | 516.2 | 5116 | 2 |
| 1520376173 | 507.8 | 5116 | 2 |
| 1520376168 | 499.5 | 5116 | 2 |
| 1520376163 | 491.1 | 5116 | 2 |
| 1520376158 | 482.90000000000003 | 5116 | 2 |
| 1520376153 | 474.5 | 5116 | 2 |
| 1520376148 | 466.20000000000005 | 5116 | 2 |
| 1520376143 | 457.8 | 5116 | 2 |
| 1520376138 | 449.5 | 5116 | 2 |
| 1520376133 | 441.20000000000005 | 5116 | 2 |
| 1520376128 | 432.90000000000003 | 5116 | 2 |
| 1520376123 | 424.6 | 5116 | 2 |
| 1520376118 | 416.20000000000005 | 5116 | 2 |
| 1520376113 | 407.8 | 5116 | 2 |
| 1520376108 | 399.5 | 5116 | 2 |
| 1520376103 | 391.20000000000005 | 5116 | 2 |
| 1520376098 | 382.90000000000003 | 5116 | 2 |
| 1520376093 | 374.5 | 5116 | 2 |
| 1520376088 | 366.20000000000005 | 5116 | 2 |
| 1520376083 | 357.8 | 5116 | 2 |
| 1520376078 | 349.5 | 5116 | 2 |
| 1520376073 | 341.20000000000005 | 5116 | 2 |
| 1520376068 | 332.90000000000003 | 5116 | 2 |
| 1520376063 | 324.5 | 5116 | 2 |
| 1520375766 | 102.5 | 5116 | 2 |
This simple query should replace your two functions. Note the window function lead() in the subquery:
SELECT *
FROM (
SELECT time, lead(time) OVER (ORDER BY time DESC) AS last_time
FROM m01
WHERE time < _startt
) sub
WHERE time > last_time + 60
ORDER BY time DESC
LIMIT 1;
Either way, the crucial part for performance is the right index. Ideally on (time DESC).
Assuming time is defined NOT NULL - which it probably should be, but the table definition in the question does not say so. Else you probably want ORDER BY time DESC NULLS LAST - and a matching index. See:
PostgreSQL sort by datetime asc, null first?
I expect this plpgsql function to perform faster, though, if gaps typically show up early:
CREATE OR REPLACE FUNCTION find_gap_before_time(_startt int)
RETURNS int AS
$func$
DECLARE
_current_time int;
_last_time int;
BEGIN
FOR _last_time IN -- single loop is enough!
SELECT time
FROM m01
WHERE time < _startt
ORDER BY time DESC -- NULLS LAST?
LOOP
IF _current_time > _last_time + 60 THEN -- never true for 1st row
RETURN _current_time;
END IF;
_current_time := _last_time;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT find_gap_before_time(1520376200);
Result as requested.
Aside: You'd typically save a couple of bytes per row in storage by placing the column value last or first, thereby minimizing alignment padding. Like:
CREATE TABLE m01 (
time integer,
id smallint,
driveId smallint,
value decimal
);
Detailed explanation:
Calculating and saving space in PostgreSQL

Pick a record based on a given value in postgres

I have a table in postgres like below,
alg_campaignid | alg_score | cp | sum
----------------+-----------+---------+----------
9829 | 30.44056 | 12.4000 | 12.4000
9880 | 29.59280 | 12.0600 | 24.4600
9882 | 29.59280 | 12.0600 | 36.5200
9827 | 29.27504 | 11.9300 | 48.4500
9821 | 29.14840 | 11.8800 | 60.3300
9881 | 29.14840 | 11.8800 | 72.2100
9883 | 29.14840 | 11.8800 | 84.0900
10026 | 28.79280 | 11.7300 | 95.8200
10680 | 10.31504 | 4.1800 | 100.0000
From which i have to select a record based on randomly generated number from 0 to 100.i.e first record should be returned if random number picked is between 0 and 12.4000,second if rendom is between 12.4000 and 24.4600,and likewise last if random no is between 95.8200 and 100.0000.
For Example
if the random number picked is 8 then the first record should be returned
or
if the random number picked is 48 then the fourth record should be returned
Is it possible to do this postgres if so kindly recommend a solution for this..
Yes, you can do this in Postgres. If you want to generate the number in the database:
with r as (
select random() * 100 as r
)
select t.*
from table t cross join r
where t.sum <= r.r
order by t.sum desc
limit 1;

Copy and Cascade insert using PL/SQL

Given data structure:
I have the following table My_List, where Sup_ID is Primary Key
My_List
+--------+----------+-----------+
| Sup_ID | Sup_Name | Sup_Code |
+--------+----------+-----------+
| 1 | AA | 23 |
| 2 | BB | 87 |
| 3 | CC | 90 |
+--------+----------+-----------+
And the following table _MyList_details, where Buy_ID is Primary Key and Sup_ID is Foreign Key points at My_List.Sup_ID
My_List_details
+--------+--------+------------+------------+------------+
| Buy_ID | Sup_ID | Sup_Detail | Max_Amount | Min_Amount |
+--------+--------+------------+------------+------------+
| 23 | 1 | AAA | 1 | 10 |
| 33 | 2 | BBB | 11 | 20 |
| 43 | 3 | CCC | 21 | 30 |
+--------+--------+------------+------------+------------+
Finally, I have the table My_Sequence as follow:
My_Sequence
+-----+------+
| Seq | Name |
+-----+------+
| 4 | x |
| 5 | y |
| 6 | z |
+-----+------+
---------------------------------------------------
Objectives
Write PL/SQL script to:
Using a cursor, I need to copy My_List records and re-insert it with the new Sup_ID copied from My_Sequence.Seq.
I need to copy My_List_details records and re-insert them with the new Sup_ID foreign key.
------------------------------------------------------------------------------
Expected Outcome
My_List
+--------+----------+----------+
| Sup_ID | Sub_Name | Sub_Code |
+--------+----------+----------+
| 1 | AA | 23 |
| 2 | BB | 87 |
| 3 | CC | 90 |
| 4 | AA | 23 |
| 5 | BB | 87 |
| 6 | CC | 90 |
+--------+----------+----------+
My_List_details
+--------+--------+------------+------------+------------+
| Buy_ID | Sup_ID | Sub_Detail | Max_Amount | Min_Amount |
+--------+--------+------------+------------+------------+
| 23 | 1 | AAA | 1 | 10 |
| 33 | 2 | BBB | 11 | 20 |
| 43 | 3 | CCC | 21 | 30 |
| 53 | 4 | AAA | 1 | 10 |
| 63 | 5 | BBB | 11 | 20 |
| 73 | 6 | CCC | 21 | 30 |
+--------+--------+------------+------------+------------+
What I have started with is the following:
DECLARE
NEW_Sup_ID Sup_ID%type := Seq;
c_Sup_Name Sup_Name%type;
c_Sup_Code Sup_Code%type;
c_Buy_ID Buy_ID%type;
c_Sup_Detail Sup_Detail%type;
c_Max_Amount Max_Amount%type
c_My_Min_Amount Min_Amount%type
CURSOR c_My_List
IS
SELECT * FROM My_List;
CURSOR c_My_List_details
IS
SELECT * FROM My_List_details
BEGIN
FETCH c_My_List INTO NEW_Sup_ID, c_Sup_Name, c_Sup_Code;
INSERT INTO My_List;
FETCH c_My_List_details INTO c_Buy_ID, NEW_Sup_ID, c_Sup_Detail, c_Max_Amount, c_Min_Amount
INSERT INTO My_List_details
END;
/
Aside from the syntax errors, I do not see my script copy row by row and insert them to both tables accordingly. Further, the number of My_Sequence records is bigger than the number of My_List records. So what I need is, if My_List records are 50, I need the script to copy the first 50 Seq from My_Sequence.
---------------------------------------------------------------------------------
Question
How to achieve this result? I have searched and found Tom Kyte for cascade update but I am not sure if I do need to use this package, I am a bit beginner in PL/SQL and it is a bit complicated for me to utilize such a comprehensive package. Further, it's for cascade update and my case is about re-insert. I'd appreciate any help
The following Sql Statements will perform the task on the schema defined at this SqlFiddle. Note that I have changed a couple of field and table names - because they clash with Oracle terms. SqlFiddle seems to have some problems with my code, but it has been tested on another (amphibious) client which shall remain nameless.
The crucial point (As I said in my comments) is deriving a rule to map old sequence number to new. The view SEQUENCE_MAP performs this task in the queries below.
You may be disappointed by my reply because it depends upon there being the exact same number of sequence records as LIST/LIST_DETAILS, and hence it can only be run once. Your final PL/SQL can perform the necessary checks, I hope.
Hopefully it is a matter of refining the sequence_map logic to get you where you want to be.
Avoid using cursors; ideally when manipulating relational data you need to think in terms of sets of data rather than rows. This is because if you use set-thinking Oracle can do its magic in optimising, parallelising and so-on. Oracle is brilliant at scaling up - If a table is split over multiple disks, for example, it may process your request with data from the multiple disks simultaneously. If you force it into a row-by-row, procedural logic you may find that the applications you write do not scale up well.
CREATE OR REPLACE VIEW SEQUENCE_MAP AS (
SELECT OLD_SEQ, NEW_SEQ FROM
(
( SELECT ROWNUM AS RN, SUP_ID AS OLD_SEQ FROM
(SELECT SUP_ID FROM LIST ORDER BY SUP_ID) ) O
JOIN
( SELECT ROWNUM AS RN, SUP_ID AS NEW_SEQ FROM
(SELECT SEQ AS SUP_ID FROM SEQUENCE_TABLE ORDER BY SEQ) ) N
ON N.RN = O.RN
)
);
INSERT INTO LIST
(
SELECT
NEW_SEQ, SUB_NAME, SUB_CODE
FROM
SEQUENCE_MAP
JOIN LIST L ON
L.SUP_ID = SEQUENCE_MAP.OLD_SEQ
);
INSERT INTO LIST_DETAILS
(
SELECT
BUY_ID, NEW_SEQ, SUB_DETAIL, MAX_FIELD, MIN_FIELD
FROM
SEQUENCE_MAP
JOIN LIST_DETAILS L ON
L.SUP_ID = SEQUENCE_MAP.OLD_SEQ
);
I would do 2 inner loops, and search the next sequence to use.
I imagine the new buy_id is assigned via trigger using a sequence, or something equivalent, else you'll have to generate it in your code.
I have no Oracle database available to test it, so don't pay attention to syntax.
DECLARE
NEW_Sup_ID Sup_ID%type := Seq;
c_Sup_ID Sup_ID%type := Seq;
c_Sup_Name Sup_Name%type;
c_Sup_Code Sup_Code%type;
c_Buy_ID Buy_ID%type;
c_Sup_Detail Sup_Detail%type;
c_Max_Amount Max_Amount%type;
c_My_Min_Amount Min_Amount%type;
CURSOR c_My_List
IS
SELECT * FROM My_List;
CURSOR c_My_List_details
IS
SELECT * FROM My_List_details where sup_id=c_Sup_ID;
BEGIN
for c_My_List IN c_Sup_ID, c_Sup_Name, c_Sup_Code loop
select min(seq) from My_sequence into NEW_Sup_ID;
INSERT INTO My_List (sup_id,...) values (NEW_Sup_ID,...);
for c_My_List_details IN c_Buy_ID, NEW_Sup_ID, c_Sup_Detail, c_Max_Amount, c_Min_Amount loop
INSERT INTO My_List_details (sup_id, ...) values (NEW_Sup_ID,...);
end loop;
deelte from from My_sequence where seq= NEW_Sup_ID;
end loop;
commit;
END;
/