How to use multiple transactions in Snowflake Task? - sql

I have two ETL jobs running on a stream I've created on a table. I need to run both on the same stream data and I read that in order to do so the DML statements (in my case merge statements) need to be wrapped in a transaction and committed at the end. I can't seem to be able to do that in a task though. I think I'm messing the semi-colon somewhere. This is what I've tried
create or replace task my_task as
begin;
merge into my_table1 t using my_stream s on t.id=s.id when matched insert values (id, col1);
merge into my_table2 t using my_stream s on t.id=s.id when matched insert values (id, col2);
commit;
This is just an example, the merge statements do more complex stuff.
The script just runs up to begin if I use a semi-colon or get an EOF error if I don't use one even though I have multiple semi-colons later in the script (so it tries to read past commit)

The task can call a stored procedure that contains the different statements within a transaction:
create procedure ...
as
$$
...
statement1;
BEGIN TRANSACTION;
statement2;
COMMIT;
statement3;
...
$$;
https://docs.snowflake.com/en/sql-reference/transactions.html

Related

PostgreSQL FOUND for CREATE TABLE statements

I am creating a function that will create a new table and insert informations about that table into other tables.
To create that table I am using the
CREATE TABLE IF NOT EXISTS
statement. Sadly it does not update the FOUND special variable in PostgreSQL nor can i find any other variable that would be updated.
Is there any way in PL/PGSQL to know whether that statement created a table or not?
The target of it is to not to double the informations in the other tables.
You may use CREATE TABLE AS in combination with ON_ERROR_ROLLBACK:
BEGIN;
-- Do inital stuff
\set ON_ERROR_ROLLBACK on
CREATE TABLE my_table AS
SELECT id, name FROM (VALUES (1, 'Bob'), (2, 'Mary')) v(id, name);
\set ON_ERROR_ROLLBACK off
-- Do remaining stuff
END;
To put it bluntly, with \set ON_ERROR_ROLLBACK on postgres will create a savepoint before each statement and automatically rollback to this savepoint or releasing it depending on the success of that statement.
The code above will execute initial and remaining stuff even if the table creation fails.
No, there are not any information if this command created table or not. The found variable is updated after query execution - not after DDL command. There is guaranteed so after this command, the table will be or this command fails to an exception.

Should i commit at the end of the procedure which is called by an oracle scheduler job

I am running an oracle JOB which will run a PROCEDURE to CREATE TRUNCATE INSERT DROP some relevant tables.
Is this the best way to do a functionality like this ?
Should I Commit at the end of the procedure or not ?
CREATE OR REPLACE Procedure PR_NAME
IS
BEGIN
CREATE TABLE TABLE_1_BAC AS SELECT * FROM TABLE1_VIA_DBLINK;
TRUNCATE TABLE TABLE_1;
INSERT INTO TABLE_1 SELECT * FROM TABLE_1_BAC;
DROP TABLE TABLE_1_BAC;
--COMMIT;
EXCEPTION
WHEN OTHERS THEN
raise_application_error(-20001,'An error was encountered - '||SQLCODE||' -ERROR- '||SQLERRM);
END;
To create TABLE_1 only once for present data :
CREATE TABLE TABLE_1 AS SELECT * FROM TABLE1_VIA_DBLINK;
and creating an insert trigger for TABLE1_VIA_DBLINK, populating TABLE_1 through this trigger for new datas, and to get rid of this job and procedure seems more feasible.
As you stay in this job, perhaps you'll wait for huge data to be inserted.
By the way, if you insist on using this job, you don't need to issue commit, since and there's already an implicit commit exists inside job mechanism.
What kind of job do you use? JOB or SCHEDULER JOB?
I don't see any reason to DROP/CREATE the table. I don't see any reason why you use the intermediate table at all.
Simply make
CREATE OR REPLACE Procedure PR_NAME
IS
BEGIN
EXECUTE IMMEDIATE 'TRUNCATE TABLE TABLE_1';
INSERT INTO TABLE_1 SELECT * FROM TABLE1_VIA_DBLINK;
COMMIT;
END;
You don't need any exception handler. In case of JOB you will not see the exception anyway. In case of SCHEDULER JOB you can see exception in views
*_SCHEDULER_JOB_LOG
*_SCHEDULER_JOB_RUN_DETAILS
If you make just this kind of operation you should consider MATERIALIZED VIEW which basically make the same: TRUNCATE and INSERT INTO ... SELECT * FROM ...
No. You don't need to commit.
Those commands are DDL (Data Definition Language) in SQL. So Oracle Database will issue a commit together with the command.
DML (Data Manipulating Language) - like SELECT, UPDATE, INSERT, DELETE requires a commit.
In a scenario, where you will update, delete and insert records. Then you ran a create table command. The records inserted, updated and deleted will be committed (save) to the database.

Why should we use rollback in sql explicitly?

I'm using PostgreSQL 9.3
I have one misunderstanding about transactions and how they work. Suppose we wrapped some SQL operator within a transaction like the following:
BEGIN;
insert into tbl (name, val) VALUES('John', 'Doe');
insert into tbl (name, val) VALUES('John', 'Doee');
COMMIT;
If something goes wrong the transaction will automatically be rolled back. Taking that into account I can't get when should we use ROLLBACK explicitly? Could you get an example when it's necessary?
In PostgreSQL the transaction is not automatically rolled back on error.
It is set to the aborted state, where further commands will fail with an error until you roll the transaction back.
Observe:
regress=> BEGIN;
BEGIN
regress=> LOCK TABLE nosuchtable;
ERROR: relation "nosuchtable" does not exist
regress=> SELECT 1;
ERROR: current transaction is aborted, commands ignored until end of transaction block
regress=> ROLLBACK;
ROLLBACK
This is important, because it prevents you from accidentally executing half a transaction. Imagine if PostgreSQL automatically rolled back, allowing new implicit transactions to occur, and you tried to run the following sequence of statements:
BEGIN;
INSERT INTO archive_table SELECT * FROM current_tabble;
DELETE FROM current_table;
COMMIT;
PostgreSQL will abort the transaction when it sees the typo current_tabble. So the DELETE will never happen - all statements get ignored after the error, and the COMMIT is treated as a ROLLBACK for an aborted transaction:
regress=> BEGIN;
BEGIN
regress=> SELECT typo;
ERROR: column "typo" does not exist
regress=> COMMIT;
ROLLBACK
If it instead automatically rolled the transaction back, it'd be like you ran:
BEGIN;
INSERT INTO archive_table SELECT * FROM current_tabble;
ROLLBACK; -- automatic
BEGIN; -- automatic
DELETE FROM current_table;
COMMIT; -- automatic
... which, needless to say, would probably make you quite upset.
Other uses for explicit ROLLBACK are manual modification and test cases:
Do some changes to the data (UPDATE, DELETE ...).
Run SELECT statements to check results of data modification.
Do ROLLBACK if results are not as expected.
In Postgres DB you can do this even with DDL statements (CREATE TABLE, ...)

PL/SQL Oracle Stored Procedure loop structure

Just wondering if the way I put COMMIT in the code block is appropriate or not? Should I put them when it finished loop or after each insert statement or after the if else statement?
FOR VAL1 IN (SELECT A.* FROM TABLE_A A) LOOP
IF VAL1.QTY >= 0 THEN
INSERT INTO TEMP_TABLE VALUES('MORE OR EQUAL THAN 0');
COMMIT; /*<-- Should I put this here?*/
INSERT INTO AUDIT_TABLE VALUE('DATA INSERTED >= 0');
COMMIT; /*<-- Should I put this here too?*/
ELSE
INSERT INTO TEMP_TABLE VALUES ('0');
COMMIT; /*<-- Should I put this here too?*/
INSERT INTO AUDIT_TABLE('DATA INSERTED IS 0');
COMMIT; /*<-- Should I put this here too?*/
END IF;
/*Or put commit here?*/
END LOOP;
/*Or here??*/
Generally, committing in a loop is not a good idea, especially after every DML in that loop. Doing so you force oracle(LGWR) to write data in redo log files and may find yourself in a situation when other sessions hang because of log file sync wait event. Or facing ORA-1555 because undo segments will be cleared more often.
Divide your DMLs into logical units of work (transactions) and commit when that unit of work is done, not before and not too late or in a middle of a transaction. This will allow you to keep your database in a consistent state. If, for example, two insert statements form a one unit of work(one transaction), it makes sense to commit or rollback them altogether not separately.
So, generally, you should commit as less as possible. If you have to commit in a loop, introduce some threshold. For instance issue commit after, let say 150 rows:
declare
l_commit_rows number;
For i in (select * from some_table)
loop
l_commit_rows := l_commit_rows + 1;
insert into some_table(..) values(...);
if mode(l_commit_rows, 150) = 0
then
commit;
end if;
end loop;
-- commit the rest
commit;
It is rarely appropriate; say your insert into TEMP_TABLE succeeds but your insert into AUDIT_TABLE fails. You then don't know where you are at all. Additionally, commits will increase the amount of time it takes to perform an operation.
It would be more normal to do everything within a single transaction; that is remove the LOOP and perform your inserts in a single statement. This can be done by using a multi-table insert and would look something like this:
insert
when ( a.qty >= 0 ) then
into temp_table values ('MORE OR EQUAL THAN 0')
into audit_table values ('DATA INSERTED >= 0')
else
into temp_table values ('0')
into audit_table values ('DATA INSERTED IS 0')
select qty from table_a
A simple rule is to not commit in the middle of an action; you need to be able to tell exactly where you were if you have to restart an operation. This normally means, go back to the beginning but doesn't have to. For instance, if you were to place your COMMIT inside your loop but outside the IF statement then you know that that has completed. You'd have to write back somewhere to tell you that this operation has been completed though or use your SQL statement to determine whether you need to re-evaluate that row.
If you insert commit after each insert statement then the database will commit each row inserted. Same will happen if you insert commit after the IF statement ends. (So both will commit after each inserted row). If commit is given after loop then commit will happen after all rows are inserted.
Commit after the loop should work faster as it will commit bulk data but if your loop encounters any error (say after 50 rows are processed there is an error) then your 50 rows also won't be inserted.
So according to your requirement u can either use commit after if or after loop

Nested transactions in postgresql 8.2?

I'm working on scripts that apply database schema updates. I've setup all my SQL update scripts using start transaction/commit. I pass these scripts to psql on the command line.
I now need to apply multiple scripts at the same time, and in one transaction. So far the only solution I've come up with is to remove the start transaction/commit from the original set of scripts, then jam them together inside a new start transaction/commit block. I'm writing perl scripts to do this on the fly.
Effectively I want nested transactions, which I can't figure out how to do in postgresql.
Is there any way to do or simulate nested transactions for this purpose? I have things setup to automatically bail out on any error, so I don't need to continue in the top level transaction if any of the lower ones fail.
Well you have the possibility to use nested transactions inside postgresql using SavePoints.
Take this code example:
CREATE TABLE t1 (a integer PRIMARY KEY);
CREATE FUNCTION test_exception() RETURNS boolean LANGUAGE plpgsql AS
$$BEGIN
INSERT INTO t1 (a) VALUES (1);
INSERT INTO t1 (a) VALUES (2);
INSERT INTO t1 (a) VALUES (1);
INSERT INTO t1 (a) VALUES (3);
RETURN TRUE;
EXCEPTION
WHEN integrity_constraint_violation THEN
RAISE NOTICE 'Rollback to savepoint';
RETURN FALSE;
END;$$;
BEGIN;
SELECT test_exception();
NOTICE: Rollback to savepoint
test_exception
----------------
f
(1 row)
COMMIT;
SELECT count(*) FROM t1;
count
-------
0
(1 row)
Maybe this will help you out a little bit.
I've ended up 'solving' my problem out of band - I use a perl script to re-work the input scripts to eliminate their start transaction/commit calls, then push them all into one file, which gets it's own start transaction/commit.