There are two tables: gc_res and anchor. Anchor table has a unique key - anchor_id.
When Data arrives in gc_res the res_eval trigger is fired and the data is partially evaluated by gc_res_eval().
Some of the entries remain, others are pushed with an insert into anchor table. The anchor table has a trigger before insert, which also pre-evaluates the data and either accepts the insert or updates an old entry. In latter case it might additionally write to a third (log) table.
Both triggers look the same:
CREATE TRIGGER res_eval BEFORE INSERT ON gc_res
FOR EACH ROW EXECUTE PROCEDURE gc_res_eval();
CREATE TRIGGER anchor_smart_insert BEFORE INSERT ON anchor
FOR EACH ROW EXECUTE PROCEDURE anchor_smart_insert();
When I directly try to write to anchor table, the anchor_smart_insert trigger fires properly and works fine.
The gc_res_eval part which tries to insert the data works with a CTE:
WITH anchor_insert AS (
INSERT INTO cbm.anchor (anchor_id, pin, geo, rwo, rwo_value, addr) (
SELECT osm_id, pin, geo, rwo, rwo_value, addr FROM gc_eval
WHERE similarity > accept_polygon AND geo_type = 'Polygon'
) RETURNING anchor_id
) SELECT array_agg(anchor_id) FROM anchor_insert INTO anchors
;
When gc_res_eval tries to insert into the anchor table and there are no conflicts on anchor_id - everything works fine. If there is a conflict, the insert fails on the unique constraint. Thus I assume - that the second trigger does not fire.
CREATE OR REPLACE FUNCTION anchor_smart_insert()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
DECLARE
geog geography;
acceptable_intersect int := 10;
acceptable_overlap_index NUMERIC := 0.56;
BEGIN
SELECT INTO geog geo FROM cbm.anchor a
WHERE a.anchor_id=NEW.anchor_id;
RAISE NOTICE 'anchor_smart_insert is running';
IF ST_Area(ST_Intersection(geog, NEW.geo))>acceptable_intersect THEN
UPDATE cbm.anchor a SET
pin=COALESCE(NEW.pin, ST_Centroid(NEW.geo)),
geo=NEW.geo,
rwo=NEW.rwo,
rwo_value=NEW.rwo_value,
addr=NEW.addr
WHERE a.anchor_id=NEW.anchor_id AND (
a.geo IS DISTINCT FROM NEW.geo
OR a.rwo IS DISTINCT FROM NEW.rwo
OR a.rwo_value IS DISTINCT FROM NEW.rwo_value
OR a.addr IS DISTINCT FROM NEW.addr);
INSERT INTO cbm.anchor_history (anchor_id, geo) VALUES (NEW.anchor_id, geog);
IF ST_Area(ST_Intersection(geog, NEW.geo))^2/(ST_Area(geog)*ST_Area(NEW.geo))<acceptable_overlap_index THEN
INSERT INTO cbm.anchor_review(anchor_id, geo, rwo, rwo_value, addr) VALUES
(NEW.anchor_id, NEW.geo, NEW.rwo, NEW.rwo_value, NEW.addr);
END IF;
RETURN NULL;
END IF;
RETURN NEW;
END;
$function$
;
The DBMS is a pg11 AWS RDS.
Blaming myself:
The anchor_smart_insert() function was working fine as long as the polygons which have been compared were greater than 10sqm. There was the assumption, that there should be only two options: greater than 10sqm, or null.
With such logical reasoning the part, which captures all exemptions was deleted since it seemed unnecessary.
This is also why the tests via direct insert worked well: the polygons were >10sqm and resulted in an update.
Thanks #Adrian Klaver - your suggestions helped!
The restored/improved function properly handles polygons <10sqm in a way, that it pre-compares it to initial and final sizes - if they are smaller, it takes a safer value.
Additionally it takes explicitly care of possible inserts (existing anchor_id ISNULL).
CREATE OR REPLACE FUNCTION cbm.anchor_smart_insert()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
DECLARE
geog geography;
acceptable_intersect NUMERIC:= 10;
acceptable_overlap_index NUMERIC := 0.56;
BEGIN
SELECT INTO geog geo FROM cbm.anchor a
WHERE a.anchor_id=NEW.anchor_id;
IF geog ISNULL THEN RETURN NEW; END IF;
IF ST_Area(ST_Intersection(geog, NEW.geo))>LEAST(acceptable_intersect, GREATEST(ST_Area(geog), ST_Area(NEW.geo))) THEN
UPDATE cbm.anchor a SET
pin=COALESCE(NEW.pin, ST_Centroid(NEW.geo)),
geo=NEW.geo,
rwo=NEW.rwo,
rwo_value=NEW.rwo_value,
addr=NEW.addr
WHERE a.anchor_id=NEW.anchor_id AND (
a.geo IS DISTINCT FROM NEW.geo
OR a.rwo IS DISTINCT FROM NEW.rwo
OR a.rwo_value IS DISTINCT FROM NEW.rwo_value
OR a.addr IS DISTINCT FROM NEW.addr);
IF ST_Area(ST_Intersection(geog, NEW.geo))^2/(ST_Area(geog)*ST_Area(NEW.geo))<acceptable_overlap_index THEN
INSERT INTO cbm.anchor_history (anchor_id, geo) VALUES (NEW.anchor_id, geog);
END IF;
END IF;
-- bad intersection:
INSERT INTO cbm.anchor_review(anchor_id, geo, rwo, rwo_value, addr) VALUES
(NEW.anchor_id, NEW.geo, NEW.rwo, NEW.rwo_value, NEW.addr);
RETURN NULL;
END;
$function$
;
Related
I have a table that looks like this:
CREATE TABLE label (
hid UUID PRIMARY KEY DEFAULT UUID_GENERATE_V4(),
name TEXT NOT NULL UNIQUE
);
I want to create a function that takes a list of names and inserts multiple rows into the table, ignoring duplicate names, and returns an array of the IDs generated for the rows it inserted.
This works:
CREATE OR REPLACE FUNCTION insert_label(nms TEXT[])
RETURNS UUID[]
AS $$
DECLARE
ids UUID[];
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
WITH new_names AS (
INSERT INTO label(name)
SELECT tn.name
FROM tmp_names tn
WHERE NOT EXISTS(SELECT 1 FROM label h WHERE h.name = tn.name)
RETURNING hid
)
SELECT ARRAY_AGG(hid) INTO ids
FROM new_names;
DROP TABLE tmp_names;
RETURN ids;
END;
$$ LANGUAGE PLPGSQL;
I have many tables with the exact same columns as the label table, so I would like to have a function that can insert into any of them. I'd like to create a dynamic query to do that. I tried that, but this does not work:
CREATE OR REPLACE FUNCTION insert_label(h_tbl REGCLASS, nms TEXT[])
RETURNS UUID[]
AS $$
DECLARE
ids UUID[];
query_str TEXT;
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
query_str := FORMAT('WITH new_names AS ( INSERT INTO %1$I(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM %1$I h WHERE h.name = tn.name) RETURNING hid)', h_tbl);
EXECUTE query_str;
SELECT ARRAY_AGG(hid) INTO ids FROM new_names;
DROP TABLE tmp_names;
RETURN ids;
END;
$$ LANGUAGE PLPGSQL;
This is the output I get when I run that function:
psql=# select insert_label('label', array['how', 'now', 'brown', 'cow']);
ERROR: syntax error at end of input
LINE 1: ...SELECT 1 FROM label h WHERE h.name = tn.name) RETURNING hid)
^
QUERY: WITH new_names AS ( INSERT INTO label(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM label h WHERE h.name = tn.name) RETURNING hid)
CONTEXT: PL/pgSQL function insert_label(regclass,text[]) line 19 at EXECUTE
The query generated by the dynamic SQL looks like it should be exactly the same as the query from static SQL.
I got the function to work by changing the return value from an array of UUIDs to a table of UUIDs and not using CTE:
CREATE OR REPLACE FUNCTION insert_label(h_tbl REGCLASS, nms TEXT[])
RETURNS TABLE (hid UUID)
AS $$
DECLARE
query_str TEXT;
BEGIN
CREATE TEMP TABLE tmp_names(name TEXT);
INSERT INTO tmp_names SELECT UNNEST(nms);
query_str := FORMAT('INSERT INTO %1$I(name) SELECT tn.name FROM tmp_names tn WHERE NOT EXISTS(SELECT 1 FROM %1$I h WHERE h.name = tn.name) RETURNING hid', h_tbl);
RETURN QUERY EXECUTE query_str;
DROP TABLE tmp_names;
RETURN;
END;
$$ LANGUAGE PLPGSQL;
I don't know if one way is better than the other, returning an array of UUIDs or a table of UUIDs, but at least I got it to work one of those ways. Plus, possibly not using a CTE is more efficient, so it may be better to stick with the version that returns a table of UUIDs.
What I would like to know is why the dynamic query did not work when using a CTE. The query it produced looked like it should have worked.
If anyone can let me know what I did wrong, I would appreciate it.
... why the dynamic query did not work when using a CTE. The query it produced looked like it should have worked.
No, it was only the CTE without (required) outer query. (You had SELECT ARRAY_AGG(hid) INTO ids FROM new_names in the static version.)
There are more problems, but just use this query instead:
INSERT INTO label(name)
SELECT unnest(nms)
ON CONFLICT DO NOTHING
RETURNING hid;
label.name is defined UNIQUE NOT NULL, so this simple UPSERT can replace your function insert_label() completely.
It's much simpler and faster. It also defends against possible duplicates from within your input array that you didn't cover, yet. And it's safe under concurrent write load - as opposed to your original, which might run into race conditions. Related:
How to use RETURNING with ON CONFLICT in PostgreSQL?
I would just use the simple query and replace the table name.
But if you still want a dynamic function:
CREATE OR REPLACE FUNCTION insert_label(_tbl regclass, _nms text[])
RETURNS TABLE (hid uuid)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE format(
$$
INSERT INTO %s(name)
SELECT unnest($1)
ON CONFLICT DO NOTHING
RETURNING hid
$$, _tbl)
USING _nms;
END
$func$;
If you don't need an array as result, stick with the set (RETURNS TABLE ...). Simpler.
Pass values (_nms) to EXECUTE in a USING clause.
The tablename (_tbl) is type regclass, so the format specifier %I for format() would be wrong. Use %s instead. See:
Table name as a PostgreSQL function parameter
I have subscription data that is being appended to a table in real-time (via kafka). i have set up a trigger such that once the data is added it is checked for consistency. If checks pass some of the data should be added to other tables (that have master information on the customer profile etc.). The checks function i wrote works fine but i keep getting errors on the function used in the trigger. The function for the trigger is:
CREATE OR REPLACE FUNCTION update_tables()
RETURNS TRIGGER
LANGUAGE plpgsql
AS
$$
BEGIN
CASE (SELECT check_incoming_data()) WHEN 0
THEN INSERT INTO
sub_master(sub_id, sub_date, customer_id, product_id)
VALUES(
(SELECT sub_id::int FROM sub_realtime WHERE CTID = (SELECT MAX(CTID) FROM sub_realtime)),
(SELECT sub_date::date FROM sub_realtime WHERE CTID = (SELECT MAX(CTID) FROM sub_realtime)),
(SELECT customer_id::int FROM sub_realtime WHERE CTID = (SELECT MAX(CTID) FROM sub_realtime)),
(SELECT product_id::int FROM sub_realtime WHERE CTID = (SELECT MAX(CTID) FROM sub_realtime))
);
RETURN sub_master;
END CASE;
RETURN sub_master;
END;
$$
The trigger is then:
CREATE TRIGGER incoming_data
AFTER INSERT
ON claims_realtime_3
FOR EACH ROW
EXECUTE PROCEDURE update_tables();
What I am saying is 'if checks pass then select data from the last added row and add them to the master table'. What is the best way to structure this query?
Thanks a lot!
The trigger functions are executed for each row and you must use a record type variable called "NEW" which is automatically created by the database in the trigger functions. "NEW" gets only inserted records. For example, I want to insert data to users_log table when inserting records to users table.
CREATE OR REPLACE FUNCTION users_insert()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
begin
insert into users_log
(
username,
first_name,
last_name
)
select
new.username,
new.first_name,
new.last_name;
return new;
END;
$function$;
create trigger store_data_to_history_insert
before insert
on users for each row execute function users_insert();
insertion code
INSERT INTO employee (pid, pname, desig, dept, lts_i, lts_O, p_status) VALUES %s \
ON CONFLICT (pid) DO UPDATE SET \
(pname, desig, dept, lts_i, lts_O, p_status) = \
(EXCLUDED.pname, EXCLUDED.desig, EXCLUDED.dept, EXCLUDED.lts_i, EXCLUDED.lts_O, EXCLUDED.p_status) \
RETURNING *
If i insert such like above then it's working good. Instead of CONFLICT i have used a function the following
CREATE FUNCTION employee_db(
pid1 integer,
pname1 text,
desig1 text,
dept1 text,
lts_i1 time,
lts_o1 time,
p_status1 text
) RETURNS VOID AS
$$
BEGIN
LOOP
-- first try to update the key
-- note that "a" must be unique
UPDATE employee SET (lts_i, lts_o, p_status) = (lts_i1, lts_o1, p_status1) WHERE pid = pid1;
IF found THEN
RETURN;
END IF;
-- not there, so try to insert the key
-- if someone else inserts the same key concurrently,
-- we could get a unique-key failure
BEGIN
INSERT INTO employee(pid, pname, desig, dept, lts_i, lts_o, p_status) VALUES (pid1, pname1, desig1, dept1, lts_i1, lts_o1, p_status1);
RETURN;
EXCEPTION WHEN unique_violation THEN
-- do nothing, and loop to try the UPDATE again
END;
END LOOP;
END;
$$
LANGUAGE plpgsql;
that takes some argument
SELECT merge_db(12, 'Newton', 'director', 'd1', '10:00:26', '00:00:00', 'P-Status')"
but when i update lts_i, lts_O and p_status within same id(12)
SELECT merge_db(12, 'Newton', 'director', 'd1', '12:10:22', '02:30:02', 'active')"
then it also showing duplicate key error.
I don't want to use here CONFLICT, because of i have a UPDATE RULE on the same Table and already postgresql says that "The event is one of SELECT, INSERT, UPDATE, or DELETE. Note that an INSERT containing an ON CONFLICT clause cannot be used on tables that have either INSERT or UPDATE rules. Consider using an updatable view instead."
Update Rule
CREATE RULE log_employee AS ON UPDATE TO employee
WHERE NEW.lts_i <> OLD.lts_i or NEW.lts_O <> OLD.lts_O
DO UPDATE employee set today = current_date where id = new.id;
if lts_i, lts_o or p_status is update then will be insert current_date into "today" field in the same employee table.
But definitely i need RULE, In this situation what should i do?
Any help would be appreciated.
Thanks.
You should use a trigger for that.
The trigger function:
create function emp_trigger_func()
returns trigger
as
$$
begin
new.today := current_date;
return new;
end;
$$
language plpgsql;
The condition on when that columns should be update is better done in the trigger definition to avoid unnecessary firing of the trigger
create trigger update_today
before update on employee
for each row
when (NEW.lts_i <> OLD.lts_i or NEW.lts_O <> OLD.lts_O)
execute procedure emp_trigger_func();
Note that <> doesn't properly deal with NULL values. If lts_i or lts_o can contain null values, then firing condition is better written as:
when ( NEW.lts_i is distinct from OLD.lts_i
or NEW.lts_O is distinct from OLD.lts_O)
This will also catch a change from or to a null value.
I have a table:
CREATE TABLE annotations
(
gid serial NOT NULL,
annotation character varying(250),
the_geom geometry,
"rotationAngle" character varying(3) DEFAULT 0,
CONSTRAINT annotations_pkey PRIMARY KEY (gid),
CONSTRAINT enforce_dims_the_geom CHECK (st_ndims(the_geom) = 2),
CONSTRAINT enforce_srid_the_geom CHECK (st_srid(the_geom) = 4326)
)
And trigger:
CREATE TRIGGER set_angle
AFTER INSERT OR UPDATE
ON annotations
FOR EACH ROW
EXECUTE PROCEDURE setangle();
And function:
CREATE OR REPLACE FUNCTION setAngle() RETURNS TRIGGER AS $$
BEGIN
IF TG_OP = 'INSERT' THEN
UPDATE annotations SET "rotationAngle" = degrees( ST_Azimuth( ST_StartPoint(NEW.the_geom), ST_EndPoint(NEW.the_geom) ) )-90 WHERE gid = NEW.gid;
RETURN NEW;
ELSIF TG_OP = 'UPDATE' THEN
UPDATE annotations SET "rotationAngle" = degrees( ST_Azimuth( ST_StartPoint(NEW.the_geom), ST_EndPoint(NEW.the_geom) ) )-90 WHERE gid = NEW.gid;
RETURN NEW;
END IF;
END;
$$ LANGUAGE plpgsql;
And when new row inserted in table or row edited i want to field rotationAngle setted with function result.
But when i inserting a new row in table function not work. I mean thath rotationAngle value not changed.
What can be wrong?
You are triggering an endless loop. Simplify the trigger function:
CREATE OR REPLACE FUNCTION set_angle()
RETURNS trigger
LANGUAGE plpgsql AS
$func$
BEGIN
NEW."rotationAngle" := degrees(
ST_Azimuth(
ST_StartPoint(NEW.the_geom)
, ST_EndPoint(NEW.the_geom)
)
) - 90;
RETURN NEW;
END
$func$;
Assign to NEW directly. No WHERE in this case.
You must double-quote illegal column names. Better not to use such names to begin with.
Recent related answer.
Code for insert & upgrade is the same. I folded into one code path.
Use a BEFORE trigger. This way you can edit columns of the triggering row directly before they are saved:
CREATE TRIGGER set_angle
BEFORE INSERT OR UPDATE ON annotations
FOR EACH ROW EXECUTE PROCEDURE set_angle();
However
If you are just trying to persist a functionally dependent value in the table (and there are no other considerations): Don't. Use a view or a generated column instead:
Store common query as column?
Then you don't need any of this.
There are multiple things wrong here.
1) When you insert a row 'A' the function setAngle() is called. But in the function you are calling another update within the function which will trigger the function again, and again, and so on...To fix this don't issue a update! Just update the NEW records value independently and return it.
I want to create a trigger for my view in PostgreSQL. The idea is that all new data must fulfill a condition to be inserted. But something is wrong here and I can't find the answer in manuals.
CREATE OR REPLACE VIEW My_View AS
SELECT name, adress, count FROM club, band, country;
CREATE OR REPLACE FUNCTION insert() RETURNS TRIGGER AS $$
BEGIN
IF(NEW.count > 10) THEN
INSERT INTO My_View VALUES (NEW.name, NEW.adress, NEW.count);
END IF;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER insert INSTEAD OF INSERT ON My_View
FOR EACH ROW EXECUTE PROCEDURE insert();
There is a semicolon ( ; ) missing at the end of the INSERT statement in the function.
insert is a reserved word in the SQL standard and should not be used as trigger or function name. Even if it's allowed in PostgreSQL, it's a very bad idea.
There are no join conditions for the three tables club, band, country in your view definition. This leads to a CROSS JOIN, which can be extremely expensive. If there are 1000 rows in each table, you get 1,000,000,000 combinations. You most definitely do not want that.
Also, you should table-qualify the columns in your view definition to avoid ambiguities.
CREATE OR REPLACE VIEW my_view AS
SELECT ??.name, ??.address, ??.mycount
FROM club cl
JOIN band ba ON ?? = ??
JOIN country co ON ?? = ??;
You need to fill in where I left question marks.
And always add a column definition list to your INSERT statements.
And finally, you do not want to INSERT into the same view again, which would create an endless loop and may be the primary cause of your error.
CREATE OR REPLACE FUNCTION f_insert()
RETURNS TRIGGER AS
$func$
BEGIN
IF NEW.mycount > 10 THEN
INSERT INTO my_view ???? (col1?, col2?, col3?)
VALUES (NEW.name, NEW.address, NEW.mycount);
END IF;
END
$func$ LANGUAGE plpgsql;
BTW, you shouldn't use count as identifier either. I use mycountinstead.