I have created a PL/pgSQL function that accepts two column names, a "relation", and two table names. It finds distinct rows in one table and inserts them in to a temporary table, deletes any row with a null value, and sets all values of one column to relation. I have the first part of the process using this function.
create or replace function alt_edger(s text, v text, relation text, tbl text, tbl_src text)
returns void
language plpgsql as
$func$
begin
raise notice 's: %, v: %, tbl: %, tbl_src: %', s,v,tbl,tbl_src;
execute ('insert into '||tbl||' ("source", "target") select distinct "'||s||'","'||v||'" from '||tbl_src||'');
execute ('DELETE FROM '||tbl||' WHERE "source" IS null or "target" is null');
end
$func$;
It is executed as follows:
-- create a temporary table and execute the function twice
drop table if exists temp_stack;
create temporary table temp_stack("label" text, "source" text, "target" text, "attr" text, "graph" text);
select alt_edger('x_x', 'y_y', ':associated_with', 'temp_stack','pg_check_table' );
select alt_edger('Document Number', 'x_x', ':documents', 'temp_stack','pg_check_table' );
select * from temp_stack;
Note that I didn't use relation, yet. The INSERT shall also assign relation, but I can't figure out how to make that happen to get something like:
label
source
target
attr
graph
:associated_with
638000
ARAS
:associated_with
202000
JASE
:associated_with
638010
JASE
:associated_with
638000
JASE
:associated_with
202100
JASE
:documents
A
638010
:documents
A
202000
:documents
A
202100
:documents
B
638000
:documents
A
638000
:documents
B
124004
:documents
B
202100
My challenges are:
How to integrate relation in the INSERT? When I try to use VALUES and comma separation I get an "error near select".
How to allow strings starting with ":" in relation? I'm anticipating here, the inclusion of the colon has given me challenges in the past.
How can I do this? Or is there a better approach?
Toy data model:
drop table if exists pg_check_table;
create temporary table pg_check_table("Document Number" text, x_x int, y_y text);
insert into pg_check_table values ('A',202000,'JASE'),
('A',202100,'JASE'),
('A',638010,'JASE'),
('A',Null,'JASE'),
('A',Null,'JASE'),
('A',202100,'JASE'),
('A',638000,'JASE'),
('A',202100,'JASE'),
('B',638000,'JASE'),
('B',202100,null),
('B',638000,'JASE'),
('B',null,'ARAS'),
('B',638000,'ARAS'),
('B',null,'ARAS'),
('B',638000,null),
('B',124004,null);
alter table pg_check_table add row_num serial;
select * from pg_check_table;
-- DROP FUNCTION alt_edger(_s text, _v text, _relation text, _tbl text, _tbl_src text)
CREATE OR REPLACE FUNCTION alt_edger(_s text, _v text, _relation text, _tbl text, _tbl_src text, OUT row_count int)
LANGUAGE plpgsql AS
$func$
DECLARE
_sql text := format(
'INSERT INTO pg_temp.%3$I (label, source, target)
SELECT DISTINCT $1, %1$I, %2$I FROM pg_temp.%4$I
WHERE (%1$I, %2$I) IS NOT NULL'
, _s, _v, _tbl, _tbl_src);
BEGIN
-- RAISE NOTICE '%', _sql; -- debug
EXECUTE _sql USING _relation;
GET DIAGNOSTICS row_count = ROW_COUNT; -- return number of inserted rows
END
$func$;
db<>fiddle here
Most importantly, use format() to concatenate your dynamic SQL commands safely. And use the format specifier %I for identifiers. This way, SQL injection is not possible and identifiers are double-quoted properly - preserving non-standard names like Document Number. That's where your original failed.
We could concatenate _relation as string to be inserted into label, too. But the preferable way to pass values to EXECUTE is with the USING clause. $1 inside the SQL string passed to EXECUTE is a placeholder for the first USING argument. Not to be confused with $1 referencing function parameters in the context of the function body outside EXECUTE! (You can pass any string, leading colon (:) does not matter, the string is not interpreted when done right.)
See:
Format specifier for integer variables in format() for EXECUTE?
Table name as a PostgreSQL function parameter
I replaced the DELETE in your original with a WHERE clause to the SELECT of the INSERT. Don't insert rows in the first place, instead of deleting them again later.
(%1$I, %2$I) IS NOT NULL only qualifies when both values are NOT NULL.
About that:
Check if a Postgres composite field is null/empty
Don't use the prefix "pg_" for your table names. That's what Postgres uses for system tables. Don't mess with those.
I schema-qualify known temporary tables with pg_temp. That's typically optional as the temporary schema comes first in the search_path by default. But that can be changed (maliciously), and then the table name would resolve to any existing regular table of the same name in the search_path. So better safe than sorry. See:
How does the search_path influence identifier resolution and the "current schema"
I made the function return the number of inserted rows. That's totally optional!
Since I do that with an OUT parameter, I am allowed to skip the RETURNS clause. See:
Can I make a plpgsql function return an integer without using a variable?
Let's say I want to create a new table from an existing table in SQL (postgres). I want the new table to have the same name as the old table but I want it to be in a different schema.
Is there a way to do this without having to repeat the name of the two tables (who share one name?)
Let's say the name of the original table is public.student
CREATE TABLE student(
student_id INT PRIMARY KEY,
last_name VARCHAR(30),
major VARCHAR(30))
Now I want to have the exact table but I want it to be in test.student
I know I would "clone" that table via
CREATE TABLE test.student AS
SELECT *
FROM public.student;
but I would like to write this without having to repeat writing "student".
Is there a way to write a function for this?
I'm quite new to SQL, so I'm thankful for any help! I looked into functions and I wasn't able to make it work.
You could create a procedure (or a function) with dynamic SQL:
CREATE OR REPLACE PROCEDURE foo(_schema text, _table text)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format('CREATE TABLE %1$I.%2$I AS TABLE public.%2$I'
, _schema, _table);
END
$func$;
Call:
CALL foo('test', 'student');
Note that identifers are case sensitive here!
Be wary of possible SQL injection. format() with the format specifier %I (for identifier) is safe. (nested $1, $2 are ordinal references to format input)
See:
Define table and column names as arguments in a plpgsql function?
Table name as a PostgreSQL function parameter
relevant documentation
I am trying to create a trigger that catches inserts into the Viewings table where the foreign key (viewings.location) does not correspond to an existing primary key in the Places table (places.location). The logic, from what I can tell, works as expected. However my issue comes from trying to concatenate the attempted value into the error-message in the raise function. Is this not allowed?
create trigger catchForeignKeyError BEFORE INSERT ON VIEWINGS
BEGIN
SELECT CASE
WHEN NEW.location NOT IN (SELECT PLACES.location FROM PLACES) THEN
RAISE(ABORT, 'Error: Insert into the VIEWINGS table references location '''||NEW.location||''' that is not found in the PLACES table.')
END;
END;
In the SQLite grammar, the second parameter of the RAISE() expression is not a string but a name:
RAISE(ABORT, some_error)
Identifiers can be quoted with double quotes, and for historical reasons, SQLite accepts a string (with single quotes) where an identifier is expected, but then it must be a single string, not a string expression composed of other values:
RAISE(ABORT, "some error")
There is no mechanism to get a dynamic value into the error message, except by creating a user-defined function for this.
Parameter table is initially created and one row is added in Postgres.
This table should have always one row, otherwise SQL queries using this table will produce incorrect results. DELETE or INSERT to this table are disallowed, only UPDATE is allowed.
How to add single row constraint to this table?
Maybe DELETE and INSERT triggers can raise an exception or is there simpler way?
The following will create a table where you can only insert one single row. Any update of the id column will result in an error, as will any insert with a different value than 42. The actual id value doesn't matter actually (unless there is some special meaning that you need).
create table singleton
(
id integer not null primary key default 42,
parameter_1 text,
parameter_2 text,
constraint only_one_row check (id = 42)
);
insert into singleton values (default);
To prevent deletes you can use a rule:
create or replace rule ignore_delete
AS on delete to singleton
do instead nothing;
You could also use a rule to make insert do nothing as well if you want to make an insert "fail" silently. Without the rule, an insert would generate an error. If you want a delete to generate an error as well, you would need to create a trigger that simply raises an exception.
Edit
If you want an error to be thrown for inserts or deletes, you need a trigger for that:
create table singleton
(
id integer not null primary key,
parameter_1 text,
parameter_2 text
);
insert into singleton (id) values (42);
create or replace function raise_error()
returns trigger
as
$body$
begin
RAISE EXCEPTION 'No changes allowed';
end;
$body$
language plpgsql;
create trigger singleton_trg
before insert or delete on singleton
for each statement execute procedure raise_error();
Note that you have to insert the single row before you create the trigger, otherwise you can't insert that row.
This will only partially work for a superuser or the owner of the table. Both have the privilege to drop or disable the trigger. But that is the nature of a superuser - he can do anything.
To make any table a singleton just add this column:
just_me bool NOT NULL DEFAULT TRUE UNIQUE CHECK (just_me)
This allows exactly one row. Plus add the trigger #a_horse provided.
But I would rather use a function instead of the table for this purpose. Simpler and cheaper.
CREATE OR REPLACE FUNCTION one_row()
RETURNS TABLE (company_id int, company text) LANGUAGE sql IMMUTABLE AS
$$SELECT 123, 'The Company'$$
ALTER FUNCTION one_row() OWNER TO postgres;
Set the owner to the user that should be allowed to change it.
Give a user permission to ALTER a function
Nobody else change it - except superusers of course. Superusers can do anything.
You can use this function just like you would use the table:
SELECT * FROM one_row();
If you need a "table", create a view (which is actually a special table internally):
CREATE VIEW one_row AS SELECT * FROM one_row();
I guess you will not use the PostgreSQL root user in your application so you could simply limit the permissions of your application user on UPDATE for this table.
An INSERT or DELETE will then cause an Insufficient privilege exception.
I'm having trouble disambiguating this particular postgres function that inserts on a FROM statement of the target table and returns the newly created unique id:
CREATE OR REPLACE FUNCTION "copyEntry"(OUT "entryId" INTEGER, IN "copyEntryId" INTEGER) RETURNING VOID AS $$
BEGIN
INSERT INTO "entries" ("data") SELECT "data" FROM "entries" WHERE "entryId" = "copyEntryId" RETURNING "entryId" INTO "entryId";
--Other plpgsql code below
END;
$$ LANGUAGE 'plpgsql';
"entryId" INTO "entryId" is ambiguous and I can't seem to find a way to alias the insert table to remove ambiguity. I would like to keep the output parameter to "entryId"
The ambiguity is between the variable and the column. There are two ways of disambiguating variable names: prefixing them based on table and code block names, or renaming them to be unique.
In this case, the parameters are declared in the outermost code block of the function, which is named after the function. So the "entryId" parameter can be referenced as "copyEntry"."entryId". Meanwhile, the column is from the table entries, so can be referenced as entries."entryId".
It may however be more readable to name your variables so that they aren't ambiguous in the first place, perhaps using a prefixing convention, so that your parameter would be "out_entryId".
Name the parameters to the function so you can distinguish them from columns in the table. This is a good programming practice.
Something like this:
CREATE OR REPLACE FUNCTION "copyEntry"(OUT "out_entryId" INTEGER, IN "in_copyEntryId" ) RETURNING VOID AS $$
BEGIN
INSERT INTO "entries"
SELECT "data"
FROM "entries" e
WHERE "entryId" = "in_copyentryid"
RETURNING "entryId" ;
--Other plpgsql code below
END;
The RETURNING clause should return the value from the row being inserted -- which is presumably some sort of default or auto-incremented value. As the documentation describes:
The optional RETURNING clause causes INSERT to compute and return
value(s) based on each row actually inserted. This is primarily useful
for obtaining values that were supplied by defaults, such as a serial
sequence number. However, any expression using the table's columns is
allowed. The syntax of the RETURNING list is identical to that of the
output list of SELECT.
The problem with your code might be the issue that the parameter has the same name as the function.