How to combine particular rows in a pl/pgsql function that returns set of a view row type? - sql

I have a view, and I have a function that returns records from this view.
Here is the view definition:
CREATE VIEW ctags(id, name, descr, freq) AS
SELECT tags.conc_id, expressions.name, concepts.descr, tags.freq
FROM tags, concepts, expressions
WHERE concepts.id = tags.conc_id
AND expressions.id = concepts.expr_id;
The column id references to the table tags, that, references to another table concepts, which, in turn, references to the table expressions.
Here are the table definitions:
CREATE TABLE expressions(
id serial PRIMARY KEY,
name text,
is_dropped bool DEFAULT FALSE,
rank float(53) DEFAULT 0,
state text DEFAULT 'never edited',
UNIQUE(name)
);
CREATE TABLE concepts(
id serial PRIMARY KEY,
expr_id int NOT NULL,
descr text NOT NULL,
source_id int,
equiv_p_id int,
equiv_r_id int,
equiv_len int,
weight int,
is_dropped bool DEFAULT FALSE,
FOREIGN KEY(expr_id) REFERENCES expressions,
FOREIGN KEY(source_id),
FOREIGN KEY(equiv_p_id) REFERENCES concepts,
FOREIGN KEY(equiv_r_id) REFERENCES concepts,
UNIQUE(id,equiv_p_id),
UNIQUE(id,equiv_r_id)
);
CREATE TABLE tags(
conc_id int NOT NULL,
freq int NOT NULL default 0,
UNIQUE(conc_id, freq)
);
The table expressions is also referenced from my view (ctags).
I want my function to combine rows of my view, that have equal values in the column name and that refer to rows of the table concepts with equal values of the column equiv_r_id so that these rows are combined only once, the combined row has one (doesn't matter which) of the ids, the value of the column descr is concatenated from the values of the rows being combined, and the row freq contains the sum of the values from the rows being combined. I have no idea how to do it, any help would be appreciated.

Basically, what you describe looks like this:
CREATE FUNCTION f_test()
RETURNS TABLE(min_id int, name text, all_descr text, sum_freq int) AS
$x$
SELECT min(t.conc_id) -- AS min_id
,e.name
,string_agg(c.descr, ', ') -- AS all_descr
,sum(t.freq) -- AS sum_freq
FROM tags t
JOIN concepts c USING (id)
JOIN expressions e ON e.id = c.expr_id;
-- WHERE e.name IS DISTINCT FROM
$x$
LANGUAGE sql;
Major points:
I ignored the view ctags altogether as it is not needed.
You could also write this as View so far, the function wrapper is not necessary.
You need PostgreSQL 9.0+ for string_agg(). Else you have to substitute with
array_to_string(array_agg(c.descr), ', ')
The only unclear part is this:
and that refer to rows of the table concepts with equal values of the column equiv_r_id so that these rows are combined only once
Waht column exactly refers to what column in table concepts?
concepts.equiv_r_id equals what exactly?
If you can clarify that part, I might be able to incorporate it into the solution.

Related

How to efficiently insert ENUM value into table?

Consider the following schema:
CREATE TABLE IF NOT EXISTS snippet_types (
id INTEGER NOT NULL PRIMARY KEY,
name TEXT NOT NULL UNIQUE
);
CREATE TABLE IF NOT EXISTS snippets (
id INTEGER NOT NULL PRIMARY KEY,
title TEXT,
content TEXT,
type INTEGER NOT NULL,
FOREIGN KEY(type) REFERENCES snippet_types(id)
);
This schema assumes a one-to-many relationship between tables and allows efficiently maintaining a set of ENUMs in the snippet_types table. Efficiency comes from the fact that we don't need to store the whole string describing snippet type in the snippets table, but this decision also leads us to some inconvenience: upon inserting we need to retrieve snippet id from snippet_types and this leads to one more select and check before inserting:
SELECT id FROM snippet_types WHERE name = "foo";
-- ...check that > 0 rows returned...
INSERT INTO snippets (title, content, type) values ("bar", "buz", id);
We could also combine this insert and select into one select like that:
INSERT INTO snippets (title, content, type)
SELECT ("bar", "buz", id) FROM snippet_types WHERE name = "foo"
However, if "foo" type is missing in snippet_types then 0 rows would have been inserted and no error returned and I don't see a possibility to get a number of rows sqlite actually inserted.
How can I insert ENUM-containing tuple in one query?

How to create a projection from multi table

I have 2 tables as following:
CREATE TABLE public.test_employee
(
index int NOT NULL,
name varchar(100),
date_of_birth date,
address varchar(100),
id_dep int NOT NULL,
CONSTRAINT C_PRIMARY PRIMARY KEY (index) DISABLED
);
CREATE TABLE store.test_department
(
index int NOT NULL,
name varchar(100),
describe varchar(100),
CONSTRAINT C_PRIMARY PRIMARY KEY (index) DISABLED
);
I need to create a projection with many columns from the above two tables, My current code looks like this:
CREATE PROJECTION public.employee_department_super
(
idEmp,
idDep,
empName,
date_of_birth,
address,
depName,
describe
)
AS
SELECT e.index,
e.id_dep,
e.name,
e.date_of_birth,
e.address,
d.name,
d.describe
FROM
public.test_employee e
inner join store.test_department d
on e.id_dep=d.index
ORDER BY e.name
UNSEGMENTED ALL NODES;
But I received an error:
[Code: 9366, SQL State: 0A000] [Vertica][VJDBC](9366) ROLLBACK: Projections must select data from only one table
How can I solve this problem?
The answer is: you can't.
Join projections were a thing of a long gone past.
Vertica has begun to satisfy the need of reducing joins by the concept of the flattened table.
You add the two columns as flattened columns to your test_employee table, and they are automatically set whenever you insert new rows into the table.
ALTER TABLE public.test_employee
ADD depname VARCHAR(100)
DEFAULT(
SELECT name FROM store.test_department d WHERE d.index=id_dep
);
ALTER TABLE public.test_employee
ADD describe VARCHAR(100)
DEFAULT(
SELECT describe FROM store.test_department d WHERE d.index=id_dep
);
And the two flattened columns do not count against your license size.

How to make sure only one column is not null in postgresql table

I'm trying to setup a table and add some constraints to it. I was planning on using partial indexes to add constraints to create some composite keys, but ran into the problem of handling NULL values. We have a situation where we want to make sure that in a table only one of two columns is populated for a given row, and that the populated value is unique. I'm trying to figure out how to do this, but I'm having a tough time. Perhaps something like this:
CREATE INDEX foo_idx_a ON foo (colA) WHERE colB is NULL
CREATE INDEX foo_idx_b ON foo (colB) WHERE colA is NULL
Would this work? Additionally, is there a good way to expand this to a larger number of columns?
Another way to write this constraint is to use the num_nonulls() function:
create table table_name
(
a integer,
b integer,
check ( num_nonnulls(a,b) = 1)
);
This is especially useful if you have more columns:
create table table_name
(
a integer,
b integer,
c integer,
d integer,
check ( num_nonnulls(a,b,c,d) = 1)
);
You can use the following check:
create table table_name
(
a integer,
b integer,
check ((a is null) != (b is null))
);
If there are more columns, you can use the trick with casting boolean to integer:
create table table_name
(
a integer,
b integer,
...
n integer,
check ((a is not null)::integer + (b is not null)::integer + ... + (n is not null)::integer = 1)
);
In this example only one column can be not null (it simply counts not null columns), but you can make it any number.
One can do this with an insert/update trigger or checks, but having to do so indicates it could be done better. Constraints exist to give you certainty about your data so you don't have to be constantly checking if the data is valid. If one or the other is not null, you have to do the checks in your queries.
This is better solved with table inheritance and views.
Let's say you have (American) clients. Some are businesses and some are individuals. Everyone needs a Taxpayer Identification Number which can be one of several things such as a either a Social Security Number or Employer Identification Number.
create table generic_clients (
id bigserial primary key,
name text not null
);
create table individual_clients (
ssn numeric(9) not null
) inherits(generic_clients);
create table business_clients (
ein numeric(9) not null
) inherits(generic_clients);
SSN and EIN are both Taxpayer Identification Numbers and you can make a view which will treat both the same.
create view clients as
select id, name, ssn as tin from individual_clients
union
select id, name, ein as tin from business_clients;
Now you can query clients.tin or if you specifically want businesses you query business_clients.ein and for individuals individual_clients.ssn. And you can see how the inherited tables can be expanded to accommodate more divergent information between types of clients.

How to insert data from one table into another as PostgreSQL array?

I have the following tables:
CREATE TABLE "User" (
id integer DEFAULT nextval('"User_id_seq"'::regclass) PRIMARY KEY,
name text NOT NULL DEFAULT ''::text,
coinflips boolean[]
);
CREATE TABLE "User_coinflips_COPY" (
"nodeId" integer,
position integer,
value boolean,
id integer DEFAULT nextval('"User_coinflips_COPY_id_seq"'::regclass) PRIMARY KEY
);
I'm no looking for the SQL statement that grabs the value entry from each row in User_coinflips and inserts it as an array into the coinflips column on User.
Any help would be appreciated!
Update
Not sure if it's important but I just realized a minor mistake in my table definitions above, I replace User_coinflips with User_coinflips_COPY since that accurately describes my schema. Just for context, before it looked like this:
CREATE TABLE "User_coinflips" (
"nodeId" integer REFERENCES "User"(id) ON DELETE CASCADE,
position integer,
value boolean NOT NULL,
CONSTRAINT "User_coinflips_pkey" PRIMARY KEY ("nodeId", position)
);
You are looking for an UPDATE, rather then insert.
Use a derived table with the aggregated values to join against in the UPDATE statement:
update "User"
set conflips = t.flips
from (
select "nodeId", array_agg(value order by position) as flips
from "User_coinflips"
group by "nodeId"
) t
where t."nodeId" = "User"."nodeId";

Postgresql SET DEFAULT value from another table SQL

I'm making a sql script so I have create tables, now I have a new table that have columns. One column has a FOREIGN KEY so I need this value to be SET DEFAULT at the value of the value of the original table. For example consider this two table
PERSON(Name,Surename,ID,Age);
EMPLOYER(Name,Surname,Sector,Age);
In Employer I need AGE to be setted on default on the AGE of Person, this only if PERSON have rows or just 1 row.
ID is Primary key for person and Surname,Sector for employer and AGE is FOREIGN KEY in Employer refferenced from Person
Example sql :
CREATE TABLE PERSON(
name VARCHAR(30) ,
surename VARCHAR(20),
ID VARCHAR(50) PRIMARY KEY,
Age INT NOT NULL,
);
CREATE TABLE EMPLOYER(
name VARCHAR(30) ,
Surename VARCHAR(20),
Sector VARCHAR(20),
Age INT NOT NULL,
PRIMARY KEY (Surename,Sector),
FOREIGN KEY (Age) REFERENCES Person(Age) //HERE SET DEFAULT Person(Age), how'??
);
Taking away the poor design choices of this exercise it is possible to assign the value of a column to that of another one using a trigger.
Rough working example below:
create table a (
cola int,
colb int) ;
create table b (
colc int,
cold int);
Create or replace function fn()
returns trigger
as $$ begin
if new.cold is null then
new.cold = (select colb from a where cola = new.colc);
end if;
return new;
end;
$$ language plpgsql;
CREATE TRIGGER
fn
BEFORE INSERT ON
b
FOR EACH ROW EXECUTE PROCEDURE
fn();
Use a trigger rather than a default. I have done things like this (useful occasionally for aggregated full text vectors among other things).
You cannot use a default here because you have no access to the current row data. Therefore there is nothing to look up if it is depending on your values currently being saved.
Instead you want to create a BEFORE trigger which sets the value if it is not set, and looks up data. Note that this has a different limitation because DEFAULT looks at the query (was a value specified) while a trigger looks at the value (i.e. what does your current row look like). Consequently a default can be avoided by explicitly passing in a NULL. But a trigger will populate that anyway.