Is it possible to create two tables with disjoint identifiers? - sql

By "disjoint" I mean mutually exclusive sets of ID values. No overlap between both tables.
For example, the sequence generator for the id column on both tables should work in conjunction to make sure they are always disjoint. I am not sure if this is possible. So, I thought I would just ask here.
Table A
id
name
0
abc
1
cad
2
pad
3
ial
Table B
id
name
40
pal
50
sal

A different hack-around:
CREATE TABLE odd(
id INTEGER GENERATED ALWAYS AS IDENTITY (START 1 INCREMENT 2)
, val integer
);
CREATE TABLE even(
id INTEGER GENERATED ALWAYS AS IDENTITY (START 2 INCREMENT 2)
, val integer
);
INSERT INTO odd (val)
SELECT GENERATE_SERIES(1,10);
INSERT INTO even (val)
SELECT GENERATE_SERIES(1,20);
SELECT * FROM odd;
SELECT * FROM even;
Result:
CREATE TABLE
CREATE TABLE
INSERT 0 10
INSERT 0 20
id | val
----+-----
1 | 1
3 | 2
5 | 3
7 | 4
9 | 5
11 | 6
13 | 7
15 | 8
17 | 9
19 | 10
(10 rows)
id | val
----+-----
2 | 1
4 | 2
6 | 3
8 | 4
10 | 5
12 | 6
14 | 7
16 | 8
18 | 9
20 | 10
22 | 11
24 | 12
26 | 13
28 | 14
30 | 15
32 | 16
34 | 17
36 | 18
38 | 19
40 | 20
(20 rows)

A very simple way is to share the same SEQUENCE:
CREATE TABLE a (
id serial PRIMARY KEY
, name text
);
CREATE TABLE b (
id int PRIMARY KEY
, name text
);
SELECT pg_get_serial_sequence('a', 'id'); -- 'public.a_id_seq'
ALTER TABLE b ALTER COLUMN id SET DEFAULT nextval('public.a_id_seq'); -- !
db<>fiddle here
This way, table a "owns" the sequence, while table b draws from the same source. You can also create an independent SEQUENCE if you prefer.
Note: this only guarantees mutually exclusive new IDs (even under concurrent write load) while you don't override default values and also don't update them later.
Related:
Creating a PostgreSQL sequence to a field (which is not the ID of the record)
Safely rename tables using serial primary key columns
Auto increment table column
PostgreSQL next value of the sequences?

Welcome to the painful world of inter-table constraints or assertions - this is something that ISO SQL and pretty much every RDBMS out there does not handle ergonomically...
(While ISO SQL does describe both deferred-constraints and database-wide assertions, as far as I know only PostgreSQL implements deferred-constraints, and no production-quality RDBMS supports database-wide assertions).
One approach is to have a third-table which is the only table with SERIAL (aka IDENTITY aka AUTO_INCREMENT) with a discriminator column which combined forms the table's primary-key, then the other two tables have an FK constraint to that PK - but they'll also need the same discriminator column (enforced with a CHECK constraint), but you will never need to reference that column in most queries.
As your post doesn't tell us what the real table-names are, I'll use my own.
Something like this:
CREATE TABLE postIds (
postId int NOT NULL SERIAL,
postType char(1) NOT NULL, /* This is the discriminator column. It can only contain ONLY either 'S' or 'G' which indicates which table contains the rest of the data */
CONSTRAINT PK_postIds PRIMARY KEY ( postId, postType ),
CONSTRAINT CK_type CHECK ( postType IN ( 'S', 'G' ) )
);
CREATE TABLE shitposts (
postId int NOT NULL,
postType char(1) DEFAULT('S'),
foobar nvarchar(255) NULL,
etc int NOT NULL,
CONSTRAINT PK_shitpostIds PRIMARY KEY ( postId, postType ),
CONSTRAINT CK_type CHECK ( postType = 'S' ),
CONSTRAINT FK_shitpost_ids FOREIGN KEY ( postId, postType ) REFERENCES postIds ( postId, postType )
);
CREATE TABLE goldposts (
postId int NOT NULL,
postType char(1) DEFAULT('G'),
foobar nvarchar(255) NULL,
etc int NOT NULL,
CONSTRAINT PK_goldpostIds PRIMARY KEY ( postId, postType ),
CONSTRAINT CK_type CHECK ( postType = 'G' ),
CONSTRAINT FK_goldpost_ids FOREIGN KEY ( postId, postType ) REFERENCES postIds ( postId, postType )
)
With this design, it is impossible for any row in shitposts to share a postId value with a post in goldposts and vice-versa.
However it is possible for a row to exist in postIds without having any row in both goldposts and shitposts. Fortunately, as you are using PostgreSQL you could add a new FK constraint from postIds to both goldposts and shitposts but use it with deferred-constraints.

Related

Oracle - fast insert and fast latest records lookup

I have a table with logs which grew in size (~100M records) to the point where querying even the latest entries takes a considerable amount of a time.
I am wondering is there a smart way to make access to latest records fast (largest PK values) while also make inserts (appends) to it fast? I do not want to delete any data if possible, actually there is already a mechanism which monthly deletes logs older than N days.
Ideally what I mean is have the query
select * from t_logs order by log_id desc fetch first 50 rows only
to run in a split second (up to reasonable row count, say 500, if that matters).
The table is defined as follows:
CREATE TABLE t_logs (
log_id NUMBER NOT NULL,
method_name VARCHAR2(128 CHAR) NOT NULL,
msg VARCHAR2(4000 CHAR) NOT NULL,
type VARCHAR2(1 CHAR) NOT NULL,
time_stamp TIMESTAMP(6) NOT NULL,
user_created VARCHAR2(50 CHAR) DEFAULT user NOT NULL
);
CREATE UNIQUE INDEX logs_pk ON t_logs ( log_id ) REVERSE;
ALTER TABLE t_logs ADD (
CONSTRAINT logs_pk PRIMARY KEY ( log_id )
);
I am not really a DBA, so I am not familiar with all the performance tuning methods. I just use logs a lot and I was wondering if I could do something data-not-invasive to ease my pain. Up to my knowledge, what I did: tried re-computing statistics/re-analyze table (no effect), looked into query plan
-------------------------------------------
| Id | Operation | Name |
-------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | VIEW | |
| 2 | WINDOW SORT PUSHED RANK| |
| 3 | TABLE ACCESS FULL | T_LOGS |
-------------------------------------------
I would expect query to leverage index to perform the lookup, why doesn't it? Maybe this is a reason it takes so long to find the results?
Version: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
Mr Cave, in the accepted answer, seems to be right
alter table t_logs drop constraint log_pk;
drop index log_pk;
create unique index logs_pk on t_logs ( log_id );
alter table t_logs add (
constraint logs_pk primary key ( log_id )
);
Queries run super fast now, plan looks as expected:
-------------------------------------------------
| Id | Operation | Name |
-------------------------------------------------
| 0 | SELECT STATEMENT | |
| 1 | VIEW | |
| 2 | WINDOW NOSORT STOPKEY | |
| 3 | TABLE ACCESS BY INDEX ROWID| T_LOGS |
| 4 | INDEX FULL SCAN DESCENDING| LOGS_PK |
-------------------------------------------------
100 million rows isn't that large.
Why are you creating a reverse-key index for your primary key? Sure, that has the potential to reduce contention on inserts but were you really constrained by contention? That would be pretty unusual. Maybe you have an unusual environment. But my guess is that someone was trying to prematurely optimize the design for inserts without considering what that did to queries.
My wager would be that a nice, basic design would be more than sufficient for your needs
CREATE TABLE t_logs (
log_id NUMBER NOT NULL,
method_name VARCHAR2(128 CHAR) NOT NULL,
msg VARCHAR2(4000 CHAR) NOT NULL,
type VARCHAR2(1 CHAR) NOT NULL,
time_stamp TIMESTAMP(6) NOT NULL,
user_created VARCHAR2(50 CHAR) DEFAULT user NOT NULL
);
CREATE UNIQUE INDEX logs_pk ON t_logs ( log_id );
ALTER TABLE t_logs ADD (
CONSTRAINT logs_pk PRIMARY KEY ( log_id )
);
If you can't recreate the primary key for some reason, create an index on time_stamp and change your queries to use that
CREATE INDEX log_ts ON t_logs( time_stamp );
SELECT *
FROM log_ts
ORDER BY time_stamp DESC
FETCH FIRST 100 ROWS ONLY;

SQL N:M query merging results by condition flag in intermediate table

[First of all, if this is a duplicate, sorry, I couldn't find a response for this, as this is a strange solution for a limitation on an ORM and I'm clearly a noobie on SQL]
Domain requirements:
A brigades must be composed by one user (the commissar one) and, optionally, one and only one assistant (1:1)
A user can only be part of one brigade (1:1)
CREATE TABLE Users
(
id SERIAL PRIMARY KEY,
username VARCHAR(100) NOT NULL UNIQUE,
password VARCHAR(100) NOT NULL
);
CREATE TABLE Brigades
(
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL
);
-- N:M relationship with a flag inside which determine if that user is a commissar or not
CREATE TABLE Brigade_User
(
brigade_id INT NOT NULL REFERENCES Brigades(id)
ON DELETE CASCADE
ON UPDATE CASCADE,
user_id INT NOT NULL REFERENCES Users(id)
ON DELETE CASCADE
ON UPDATE CASCADE,
is_commissar BOOLEAN NOT NULL
PRIMARY KEY(brigade_id, user_id)
);
Ideally, as relations are 1:1, Brigade_User intermediate table could be erased and a Brigade table with two foreign keys could be created instead (this is not supported by Diesel Rust ORM, so I think I'm coupled to first approach)
CREATE TABLE Brigades
(
id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL
-- 1:1
commisar_id INT NOT NULL REFERENCES Users(id)
ON DELETE CASCADE
ON UPDATE CASCADE,
-- 1:1
assistant_id INT NOT NULL REFERENCES Users(id)
ON DELETE CASCADE
ON UPDATE CASCADE
);
An example...
> SELECT * FROM brigade_user LEFT JOIN brigades ON brigade_user.brigade_id = brigades.id;
brigade_id | user_id | is_commissar | id | name
------------+---------+--------------+----+------------------
1 | 1 | t | 1 | Patrulla gatuna
1 | 2 | f | 1 | Patrulla gatuna
2 | 3 | t | 2 | Patrulla perruna
2 | 4 | f | 2 | Patrulla perruna
3 | 6 | t | 3 | Patrulla canina
3 | 5 | f | 3 | Patrulla canina
(4 rows)
Is it possible to make a query which returns a table like this?
brigade_id | commissar_id | assistant_id | name
-----------+--------------+--------------+--------------------
1 | 1 | 2 | Patrulla gatuna
2 | 3 | 4 | Patrulla perruna
3 | 6 | 5 | Patrulla canina
See that each two rows have been merged into one (remember, a brigade is composed by one commissary and, optionally, one assistant) depending on the flag.
Could this model be improved (having in mind the limitation on multiple foreign keys referencing the same table, discussed here)
Try the following:
with cte as
(
SELECT A.brigade_id,A.user_id,A.is_commissar,B.name
FROM brigade_user A LEFT JOIN brigades B ON A.brigade_id = B.id
)
select C1.brigade_id, C1.user_id as commissar_id , C2.user_id as assistant_id, C1.name from
cte C1 left join cte C2
on C1.brigade_id=C2.brigade_id
and C1.user_id<>C2.user_id
where C1.is_commissar=true
See a demo from here.

Error "duplicate key value violates unique constraint" while updating multiple rows

I created a table in PostgreSQL and Oracle as
CREATE TABLE temp(
seqnr smallint NOT NULL,
defn_id int not null,
attr_id int not null,
input CHAR(50) NOT NULL,
CONSTRAINT pk_id PRIMARY KEY (defn_id, attr_id, seqnr)
);
This temp table has primary key as (defn_id,attr_id,seqnr) as a whole!
Then I inserted the record in the temp table as
INSERT INTO temp(seqnr,defn_id,attr_id,input)
VALUES (1,100,100,'test1');
INSERT INTO temp(seqnr,defn_id,attr_id,input)
VALUES (2,100,100,'test2');
INSERT INTO temp(seqnr,defn_id,attr_id,input)
VALUES (3,100,100,'test3');
INSERT INTO temp(seqnr,defn_id,attr_id,input)
VALUES (4,100,100,'test4');
INSERT INTO temp(seqnr,defn_id,attr_id,input)
VALUES (5,100,100,'test5');
in both Oracle and Postgres!
The table now contains:
seqnr | defn_id | attr_id | input
1 | 100 | 100 | test1
2 | 100 | 100 | test2
3 | 100 | 100 | test3
4 | 100 | 100 | test4
5 | 100 | 100 | test5
When I run the command:
UPDATE temp SET seqnr=seqnr+1
WHERE defn_id = 100 AND attr_id = 100 AND seqnr >= 1;
In case of ORACLE it is Updating 5 Rows and the O/p is
seqnr | defn_id | attr_id | input
2 | 100 | 100 | test1
3 | 100 | 100 | test2
4 | 100 | 100 | test3
5 | 100 | 100 | test4
6 | 100 | 100 | test5
But in case of PostgreSQL it is giving an error!
DETAIL: Key (defn_id, attr_id, seqnr)=(100, 100, 2) already exists.
Why does this happen and how can I replicate the same result in Postgres as Oracle?
Or how can the same result be achieved in Postgres without any errors?
UNIQUE an PRIMARY KEY constraints are checked immediately (for each row) unless they are defined DEFERRABLE - which is the solution you demand.
ALTER TABLE temp
DROP CONSTRAINT pk_id
, ADD CONSTRAINT pk_id PRIMARY KEY (defn_id, attr_id, seqnr) DEFERRABLE
;
Then your UPDATE just works.
db<>fiddle here
This comes at a cost, though. The manual:
Note that deferrable constraints cannot be used as conflict
arbitrators in an INSERT statement that includes an ON CONFLICT DO UPDATE clause.
And for FOREIGN KEY constraints:
The referenced columns must be the columns of a non-deferrable unique
or primary key constraint in the referenced table.
And:
When a UNIQUE or PRIMARY KEY constraint is not deferrable,
PostgreSQL checks for uniqueness immediately whenever a row is
inserted or modified. The SQL standard says that uniqueness should be
enforced only at the end of the statement; this makes a difference
when, for example, a single command updates multiple key values. To
obtain standard-compliant behavior, declare the constraint as
DEFERRABLE but not deferred (i.e., INITIALLY IMMEDIATE). Be aware
that this can be significantly slower than immediate uniqueness
checking.
See:
Constraint defined DEFERRABLE INITIALLY IMMEDIATE is still DEFERRED?
I would avoid a DEFERRABLE PK if at all possible. Maybe you can work around the demonstrated problem? This usually works:
UPDATE temp t
SET seqnr = t.seqnr + 1
FROM (
SELECT defn_id, attr_id, seqnr
FROM temp
WHERE defn_id = 100 AND attr_id = 100 AND seqnr >= 1
ORDER BY defn_id, attr_id, seqnr DESC
) o
WHERE (t.defn_id, t.attr_id, t.seqnr)
= (o.defn_id, o.attr_id, o.seqnr);
db<>fiddle here
But there are no guarantees as ORDER BY is not specified for UPDATE in Postgres.
Related:
UPDATE with ORDER BY

Finding all entries with no new reference in another table within last two years

I have the following three tables:
CREATE TABLE group (
id SERIAL PRIMARY KEY,
name VARCHAR NOT NULL,
insert_date TIMESTAMP WITH TIME ZONE NOT NULL
);
CREATE TABLE customer (
id SERIAL PRIMARY KEY,
ext_id VARCHAR NOT NULL,
insert_date TIMESTAMP WITH TIME ZONE NOT NULL
);
CREATE TABLE customer_in_group (
id SERIAL PRIMARY KEY,
customer_id INT NOT NULL,
group_id INT NOT NULL,
insert_date TIMESTAMP WITH TIME ZONE NOT NULL,
CONSTRAINT customer_id_fk
FOREIGN KEY(customer_id)
REFERENCES customer(id),
CONSTRAINT group_id_fk
FOREIGN KEY(group_id)
REFERENCES group(id)
)
I need to find all of the groups which have not had any customer_in_group entities' group_id column reference them within the last two years. I then plan to delete all of the customer_in_groups that reference them, and finally delete that group after finding them.
So basically given the following two groups and the following 3 customer_in_groups
Group
| id | name | insert_date |
|----|--------|--------------------------|
| 1 | group1 | 2011-10-05T14:48:00.000Z |
| 2 | group2 | 2011-10-05T14:48:00.000Z |
Customer In Group
| id | group_id | customer_id | insert_date |
|----|----------|-------------|--------------------------|
| 1 | 1 | 1 | 2011-10-05T14:48:00.000Z |
| 2 | 1 | 1 | 2020-10-05T14:48:00.000Z |
| 3 | 2 | 1 | 2011-10-05T14:48:00.000Z |
I would expect just to get back group2, since group1 has a customer_in_group referencing it inserted in the last two years.
I am not sure how I would write the query that would find all of these groups.
As a starter, I would recommend enabling on delete cascade on foreing keys of customer_in_group.
Then, you can just delete the rows you want from groups, and it will drop the dependent rows in the child table. For this, you can use not exists:
delete from groups g
where not exists (
select 1
from customer_in_group cig
where cig.group_id = g.id and cig.insert_date >= now() - interval '2 year'
)

Snowflake: create a default field value that auto increments for each primary key, resets per primary key

I would like to create a table to house the following type of data
+--------+-----+----------+
| pk | ctr | name |
+--------+-----+----------+
| fish | 1 | herring |
| mammal | 1 | dog |
| mammal | 2 | cat |
| mammal | 3 | whale |
| bird | 1 | penguin |
| bird | 2 | ostrich |
+--------+----_+----------+
PK is the primary key string (100) not null
ctr is a field I want to auto increment by 1 for each pk row
I have tried the following
create or replace table schema.animals (
pk string(100) not null primary key,
ctr integer not null default ( select NVL(max(ctr),0) + 1 from schema.animals )
name string (1000) not null);
This produced the following error
SQL compilation error: error line 6 at position 52 aggregate functions
are not allowed as part of the specification of a default value
clause.
So i would have used the auto increment /identity property like so
AUTOINCREMENT | IDENTITY [ ( start_num , step_num ) | START num INCREMENT num ]
but it doesnt seem to be able to support the resetting per unique pk
looking for any suggestions on how to solve this, thanks for any help in advance
You cannot do this with an IDENTITY method. The suggested solution is to use INSTEAD OF trigger that will calculate ctr value on every row of INSERTED table. For example
CREATE TABLE dbo.animals (
pk nvarchar(100) NOT NULL,
ctr integer NOT NULL,
name nvarchar(1000) NOT NULL,
CONSTRAINT PK_animals PRIMARY KEY (pk, ctr)
)
GO
CREATE TRIGGER dbo.animals_before_insert ON dbo.animals INSTEAD OF INSERT
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO animals (pk, ctr, name)
SELECT
i.pk,
(ROW_NUMBER() OVER (PARTITION BY i.pk ORDER BY i.name) + ISNULL(a.max_ctr, 0)) AS ctr,
i.name
FROM inserted i
LEFT JOIN (SELECT pk, MAX(ctr) AS max_ctr FROM dbo.animals GROUP BY pk) a
ON i.pk = a.pk;
END
GO
INSERT INTO dbo.animals (pk, name) VALUES
('fish' , 'herring'),
('mammal' , 'dog'),
('mammal' , 'cat'),
('mammal' , 'whale'),
('bird' , 'pengui'),
('bird' , 'ostrich');
SELECT * FROM dbo.animals;
Result
pk ctr name
------- ----- ---------
bird 1 ostrich
bird 2 pengui
fish 1 herring
mammal 1 cat
mammal 2 dog
mammal 3 whale
Another method is to use scalar user-defined function as DEFAULT value but it is slow: the trigger fires once on all rows whereas the function is called on every row.
I have no idea why you would have a column called pk that is not the primary key. You cannot (easily) do what you want. I would recommend doing this as:
create or replace table schema.animals (
animal_id int identity primary key,
name string(100) not null primary key,
);
create view schema.v_animals as
select a.*, row_number() over (partition by name order by animal_id) as ctr
from schema.animals a;
That is, calculate ctr when you need to use it, rather than storing it in the table.