What is the best way to enforce key uniqueness in a temporal table (Oracle DBMS). A temporal table is one where all historical states are recorded with a time-span.
For example, we have a Key --> Value association like this ...
create table TEMPORAL_VALUES
(KEY1 varchar2(99) not null,
VALUE1 varchar2(99),
START_PERIOD date not null,
END_PERIOD date not null);
There are two constraints to enforce to do with the temporal nature of the table, to wit:
For each record we must have END_PERIOD > START_PERIOD. This is the period for which the Key->Value map is valid.
For each Key, there can't be any overlapping periods. The period includes the moment of the START_PERIOD, but excludes the exact moment of the END_PERIOD.
Constraint enforcement could be done either on row insert/update, or on commit. I don't really care, as long as it is impossible to commit invalid data.
I've been informed that the best practice to enforce constraints like this is to use materialized views instead of triggers.
Please advise on what is the best way to achieve this?
The Oracle banner is ...
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bi
What I have tried so far
I think that this solution is close, but it doesn't really work because 'on commit' is needed. Oracle doesn't seem capable of creating a materialized view of this complexity which refreshes on commit.
create materialized view OVERLAPPING_VALUES
nologging cache build immediate
refresh complete on demand
as select 'Wrong!'
from
(
select KEY1, END_PERIOD,
lead( START_PERIOD, 1) over (partition by KEY1 order by START_PERIOD) as NEXT_START
from TEMPORAL_VALUES
)
where NEXT_START < END_PERIOD;
alter table OVERLAPPING_VALUES add CHECK( 0 = 1 );
What am I doing wrong? How do I get this work on commit to prevent invalid rows in TEMPORAL_VALUES?
After some struggling, experimentation and guidance from this forum post,
drop table TEMPORAL_VALUE;
create table TEMPORAL_VALUE
(KEY1 varchar2(99) not null,
VALUE1 varchar2(99),
START_PERIOD date not null,
END_PERIOD date
)
/
alter table TEMPORAL_VALUE add
constraint CHECK_PERIOD check ( END_PERIOD is null or END_PERIOD > START_PERIOD)
/
alter table TEMPORAL_VALUE add
constraint PK_TEMPORAL_VALUE primary key (KEY1, START_PERIOD)
/
alter table TEMPORAL_VALUE add
constraint UNIQUE_END_PERIOD unique (KEY1, END_PERIOD)
/
create materialized view log on TEMPORAL_VALUE with rowid;
drop materialized view OVERLAPPING_VALUES;
create materialized view OVERLAPPING_VALUES
build immediate refresh fast on commit as
select a.rowid a_rowid, b.rowid b_rowid
from TEMPORAL_VALUE a, TEMPORAL_VALUE b
where a.KEY1 = b.KEY1
and a.rowid <> b.rowid
and a.START_PERIOD <= b.START_PERIOD
and (a.END_PERIOD is null or (a.END_PERIOD > b.START_PERIOD));
alter table OVERLAPPING_VALUES add CHECK( 0 = 1 );
Why does this work?
Why does this work, but my original posted view ...
select KEY1, END_PERIOD,
lead( START_PERIOD, 1) over (partition by KEY1 order by START_PERIOD) as NEXT_START
from TEMPORAL_VALUES
... will not be accepted as an On-Commit materialized view? Well, there answer is that there appears to be limits in the complexity of on-commit materialized views. The views must include the row id's or keys of the underlying table, and not be over some threshold of complexity.
There is a technique I've seen described for SQL Server (see this article and search for "Kuznetsov's History Table") which adds a third time column, previous_end_period that you can use to establish a foreign key on the table itself to enforce the constraint that the intervals can't overlap. I don't know if this can be adapted to Oracle.
Nice solution Sean!
But I would add comments to your objects due to the complexity… something like:
COMMENT ON COLUMN TEMPORAL_VALUE.KEY IS 'Each key may have at most only one value for any instant in time';
COMMENT ON COLUMN TEMPORAL_VALUE.START_PERIOD IS 'The period described includes the START_PERIOD date/time';
COMMENT ON COLUMN TEMPORAL_VALUE.END_PERIOD IS 'The period described does not included the END_PERIOD date/time. A null end period means until forever';
COMMENT ON COLUMN TEMPORAL_VALUE IS 'Integrity is enforced by the MATERIALIZED VIEW OVERLAPPING_VALUES';
COMMENT ON MATERIALIZED VIEW OVERLAPPING_VALUES IS 'Used to enforce the rule - each key may have at most only one value for any instant in time. This is an [on commit] mv, that holds any temporal values that overlaps another (for the same key), but the CHECK(0=1) constraint will raise an exception if any rows are found, stopping any commit that would break integrity';
I personally like to prefix all materialized view names with MV_ and views with V_
Interesting that you don’t allow START_PERIOD to be null. Most implementations would allow a null start and a non-null end to specify the period everything before, and null values for both bates to indicate a constant value for a key.
Related
I have an existing table in a postgres-DB. For the sake of demonstration, this is how it looks like:
create table myTable(
forDate date not null,
key2 int not null,
value int not null,
primary key (forDate, key2)
);
insert into myTable (forDate, key2, value) values
('2000-01-01', 1, 1),
('2000-01-01', 2, 1),
('2000-01-15', 1, 3),
('2000-03-02', 1, 19),
('2000-03-30', 15, 8),
('2011-12-15', 1, 11);
However in contrast to these few values, myTable is actually HUGE and it is growing continuously. I am generating various reports from this table, but currently 98% of my reports work with a single month and the remaining queries work with an even shorter timeframe. Oftentimes my queries cause Postgres to do table scans over this huge table and I am looking for ways to reduce the problem. Table partitioning seems to fit my problem perfectly. I could just partition my table into months. But how do I turn my existing table into a partitioned table? The manual explicitly states:
It is not possible to turn a regular table into a partitioned table or vice versa
So I need to develop my own migration script, which will analyze the current table and migrate it. The needs are as follows:
At design time the time frame which myTable covers is unknown.
Each partition should cover one month from the first day of that month to the last day of that month.
The table will grow indefinitely, so I have no sane "stop value" for how many tables to generate
The result should be as transparent as possible, meaning that I want to touch as little as possible of my existing code. In best case this feels like a normal table which I can insert to and select from without any specials.
A database downtime for migration is acceptable
Getting along with pure Postgres without any plugins or other things that need to be installed on the server is highly preferred.
Database is PostgreSQL 10, upgrading to a newer version will happen sooner or later anyway, so this is an option if it helps
How can I migrate my table to be partitioned?
In Postgres 10 "Declarative Partitioning" was introduced, which can relieve you of a good deal of work such as generating triggers or rules with huge if/else statements redirecting to the correct table. Postgres can do this automatically now. Let's start with the migration:
Rename the old table and create a new partitioned table
alter table myTable rename to myTable_old;
create table myTable_master(
forDate date not null,
key2 int not null,
value int not null
) partition by range (forDate);
This should hardly require any explanation. The old table is renamed (after data migration we'll delete it) and we get a master table for our partition which is basically the same as our original table, but without indexes)
Create a function that can generate new partitions as we need them:
create function createPartitionIfNotExists(forDate date) returns void
as $body$
declare monthStart date := date_trunc('month', forDate);
declare monthEndExclusive date := monthStart + interval '1 month';
-- We infer the name of the table from the date that it should contain
-- E.g. a date in June 2005 should be int the table mytable_200506:
declare tableName text := 'mytable_' || to_char(forDate, 'YYYYmm');
begin
-- Check if the table we need for the supplied date exists.
-- If it does not exist...:
if to_regclass(tableName) is null then
-- Generate a new table that acts as a partition for mytable:
execute format('create table %I partition of myTable_master for values from (%L) to (%L)', tableName, monthStart, monthEndExclusive);
-- Unfortunatelly Postgres forces us to define index for each table individually:
execute format('create unique index on %I (forDate, key2)', tableName);
end if;
end;
$body$ language plpgsql;
This will come in handy later.
Create a view that basically just delegates to our master table:
create or replace view myTable as select * from myTable_master;
Create rule so that when we insert into the rule, we'll not just update out partitioned table, but also create a new partition if needed:
create or replace rule autoCall_createPartitionIfNotExists as on insert
to myTable
do instead (
select createPartitionIfNotExists(NEW.forDate);
insert into myTable_master (forDate, key2, value) values (NEW.forDate, NEW.key2, NEW.value)
);
Of course, if you also need update and delete, you also need a rule for those which should be straight forward.
Actually migrate the old table:
-- Finally copy the data to our new partitioned table
insert into myTable (forDate, key2, value) select * from myTable_old;
-- And get rid of the old table
drop table myTable_old;
Now migration of the table is complete without that there was any need to know how many partitions are needed and also the view myTable will be absolutely transparent. You can simple insert and select from that table as before, but you might get the performance benefit from partitioning.
Note that the view is only needed, because a partitioned table cannot have row triggers. If you can get along with calling createPartitionIfNotExists manually whenever needed from your code, you do not need the view and all it's rules. In this case you need to add the partitions als manually during migration:
do
$$
declare rec record;
begin
-- Loop through all months that exist so far...
for rec in select distinct date_trunc('month', forDate)::date yearmonth from myTable_old loop
-- ... and create a partition for them
perform createPartitionIfNotExists(rec.yearmonth);
end loop;
end
$$;
A suggestion could be, use a view for you main table access, do the steps mentioned above, where you create a new partition table. once finished, point the view to the new partitioned table, and then do the migration, finally deprecate the old table.
My Tables are:
CREATE TABLE member
(
svn INTEGER,
campid INTEGER,
tentname VARCHAR(4),
CONSTRAINT member_fk_svn FOREIGN KEY (svn) REFERENCES people,
CONSTRAINT member_fk_campid FOREIGN KEY (campid) REFERENCES camp ON
DELETE CASCADE,
CONSTRAINT member_pk PRIMARY KEY (svn, campid),
CONSTRAINT member_fk_tentname FOREIGN KEY (tentname) REFERENCES tent,
CONSTRAINT check_teilnehmer_zelt CHECK (Count(zeltname) over (PARTITION BY (zeltname
AND lagerid)) )<= zelt.schlafplaetze
);
With the last constraint, I want to check that there are not more members assigned to a tent than the capacity of it.
Thank you in advances for your help
This would require a SQL assertion, which is not currently supported by Oracle (or indeed any DBMS). However, Oracle are considering adding support for these in the future (please upvote that idea!)
Solution using a Materialized View
Currently you may be able to implement this constraint using a materialized view (MV) with a check constraint - something I blogged about many years ago. In your case the materialized view query would be something like:
select t.tent_id
from tents t, members m
where m.tent_id = t.tent_id
group by t.tent_id
having sum(m.num_members) > t.capacity;
The check constraint could be:
check (t.tent_id is null)
The check constraint would be violated for any row returned by the materialized view, so ensures that the MV is always empty i.e. no tents exist that are over capacity.
Notes:
I deliberately did not use ANSI join syntax, since MVs don't tend to like it (the same join may be permitted in old syntax but not permitted in ANSI syntax). Of course feel free to try ANSI first.
I haven't confirmed that this particular query is permitted in an MV with REFRESH COMPLETE ON COMMIT. The rules on what can and cannot be used vary from version to version of Oracle.
Watch out for the performance impact of maintaining the MV.
Alternative solution using a trigger
Another way would be to add a column total_members to the tents table, and use a trigger on members to maintain that e.g.
create trigger members_trg
after insert or delete or update of num_members on members
for each row
declare
l_total_members tents.total_members%type;
begin
select total_members
into l_total_members
from tents
where tent_id = nvl(:new.tent_id,:old.tent_id)
for update of total_members;
if inserting then
l_total_members := l_total_members + :new.num_members;
elsif deleting then
l_total_members := l_total_members - :old.num_members;
elsif updating then
l_total_members := l_total_members - :old.num_members + :new.num_members;
end if;
update tents
set total_members = l_total_members
where tent_id = nvl(:new.tent_id,:old.tent_id);
end;
Then just add the check constraint:
alter table tents add constraint tents_chk
check (total_members <= capacity);
By maintaining the total in the tents table, this solution serializes transactions and thus avoids the data corruption you will get with other trigger-based solutions in multi-user environments.
No, it is not. From the documentation:
The search condition must always return the same value if applied to
the same values. Thus, it cannot contain any of the following:
* Dynamic parameters (?)
* Date/Time Functions (CURRENT_DATE, CURRENT_TIME, CURRENT_TIMESTAMP)
* Subqueries
* User Functions (such as USER, SESSION_USER, CURRENT_USER)
I want to restrict insertion in my table based on some condition.
My table is like
col1 col2 Date Create
A 1 04/05/2016
B 2 04/06/2016
A 3 04/08/2016 -- Do not allow insert
A 4 04/10/2016 -- Allow insert
So I want to restrict insert based on the number of days the same record was inserted earlier.
As shown in able example, A can be inserted again in table only after 4 days of previous insertion not before that.
Any pointers how I can do this in SQL/Oracle.
You only want to insert when there not exists a record with the same col1 and a too recent date:
insert into mytable (col1, col2, date_create)
select 'B' as col1, 4 as col2, trunc(sysdate) as date_create from dual ins
where not exists
(
select *
from mytable other
where other.col1 = ins.col1
and other.date_create > ins.date_create - 4
);
An undesired record would not be inserted thus. However, no exception would be raised. If you want that, I'd suggest a PL/SQL block or a before insert trigger.
If several processes write to your table simultaneously with possibly conflicting data then oracle database should do the job.
This can be solved by defining a constraint to check if there already exists an entry with the same col1 value younger than four days.
As far as I know, it is not possible to define such a constraint directly. Instead, define a materialized view and add a constraint on this view.
create materialized view mytable_mv refresh on commit as
select f2.col1, f2.date_create, f1.date_create as date_create_conflict
from mytable f2, mytable f1
where f2.col1 = f1.col1
and f2.date_create > f1.date_create
and f2.date_create - f1.date_create < 4;
This materialized view will contain an entry, if and only if a conflict exists.
Now define a constraint on this view:
alter table mytable_mv add constraint check_date_create
check(date_create=
date_create_conflict) deferrable;
It is executed when the current transaction is commited (because the materialized view is refreshed - as declared above refresh on commit).
This works fine if you insert into your table mytable in an autonomous transaction, e.g. for a logging table.
In other cases, you can force the refresh on the materialized view by dbms_mview.refresh('mytable_mv') or use another option than refresh on commit.
I have problems using a CHECK constraint. I have two tables
Users Table
userid | register_date
Activity table
id | userid | activity_date
I need to put a constraint that disallows the insertion of an activity_date which is less than a register_date. I could do it with a CHECK constraint if they were in same table. But, how do you do it for two different tables? (also Oracle disallows sub-queries in a check constraint).
Is there any other way to perform this action?
The simplest way is to have a trigger:
create or replace trigger tr_activity
before update or insert of activity_date on activity
for each row
declare
l_register_date users.register_date%type;
begin
select register_date into l_register_date
from users
where id = :new.userid
;
if :new.activity_date < l_register_date then
raise_application_error(-20000, 'Stop attempting the impossible');
end if;
end tr_activity;
/
But, this seems a little strange; I would assume that you're only ever inserting the current date into the activity date, which means that the registration date will always be before the activity date, unless it's been updated prior. I would simply ensure that the activity date is never updated or inserted in your application code and use a default value in the table:
alter table activity modify activity_date default sysdate not null
I have an sql table which is basically a statement.
Now lets say the records I have in my table have a date and an identity column which is autonumbered and defines the order which the transactions are displayed in the front end to the client.
The issue is during an insert some of the data have gone missing and some transactions between two dates are missing.
I need to insert the data into the table, but I need to insert them between the dates and not at the end of the table. If I do a a normal insert, the data will appear at the end of the table and not at the date I specify, because the identity column is autonumbered, and cannot be updated.
Using SET IDENTITY_INSERT (table) ON, you force SQL Server to let you insert any arbitrary value into an IDENTITY column - but there's no way to update an IDENTITY column.
What's the big problem with a few gaps anyway?? Yes, it might be a bit of a "cosmetic" problem - but how much hassle and effort do you really want to spend on cosmetic problems?? The order of the entries is still a given - even with gaps.
So again: what's the big deal?? IDENTITY columns are guaranteed to be ever increasing - that's all they guarantee. And for 99% of the cases, that's more than good enough....
Why not just display the records in the user interface sorted by the date, rather than by the primary key?
OK, if you really want to do this (personally, I think changing the sort date in the UI is going to be easier than updating the primary key values in the database, but anyway...). This should work, assuming you're not using the primary key values in any foreign key constraints (if you are, then you'll need to make sure those constraints have ON UPDATE CASCADE set)
SET IDENTITY_INSERT tablename ON
UPDATE tablename SET
primary_key = primay_key + 1
WHERE
primary_key >= <the primary key where you want to insert the new date>
INSERT INTO tablename
(primary_key, date, ...)
VALUES
(<the primary key to insert>, <the date to insert>, ...)
SET IDENTITY_INSERT tablename OFF
However, I strongly, strongly suggest you backup your database before attempting this.
Just out of curiosity, is it one ID per date? Your answers imply this a little, so if so, replace the Identity column with a computed column that is defined as the date difference in days from an arbitrary starting point?
DECLARE #EXAMPLE TABLE
(
[Date] DATE,
ID AS DATEDIFF(Day, '1 Jan 2010', [Date])
)
INSERT INTO #EXAMPLE([Date])
VALUES (GETDATE()), (GETDATE()+1), (GETDATE()+2)
SELECT * FROM #EXAMPLE