SQL self calculated field - sql

I have two tables which are connected by m2m relationship.
CREATE TABLE words
(
id INT PRIMARY KEY,
word VARCHAR(100) UNIQUE,
counter INT
)
CREATE TABLE urls
(
id INT PRIMARY KEY,
url VARCHAR(100) UNIQUE
)
CREATE TABLE urls_words
(
url_id INT NOT NULL REFERENCES urls(id),
word_id INT NOT NULL REFERENCES words(id)
)
and i have counter field in words table. How can i automize proccess of updating counter field which is responsible for calculating how much rows stored in urls_words with particular word.

I would investigate why you want to store this value. There may be good reasons, but triggers complicate databases.
If this is a "load-then-query" database, then you can update the count when you load data -- presumably at some frequency such as once a day or once a week. You don't need to worry about triggers.
If this is a transactional database, then triggers would be needed and these add complexity to the processing. They also lock tables when you might not want them locked.
An alternative is to have an index on urls_words(word_id, url_id). This would greatly speed the calculation of the count when you need it. It also does not require triggers or locks on multiple table during an update.

Create a trigger on urls_words table which updates the counter column on words table every time a change is made (ie update, insert, delete).

Related

How do I ensure that a referencing table also has data

My Postgres database has the following schema where the the user can store multi profile images.
CREATE TABLE users(
id INT GENERATE AS ALWAYS PRIMARY KEY,
name VARCHAR(50)
);
CREATE TABLE images(
id INT GENERATE AS ALWAYS PRIMARY KEY,
url VARCHAR(50)
);
CREATE TABLE user_images(
user_id INT REFERENCES users(id),
image_id INT REFERENCES images(id)
);
How do I ensure that when I insert a user object, I also insert at least one user image?
You cannot do so very easily . . . and I wouldn't encourage you to enforce this. Why? The problem is a "chick and egg" problem. You cannot insert a row into users because there is no image. You cannot insert a row into user_images because there is no user_id.
Although you can handle this situation with transactions or delayed constraint checking, that covers only half the issue -- because you have to prevent deletion of the last image.
Here are two alternative.
First, you can simply add a main_image_id to the users table and insist that it be NOT NULL. Voila! At least one image is required.
Second, you can use a trigger to maintain a count of images in users. Then treat rows with no images as "deleted" so they are never seen.
When you insert a data into a table database can return a id from row which was inserted. So, if id > 0 the row has been inserted. But first, add column id (bigserial, auto increment, unique) to all tables.
INSERT INTO user_images VALUES (...) RETURNING id;

RDBMS primary key design for row versioning

I want to design primary key for my table with row versioning. My table contains 2 main fields : ID and Timestamp, and bunch of other fields. For a unique "ID" , I want to store previous versions of a record. Hence I am creating primary key for the table to be combination of ID and timestamp fields.
Hence to see all the versions of a particular ID, I can give,
Select * from table_name where ID=<ID_value>
To return the most recent version of a ID, I can use
Select * from table_name where ID=<ID_value> ORDER BY timestamp desc
and get the first element.
My question here is, will this query be efficient and run in O(1) instead of scanning the entire table to get all entries matching same ID considering ID field was a part of primary key fields? Ideally to get a result in O(1), I should have provided the entire primary key. If it does need to do entire table scan, then how else can I design my primary key so that I get this request done in O(1)?
Thanks,
Sriram
The canonical reference on this subject is Effective Timestamping in Databases:
https://www.cs.arizona.edu/~rts/pubs/VLDBJ99.pdf
I usually design with a subset of this paper's recommendations, using a table containing a primary key only, with another referencing table that has that key as well change_user, valid_from and valid_until colums with appropriate defaults. This makes referential integrity easy, as well as future value insertion and history retention. Index as appropriate, and consider check constraints or triggers to prevent overlaps and gaps if you expose these fields to the application for direct modification. These have an obvious performance overhead.
We then make a "current values view" which is exposed to developers, and is also insertable via an "instead of" trigger.
It's far easier and better to use the History Table pattern for this.
create table foo (
foo_id int primary key,
name text
);
create table foo_history (
foo_id int,
version int,
name text,
operation char(1) check ( operation in ('u','d') ),
modified_at timestamp,
modified_by text
primary key (foo_id, version)
);
Create a trigger to copy a foo row to foo_history on update or delete.
https://wiki.postgresql.org/wiki/Audit_trigger_91plus for a full example with postgres

SQL Server Constraint (Limit bit field based on a foreign key)

I need help with constraints in SQL Server. The situation is for each OrderID=1 (foreign key not primary key so there are multiple rows with the same ID) on the table, the bit field can only be 1 for one of those rows, and for each row with OrderID=2, the bit field can only be 1 for one row, etc etc. It should be 0 for all other rows with the same OrderID. Any new records coming in with 1 in the bit field should reject if there is already a row with that OrderID which has the bit field set to 1. Any ideas?
CREATE UNIQUE INDEX ON UnnamedTable (OrderID) WHERE UnnamedBitField=1
It's called a Filtered Index. If you're on a pre-2008 version of SQL Server, you can implement a poor-mans equivalent of a filtered index using an indexed view:
CREATE VIEW UnnamedView
WITH SCHEMABINDING
AS
SELECT OrderID From UnnamedSchema.UnnamedTable WHERE UnnamedBitField=1
GO
CREATE UNIQUE CLUSTERED INDEX ON UnnamedView (OrderID)
You can't really do it as a constraint, since SQL Server only supports column constraints and row constraints. There's no (non-fudging) way to write a constraint that deals with all values in the table.
You could more fully normalize the schema which will help you not have to hunt for the already set bit but use a join. You need to remove the bit field and crate a new table say X containing OrderID and the primary key of your table, with the primary key of X being all those fields.
This means that when you insert you need to insert into your original table and into X f and only if you would have set the bit to 1 on your table. The insert will fail if there is already a row in X which is as if there was already an original row with bit set to 1.
The downside is that this takes up more space than your schema but is easier to maintain as you can't get to the equivalent of having two rows with the bit set to 1.
The only way to do that is to subclass the parent table. You didn't mention it but a common reason for this pattern is to represent one unique active row from the set of all rows with the same common key value. Let's Assume your bit field represents the active Orders....
Then I would create a separate table called ActiveOrders, which will only contain the one row with the bit field set to 1
Create Table ActiveOrders(int Orderid Primary Key Null)
and the other table with all the rows in it, with it's own unique Primary Key OrderId
Create Table AllOrders
(OrderId Integer Primary Key Not Null, ActiveOrderId Integer Not Null,
[All other data fields]
Constraint FK_AllOrders2ActiveOrder
Foreign Key(ActiveOrderId) references ActiveOrders(OrderId))
You now no longer even need the bit field, as the presence of the row in the ActiveOrders table identifies it as the Active Order... To get only the active Orders (the ones that in your scheme would have bit field set to 1), just join the two tables.
I aggree with the other answers and if you can change the schema then do that but if not then I think something like this will do.
CREATE FUNCTION fnMyCheck
(#id INT)
RETURNS INT
AS
BEGIN
DECLARE #i INT
SELECT #i = COUNT(*)
FROM MyTable
WHERE FkCol = #id
AND BitCol = 1
RETURN #i
END
ALTER TABLE YourTable
ADD CONSTRAINT ckMyCheck CHECK (fnMyCheck(FkCol)<=1)
but there are problems that can come from doing using a udf in a check constraint, such as this
Edit to add comment regarding problems with this 'solution':
There are more straightforward issues than what you've linked to.
INSERT INTO YourTable(FkCol,BitCol) VALUES (1,1),(1,0)
followed by
UPDATE YourTable SET BitCol=1
succeeds and leaves two rows with FkCol=1 and BitCol=1

Can I use a trigger to create a column?

As an alternative to anti-patterns like Entity-Attribute-Value or Key-Value Pair tables, is it possible to dynamically add columns to a data table via an INSERT trigger on a parameter table?
Here would be my tables:
CREATE TABLE [Parameters]
(
id int NOT NULL
IDENTITY(1,1)
PRIMARY KEY,
Parameter varchar(200) NOT NULL,
Type varchar(200) NOT NULL
)
GO
CREATE TABLE [Data]
(
id int NOT NULL
IDENTITY(1,1)
PRIMARY KEY,
SerialNumber int NOT NULL
)
GO
And the trigger would then be placed on the parameter table, triggered by new parameters being added:
CREATE TRIGGER [TRG_Data_Insert]
ON [Parameters]
FOR INSERT
AS BEGIN
-- The trigger takes the newly inserted parameter
-- record and ADDs a column to the data table, using
-- the parameter name as the column name, the data type
-- as the column data type and makes the new column
-- nullable.
END
GO
This would allow my data mining application to get a list of parameters to mine and have a place to store that data once it mines it. It would also allow a user to add new parameters to mine dynamically, without having to mess with SQL.
Is this possible? And if so, how would you go about doing it?
I think the idea of dynamically adding columns will be a ticking time bomb, just gradually creeping towards one of the SQL Server limits.
You will also be putting the database design in the hands of your users, leaving you at the mercy of their naming conventions and crazy ideas.
So while it is possible, is it better than an EAV table, which is at least obvious to the next developer to pick up your program؟

Is there a smart way to append a number to an PK identity column in a Relational database w/o total catastrophe?

It's far from the ideal situation, but I need to fix a database by appending the number "1" to the PK Identiy column which has FK relations to four other tables. I'm basically making a four digit number a five digit number. I need to maintain the relations. I could store the number in a var, do a Set query and append the 1, and do that for each table...
Is there a better way of doing this?
You say you are using an identity data type for your primary key so before you update the numbers you will have to SET IDENTITY_INSERT ON (documentation here) and then turn it off again after the update.
As long as you have cascading updates set for your relations the other tables should be updated automatically.
EDIT: As it's not possible to change an identity value I guess you have to export the data, set the new identity values (+10000) and then import your data again.
Anyone have a better suggestion...
Consider adding another field to the PK instead of extending the length of the PK field. Your new field will have to cascade to the related tables, like a field length increase would, but you get to retain your original PK values.
My suggestion is:
Stop writing to the tables.
Copy the tables to new tables with the new PK.
Rename the old tables to backup names.
Rename the new tables to the original table name.
Count the rows in all the tables and double check your work.
Continue using the tables.
Changing a PK after the fact is not fun.
If the column in question has an identity property on it, it gets complicated. This is more-or-less how I'd do it:
Back up your database.
Put it in single user mode. You don't need anybody mucking around whilst you do the surgery.
Execute the ALTER TABLE statements necessary to
disable the primary key constraint on the table in question
disable all triggers on the table in question
disable all foreign key constraints referencing the table in question.
Clone your table, giving it a new name and a column-for-column identical definitions. Don't bother with any triggers, indices, foreign keys or other constraints. Omit the identity property from the table's definition.
Create a new 'map' table that will map your old id values to the new value:
create table dbo.pk_map
(
old_id int not null primary key clustered ,
new_id int not null unique nonclustered ,
)
Populate the map table:
insert dbo.pk_map
select old_id = old.id ,
new_id = f( old.id ) // f(x) is the desired transform
from dbo.tableInQuestion old
Populate your new table, giving the primary key column the new value:
insert dbo.tableInQuestion_NEW
select id = map.id ,
...
from dbo.tableInQuestion old
join dbo.pk_map map on map.old_id = old.id
Truncate the original table: TRUNCATE dbo.tableInQuestion. This should work—safely—since you've disabled all the triggers and foreign key constraints.
Execute SET IDENTITY_INSERT dbo.tableInQuestion ON.
Reload the original table:
insert dbo.tableInQuestion
select *
from dbo.tableInQuestion_NEW
Execute SET IDENTITY_INSERT dbo.tableInQuestion OFF
Execute drop table dbo.tableInQuestion_NEW. We're all done with it.
Execute DBCC CHECKIDENT( dbo.tableInQuestion , reseed ) to get the identity counter back in sync with the data in the table.
Now, use the map table to propagate the changed primary key column down the line. Depending on your E-R model, this can get complicated as foreign keys referencing the updated column may themselves be part of a composite primary key.
When you're all done, start re-enabling the constraints and triggers you disabled. Make sure you do this using the WITH CHECK option. Fix any problems thus uncovered.
Finally, drop the map table, and clear the single user flag and bring your system(s) back online.
Piece of cake! (or something.)
Consider this approach:
Reset the identity seed to the 10000 + the current seed.
Set identity insert on
Insert into the table from the values in the table and add 10000 to the identity column on the way.
EX:
Set identity insert on
Insert Table(identity, column1, eolumn2)
select identity + 10000, column1, column2
From Table
Where identity < origional max identity value
After the insert you know the identity is exactly 10000 more than the origional.
Update the foreign keys by addding 10000.