Super polymorphic event schema - sql

I want to be able to track every action a user takes on my site.
An action can originate from a visitor or a user (both of which are human).
An action can affect a subject (visitor or a user)
An action can have an object, which can be any of the other database tables
Some examples:
User A (actor) assigns User B (subject) to conversation (object)
User A (actor) creates team (object)
User B (actor) moved Visitor (subject) to group C (object)
In my application, I want a feed of all events, and for each event, show exactly what actor, subject (if any) and object it refers to
I am thinking something like
create table actors ( -- contains as many rows as there are people
int ID,
)
create table roles ( -- roles like for both human and object roles such as: Visitor, Team, User, Conversation, Group
int ID,
nvarchar(max) Name
)
create table actors_roles ( -- associates people with roles
int Actor_ID, -- FK to actors.ID
int Role_ID -- FK to roles.ID
)
create table objects ( -- associates objects with roles
int ID,
)
create table object_roles ( -- associates objects with roles
int Object_ID, -- FK to object.ID
int Role_ID -- FK to roles.ID
)
create table tEvent (
int ID,
int Type_ID,
int Actor_ID, -- FK to actors.ID
int Subject_ID -- FK to actors.ID
int Object_ID -- FK to objects.ID
)
Besides these tables, every record in roles will have a corresponding, separate table maintaining all the data related to the object with a foreign key.
I'd love to get some feedback on this structure and if it is scaleable, or perhaps, there is a better way to accomplish this?
Credit to Daniel A. Thompson for pushing me in this direction

Based on your requirements, I'd propose the following schema:
-- roles for both human and object roles such as:
-- Visitor, Team, User, Conversation, Group
CREATE TABLE tRole (
int ID,
nvarchar(max) Name
)
-- contains as many rows as there are people
CREATE TABLE tActor (
int ID,
int Role_ID -- FK to tRole.ID
)
-- contains as many rows as there are objects
CREATE TABLE tObject (
int ID,
int Role_ID -- FK to tRole.ID
)
CREATE TABLE tEvent (
int ID,
int Type_ID,
int Actor_ID, -- FK to tActor.ID
int Subject_ID -- FK to tActor.ID
int Object_ID -- FK to tObject.ID
)

Related

Create an aggregation query of a table linked to another 2 tables

I have 3 tables, the first one stores information about a user, such as an email address, etc, and it updates dynamically (when a new user has registered). The second one stores the user's roles and it is a static table comes from the init SQL script. And the last one, named user_status, keeps track of changing a particular user's role by adding a new entry with the current timestamp.
Need to gather the current status (the latest created status entries) of all users grouped by role pointed to a number, which is a count of corresponding users.
/* table user_account stores top user information */
CREATE TABLE IF NOT EXISTS user_account (
id serial PRIMARY KEY,
email text NOT NULL
);
/* table user_role keeps user's role ids */
CREATE TABLE IF NOT EXISTS user_role (
id serial PRIMARY KEY,
role text NOT NULL
);
/* table user_status track a user role on change */
CREATE TABLE IF NOT EXISTS user_status (
id serial PRIMARY KEY,
user_account_id int NOT NULL REFERENCES user_account(id),
user_role_id int NOT NULL REFERENCES user_role(id),
created timestamptz NOT NULL DEFAULT clock_timestamp()
);
INSERT INTO user_account(email) VALUES
( 'user1#myorg.com' ),
( 'user2#myorg.com' ),
( 'user3#myorg.com' );
INSERT INTO user_role (role) VALUES
( 'activation_required' ),
( 'regular' ),
( 'forum_only' ),
( 'moderator' ),
( 'admin' );
INSERT INTO user_status (user_account_id, user_role_id) VALUES
(1, 1), -- now user `user1#myorg.com` has `activation_required` role
(1, 5), -- now user `user1#myorg.com` has `admin` role
(2, 1), -- now user `user2#myorg.com` has `activation_required` role
(2, 2), -- now user `user2#myorg.com` has `regular` role
(3, 1), -- now user `user3#myorg.com` has `activation_required` role
(1, 4), -- now user `user1#myorg.com` has `moderator` role
(3, 2); -- now user `user2#myorg.com` has `regular` role
So after the latter insert query I expect to see
moderator | 1
regular | 2
Because there is only 3 users (1+2), and at the current moment is one moderator and two regular users.
So I've contrived a query that solves the issue:
SELECT user_role.role, COUNT(user_status.*)
FROM user_account, user_status
JOIN user_role ON user_role.id = user_status.user_role_id
WHERE user_status.created = (SELECT MAX(created) FROM user_status WHERE user_account.id = user_status.user_account_id)
GROUP BY user_role.role
ORDER BY user_role.role;

SQL - foreign key one table with 3 owner tables

I have the following table Widget which is going to have an owner associated with each row.
The owner can be an id from User, Company or Department Tables. I'm guessing how to set this up is to make a link table like so?
id | user | company | department
---------|----------|----------|----------
1 | 4 | NULL | NULL
2 | 6 | 3 | 6
3 | 10 | 3 | 8
and then have the Widget table use that ID as the owner provided logic is in the app that if company is not null then the owner is the company otherwise owner would be user.
a department can't exist if there's no company.
It is not a problem if you want to add three foreign key (FK) columns from the three tables (USER, COMPANY, DEPARTMENT) respectively on the WIDGET table. You can distinguish real owner using JOIN operation described below;
CREATE TABLE WIDGET (
WIDGET_NAME VARCHAR(20),
OWNER_USER_ID INTEGER REFERENCES USER(ID),
OWNER_COMPANY_ID INTEGER REFERENCES COMPANY(ID),
OWNER_DEPART_ID INTEGER REFERENCES DEPARTMENT(ID),
);
-- retrieve OWNER_USER (you can JOIN with the USER table)
SELECT OWNER_USER_ID, WIDGET_NAME FROM WIDGET WHERE OWNER_COMPANY_ID IS NULL;
-- retrieve OWNER_COMPANY (plus OWNER_DEPART) (you can JOIN with the COMPANY and DEPARTMENT table)
SELECT OWNER_COMPANY_ID, OWNER_DEPART_ID, WIDGET_NAME FROM WIDGET WHERE OWNER_COMPANY_ID IS NOT NULL;
If you want to add just a single PK column from three tables, it doesn't make sense theoretically, but you can do it under some extra conditions. You said the owner of one widget in WIDGET table is a company if company is not null. But if company is null, then the owner is a user. If user (or corresponding identifier) column in WIDGET table is always not null whether company (or corresponding identifier) column is null or not, then you can just pick up the primary key (PK) column of USER table as a single FK of WIDGET table. Why? User → Company and User → Department dependencies are generated by this condition. It means, if you select a user A, then it is trivial that there is no more two companies related to him or her, and same as between user and department.
-- Schema of USER, COMPANY, DEPARTMENT table
CREATE TABLE USER (
ID INTEGER PRIMARY KEY,
NAME VARCHAR(20),
COMPANY_ID INTEGER REFERENCES COMPANY(ID),
DEPART_ID INTEGER REFERENCES DEPARTMENT(ID)
);
CREATE TABLE COMPANY (
ID INTEGER PRIMARY KEY,
NAME VARCHAR(20)
);
CREATE TABLE DEPARTMENT (
ID INTEGER PRIMARY KEY,
NAME VARCHAR(20)
);
-- Schema of WIDGET table
CREATE TABLE WIDGET (
WIDGET_NAME VARCHAR(20),
OWNER_ID INTEGER REFERENCES USER(ID)
);
-- retrieve OWNER_USER
SELECT U.NAME AS OWNER_USER_NAME, W.WIDGET_NAME
FROM WIDGET W, USER U
WHERE U.ID = W.OWNER_ID AND U.COMPANY_ID IS NULL;
-- retrieve OWNER_COMPANY
SELECT C.NAME AS OWNER_COMPANY_NAME, W.WIDGET_NAME
FROM WIDGET W, USER U, COMPANY C
WHERE U.ID = W.OWNER_ID AND U.COMPANY_ID = C.ID;
-- retrieve OWNER_DEPARTMENT
SELECT D.NAME AS OWNER_DEPART_NAME, W.WIDGET_NAME
FROM WIDGET W, USER U, DEPARTMENT D
WHERE U.ID = W.OWNER_ID AND U.COMPANY_ID IS NOT NULL AND U.DEPART_ID IS NOT NULL AND U.DEPART_ID = D.ID;
But if user column in WIDGET table can be null even though company column is not null, then you build up another OWNER table to keep your owner information (USER, COMPANY, DEPARTMENT). Of course, each record of WIDGET must be unique so composite unique index may be needed. (See http://www.postgresql.org/docs/current/static/indexes-unique.html)
-- Schema of OWNER table
CREATE TABLE OWNER (
ID INTEGER PRIMARY KEY.
OWNER_USER_ID INTEGER REFERENCES USER(ID),
OWNER_COMPANY_ID INTEGER REFERENCES COMPANY(ID),
OWNER_DEPARTMENT_ID INTEGER REFERENCES DEPARTMENT(ID)
);
-- unique index on OWNER
CREATE UNIQUE INDEX OWNER_UIDX ON OWNER( OWNER_USER_ID, OWNER_COMPANY_ID, OWNER_DEPARTMENT_ID );
-- Schema of WIDGET table
CREATE TABLE WIDGET (
WIDGET_NAME VARCHAR(20),
OWNER_ID INTEGER REFERENCES OWNER(ID)
);

Find the rating that given to a certian game by each member who rated it in SQL

CREATE TABLE members
(
name varchar(40),
ID char(6) PRIMARY KEY
);
CREATE TABLE games
(
name varchar(100),
ID serial PRIMARY KEY
);
CREATE TABLE ratings
(
memberID char(6) REFERENCES members(ID),
rating SMALLINT CHECK(rating >= 1 AND rating <= 8),
gameID integer REFERENCES games(ID),
PRIMARY KEY (memberID, gameID)
);
I am trying to find all the ratings that were given to the game that has an ID of (2) following by each member who rated it.
I used:
SELECT rating, name
FROM ratings, members
WHERE gameID = 2;
Whenever i used this command, it gives me the correct rating value but it lists all the members even if the member did not rate the game. Can someone help to figure out how to solve the problem.
thanks all in advance
You're looking for JOIN:
SELECT rating, name
FROM ratings r
INNER JOIN members m
ON r.MemberID = m.ID
WHERE gameID = 2;
try:
SELECT rating, name
FROM ratings, members
WHERE gameID = 2 and ratings.memberID = members.id;

How to I design a database constraint so two entities can only have a many to many relationship if two field values within them match?

I have a database with four tables as follows:
Addressbook
--------------------
id
more fields
Contact
---------------------
id
addressbook id
more fields
Group
---------------------
id
addressbook id
more fields
Group to Contact
---------------------
Composite key
Group id
Contact id
My relationships are one to many for addressbook > contact, one to many for addressbook > group and many to many between contact and groups.
So in summary, I have an addressbook. Contacts and groups can be stored within it and they cannot be stored in more than one addressbook. Furthermore as many contacts that are needed can be added to as many groups as are needed.
My question now poses as follows. I wish to add the constraint that a contact can only be a member of a group if both of them have the same addressbook id.
As I am not a database person this is boggling my brain. Does this mean I have designed my table structure wrong? Or does this mean that I have to add a check somewhere before inserting into the group to contact table? This seems wrong to me because I would want it to be impossible for SQL queries to link contacts to groups if they do not have the same id.
You should be able to accomplish this by adding a addressbook_id column to your Group to Contact bridge table, then using a compound foreign key to both the Contacts and Groups tables.
In PostgreSQL (but easily adaptable to any DB, or at least any DB that supports compound FKs):
CREATE TABLE group_to_contact (
contact_id INT,
group_id INT,
addressbook_id INT,
CONSTRAINT contact_fk FOREIGN KEY (contact_id,addressbook_id)
REFERENCES contacts(id,addressbook_id),
CONSTRAINT groups_fk FOREIGN KEY (group_id,addressbook_id)
REFERENCES groups(id,addressbook_id)
)
By using the same addressbook_id column in both constraints, you are of course enforcing that they are the same in both referenced tables.
OK - the Many to Many is governed by the GroupToContact table.
So the constraints are between Group and GroupToContact and between Contact and GroupToContact (GTC)
Namely
[Group].groupId = GTC.GroupId AND [Group].AddressBookid = GTC.AddressBookId
And
Contact.ContactId = GTC.ContactID AND Contact.AddressBookId = GTC.AddressBookId
So you will need to add AddressBookId to GroupToContact table
One further note - you should not define any relationship between Contact and Group directly - instead you just define the OneToMany relationships each has with the GroupToContact table.
As BonyT suggestion:
Addressbook
---------------
*id*
...more fields
PRIMARY KEY (id)
Contact
-----------
*id*
addressbook_id
...more fields
PRIMARY KEY (id)
FOREIGN KEY (addressbook_id)
REFERENCES Addressbook(id)
Group
---------
*id*
addressbook_id
...more fields
PRIMARY KEY (id)
FOREIGN KEY (addressbook_id)
REFERENCES Addressbook(id)
Group to Contact
--------------------
*group_id*
*contact_id*
addressbook_id
PRIMARY KEY (group_id, contact_id)
FOREIGN KEY (addressbook_id, contact_id)
REFERENCES Contact(addressbook, id)
FOREIGN KEY (addressbook_id, group_id)
REFERENCES Group(addressbook, id)
As A CHECK Constraint can't include sub-queries.
You could create a trigger that checks that the group and contact have the same addressbookid
and generate an error if they do not.
Although a database trigger defined to enforce an integrity rule does not check the data already in the table, I would recommended that you use a trigger only when the integrity rule cannot be enforced by an integrity constraint.
CREATE TRIGGER tr_Group_to_Contact_InsertOrUpdate on Group_to_Contact
FOR INSERT, UPDATE AS
IF (SELECT Count(*) FROM inserted i
INNER JOIN Group g ON i.groupid= g.groupid AND a.addressbookid=i.addressbookid
INNER JOIN Address a ON a.addressbookid=I.addressbookid AND a.addressd=i.addressid) = 0
BEGIN
RAISERROR('Address Book Mismatch', 16, 1)
rollback tran
END
Note:(This is from memory so probably not syntactically correct)
In your E-R (Entity-Relationship) model, the entities Group and Contact are (or should be) "dependent entities", which is to say that the existence of a Group or Contact is predicated upon that of 1 or more other entities, in this case AddressBook, that contributes to the identity of the dependent entity. The primary key of a dependent entity is composite and includes foreign keys to the entity(ies) upon which it is dependent.
The primary key of both Contact and Group include the primary key of the AddressBook to which they belong. Once you do that, everything falls into place:
create table Address
(
id int not null ,
... ,
primary key (id) ,
)
create table Contact
(
address_book_id int not null ,
id int not null ,
... ,
primary key ( address_book_id , id ) ,
foreign key ( address_book_id ) references AddressBook ( id ) ,
)
create table Group
(
address_book_id int not null ,
id int not null ,
... ,
primary key ( address_book_id , id ) ,
foreign key ( address_book_id ) references AddressBook( id ) ,
)
create table GroupContact
(
address_book_id int not null ,
contact_id int not null ,
group_id int not null ,
primary key ( address_book_id , contact_id , group_id ) ,
foreign key ( address_book_id , contact_id ) references Contact ( address_book_id , id ) ,
foreign key ( address_book_id , group_id ) references Group ( address_book_id , id ) ,
)
Cheers.

SQL normalization

right now, i have a table:
Id - CollegeName - CourseName
this table is not normalized so i have many Courses for every 1 College
I need to normalize this into two tables:
Colleges: CollegeID - CollegeName
Courses: CourseID - CollegeID - CourseName
Is there an easy way to do this?
Thank you
CREATE TABLE dbo.College
(
CollegeId int IDENTITY(1, 1) NOT NULL PRIMARY KEY,
CollegeName nvarchar(100) NOT NULL
)
CREATE TABLE dbo.Course
(
CourseId int IDENTITY(1, 1) NOT NULL PRIMARY KEY,
CollegeId int NOT NULL,
CourseName nvarchar(100) NOT NULL
)
ALTER TABLE dbo.Course
ADD CONSTRAINT FK_Course_College FOREIGN KEY (CollegeId)
REFERENCES dbo.College (CollegeId)
--- add colleges
INSERT INTO dbo.College (CollegeName)
SELECT DISTINCT CollegeName FROM SourceTable
--- add courses
INSERT INTO dbo.Course (CollegeId, CourseName)
SELECT
College.CollegeId,
SourceTable.CourseName
FROM
SourceTable
INNER JOIN
dbo.College ON SourceTable.CollegeName = College.CollegeName
If you create the 2 new tables with Colleges.CollegeID and Courses.CourseID as auto numbered fields, you can go with :
INSERT INTO Colleges (CollegeName)
SELECT DISTINCT CollegeName
FROM OLdTable ;
INSERT INTO Courses (CollegeID, CourseName)
SELECT Colleges.CollegeID, OldTable.CourseName
FROM OldTable
JOIN Colleges
ON OldTable.CollegeName = Colleges.CollegeName ;
I agreed with #Andomar's first comment: remove the seemingly redundant Id column and your CollegeName, CourseName table is already in 5NF.
What I suspect you need is a further table to give courses attributes so that you can model the fact that, say, Durham University's B.Sc. in Computing Science is comparable with Harvard's A.B. in Computer Science (via attributes 'computing major', 'undergraduate', 'country=US, 'country=UK', etc).
Sure.
Create a College table with a college_id (primary key) column, and a college_name column which is used as a unique index column.
Just refer to the college_id column, not college_name, in the Course table.