Full text search on many to many relationship - sql

I have the following tables
The keywords table
CREATE TABLE trigger_keyword
(
id bigint NOT NULL,
keyword text NOT NULL,
CONSTRAINT trigger_keyword_id PRIMARY KEY (id)
)
This is the bridge table
CREATE TABLE trigger_keyword_trigger_message
(
trigger_keyword_id bigint NOT NULL,
trigger_message_id bigint NOT NULL,
CONSTRAINT trigger_keyword_trigger_message_trigger_keyword_id_fkey FOREIGN KEY (trigger_keyword_id)
REFERENCES public.trigger_keyword (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE NO ACTION,
CONSTRAINT trigger_keyword_trigger_message_trigger_message_id_fkey FOREIGN KEY (trigger_message_id)
REFERENCES public.trigger_message (id) MATCH SIMPLE
ON UPDATE CASCADE
ON DELETE NO ACTION
)
The message table
CREATE TABLE trigger_message
(
id bigint NOT NULL,
message text NOT NULL,
CONSTRAINT trigger_message_id PRIMARY KEY (id)
)
I have a list of strings outside of the PostgreSQL database, which I will run in a loop.
Let's assume we have the following keywords in the trigger_keyword table
The trigger_keyword table
id keyword
----------------------------------------
1 hi
2 hello
3 the weather
4 the climate
The trigger_message table
id message
-----------------------------------------
1 Hi how is your day?
2 Hello, have a wonderful day
3 Looks sunny today
4 Excellent, no rain today
5 looks like we'll have showers today
Let's say one of our strings is Hi Robot!, then the SQL query should return Hi how is your day? or Hello, have a wonderful day; it should pick one of them randomly. It should do the same if the string contained hello robot instead of hi robot since both hi and hello are in the keywords table.
And if the string contains tell me the weather then the SQL query should return Looks sunny today or Excellent, no rain today or looks like we'll have showers today randomly.
I assume I would have to use full text search for this?
It's my first time using a bridge table, do I manually insert the relations in the bridge table?

You should define a primary key constraint on the “bridge” table that contains both columns.
Full text search as indicated in this answer is one way to do this.
To randomly pick one result row, you can append the following to the query:
ORDER BY random() LIMIT 1
To insert into the tables, you could use a DEFAULT clause with a sequence in the definition of the id columns and use INSERT ... RETURNING to get the values for the bridge table.

Related

How to reference a SQL table if primary key has more than one column?

I am learning postgresql and I have created 2 tables: goals and results. Each table has a primary key which is formed by 3 columns:
id
valid_date_from
vald_until_from
I did this so that each row must be unique not only by its id but also depending on when the goal or result is still valid. Here is my sql code:
CREATE TABLE goals (
goal_id INT,
goal_title VARCHAR(80),
goal_description VARCHAR(300),
goal_valid_from_date TIMESTAMP,
goal_valid_until_date TIMESTAMP,
goal_deleted_flag BOOLEAN,
PRIMARY KEY (goal_id, goal_valid_from_date, goal_valid_until_date)
);
CREATE TABLE results (
result_id INT,
goal_id INT,
result_description VARCHAR(300),
result_target FLOAT,
result_timestamp DATE,
result_valid_from_date TIMESTAMP,
result_valid_until_date TIMESTAMP,
result_deleted_flag BOOLEAN,
PRIMARY KEY (result_id, result_valid_from_date, result_valid_until_date)
);
My goal now is to create a foreign key so that I am able to connect the 2 tables based on the goal_id column only and not the valid_from/until_date columns otherwise they would never match.
I tried to achieve this by using the following lines of code:
ALTER TABLE results
ADD FOREIGN KEY(goal_id)
REFERENCES goals(goal_id) on delete set null;
However I get an error:
SQL Error [42830]: ERROR: there is no unique constraint matching given
keys for referenced table "goal"
Would you be able to propose a smart and elegant way to achieve my goal?
it is very clear in the documentation
FOREIGN KEY ( column_name [, ... ] ) REFERENCES reftable [ ( refcolumn [, ... ] ) ]
you need to put the list of column_name
... so that I am able to connect the 2 tables based on the goal_id column only ...
As the error message already suggests, you need to add a matching UNIQUE CONSTRAINT:
ALTER TABLE goals
ADD CONSTRAINT u_goals_goal_id
UNIQUE (goal_id)
However, this constraint is more restrictive than your actual primary key, which might not be what you want, if your current design is on purpose and not by accident (see the comments of #MikeOrganek and #ErwinBrandstetter).
If it is what you want then you should consider making this your primary key instead.
For your exclusion problem (no two goals may have overlapping time periods) take a look at the example in the manual, it literally describes your case.

Foreign key without correspondence in the referenced table

I'm trying to come up with a solution for an exercise in which I have to create 3 tables: one for employees, one for projects, and one for projects and employees, in which I have to insert the employee's ID and associate with a project ID. In the project_employee table, since it has only 2 columns that reference other tables, I thought I could set them bot as foreign keys, like this:
CREATE TABLE employee_project
(
id_employee numeric FOREIGN KEY REFERENCES employee(id_employee),
id_project numeric FOREIGN KEY REFERENCES project(id_project)
)
Then I stumbled upon a problem: when inserting values on the 3 tables, I noticed that one of the employee's ID number was 4, but on the table employee there was no line with ID 4. Of course, this line wasn't created, but I want to understand: is there a way I could create a line whose ID has no matching record in the referenced table? Could it be a possible mistake in the question, or is there something I'm missing? Thanks in advance for your time!
If there is no rows in employee table with id_employee value 4 then there should not be rows in employee_project table with id_employee value 4. SQL Server will give you an error like below:
The INSERT statement conflicted with the FOREIGN KEY constraint "FK__employee__id__75F77EB0". The conflict occurred in database "test", table "dbo.employee", column 'id_employee'.
But if you want to create employee_project with composite primary key on both the column you can try this:
CREATE TABLE employee_project
(
id_employee int not null,
id_project int not null,
primary key(id_employee, id_project )
)

How to reference foreign key from more than one column (Inconsistent values)

I Have table three tables:
The first one is emps:
create table emps (id number primary key , name nvarchar2(20));
The second one is cars:
create table cars (id number primary key , car_name varchar2(20));
The third one is accounts:
create table accounts (acc_id number primary key, woner_table nvarchar2(20) ,
woner_id number references emps(id) references cars(id));
Now I Have these values for selected tables:
Emps:
ID Name
-------------------
1 Ali
2 Ahmed
Cars:
ID Name
------------------------
107 Camery 2016
108 Ford 2012
I Want to
Insert values in accounts table so its data should be like this:
Accounts:
Acc_no Woner_Table Woner_ID
------------------------------------------
11013 EMPS 1
12010 CARS 107
I tried to perform this SQL statement:
Insert into accounts (acc_id , woner_table , woner_id) values (11013,'EMPS',1);
BUT I get this error:
ERROR at line 1:
ORA-02291: integrity constraint (HR.SYS_C0016548) violated - parent key not found.
This error occurs because the value of woner_id column doesn't exist in cars table.
My work require link tables in this way.
How Can I Solve This Problem Please ?!..
Mean: How can I reference tables in previous way and Insert values without this problem ?..
One-of relationships are tricky in SQL. With your data structure here is one possibility:
create table accounts (
acc_id number primary key,
emp_id number references emps(id),
car_id number references car(id),
id as (coalesce(emp_id, car_id)),
woner_table as (case when emp_id is not null then 'Emps'
when car_id is not null then 'Cars'
end),
constraint chk_accounts_car_emp check (emp_id is null or car_id is null)
);
You can fetch the id in a select. However, for the insert, you need to be explicit:
Insert into accounts (acc_id , emp_id)
values (11013, 1);
Note: Earlier versions of Oracle do not support virtual columns, but you can do almost the same thing using a view.
Your approach should be changed such that your Account table contains two foreign key fields - one for each foreign table. Like this:
create table accounts (acc_id number primary key,
empsId number references emps(id),
carsId number references cars(id));
The easiest, most straightforward method to do this is as STLDeveloper says, add additional FK columns, one for each table. This also bring along with it the benefit of the database being able to enforce Referential Integrity.
BUT, if you choose not to do, then the next option is to use one FK column for the the FK values and a second column to indicate what table the value refers to. This keeps the number of columns small = 2 max, regardless of number of tables with FKs. But, this significantly increases the programming burden for the application logic and/or PL/SQL, SQL. And, of course, you completely lose Database enforcement of RI.

Enforcing existence of many-to-many and one-to-one information at the same time (PostgreSQL)

The relation "astronomers discover stars" is m:n, as an astronomer can discover many stars, and a star can as well be discovered from many astronomers.
In any case however, there will be a single date of discovery for every star. If there are many astronomers working on it, they should do it simultaneously, otherwise, only the first one gets the credits of the discovery.
So, in PostgreSQL, we have:
CREATE TABLE astronomer (
astronomer_id serial PRIMARY KEY,
astronomer_name text NOT NULL UNIQUE
);
CREATE TABLE star (
star_id serial PRIMARY KEY,
star_name text NOT NULL UNIQUE,
discovery_date date
);
CREATE TABLE discovered (
star_id integer NOT NULL REFERENCES star,
astronomer_id integer NOT NULL REFERENCES astronomer,
CONSTRAINT pk_discovered PRIMARY KEY (star_id, astronomer_id)
);
Now the trick question: I want to enforce that whenever there is a discovery date for a star, there will be at least one entry in the discovered table, and vice versa.
I could create a unique key (star_id, discovery_date) in table star, and then use that combination as a foreign key in table discovered instead of the star_id. That would solve half of the problem, but still leave it possible to have a discovered_date without astronomers for it.
I can't see any simple solution other than using a trigger to check, or only allowing data enter via a stored procedure that will at the same time insert a discovery_date and entries into the table discovered.
Any ideas?
Thank you!
I would just move the discovery_date column to the discovered table
CREATE TABLE astronomer (
astronomer_id serial PRIMARY KEY,
astronomer_name text NOT NULL UNIQUE
);
CREATE TABLE star (
star_id serial PRIMARY KEY,
star_name text NOT NULL UNIQUE
);
CREATE TABLE discovered (
star_id integer NOT NULL REFERENCES star,
astronomer_id integer NOT NULL REFERENCES astronomer,
discovery_date date not null,
CONSTRAINT pk_discovered PRIMARY KEY (star_id, astronomer_id)
);
Now you have the problem of multiple dates for the same star but as you say in the question only the first one(s) will get the credit.

Correct way to create a table that references variables from another table

I have these relationships:
User(uid:integer,uname:varchar), key is uid
Recipe(rid:integer,content:text), key is rid
Rating(rid:integer, uid:integer, rating:integer) , key is (uid,rid).
I built the table in the following way:
CREATE TABLE User(
uid INTEGER PRIMARY KEY ,
uname VARCHAR NOT NULL
);
CREATE TABLE Recipes(
rid INTEGER PRIMARY KEY,
content VARCHAR NOT NULL
);
Now for the Rating table: I want it to be impossible to insert a uid\rid that does not exist in User\Recipe.
My question is: which of the following is the correct way to do it? Or please suggest the correct way if none of them are correct. Moreover, I would really appreciate if someone could explain to me what is the difference between the two.
First:
CREATE TABLE Rating(
rid INTEGER,
uid INTEGER,
rating INTEGER CHECK (0<=rating and rating<=5) NOT NULL,
PRIMARY KEY(rid,uid),
FOREIGN KEY (rid) REFERENCES Recipes,
FOREIGN KEY (uid) REFERENCES User
);
Second:
CREATE TABLE Rating(
rid INTEGER REFERENCES Recipes,
uid INTEGER REFERENCES User,
rating INTEGER CHECK (0<=rating and rating<=5) NOT NULL,
PRIMARY KEY(rid,uid)
);
EDIT:
I think User is problematic as a name for a table so ignore the name.
Technically both versions are the same in Postgres. The docs for CREATE TABLE say so quite clearly:
There are two ways to define constraints: table constraints and column constraints. A column constraint is defined as part of a column definition. A table constraint definition is not tied to a particular column, and it can encompass more than one column. Every column constraint can also be written as a table constraint; a column constraint is only a notational convenience for use when the constraint only affects one column.
So when you have to reference a compound key a table constraint is the only way to go.
But for every other case I prefer the shortest and most concise form where I don't need to give names to stuff I'm not really interested in. So my version would be like this:
CREATE TABLE usr(
uid SERIAL PRIMARY KEY ,
uname TEXT NOT NULL
);
CREATE TABLE recipes(
rid SERIAL PRIMARY KEY,
content TEXT NOT NULL
);
CREATE TABLE rating(
rid INTEGER REFERENCES recipes,
uid INTEGER REFERENCES usr,
rating INTEGER NOT NULL CHECK (rating between 0 and 5),
PRIMARY KEY(rid,uid)
);
This is a SQL Server based solution, but the concept applies to most any RDBMS.
Like so:
CREATE TABLE Rating (
rid int NOT NULL,
uid int NOT NULL,
CONSTRAINT PK_Rating PRIMARY KEY (rid, uid)
);
ALTER TABLE Rating ADD CONSTRAINT FK_Rating_Recipies FOREIGN KEY(rid)
REFERENCES Recipies (rid);
ALTER TABLE Rating ADD CONSTRAINT FK_Rating_User FOREIGN KEY(uid)
REFERENCES User (uid);
This ensures that the values inside of Rating are only valid values inside of both the Users table and the Recipes table. Please note, in the Rating table I didn't include the other fields you had, just add those.
Assume in the users table you have 3 users: Joe, Bob and Bill respective ID's 1,2,3. And in the recipes table you had cookies, chicken pot pie, and pumpkin pie respective ID's are 1,2,3. Then inserting into Rating table will only allow for these values, the minute you enter 4 for a RID or a UID SQL throws an error and does not commit the transaction.
Try it yourself, its a good learning experience.
In Postgresql a correct way to implement these tables are:
CREATE SEQUENCE uid_seq;
CREATE SEQUENCE rid_seq;
CREATE TABLE User(
uid INTEGER PRIMARY KEY DEFAULT nextval('uid_seq'),
uname VARCHAR NOT NULL
);
CREATE TABLE Recipes(
rid INTEGER PRIMARY KEY DEFAULT nextval('rid_seq'),
content VARCHAR NOT NULL
);
CREATE TABLE Rating(
rid INTEGER NOT NULL REFERENCES Recipes(rid),
uid INTEGER NOT NULL REFERENCES User(uid),
rating INTEGER CHECK (0<=rating and rating<=5) NOT NULL,
PRIMARY KEY(rid,uid)
);
There is no real difference between the two options that you have written.
A simple (i.e. single-column) foreign key may be declared in-line with the column declaration or not. It's merely a question of style. A third way should be to omit foreign key declarations from the CREATE TABLE entirely and later add them using ALTER TABLE statements; done in a transaction (presumable along with all the other tables, constraints, etc) the table would never exist without its required constraints. Choose whichever you think is easiest fora human coder to read and understand i.e. is easiest to maintain.
EDIT: I overlooked the REFERENCES clause in the second version when I wrote my original answer. The two versions are identical in terms of referential integrity, there are just two ways of syntax to do this.