a sql question about linking tables - sql

I wonder why these statements are not the same (the first one worked for me)
AND user_thread_map.user_id = $user_id
AND user_thread_map.thread_id = threads.id
AND user_thread_map.user_id = users.id
AND user_thread_map.thread_id = threads.id
AND users.id = $user_id
Shouldn't they be the same? In the 2nd one I linked all the tables in the first 2 lines, then I tell it to select where users.id = $user_id.
Can someone explain why the 2nd statement doesn't work? Because I thought it would.

Assuming you're getting no rows returned (you don't really say what the problem is, so I'm guessing a bit here), my first thought is that there are no rows in users where id is equal to $user_id.
That's the basic difference between those two SQL segments, the second is a cross-join of the user_thread_map, threads and users tables. The first does not join with users at all, so that's where I'd be looking for the problem.
It appears that your user_thread_map table is a many-to-many relationship between users and threads. If that is true, are you sure you have a foreign key constraint between the ID fields in that table to both corresponding other tables, something like:
users:
id integer primary key
name varchar(50)
threads:
id integer primary key
thread_text varchar(100)
user_thread_map:
user_id integer references users(id)
thread_id integer references threads(id)
If you have those foreign key constraints, it should be impossible to end up with a user_thread_map(user_id) value that doesn't have a corresponding users(id) value.
If those constraints aren't there, a query can tell you which values need to be fixed before immediately adding the constraints (this is important to prevent the problem from re-occurring), something like:
select user_thread_map.user_id
from user_thread_map
left join users
on user_thread_map.user_id = users.id
where users.id is null

The first one would select records from table user_thread_map with user_id = $user_id, irrespective of whether a record in table user existed with that id. The second query would only return something if the related record in user is found.

Related

How to insert into a table if it violates a foreign key constraint

I am working on a script that moves data between two databases.
I am moving a table of Phone Numbers. Each Phone Number is for a user.
The problem is that each Phone Number entry references a User with a User ID. Some of these users do not exist anymore, so when I try to insert, it returns a foreign key constraint violation.
insert or update on table "phone_numbers" violates foreign key constraint "fk3843uenfej83jf32wde"
user_id = 10 is not present in table users
However, I can't go and delete each single user reference as there are thousands of references.
So what would be the best way to approach it?
Should I simply remove the foreign key constraint?
Phone numbers that belong to non existent users are termed “orphaned” data.
Either clean up orphaned data in the source data (orphaned data shouldn’t exist):
delete from phone_number
where not exists (select * from user where id = user_id)
Or don’t select them when exporting:
select p.*
from phone_number p
join user u on u.id = p.user_id
I would not remove the constraint, as it can have impacts on other things (application ? report ? Whatever).
So the question is wHhat do you need ?
Insert all ph. numbers including the ones without users
Insert only ph. numbers with users associated
In any case load your data to a 'temp' table call, temp_phones, without any constraint.
In case 1 migrate data to phone_numbers making userid = null if the user is not present anymore. You can do it with an "easy" query
In case 2 migrate data to phone_numbers only when the userid of the record is found in your user table, also this can be done with a query
You can perform both processes also after having migrate the data. In this case you should disable\remove the constraint, update the userid according to the proposed rules, then recreate the constraint

How to migrate IDs from JOIN table into foreign key column in PostgreSQL

I have the following tables in my PostgreSQL database:
CREATE TABLE "User" (
id VARCHAR(25) PRIMARY KEY NOT NULL
);
CREATE TABLE "Post" (
id VARCHAR(25) PRIMARY KEY NOT NULL
);
CREATE TABLE "_PostToUser" (
"A" VARCHAR(25) NOT NULL REFERENCES "Post"(id) ON DELETE CASCADE,
"B" VARCHAR(25) NOT NULL REFERENCES "User"(id) ON DELETE CASCADE
);
The relationship between User and Post right now is managed via the _PostToUser JOIN table.
However, I want to get rid of this extra JOIN table and simply have a foreign key reference from Post to User, so I ran this query to create the foreign key:
ALTER TABLE "Post" ADD COLUMN "authorId" VARCHAR(25);
ALTER TABLE "Post"
ADD CONSTRAINT fk_author
FOREIGN KEY ("authorId")
REFERENCES "User"("id");
Now, I'm wondering what SQL query I need to run in order to migrate the data from the JOIN table to the new authorId column? If I understand correctly, I need a query that reads all the rows from the _PostToUser relation table and for each row:
Finds the respective Post record by looking up the value from column A
Inserts the value from column B as the value for authorId into that Post record
Edit: As mentioned by #Nick in the comments, I should have clarified that I indeed want to change the relationship from m-n and restrict it to 1-n: One post can at most have one author. One author/user can write many posts.
Your current design is already correct, and uses a proper junction table to store the relationships between users and their posts. In this design, a given relationship only requires storing two ID values, which is lean. Going in the direction you suggest is denormalizing your data, and will result in data duplication. To see why this is the case, your suggested table will now store metadata from the author table. This metadata will, in principle, be repetitive, since a given author's metadata would be the same for every record in the new posts table.
Instead, I suggest indexing the junction table:
CREATE INDEX idx ON "_PostToUser" (B, A);
As an example, the above index should help the following query:
SELECT u.*, p.*
FROM "User" u
INNER JOIN "_PostToUser" pu ON pu.B = u.id -- index helps here
INNER JOIN "Post" p ON p.id = pu.A; -- Post.id is already a primary key
The join to the lookup table should now be faster, because Postgres can use the index take a given user id value and try to find the corresponding A value on the other side of the junction.
As long as you are happy to restrict the relationship between Posts and Users to N:1, and you only store a foreign key to User in Post, then I think what you are doing is fine. The query to update the Post table would be:
UPDATE "Post" p
SET "authorId" = pu."B"
FROM "_PostToUser" pu
WHERE pu."A" = p."id"
Demo on dbfiddle

Performing SQL Query to Remove Unused Users from a Database

I'm currently working with a database that consists of a users table, a permissions table, a set of documents-related tables, and several miscellaneous tables that have foreign key dependencies on rows in the user table.
I'm trying to remove all user entries from the 'Users' table that meet the following criteria:
Not referenced by an entry in one of the documents tables.
Not referenced by an entry in the permissions table.
Contains a null value in the 'Customer ID' column of the User row.
I'm able to create a query that gets all users, which looks like this:
SELECT id
INTO MyTableVar
FROM Users
WHERE
(NOT EXISTS (SELECT Author_Id FROM ItemInstances_DocumentInstance
WHERE Users.Id = ItemInstances_DocumentInstance.Author_Id)
AND NOT EXISTS (SELECT CompletedBy_Id FROM TaskInstanceUser
WHERE Users.Id = TaskInstanceUser.CompletedBy_Id)
AND Cust_Id IS NULL
AND Id > 4)
SELECT *
FROM MyTableVar
This query gets all of Id's of users that I want to remove, but I get an error when I try to delete these entries
The DELETE statement conflicted with the REFERENCE constraint "FK_MessageUser_User.
I'm stumped as to how I should use the ID's I've queried to remove entries in the MessageUser_User table that correspond to users I want to delete. I feel like this should be easy, but I can't figure out a way to do it with SQL syntax.
PS: I'd also appreciate some feedback on how I wrote what I have so far for my query. I'd love to know what I could do to make it cleaner. I'm new to SQL and need all the help I can get.
I'm guessing that the table with the Foreign Key does not have ON DELETE CASCADE which you can read about here.
If you have the ability to alter constraints on your table, you can do this, which will permit the referencing table to automatically delete records that reference a deleted row from the main table.
ALTER TABLE MessageUser_User DROP
CONSTRAINT FK_MessageUser_User;
ALTER TABLE MessageUser_User ADD
CONSTRAINT FK_MessageUser_User
FOREIGN KEY (<<IdColumnName>>)
REFERENCES Users (Id)
ON DELETE CASCADE;
Otherwise, you can use a separate query to delete from MessageUser_User where it contains the IDs you want to delete in it's foreign key column:
DELETE FROM MessageUser_User WHERE ID IN (SELECT ID FROM MyTableVar );
Regarding the style of your delete query - I usually prefer to do left joins then delete the records where there is a null in the right table(s):
SELECT id
INTO MyTableVar
FROM Users
LEFT JOIN ItemInstances_DocumentInstance ON Author_Id = Users.Id
LEFT JOIN TastInstanceUser ON CompletedBy_Id = Users.Id
WHERE
Author_Id IS NULL
AND CompletedBy_Id IS NULL
AND Cust_Id IS NULL
AND Id > 4

Is it possible to create a foreign key constraint using "NOT IN" logic

Is it possible to add a foreign key constraint on a table which will allow values which do NOT exist in another table?
In the example below, both tables contain the field USER_ID. The constraint is that a customer and and an employee cannot have the same USER_ID value.
I am very limited in adding new tables or altering one of the tables in any way.
CUSTOMER
--------------------------
USER_ID varchar2(10)
EMPLOYEE
--------------------------
USER_ID varchar2(10)
I thought of a few workarounds, such as a view which contains the data from both tables or adding an insert trigger on the table I can modify.
No, no such thing exists, though it is possible to fake.
If you want to do this relationally (which would be a lot better than views/triggers) the easy option is to add a E to all employee IDs and a C to all customer IDs. However, this won't work if you have other attributes and you want to ensure they're not the same person (i.e. you're not just interested in the ID).
If this is the case you need to create a third table, let's call it PEOPLE:
create table people (
user_id varchar2(10) not null
, user_type varchar2(1) not null
, constraint pk_people primary key (user_id)
, constraint chk_people_user_types check ( user_type in ('C','E') )
);
C would stand for customer and E for employee in the check constraint. You then need to create a unique index/constraint on PEOPLE:
create index ui_people_id_type on people ( user_id, user_type );
Personally, I'd stop here and completely drop your CUSTOMER and EMPLOYEE tables; they no longer have any use and your problem has been solved.
If you don't have the ability to add new columns/tables you need to speak to the people who do and convince them to change it. Over-complicating things only leads to errors in logic and confusion (believe me - using a view means you need a lot of triggers to maintain your tables and you'll have to ensure that someone only ever updates the view). It's a lot easier to do things properly, even if they take longer.
However, if you really want to continue you alter your CUSTOMER and EMPLOYEE tables to include their USER_TYPE and ensure that it's always the same for every row in the table, i.e.:
alter table customers add user_type default 'C' not null;
alter table customers add constraint chk_customers_type
check ( user_type is not null and user_type = 'C' );
Unless you are willing to change the data model as someone else has suggested, the simplest way to proceed with the existing structure while maintaining mutual exclusion is to issue check constraints on the user_ids of both tables such that they validate only to mutually exclusive series.
For example, you could issue checks to ensure that only even numbers are assigned to customers and odd numbers to employees ( or vice-versa).
Or, since both IDS are varchar, stipulate using your check constraint that the ID begins with a known substring, such as 'EMP' or 'CUST'.
But these are only tricks and are hardly relational. Ideally, one would revise the data model. Hope this helps.

primary key constraint error when doing an update

I am having a bit of challenge doing mass update in SQL server. I have a table with composite primary key( studentid,login,pass). They have a common login and password as they are in a group. I m doing an (again all same login and pass) update and setting their login and pass to new values where the field, class group =x.But I get a duplicate primary key violation error. Any idea why?
Thanks
Given that you have a weird primary key, it's easy to see how it can be violated. For example, say student 1 has two rows in the table:
studentid login pass group
1 bobama reallyborninkenia politician
1 bobama2 raisetaxes politician
And say you update the group politician:
update StudentTable
set login = 'bobama'
, pass = 'justpwndromney'
where [group] = 'politician'
Then you'd get a primary key violation, since there would be two rows with the same (studentid, login, pass) combination.
If that is weird, that's because your primary key is weird. I'd expect the primary key to be just (studentid).
well clearly there is a clash, you are are creating a studentid/login/password combination that exists in the table already. This query should show you where the existing rows that clash with your proposed changes are:
select t.* from [your-table] t
join (select * from [your-table] where [class-group] = x) proposed-change
on proposed-change.[studentid] = [t.studentid]
where t.login = 'proposed-login' and t.password = 'proposed-password'