Join Three Tables and Use Count on One of Them - sql

I'm trying to improve my querying abilities and I have been having trouble wrapping my head around joins of moderate complexity. To be as clear and concise as possible, I am trying to join 3 tables. The first join selects posts from all users on users.User_ID=posts.FK_User_ID
**User table**
User_ID pk
username int
email
etc...
**Post Table**
post_ID PK
User_ID FK
Post
etc...
**Like Table**
FK_User_ID references user.user_ID
FK_Post_ID references post.Post_ID (*This is what I want to count*)
after this I want to reference a third table. This table contains a Foreign key to user_ID of user table and a foreign key of FK_Post_ID referencing the primary key Post_ID in the post table. This third table is a linking table of users who have liked the post. I want to count all occurrences of a post ID in this table and append it to each post in the initial user and post join so an output result would look like this:
User_id Username Post_ID Post Number_of_Likes
1 bob 4 'foo' 18
my first join between the two tables works and looks like this (simplified with * for example)
select * from users
join post
on post.User_ID=users.User_ID
Now I need a way to reference the third table to count the total # of times that a post id appears in the like table and append it to each row. This is where I am lost, I have been trying a lot of things to no luck. I believe I need to construct an inner join clause for my second join or I need to pull off a nested select statement? Could someone correct me on this if I am wrong and perhaps guide me in the right direct? Appreciate it!

A common way to do this is make a sub-query that has counts and key then join to that. Like this:
select *
from users
join post on post.FK_User_ID=users.User_ID
left join (
select FK_Post_ID, count(*) as count_of_likes_on_a_post
from likestable
group by FK_Post_ID
) likes on post.Post_ID = likes.FK_Post_ID

Related

Does row-level security (RLS) apply on a join statement?

Given the following (simplified) tables:
users
-----
id (pk)
posts
-----
id (pk)
user_id (fk)
likes
-----
user_id (pk)
post_id (pk)
If I run the following query to get which posts a user (?) liked:
SELECT *
FROM posts p
INNER JOIN likes l
ON l.post_id = p.id
WHERE l.user_id = ?
Would the RLS policy for SELECT on the likes table and posts table both be invoked, or would it only apply to the posts table since that is where we are SELECTing FROM? I'm under the assumptiom that it would apply for both tables, but just wanted to double check and make sure. I'm using PostgreSQL if that makes any difference.
Thanks for any help!
If a user doesn't have permission to see rows in a table, that applies to the entire query, not just to the columns being returned.
Basically, the table (or rows) will be invisible to the user.

Access: Updatable join query with 2 primary key fields that are both also foreign keys

In MS Access, I am trying to implement a many-to-many table that will store 2-way relationships, similar to Association between two entries in SQL table. This table stores info such as "Person A and Person B are coworkers" "C and D are friends", etc. The table is like this:
ConstitRelationships
LeftId (number, primary key, foreign key to Constituents.ConstitId)
RightId (number, primary key, foreign key to Constituents.ConstitId)
Description (text)
Note that the primary key is a composite of the two Id fields.
Also the table has constraints:
[LeftId]<>[RightId] AND [LeftId]<[RightId]
The table is working ok in my Access project, except that I cannot figure out how to make an updateable query that I want to use as a datasheet subform so users can easily add/delete records and change the descriptions. I currently have a non-updatable query:
SELECT Constituents.ConstituentId, Constituents.FirstName,
Constituents.MiddleName, Constituents.LastName,
ConstitRelationships.Description, ConstitRelationships.LeftId,
ConstitRelationships.RightId
FROM ConstitRelationships INNER JOIN Constituents ON
(Constituents.ConstituentId =
ConstitRelationships.RightId) OR (Constituents.ConstituentId =
ConstitRelationships.LeftId);
If I ignore the possibility that the constituentId I want is in the leftId column, I can do this, which is updatable. So the OR condition in the inner join above is what's messing it up.
SELECT Constituents.ConstituentId, Constituents.FirstName,
Constituents.MiddleName, Constituents.LastName,
ConstitRelationships.Description, ConstitRelationships.LeftId,
ConstitRelationships.RightId
FROM ConstitRelationships INNER JOIN Constituents ON
(Constituents.ConstituentId =
ConstitRelationships.RightId) ;
I also tried this wacky iif thing to collapse the two LeftId and RightId fields into FriendId, but it was not updateable either.
SELECT Constituents.ConstituentId, Constituents.FirstName,
Constituents.MiddleName,
Constituents.LastName, subQ.Description
FROM Constituents
INNER JOIN (
SELECT Description, Iif([Forms]![Constituents Form]![ConstituentId] <>
ConstitRelationships.LeftId, ConstitRelationships.LeftId,
ConstitRelationships.RightId) AS FriendId
FROM ConstitRelationships
WHERE ([Forms]![Constituents Form]![ConstituentId] =
ConstitRelationships.RightId)
OR ([Forms]![Constituents Form]![ConstituentId] =
ConstitRelationships.LeftId)
) subQ
ON (subQ.FriendId = Constituents.ConstituentId)
;
How can I make an updatable query on ConstitRelationships, including a JOIN with the Constituent.FirstName MiddleName LastName fields?
I am afraid that is not possible. Because you use joins in your query over three tables it is not updatable. There is no way around this.
Here some detailed information about the topic: http://www.fmsinc.com/Microsoftaccess/query/non-updateable/index.html
As mentioned in the linked article one possible solution and in my opinion best solution for you would be the temporary table. It is a load of work compared to the easy "bind-form-to-a-query"-approach but it works best.
The alternative would be to alter your datascheme in that way that you do not need joins. But then denormalized data and duplicates would go rampage which makes the temporary table a favorable choice.

Inserting Into and Maintaining Many-to-Many Tables

SQLite3 user.
I have read thru numerous books on relational DBs and SQL and not one shows how to maintain the linking tables for many-to-many relationships. I just went through a book that went into the details of SELECT and JOINS with examples, but then glosses over the same when many-to-many relationships are covered. The author just showed some pseudo code for a table, no data, and then a pseudo code query--WTF? I am probably missing something, but it has become quite maddening.
Anyways, say I have a table like [People] with 3 columns: pID (primary), name and age. A table [Groups] with 3 columns: gID (primary), groupname and years. Since people can belong to multiple groups and groups can have multiple people, I set up a linking table called [peoplegroups] with two columns: pID, and gID both of which come from their respective tables.
So ,how do I efficiently get data into the linking table when INSERTING on the others and how do I get data out using the linking table?
Example: I want to INSERT "Jane" into [people] and make her a member of group gID 2, "bowlers" and update the linking table {peoplegroups] at the same time.
Later I want to go back and pull out a list of all of the bowlers or all the groups a person is part of.
If you already don't use primary and foreign keys (which you should!) I think you may need to consider using triggers in your design as well? So if you have a specific set of rules (e.g. if you want to create Jane with id = 1 and choose existing group 2, then after insert jane into people automatically create an entry pair personid=1,groupid=2 in the table peoplegroups. You can also create views with specific selects to see the data you want, for example if you want a query where you only show the peoples names and groups names you could create a view 'PeopleView':
SELECT P.PersonName, G.GroupName
FROM People P
INNER JOIN PeopleGroup PG ON P.PersonID = PG.PersonID
INNER JOIN Group G ON G.GroupId = PG.GroupID
then you can query 'PeopleView' saying
SELECT * FROM PeopleView WHERE GroupName = 'bowlers'
When inserting new data into the tables mentioned, the "linking" table that you are referring to needs to contain both primary keys from the other tables as foreign keys. So basically The [People] tables (pID) and the [Groups] table (gID) should both be foreign keys in the [PeopleGroups] table. In order to create a new "link" in [PeopleGroups] the record has to already exist in the [People] table as well as the [Groups] table BEFORE you try and create the link in the [PeopleGroups] table. I hope this helps

Design : multiple visits per patient

Above is my schema. What you can't see in tblPatientVisits is the foreign key from tblPatient, which is patientid.
tblPatient contains a distinct copies of each patient in the dataset as well as their gender. tblPatientVists contains their demographic information, where they lived at time of admission and which hospital they went to. I chose to put that information into a separate table because it changes throughout the data (a person can move from one visit to the next and go to a different hospital).
I don't get any strange numbers with my queries until I add tblPatientVisits. There are just under one millions claims in tblClaims, but when I add tblPatientVisits so I can check out where that person was from, it returns over million. I thinkthis is due to the fact that in tblPatientVisits the same patientID shows up more than once (due to the fact that they had different admission/dischargedates).
For the life of me I can't see where this is incorrect design, nor do I know how to rectify it beyond doing one query with count(tblPatientVisits.PatientID=1 and then union with count(tblPatientVisits.patientid)>1.
Any insight into this type of design, or how I might more elegantly find a way to get the claimType from tblClaims to give me the correct number of rows with I associate a claim ID with a patientID?
EDIT: The biggest problem I'm having is the fact that if I include the admissionDate,dischargeDate or the patientStatein the tblPatient table I can't use the patientID as a primary key.
It should be noted that tblClaims are NOT necessarily related to tblPatientVisits.admissionDate, tblPatientVisits.dischargeDate.
EDIT: sample queries to show that when tblPatientVisits is added, more rows are returned than claims
SELECT tblclaims.id, tblClaims.claimType
FROM tblClaims INNER JOIN
tblPatientClaims ON tblClaims.id = tblPatientClaims.id INNER JOIN
tblPatient ON tblPatientClaims.patientid = tblPatient.patientID INNER JOIN
tblPatientVisits ON tblPatient.patientID = tblPatientVisits.patientID
more than one million query rows returned
SELECT tblClaims.id, tblPatient.patientID
FROM tblClaims INNER JOIN
tblPatientClaims ON tblClaims.id = tblPatientClaims.id INNER JOIN
tblPatient ON tblPatientClaims.patientid = tblPatient.patientID
less than one million query rows returned
I think this is crying for a better design. I really think that a visit should be associated with a claim, and that a claim can only be associated with a single patient, so I think the design should be (and eliminating the needless tbl prefix, which is just clutter):
CREATE TABLE dbo.Patients
(
PatientID INT PRIMARY KEY
-- , ... other columns ...
);
CREATE TABLE dbo.Claims
(
ClaimID INT PRIMARY KEY,
PatientID INT NOT NULL FOREIGN KEY
REFERENCES dbo.Patients(PatientID)
-- , ... other columns ...
);
CREATE TABLE dbo.PatientVisits
(
PatientID INT NOT NULL FOREIGN KEY
REFERENCES dbo.Patients(PatientID),
ClaimID INT NULL FOREIGN KEY
REFERENCES dbo.Claims(ClaimID),
VisitDate DATE
, -- ... other columns ...
, PRIMARY KEY (PatientID, ClaimID, VisitDate) -- not convinced on this one
);
There is some redundant information here, but it's not clear from your model whether a patient can have a visit that is not associated with a specific claim, or even whether you know that a visit belongs to a specific claim (this seems like crucial information given the type of query you're after).
In any case, given your current model, one query you might try is:
SELECT c.id, c.claimType
FROM dbo.tblClaims AS c
INNER JOIN dbo.tblPatientClaims AS pc
ON c.id = pc.id
INNER JOIN dbo.tblPatient AS p
ON pc.patientid = p.patientID
-- where exists tells SQL server you don't care how many
-- visits took place, as long as there was at least one:
WHERE EXISTS (SELECT 1 FROM dbo.tblPatientVisits AS pv
WHERE pv.patientID = p.patientID);
This will still return one row for every patient / claim combination, but it should only return one row per patient / visit combination. Again, it really feels like the design isn't right here. You should also get in the habit of using table aliases - they make your query much easier to read, especially if you insist on the messy tbl prefix. You should also always use the dbo (or whatever schema you use) prefix when creating and referencing objects.
I'm not sure I understand the concept of a claim but I suspect you want to remove the link table between claims and patient and instead make the association between patient visit and a claim.
Would that work out better for you?

Unexpected results after joining another table

I use three tables to get to the final result. They are called project_board_members, users and project_team.
This is the query:
SELECT `project_board_members`.`member_id`,
`users`.`name`,
`users`.`surname`,
`users`.`country`,
`project_team`.`tasks_completed`
FROM `project_board_members`
JOIN `users`
ON (`users`.`id` = `project_board_members`.`member_id`)
JOIN `project_team`
ON (`project_team`.`user_id` = `project_board_members`.`member_id`)
WHERE `project_board_members`.`project_id` = '5'
You can ignore last line because it just points to the project I'm using.
Table project_board_members holds three entries and have structure like:
id,
member_id,
project_id,
created_at;
I need to get member_id from that table. Then I join to users table to get name, surname and country. No problems. All works! :)
After that, I needed to get tasks_completed for each user. That is stored in project_team table. The big unexpected thing is that I got four entries returned and the big what-the-f*ck is that in the project_board_members table are only three entries.
Why is that so? Thanks in advice!
A SQL join creates a result set that contains one row for each combination of the left and right tables that matches the join conditions. Without seeing the data or a little more information it's hard to say what exactly is wrong from what you expect, but I'm guessing it's one of the following:
1) You have two entries in project_team with the same user_id.
2) Your entries in project_team store both user_id and project_id and you need to be joining on both of them rather than just user_id.
The table project_board_members represent what is called in the Entity-Relationship modelling world an "associative entity". It exists to implement a many-to-many relationship (in this case, between the project and user entities. As such it is a dependent entity, which is to say that the existence of an instance of it is predicated on the existence of an instance of each of the entities to which it refers (a user and a project).
As a result, the columnns comprising the foreign keys relating to those entities (member_id and project_id) must be form part or all of the primary key.
Normally, instances of an associative entity are unique WRT the entities to which it relates. In your case the relationship definitions would be:
Each user is seated on the board of 0-to-many projects;
Each project's board is comprise of 0-to-many users
which is to say that a particular user may not be on the board of a particular project more than once. The only reason for adding other columns (such as your id column) to the primary key would be if the user:project relationship is non-unique.
To enforce this rule -- a user may sit on the board a particular project just once -- the table schema should look like this:
create table project_board_member
(
member_id int not null foreign key references user ( user_id ) ,
project_Id int not null foreign key references project ( project_id ) ,
created_at ...
...
primary key ( member_id , project_id ) ,
)
}
The id column is superfluous.
For debugging purposes do
SELECT GROUP_CONCAT(pbm.member_id) AS member_ids,
GROUP_CONCAT(u.name) as names,
GROUP_CONCAT(u.surname) as surnames,
GROUP_CONCAT(u.country) as countries,
GROUP_CONCAT(pt.tasks_completed) as tasks
FROM project_board_members pbm
JOIN users u
ON (u.id = pbm.member_id)
JOIN project_team pt
ON (pt.user_id = pbm.member_id)
WHERE pbm.project_id = '5'
GROUP BY pbm.member_id
All the fields that list multiple entries in the result are messing up the rowcount in your resultset.
To Fix that you can do:
SELECT pbm.member_id
u.name,
u.surname,
u.country,
pt.tasks_completed
FROM (SELECT
p.project_id, p.member_id
FROM project_board_members p
WHERE p.project_id = '5'
LIMIT 1
) AS pbm
JOIN users u
ON (u.id = pbm.member_id)
JOIN project_team pt
ON (pt.user_id = pbm.member_id)