Join Lookup from 1 table to multiple columns - sql

How do I link 1 table with multiple columns in another table without using mutiple JOIN query?
Below is my scenario:
I have table User with ID and Name
User
+---------+------------+
| Id | Name |
+---------+------------+
| 1 | John |
| 2 | Mike |
| 3 | Charles |
+---------+------------+
And table Product with multiple columns, but just focus on 2 columns CreateBy And ModifiedBy
+------------+-----------+-------------+
| product_id | CreateBy | ModifiedBy |
+------------+-----------+-------------+
| 1 | 1 | 3 |
| 2 | 1 | 3 |
| 3 | 2 | 3 |
| 4 | 2 | 1 |
| 5 | 2 | 3 |
+------------+-----------+-------------+
With normal JOIN, i will need to do 2 JOIN:
SELECT p.Product_id,
u1.Name AS CreateByName,
u2.Name AS ModifiedByName
FROM Product p
JOIN USER user u1 ON p.CreateBy = u1.Id,
JOIN USER user u2 ON p.ModifiedBy = u2.Id
to come out result
+------------+---------------+-----------------+
| product_id | CreateByName | ModifiedByName |
+------------+---------------+-----------------+
| 1 | John | Charles |
| 2 | John | Charles |
| 3 | Mike | Charles |
| 4 | Mike | John |
| 5 | Mike | Charles |
+------------+---------------+-----------------+
How do i avoid that 2 times JOIN?
I'm using MS-SQL , but open to all SQL query for my own learning curious

Your current design/approach is acceptable, I think, and the need for two joins is a function of there being two user ID columns. Each of the two columns requires a separate join.
For fun, here is a table design which you may consider if you really want to have to perform only one join:
+------------+-----------+-------------+
| product_id | user_id | type |
+------------+-----------+-------------+
| 1 | 1 | created |
| 2 | 1 | created |
| 3 | 2 | created |
| 4 | 2 | created |
| 5 | 2 | created |
| 1 | 3 | modified |
| 2 | 3 | modified |
| 3 | 3 | modified |
| 4 | 1 | modified |
| 5 | 3 | modified |
+------------+-----------+-------------+
Now, you can get away with a just a single join followed by an aggregation:
SELECT
p.product_id,
MAX(CASE WHEN t.type = 'created' THEN u.Name END) AS CreateByName,
MAX(CASE WHEN t.type = 'modified' THEN u.Name END) AS ModifiedByName
FROM Product p
INNER JOIN user u
ON p.user_id = u.Id
GROUP BY
p.product_id;
Note that I don't recommend this approach at all. It is much cleaner to use your current approach and use two joins. Joins can fairly easily be optimized using one or more indices. The above aggregation approach would probably not perform as well as what you already have.

If you use natural keys instead of surrogates, you won't need to join at all.
I don't know how you tell your products apart in the real world, but for the example I will assume you have a UPC
CREATE TABLE User
(Name VARCHAR(20) PRIMARY KEY);
CREATE TABLE Product
(UPC CHAR(12) PRIMARY KEY,
CreatedBy VARCHAR(20) REFERENCES User(Name),
ModifiedBy VARCHAR(20) REFERENCES User(Name)
);
Now your query is a simple select, and you also enforce uniqueness of your user names as a bonus, and don't need additional indexes.
Try it...
HTH

Join is the best Approach, but if looking for alternate approach you can use Inline Query.
SELECT P.PRODUCT_ID,
(SELECT [NAME] FROM #USER WHERE ID = CREATED_BY) AS CREATED_BY,
(SELECT [NAME] FROM #USER WHERE ID = MODIFIED_BY) AS MODIFIED_BY
FROM #PRODUCT P
DEMO

Related

SQL: How to find rows in one table that have no references to rows in another tables?

I have three tables: users, rooms, room_users.
Users can have many rooms and rooms as well can have many users, so this is many to many relationship.
users table:
+----+-----------+-----+
| id | name | age |
+----+-----------+-----+
| 1 | Christian | 19 |
| 2 | Ben | 36 |
| 3 | Robert | 52 |
| 4 | Monica | 25 |
| 5 | Alice | 26 |
| 6 | William | 18 |
+----+-----------+-----+
rooms table:
+----+----------+
| id | name |
+----+----------+
| 1 | College |
| 2 | Work |
| 3 | Football |
+----+----------+
And room_users table that represents relationship between users and rooms:
+---------+---------+
| user_id | room_id |
+---------+---------+
| 1 | 1 |
| 1 | 3 |
| 2 | 2 |
| 4 | 1 |
| 5 | 2 |
| 6 | 1 |
| 6 | 3 |
+---------+---------+
So, having these tables we can say that:
Christian(1) belongs to College(1) and Football(3) rooms.
Ben(2) belongs to Work(2) room.
Robert(3) does not belong to any room.
Monica(4) belongs to College(1) room.
Alice(5) belongs to Work(2) room.
William(1) belongs to College(1) and Football(3) rooms.
And now if I want to find users (ids) that does belong to Football room I should use this query:
SELECT user_id FROM room_users WHERE room_id = 3
Output for this query:
+---------+
| user_id |
+---------+
| 1 |
| 6 |
+---------+
This is correct, only Christian(1) and William(3) belong to Football room.
But how to find users that does NOT belong to Football room?
In this case, query must return 2, 3, 4 and 5 ids. That is, all IDs excluding IDs from the first query.
Is it possible to do it using LEFT JOIN?
As far as I know, it is more efficient way than using sub-queries.
Thanks in advance!
EDIT:
I've found a query that can solve the problem, but this query is VERY SLOW on large database:
SELECT users.id FROM users WHERE 0=(SELECT COUNT(*) FROM room_users WHERE user_id=users.id AND room_id=3);
Without correlated behavior, try something like this:
SELECT u.*
FROM users AS u
LEFT JOIN (
SELECT DISTINCT user_id FROM room_users WHERE room_id = 3
) AS v
ON v.user_id = u.id
WHERE v.user_id IS NULL
;
For performance issues, start by reviewing the explain/execution plan and use of indexes.
You could find those users that belong to the football room AND then exclude those using not in.
Also you can use a JOIN
SELECT u.*
FROM
users u
WHERE user_id NOT IN
(SELECT user_id FROM room_users WHERE room_id=3)
You are correct that this is possible to do with a left join.
SELECT
u.id
FROM
users u
LEFT JOIN room_users ur
ON u.id = ur.user_id
AND ur.room_id = 3
WHERE
ur.room_id is null;

Selecting the two most common attribute pairings from a Entity-Attribute Table?

I have a simple Entity-Attribute table in my database describing simply if an Entity has some Attribute by the existance of a row consisting of (Entity, Attribute).
I want to find out, of all the Entities with two and only two Attributes, what are the most common Attribute pairs
For example, if my table looked like:
+--------+-----------+
| Entity | Attribute |
+--------+-----------+
| Bob | A |
| Sally | B |
| Terry | C |
| Bob | B |
| Sally | A |
| Terry | D |
| Larry | C |
+--------+-----------+
I would want it to return
+-------------+-------------+-------+
| Attribute-1 | Attribute-2 | Count |
+-------------+-------------+-------+
| A | B | 2 |
| C | D | 1 |
+-------------+-------------+-------+
I currently have a short query that looks like:
WITH TwoAtts (
SELECT entity
FROM table
GROUP BY entity
HAVING COUNT(att) = 2
)
SELECT t1.att, t2.att, COUNT(entity)
FROM table t1
JOIN table t2
ON t1.entity = t2.entity
WHERE t1.entity IN (SELECT * FROM TwoAtts)
AND t1.att != t2.att
GROUP BY t1.att, t2.att
ORDER BY COUNT(entity) DESC
but is only capable of producing "duplicate" results like
+-------------+-------------+-------+
| Attribute-1 | Attribute-2 | Count |
+-------------+-------------+-------+
| A | B | 2 |
| B | A | 2 |
| D | C | 1 |
| C | D | 1 |
+-------------+-------------+-------+
In a sense I would like to be able to run a unordered DISTINCT / set operator over the two attribute columns, but I am not sure how to acheive this functionality in SQL?
Hmmm, I think you want two levels of aggregation, with some filtering:
select attribute_1, attribute_2, count(*)
from (select min(ea.attribute) as attribute_1, max(ea.attribute) as attribute_2
from entity_attribute ea
group by entity
having count(*) = 2
) aa
group by attribute_1, attribute_2;
Here is a db<>fiddle

Join three tables by one foreign key

I have three tables:
Task (ID, TaskDescription)
Schedule (TaskID, ID, DueAt)
Audit (TaskID, TestID)
In Schedule table there is a list of scheduled tasks, and Audit table is for already done tasks. So first there is a row in Schedule, then when this task is done it's removing from Schedule table and added into Audit table.
Tasks table
+----+-----------------+
| ID | TaskDescription |
+----+-----------------+
| 1 | Clean room |
| 2 | Remove trash |
+----+-----------------+
Schedule table
+--------+--------+------------+
| ID | TaskID | DueAt |
+--------+--------+------------+
| 927847 | 1 | 2020-08-01 |
| 777777 | 2 | 2020-08-07 |
+--------+--------+------------+
Audit table
+--------+--------+
| TaskID | TestID |
+--------+--------+
| 1 | 3 |
| 1 | 2 |
| 1 | 1 |
| 2 | 4 |
+--------+--------+
I need to take all planned and already done tasks for one task ID. So for example, what I expect as result:
+---------+-----------------+-------------+----------------+--------+
| Task.ID | TaskDescription | Schedule.ID | Schedule.DueAt | TestID |
+---------+-----------------+-------------+----------------+--------+
| 1 | Clean room | 927847 | 2020-08-01 | NULL |
| 1 | Clean room | NULL | NULL | 3 |
| 1 | Clean room | NULL | NULL | 2 |
| 1 | Clean room | NULL | NULL | 1 |
+---------+-----------------+-------------+----------------+--------+
That means already 3 tasks are done and one is scheduled for 2020-08-01.
What i tried:
SELECT
TaskID = t.ID,
t.TaskDescription,
ScheduleID = s.ID,
ScheduleDueAt = s.DueAt,
a.TestID
FROM Task t
LEFT OUTER JOIN Schedule s
ON (s.TaskID = t.ID)
LEFT OUTER JOIN Audit a
ON (a.TaskID = t.ID)
WHERE t.ID = '1'
But of course, I get the wrong result:
+---------+-----------------+-------------+----------------+--------+
| Task.ID | TaskDescription | Schedule.ID | Schedule.DueAt | TestID |
+---------+-----------------+-------------+----------------+--------+
| 1 | Clean room | 927847 | 2020-08-01 | 3 |
| 1 | Clean room | 927847 | 2020-08-01 | 2 |
| 1 | Clean room | 927847 | 2020-08-01 | 1 |
+---------+-----------------+-------------+----------------+--------+
I'm going to use UNION for that but first wanted to ask maybe there is more right way how to do it.
You need to union all the schedule and audit tables and query nulls for the missing columns. Then, you can join that result with the task table:
SELECT t.id, t.taskdescription, s.id, s.dueat, s.testid
FROM task t
JOIN (SELECT taskid, id, dueat, NULL AS testid
FROM schedule
UNION ALL
SELECT taskid, NULL, NULL, testid
FROM audit) s ON t.id = s.taskid
I agree that using UNION ALL as #Mureinik suggested is probably your best option here, but just for fun, another alternative would be this.
If you added another entry to your audit table for each taskID with a TestID of 0 (sort of as a default whenever a new task is created), then it will allow you to join onto the audit table, without the need for UNION.
So your Audit table would look like this:
+--------+--------+
| TaskID | TestID |
+--------+--------+
| 1 | 0 |
| 2 | 0 |
| 1 | 3 |
| 1 | 2 |
| 1 | 1 |
| 2 | 4 |
+--------+--------+
Then you can modify your query to join the schedule table as normal, but only where the audit table value is 0.
And finally, to keep it tidy, use NULLIF to hide the 0 for that TestID if you wish:
Select
TaskID = t.ID,
t.TaskDescription,
ScheduleID = s.ID,
ScheduleDueAt = s.DueAt,
TestID= nullIF(a.TestID,0)
from
Task t
inner join
Audit a on
a.TaskID = t.ID
left join
Schedule s on
s.TaskID = t.ID
and a.TaskID = 0
where
t.ID = 1
UPDATE: You will also need an additional where clause for when there is no scheduled task, to prevent an empty row returning:
where
t.ID = 1
and not (s.TaskID is null and a.TestID = 0)

Query returned with an extra column in sql -ms access

So I am wondering. I fell into an interesting suggestion from another developer. So i basically have two tables I join in a query and I want the resulting table from the query to have an extra column that comes from the table on from the joint.
Example:
#table A: contains rating of players, changes randomly at any date depending
#on drop of form from the players
PID| Rating | DateChange |
1 | 2 | 10-May-2014 |
1 | 4 | 20-May-2015 |
1 | 20 | 1-June-2015 |
2 | 4 | 1-April-2014|
3 | 4 | 5-April-2014|
2 | 3 | 3-May-2015 |
#Table B: contains match sheets. Every player has a different match sheet
#and plays different dates.
MsID | PID | MatchDate | Win |
1 | 2 | 10-May-2014 | No |
2 | 1 | 15-May-2015 | Yes |
3 | 3 | 10-Apr-2014 | No |
4 | 1 | 21-Apr-2015 | Yes |
5 | 1 | 3-June-2015 | Yes |
6 | 2 | 5-May-2015 | No |
#I am trying to achieve this by running the ms-access query: i want to get
#every players rating at the time the match was played not his current
#rating.
MsID | PID | MatchDate | Rating |
1 | 2 | 10-May-2014 | 4 |
2 | 1 | 15-May-2015 | 2 |
3 | 3 | 10-Apr-2014 | 4 |
4 | 1 | 21-Apr-2015 | 4 |
5 | 1 | 3-June-2015 | 20 |
6 | 2 | 5-May-2015 | 3 |
This is what I have tried below:
Select MsID, PID, MatchDate, A-table.rating as Rating from B-table
left Join A-table
on B-table.PID = A-table.PID
where B-table.MatchDate > A-table.Datechange;
any help is appreciated. The solution can be in Vba as long as it returns something like a view/table I can manipulate using other queries or report.
Think of this in terms of sets of data... you need a set that lists the MAX dateChange for each player's and match date.
Soo...
SELECT MAX(A.DateChange) MDC, A.PID, B.Matchdate
FROM B-table B
INNER Join A-table A
on B.PID = A.PID
and A.DateChange <= B.MatchDate
GROUP BY A.PID, B.Matchdate
Now we take this and join it back to what you've done to limit the results in table A and B to ONLY those with that date player and matchDate (my inline table C)
SELECT B.MsID, B.PID, B.MatchDate, A.rating as Rating
FROM [B-table] B
INNER JOIN [A-table] A
on B.PID = A.PID
INNER JOIN (
SELECT MAX(Y.DateChange) MDC, Y.PID, Z.Matchdate
FROM [B-table] Z
INNER Join [A-table] Y
on Z.PID = Y.PID
and Y.DateChange <= Z.MatchDate
GROUP BY Y.PID, Z.Matchdate) C
on C.mdc = A.DateChange
and A.PID = C.PId
and B.MatchDate = C.Matchdate
I didn't create a sample for this using your data so it's untested but I believe the logic is sound...
Now Tested! SQL Fiddle using SQL server though...
My results don't match yours exactly. I think you're expected results are wrong though for MSID 4 given rules defined.

PostgreSQL select all from one table and join count from table relation

I have two tables, post_categories and posts. I'm trying to select * from post_categories;, but also return a temporary column with the count for each time a post category is used on a post.
Posts
| id | name | post_category_id |
| 1 | test | 1 |
| 2 | nest | 1 |
| 3 | vest | 2 |
| 4 | zest | 3 |
Post Categories
| id | name |
| 1 | cat_1 |
| 2 | cat_2 |
| 3 | cat_3 |
Basically, I'm trying to do this without subqueries and with joins instead. Something like this, but in real psql.
select * from post_categories some-type-of-join posts, count(*)
Resulting in this, ideally.
| id | name | count |
| 1 | cat_1 | 2 |
| 2 | cat_2 | 1 |
| 3 | cat_3 | 1 |
Your help is greatly appreciated :D
You can use a derived table that contains the counts per post_category_id and left join it to the post_categories table
select p.*, coalesce(t1.p_count,0)
from post_categories p
left join (
select post_category_id, count(*) p_count
from posts
group by post_category_id
) t1 on t1.post_category_id = p.id
select post_categories.id, post_categories.name , count(posts.id)
from post_categories
inner join posts
on post_category_id = post_categories.id
group by post_categories.id, post_categories.name