SQL conditional field, first match JOIN - sql

Lets imagine I have two tables:
user
--userid
--fname
--lname
widget
--id
--userid
--value
user.userid = widget.userid
I want to see the full list of users with the Widget.value if they have one, AND(!) the first match if there are more than 1 widget. No widget = null field
id fname lname value
1 John Doe X8
I can not do simple joins, cos if there is no 'widget.value' for some 'user' user won't be displayed
CROSS APPLY doesn't work as well
I need
1 widget = value
2 widgets = first one
0 widgets = null field

using top with ties:
select top 1 with ties
u.*, w.id, w.value
from dbo.user u
left join dbo.widget w
on u.userid = w.userid
order by row_number() over (partition by u.userid order by w.id);
using common table expression with row_number()
;with cte as (
select u.*, w.id, w.value
, rn = row_number() over (partition by u.userid order by w.id)
from dbo.user u
left join dbo.widget w
on u.userid = w.userid
)
select *
from cte
where rn = 1;

outer apply should do what you want:
select u.*, w.value
from user u outer apply
(select top 1 w.*
from widgets w
where w.userid = u.userid
order by id -- or however you define the first one
) w;

Try this:
SELECT
u.userid, u.fname, u.lname, w.value
FROM user as u
LEFT JOIN
(
SELECT w1.*
FROM widget as w1
INNER JOIN
(
SELECT userid, MAX(id) AS LatestId
FROM widget
GROUP BY userid
) AS w2 ON w1.userid = w2.userid and w1.id = w2.latestid
) AS w ON u.userid = w.userid;
The inner join with subquery with max and group by, will give you the latest row for each userid if any. So for those with more than 1 row you will get the latest one.
There is no date, so I assumed the max id is the latest one, which might not always the case.
LEFT JOIN will include those rows with un matched rows form the widget table, so if there is a user with no widget you will get a value null.

SELECT u.userid, w.value
FROM user u
OUTER APPLY (
SELECT TOP 1 w.value
FROM widget w
WHERE w.userid = u.userid
ORDER BY w.id --order by whatever makes a widget the first one
) w

Related

Is there any alternative way to write this t-sql query?

I have these 3 tables:
For each car I need to visualize the data about the last (most recent) reservation:
the car model (Model);
the user who reserved the car (Username);
when it was reserved (ReservedOn);
when it was returned (ReservedUntil).
If there is no reservation for a given car, I have to show only the car model. Other fields must be empty.
I wrote the following query:
SELECT
Reservations.CarId,
res.MostRecent,
Reservations.UserId,
Reservations.ReservedOn,
Reservations.ReservedUntil
FROM
Reservations
JOIN (
Select
Reservations.CarId,
MAX(Reservations.ReservedOn) AS 'MostRecent'
FROM
Reservations
GROUP BY
Reservations.CarId
) AS res ON res.carId = Reservations.CarId
AND res.MostRecent = Reservations.ReservedOn
This first one works but I got stuck to obtain the result that I need. How could I write complete the query?
It looks like a classic top-n-per-group problem.
One way to do it is to use OUTER APPLY. It is a correlated subquery (lateral join), which returns the latest Reservation for each row in the Cars table. If such reservation doesn't exist for a certain car, there will be nulls.
If you create an index for Reservations table on (CarID, ReservedOn DESC), this query should be more efficient than self-join.
SELECT
Cars.CarID
,Cars.Model
,A.ReservedOn
,A.ReservedUntil
,A.UserName
FROM
Cars
OUTER APPLY
(
SELECT TOP(1)
Reservations.ReservedOn
,Reservations.ReservedUntil
,Users.UserName
FROM
Reservations
INNER JOIN Users ON Users.UserId = Reservations.UserId
WHERE
Reservations.CarID = Cars.CarID
ORDER BY
Reservations.ReservedOn DESC
) AS A
;
For other approaches to this common problem see Get top 1 row of each group
and Retrieving n rows per group
With not exists:
select r.* from reservations r
where not exists (
select 1 from reservations
where carid = r.carid and reservedon > r.reservedon
)
You can create a CTE with the above code and join it to the other tables:
with cte as (
select r.* from reservations r
where not exists (
select 1 from reservations
where carid = r.carid and reservedon > r.reservedon
)
)
select c.carid, c.model, u.username, cte.reservedon, cte.reserveduntil
from cars c
left join cte on c.carid = cte.carid
left join users u on u.userid = cte.userid
If you don't want to use a CTE:
select c.carid, c.model, u.username, t.reservedon, t.reserveduntil
from cars c
left join (
select r.* from reservations r
where not exists (
select 1 from reservations
where carid = r.carid and reservedon > r.reservedon
)
) t on c.carid = t.carid
left join users u on u.userid = t.userid

How to Join only first row, disregard further matches

I have 2 tables
Table Users:
UserID | Name
Table Cars:
CarID | Car Name | FK_UserID
A user can have more than 1 car.
I want to join each user with 1 car only, not more.
Having looked at other threads here,
I've tried the following:
Select users.UserID, users.name, carid
from Users
join cars
on users.UserID =
(
select top 1 UserID
from users
where UserID = CarID
)
But it still returns more than 1 match for each user.
What am I doing wrong?
You can try like below using ROW_NUMBER() function
select userid, username, carname
from
(
Select users.UserID as userid,
users.name as username,
cars.carname as carname,
ROW_NUMBER() OVER(PARTITION BY users.UserID ORDER BY users.UserID) AS r
from Users
join cars
on users.UserID = cars.FK_UserID
) XXX
where r = 1;
with x as
(select row_number() over(partition by userid order by carid) as rn,
* from cars)
select u.userid, x.carid, x.carname
from users u join x on x.userid = u.userid
where x.rn = 1;
This is one way to do it using row_number function.
Another way to do it
select u.UserID,
u.name,
(select TOP 1 carid
from cars c
where u.UserID = c.FK_UserID
order by carid) carid -- Could be ordered by anything
from Users u
-- where only required if you only want users with cars
where exists (select * from car c where u.UserID = c.FK_UserID)
Best would be to do a subquery and use a group-by in it to return only 1 user and a car for each user. Then join that to the outer user table.
Here is an example:
select *
from user_table u
join (
select userid
, max(carname)
from cars
group by userid
) x on x.userId = u.userId
or you could use the row_number() examples above if you want a specific order (either this example or theirs will do the trick)

Select all threads and order by the latest one

Now that I got the Select all forums and get latest post too.. how? question answered, I am trying to write a query to select all threads in one particular forum and order them by the date of the latest post (column "updated_at").
This is my structure again:
forums forum_threads forum_posts
---------- ------------- -----------
id id id
parent_forum (NULLABLE) forum_id content
name user_id thread_id
description title user_id
icon views updated_at
created_at created_at
updated_at
last_post_id (NULLABLE)
I tried writing this query, and it works.. but not as expected: It doesn't order the threads by their last post date:
SELECT DISTINCT ON(t.id) t.id, u.username, p.updated_at, t.title
FROM forum_threads t
LEFT JOIN forum_posts p ON p.thread_id = t.id
LEFT JOIN users u ON u.id = p.user_id
WHERE t.forum_id = 3
ORDER BY t.id, p.updated_at DESC;
How can I solve this one?
Assuming you want a single row per thread and not all rows for all posts.
DISTINCT ON is still the most convenient tool. But the leading ORDER BY items have to match the expressions of the DISTINCT ON clause. If you want to order the result some other way, you need to wrap it into a subquery and add another ORDER BY to the outer query:
SELECT *
FROM (
SELECT DISTINCT ON (t.id)
t.id, u.username, p.updated_at, t.title
FROM forum_threads t
LEFT JOIN forum_posts p ON p.thread_id = t.id
LEFT JOIN users u ON u.id = p.user_id
WHERE t.forum_id = 3
ORDER BY t.id, p.updated_at DESC
) sub
ORDER BY updated_at DESC;
If you are looking for a query without subquery for some unknown reason, this should work, too:
SELECT DISTINCT
t.id
, first_value(u.username) OVER w AS username
, first_value(p.updated_at) OVER w AS updated_at
, t.title
FROM forum_threads t
LEFT JOIN forum_posts p ON p.thread_id = t.id
LEFT JOIN users u ON u.id = p.user_id
WHERE t.forum_id = 3
WINDOW w AS (PARTITION BY t.id ORDER BY p.updated_at DESC)
ORDER BY updated_at DESC;
There is quite a bit going on here:
The tables are joined and rows are selected according to JOIN and WHERE clauses.
The two instances of the window function first_value() are run (on the same window definition) to retrieve username and updated_at from the latest post per thread. This results in as many identical rows as there are posts in the thread.
The DISTINCT step is executed after the window functions and reduces each set to a single instance.
ORDER BY is applied last and updated_at references the OUT column (SELECT list), not one of the two IN columns (FROM list) of the same name.
Yet another variant, a subquery with the window function row_number():
SELECT id, username, updated_at, title
FROM (
SELECT t.id
, u.username
, p.updated_at
, t.title
, row_number() OVER (PARTITION BY t.id
ORDER BY p.updated_at DESC) AS rn
FROM forum_threads t
LEFT JOIN forum_posts p ON p.thread_id = t.id
LEFT JOIN users u ON u.id = p.user_id
WHERE t.forum_id = 3
) sub
WHERE rn = 1
ORDER BY updated_at DESC;
Similar case:
Return records distinct on one column but order by another column
You'll have to test which is faster. Depends on a couple of circumstances.
Forget the distinct on:
SELECT t.id, u.username, p.updated_at, t.title
FROM forum_threads t
LEFT JOIN forum_posts p ON p.thread_id = t.id
LEFT JOIN users u ON u.id = p.user_id
WHERE t.forum_id = 3
ORDER BY p.updated_at DESC;

Writing a Mathematical Formula in SQL?

I have these tables: users, comments, ratings, and items
I would like to know if it is possible to write SQL query that basically does this:
user_id is in each table. I'd like a SQL query to count each occurrence in each table (except users of course). BUT, I want some tables to carry more weight than the others. Then I want to tally up a "score".
Here is an example:
user_id 5 occurs...
2 times in items;
5 times in comments;
11 times in ratings.
I want a formula/point system that totals something like this:
items 2 x 5 = 10;
comments 5 x 1 = 5;
ratings 11 x .5 = 5.5
TOTAL 21.5
This is what I have so far.....
SELECT u.users
COUNT(*) r.user_id
COUNT(*) c.user_id
COUNT(*) i.user_id
FROM users as u
JOIN COMMENTS as c
ON u.user_id = c_user_id
JOIN RATINGS as r
ON r.user_id = u.user_id
JOIN ITEMS as i
i.user_id = u.user_id
WHERE
????
GROUP BY u.user_id
ORDER by total DESC
I am not sure how to do the mathematical formula portion (if possible). Or how to tally up a total.
Final Code based on John Woo's Answer!
$sql = mysql_query("
SELECT u.username,
(a.totalCount * 5) +
(b.totalCount) +
(c.totalCount * .2) totalScore
FROM users u
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM items
GROUP BY user_id
) a ON a.user_id= u.user_id
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM comments
GROUP BY user_id
) b ON b.user_id= u.user_id
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM ratings
GROUP BY user_id
) c ON c.user_id = u.user_id
ORDER BY totalScore DESC LIMIT 10;");
Maybe this can help you,
SELECT u.user_ID,
(a.totalCount * 5) +
(b.totalCount) +
(c.totalCount * .2) totalScore
FROM users u LEFT JOIN
(
SELECT user_ID, COUNT(user_ID) totalCount
FROM items
GROUP BY user_ID
) a ON a.user_ID = u.user_ID
LEFT JOIN
(
SELECT user_ID, COUNT(user_ID) totalCount
FROM comments
GROUP BY user_ID
) b ON b.user_ID = u.user_ID
LEFT JOIN
(
SELECT user_ID, COUNT(user_ID) totalCount
FROM ratings
GROUP BY user_ID
) c ON c.user_ID = u.user_ID
ORDER BY totalScore DESC
but based on yur query above,thismay also work
SELECT u.users
(COUNT(*) * .5) +
COUNT(*) +
(COUNT(*) * 2) totalcore
FROM users as u
LEFT JOIN COMMENTS as c
ON u.user_id = c_user_id
LEFT JOIN RATINGS as r
ON r.user_id = u.user_id
LEFT JOIN ITEMS as i
ON i.user_id = u.user_id
GROUP BY u.user_id
ORDER by totalcore DESC
The only difference is by using LEFT JOIN. You will not use INNER JOIN in this situation because there are chances that user_id is not guaranteed to exists on every table.
Hope this makes sense
Here's an alternative approach:
SELECT
u.user_id,
SUM(s.weight) AS totalScore
FROM users u
LEFT JOIN (
SELECT user_id, 5.0 AS weight
FROM items
UNION ALL
SELECT user_id, 1.0
FROM comments
UNION ALL
SELECT user_id, 0.5
FROM ratings
) s
ON u.user_id = s.user_id
GROUP BY
u.user_id
I.e. for every occurrence of every user in every table, a row with a specific weight is produced. The UNIONed set of weights is then joined to the users table for subsequent grouping and aggregating.

possible to join a table on only one row

I have a temporary table I'm creating in a sproc that houses my user information. I need to join this table to another table that has SEVERAL rows for that particular user but I only want to return one result from the "many" table.
something like this
SELECT u.firstname, u.lastname
FROM #users AS u
INNER JOIN OtherTable AS ot on u.userid = (top 1 ot.userid)
obviously that wont' work but that's the gist of what I'm trying to do for two reasons, one I only want one row returned (by a date field descending) and two for optimaztion purposes. The query has to scan several thousand rows as it currently is..
SELECT
u.firstname, u.lastname, t.*
FROM
#users AS u
CROSS APPLY
(SELECT TOP 1 *
FROM OtherTable AS ot
WHERE u.userid = ot.userid
ORDER BY something) t
Use the ROW_NUMBER() function to order your rows by datetime and then filter by row_num = 1
;WITH otNewest
AS
(
SELECT *
FROM othertable
WHERE ROW_NUM() OVER(partition by userid order by datetime DESC) = 1
)
SELECT u.firstname, u.lastname, o.*
FROM #users U
INNER JOIN otNewest O
ON U.userid = O.userid
So, if you're joining but not returning any columns from OtherTable, then you're only interested in checking for existence?
SELECT u.firstname, u.lastname
FROM #users AS u
WHERE EXISTS(SELECT 1
FROM OtherTable ot
WHERE u.userid = ot.userid)