Order by join column but use distinct on another - sql

I'm building a system in which there are the following tables:
Song
Broadcast
Station
Follow
User
A user follows stations, which have songs on them through broadcasts.
I'm building a "feed" of songs for a user based on the stations they follow.
Here's the query:
SELECT DISTINCT ON ("broadcasts"."created_at", "songs"."id") songs.*
FROM "songs"
INNER JOIN "broadcasts" ON "songs"."shared_id" = "broadcasts"."song_id"
INNER JOIN "stations" ON "broadcasts"."station_id" = "stations"."id"
INNER JOIN "follows" ON "stations"."id" = "follows"."station_id"
WHERE "follows"."user_id" = 2
ORDER BY broadcasts.created_at desc
LIMIT 18
Note: shared_id is the same as id.
As you can see I'm getting duplicate results, which I don't want. I found out from a previous question that this was due to selecting distinct on broadcasts.created_at.
My question is: How do I modify this query so it will return only unique songs based on their id but still order by broadcasts.created_at?

Try this solution:
SELECT a.maxcreated, b.*
FROM
(
SELECT bb.song_id, MAX(bb.created_at) AS maxcreated
FROM follows aa
INNER JOIN broadcasts bb ON aa.station_id = bb.station_id
WHERE aa.user_id = 2
GROUP BY bb.song_id
) a
INNER JOIN songs b ON a.song_id = b.id
ORDER BY a.maxcreated DESC
LIMIT 18
The FROM subselect retrieves distinct song_ids that are broadcasted by all stations the user follows; it also gets the latest broadcast date associated with each song. We have to encase this in a subquery because we have to GROUP BY on the columns we're selecting from, and we only want the unique song_id and the maxdate regardless of the station.
We then join that result in the outer query to the songs table to get the song information associated with each unique song_id

You can use Common Table Expressions (CTE) if you want a cleaner query (nested queries make things harder to read)
I would look like this:
WITH a as (
SELECT bb.song_id, MAX(bb.created_at) AS maxcreated
FROM follows aa
INNER JOIN broadcasts bb ON aa.station_id = bb.station_id
INNER JOIN songs cc ON bb.song_id = cc.shared_id
WHERE aa.user_id = 2
GROUP BY bb.song_id
)
SELECT
a.maxcreated,
b.*
FROM a INNER JOIN
songs b ON a.song_id = b.id
ORDER BY
a.maxcreated DESC
LIMIT 18
Using a CTE offers the advantages of improved readability and ease in maintenance of complex queries. The query can be divided into separate, simple, logical building blocks. These simple blocks can then be used to build more complex, interim CTEs until the final result set is generated.

Try by adding GROUP BY Songs.id

I had a very similar query I was doing between listens, tracks and albums and it took me a long while to figure it out (hours).
If you use a GROUP_BY songs.id, you can get it to work by ordering by MAX(broadcasts.created_at) DESC.
Here's what the full SQL looks like:
SELECT songs.* FROM "songs"
INNER JOIN "broadcasts" ON "songs"."shared_id" = "broadcasts"."song_id"
INNER JOIN "stations" ON "broadcasts"."station_id" = "stations"."id"
INNER JOIN "follows" ON "stations"."id" = "follows"."station_id"
WHERE "follows"."user_id" = 2
GROUP BY songs.id
ORDER BY MAX(broadcasts.created_at) desc
LIMIT 18;

Related

SQL How can I create an inner join from these 2 tables

I am just learning sql and I am creating a social network as a learning experience . I have 2 tables called Streams and Votes, Streams pulls user created content and Votes stores the content that people have liked . What I am trying to figure out is how can I return the data from both tables to check if a user a liked a particular post being shown . For instance this is how both my tables look . If you see they both have a field in common stream_id and they both have number 278. How can I do an inner join that checks to see if there are any common stream_ID in both tables ? This is the sql code that I use that gets me the Stream data
Query 1
select post,profile_id,
votes,id as stream_id FROM streams WHERE latitudes>=28.1036 AND
28.9732>=latitudes AND longitudes>=-81.8696 AND -80.8798>=longitudes
order by id desc limit 10
The User ID is 11 and both Streams.profile_id and Votes.my_id are the same field . I have tried this SQL query but this only returns in total 1 result . Again I would like to return all results from the Streams table which I do in query 1 and also add another column to the results from the Votes table where Votes.stream_id=Streams.ID because it'll show that the particular user has liked that post. Any hemp would be great
Query 2
select s.post,s.profile_id,
s.votes,s.id as stream_id, v.my_id as ID FROM streams s inner join Votes v on (s.id = v.stream_id) WHERE latitudes>=28.1036 AND
28.9732>=latitudes AND longitudes>=-81.8696 AND -80.8798>=longitudes
order by id desc limit 10
Streams
Votes
You need to use a LEFT OUTER JOIN instead of an INNER JOIN.
SELECT s.post, s.profile_id, s.votes, s.id as stream_id, v.my_id as ID
FROM streams s
LEFT OUTER JOIN Votes v on (s.id = v.stream_id) WHERE latitudes>=28.1036 AND
28.9732>=latitudes AND longitudes>=-81.8696 AND -80.8798>=longitudes
order by id desc limit 10
For more information about joins, look here.
I think that you are looking for a LEFT JOIN
select s.post,s.profile_id,
s.votes,s.id as stream_id, v.my_id as ID FROM streams s LEFT JOIN Votes v on (s.id = v.stream_id) WHERE latitudes>=28.1036 AND
28.9732>=latitudes AND longitudes>=-81.8696 AND -80.8798>=longitudes
order by id desc limit 10

SQL select with join are returning double results

I am trying to select some data from different tables using join.
First, here is my SQL (MS) query:
SELECT Polls.pollID,
Members.membername,
Polls.polltitle, (SELECT COUNT(*) FROM PollChoices WHERE pollID=Polls.pollID) AS 'choices',
(SELECT COUNT(*) FROM PollVotes WHERE PollVotes.pollChoiceID = PollChoices.pollChoicesID) AS 'votes'
FROM Polls
INNER JOIN Members
ON Polls.memberID = Members.memberID
INNER JOIN PollChoices
ON PollChoices.pollID = Polls.pollID;
And the tables involved in this query is here:
The query returns this result:
pollID | membername | polltitle | choices | votes
---------+------------+-----------+---------+-------
10000036 | TestName | Test Title| 2 | 0
10000036 | TestName | Test Title| 2 | 1
Any help will be greatly appreciated.
Your INNER JOIN with PollChoices is bringing in more than 1 row for a given poll as there are 2 choices for the poll 10000036 as indicated by choices column.
You can change the query to use GROUP BY and get the counts.
In case you don't have entries for each member in the PollVotes or Polls table, you need to use LEFT JOIN
SELECT Polls.pollID,
Members.membername,
Polls.polltitle,
COUNT(PollChoices.pollID) as 'choices',
COUNT(PollVotes.pollvoteId) as 'votes'
FROM Polls
INNER JOIN Members
ON Polls.memberID = Members.memberID
INNER JOIN PollChoices
ON PollChoices.pollID = Polls.pollID
INNER JOIN PollVotes
ON PollVotes.pollChoiceID = PollChoices.pollChoicesID
AND PollVotes.memberID = Members.memberID
GROUP BY Polls.pollID,
Members.membername,
Polls.polltitle
You are getting 1 row for each PollChoices record since there are multiple choices per Polls INNER JOIN Members. You may be expecting the SELECT COUNT(*) sub-queries to act as a GROUP BY clause, but they don't.
If that doesn't make sense, add a bare minimum of sample data and the expected result and we can help more.
This query result is telling you the number of votes per choice in each poll.
In your example, this voter named TestName answered the poll (with ID 10000036) and gave one choice 1 vote, and the second choice 0 votes. This is why you are getting two rows in your result.
I'm not sure if you are expecting just one row because you didn't specify what data, exactly, you are trying to select. However if you are trying to see the number of votes that TestName has submitted, for each choice where the vote was greater than 1, then you will have to modify your query like this:
select * from
(SELECT Polls.pollID,
Members.membername,
Polls.polltitle, (SELECT COUNT(*) FROM PollChoices WHERE pollID=Polls.pollID) AS 'choices',
(SELECT COUNT(*) FROM PollVotes WHERE PollVotes.pollChoiceID = PollChoices.pollChoicesID) AS 'votes'
FROM Polls
INNER JOIN Members
ON Polls.memberID = Members.memberID
INNER JOIN PollChoices
ON PollChoices.pollID = Polls.pollID) as mysubquery where votes <> 0;

Select last record out of grouped records

i have this code and i want someone to help me to change it to a grouped query which orders froms below.
SELECT *
FROM dbo.users_pics INNER JOIN profile
ON users_pics.email = profile.email
Left Join photo_comment
On users_pics.u_pic_id = photo_comment.pic_id
WHERE users_pics.wardrobe = MMColParam
ORDER BY u_pic_id asc
what i mean is i have grouped of records which i want to select one per record only from beneath. for example if i have 10 records of the name "John" i want to select the last "John" out of the 10 and then the rest also follows
I'm going to presume that your users table contains a single user, and each user has a single profile, and your photo_comment table can contain multiple comments.
Depending on your RDBMS, you can do this a number of ways. Row_Number can often be a quick way of doing this if you're using a database which supports window functions such as SQL Server or Oracle.
A generic solution to this is to join the table back to itself using the MAX aggregate. This is dependent on having a field to determine which record is the max. Generally speaking, that would be an identity/auto number field or a time stamp field.
Here is the basic concept using photo_comment_id as your determining column:
SELECT *
FROM dbo.users_pics INNER JOIN profile
ON users_pics.email = profile.email
LEFT Join (
SELECT pic_id, MAX(photo_comment_id) max_photo_comment_id
FROM max_photo_comment
GROUP BY pic_id
) max_photo_comment On users_pics.u_pic_id = max_photo_comment.pic_id
LEFT Join photo_comment On
max_photo_comment.pic_id = photo_comment.pic_id AND
max_photo_comment.max_photo_comment_id = photo_comment.photo_comment_id
WHERE users_pics.wardrobe = MMColParam
ORDER BY u_pic_id asc
If your database supports ROW_NUMBER, then you can do this as well (still using the photo_comment_id field):
SELECT *
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY photo_comment.pic_id
ORDER BY photo_comment.photo_comment_id DESC) rn
FROM dbo.users_pics INNER JOIN profile
ON users_pics.email = profile.email
LEFT JOIN photo_comment
ON users_pics.u_pic_id = photo_comment.pic_id
WHERE users_pics.wardrobe = MMColParam
) t
WHERE rn = 1
ORDER BY u_pic_id asc

How to list unused items from database

MDW_CUSTOMER_ACCOUNTS has the following fields: ACCOUNT_ID, MEAL_ID.
MDW_MEALS_MENU has the following fields: MEAL_ID, MEAL_NAME.
I am trying to generate a report on the number of times a particular meal has been subscribed to by a customer using the query,
SELECT count(a.account_id), b.meal_id, b.meal_name
FROM mdw_meals_menu b LEFT JOIN mdw_customer_accounts a
on b.meal_id=a.meal_id
WHERE
a.start_date BETWEEN to_date('01-APR-2013','DD-MON-YYYY')
AND to_date('30-JUN-2013','DD-MON-YYYY')
GROUP BY b.meal_id, b.meal_name
ORDER BY count(a.account_id) desc, b.meal_id;
This only lists the MEAL_IDs that has been subscribed to at least once. But it is not displaying the Ids that have not been subscribed to.
How do I get these MEAL_IDs to print with the count being 0?
i have modified the code, but still i get the same result.
Your where clause is effectively turning your outer join back into an inner join - conditions on an outer-joined table should generally be in the join clause, like so:
SELECT count(a.account_id), b.meal_id, b.meal_name
FROM mdw_meals_menu b
LEFT JOIN mdw_customer_accounts a
on b.meal_id=a.meal_id and
a.start_date BETWEEN to_date('01-APR-2013','DD-MON-YYYY')
AND to_date('30-JUN-2013','DD-MON-YYYY')
GROUP BY b.meal_id, b.meal_name
ORDER BY count(a.account_id) desc, b.meal_id;
You should use a left outer join .

i want to modify this SQL statement to return only distinct rows of a column

select
picks.`fbid`,
picks.`time`,
categories.`name` as cname,
options.`name` as oname,
users.`name`
from
picks
left join categories
on (categories.`id` = picks.`cid`)
left join options
on (options.`id` = picks.oid)
left join users
on (users.fbid = picks.`fbid`)
order by
time desc
that query returns a result that like:
my question is.... I would like to modify the query to select only DISTINCT fbid's. (perhaps the first row only sorted by time)
can someone help with this?
select
p2.fbid,
p2.time,
c.`name` as cname,
o.`name` as oname,
u.`name`
from
( select p1.fbid,
min( p1.time ) FirstTimePerID
from picks p1
group by p1.fbid ) as FirstPerID
JOIN Picks p2
on FirstPerID.fbid = p2.fbid
AND FirstPerID.FirstTimePerID = p2.time
LEFT JOIN Categories c
on p2.cid = c.id
LEFT JOIN Options o
on p2.oid = o.id
LEFT JOIN Users u
on p2.fbid = u.fbid
order by
time desc
I don't know why you originally had LEFT JOINs, as it appears that all picks must be associated with a valid category, option and user... I would then remove the left, and change them to INNER joins instead.
The first inner query grabs for each fbid, the FIRST entry time which will result in a single entity for the FBID. From that, it re-joins to the picks table for the same ID and timeslot... then continues for the rest of the category, options, users join criteria of that single entry.
2 options, you could write a group by clause.
Or you could write a nested query joined back to itself to get pertinent info.
Nested aliased table:
SELECT
n.fBids
FROM
MyTable t
INNER JOIN
(SELECT DISTINCT fBids
FROM MyTable) n
ON n.ID = t.ID
Or group by option
SELECT fBId from MyTable
GROUP BY fBID
select picks.`fbid`, picks.`time`, categories.`name` as cname,
options.`name` as oname, users.`name` from picks left join categories
on (categories.`id` = picks.`cid`) left join options on (options.`id` = picks.oid)
left join users on (users.fbid = picks.`fbid`)
order by time desc GROUP BY picks.`fbid`
select
picks.fbid,
MIN(picks.time) as first_time,
MAX(picks.time) as last_time
from
picks
group by
picks.fbid
order by
MIN(picks.time) desc
However, if you want only distinct fbid's you cannot display cname and other columns at the same time.