mysql left join ascending - sql

I have a query:
SELECT reply.id,
reply.message,
reply.userid,
reply.date,
medal.id,
medal.url,
medal.name,
user.id,
user.name AS username
FROM posts AS reply
LEFT JOIN users AS user ON reply.userid = user.id
LEFT JOIN medals AS medal ON medal.userid = user.id
GROUP BY reply.id
ORDER BY reply.id ASC
everything is OK, except that I get the medal ascending not descending
which mean that it grab the first medal that user got - I need to get the last one.

The fact that you are seeing the first record per group is accidental. Selecting a non-aggregate, non-group-unique column in a GROUP BY query causes an undefined record to be selected.
For a more detailed explanation read: http://dev.mysql.com/tech-resources/articles/debunking-group-by-myths.html.
One correct way to do what you're doing is by using a subquery, in which you select the maximum medal date per desired group.
This approach is outlined here: http://dev.mysql.com/doc/refman/5.6/en/example-maximum-column-group-row.html

You could perhaps make a subquery or temporary table for medals instead of joining directly on medals. The temporary table would be the medals table but sorted DESC. The subquery would be LEFT JOIN (select id, url, name from medals order by desc) but I feel like that would be very slow and not the best option.

The simplest solution is to do this:
ORDER BY reply.id ASC, medal.id DESC
AND remove your GROUP BY statement.
You can filter the posts / users who have multiple medals afterwards (skip rows that contain the medals you don't want).
The next option is to select MAX(medal.id) in a subquery, and then join on that. That gets messy, but is doable. Here is the general idea (you will probably have to rename a few table aliases, since they are used more than once):
SELECT *,
medal.url,
medal.name
FROM
(
SELECT reply.id,
reply.message,
reply.userid,
reply.date,
MAX(medal.id) AS medal_id,
user.id,
user.name AS username
FROM posts AS reply
LEFT JOIN users AS user ON reply.userid = user.id
LEFT JOIN medals AS medal ON medal.userid = user.id
GROUP BY reply.id
ORDER BY reply.id ASC
)
AS t
LEFT JOIN medals AS medal ON medal.id = t.medal_id

Related

How to write the correct SQL query to find the most duplicate user?

That's my database design.
I need to find the person with the most albums using SQLite.
I tried this:
SELECT USERS.NAME, COUNT(USERS.NAME) AS 'value_occurrence' FROM USERS
INNER JOIN ALBUMS
ON USERS.ID = ALBUMS.USER_ID
GROUP BY USERS.NAME
ORDER BY 'value_occurrence'
DESC LIMIT 1;
but it didn't work and gave me the wrong result. Please help me find the right way to do this.
The logic is correct, but you may be getting tripped up by incorrect use of single quotes for aliases. Try this version:
SELECT u.NAME, COUNT(u.NAME) AS value_occurrence
FROM USERS u
INNER JOIN ALBUMS a ON a.USER_ID = u.ID
GROUP BY u.NAME
ORDER BY value_occurrence DESC
LIMIT 1;
The problem with ORDER BY 'value_occurrence' is that you are telling SQLite to order by a constant value. That is, every record in the result set will have the same value for ordering, which basically means that SQLite is free to choose any record as being the "first."
Note: As the answer by #Cazzym mentioned, you should be aggregating by the user ID, in case two or more users might have the same name.
The code basically looks fine, but it will return unexpected results where two users have the same name. That's why we have ID columns!
Try
SELECT USERS.NAME, COUNT(USERS.ID) AS 'value_occurrence' FROM USERS
INNER JOIN ALBUMS
ON USERS.ID = ALBUMS.USER_ID
GROUP BY USERS.ID, USERS.NAME
ORDER BY 'value_occurrence'
DESC LIMIT 1;
We can use group by Users.Id, Users.Name because each ID is going to only have one name associated with it, so it's still going to only create a single group per ID.

SQL Query with row_number() not returning expected output

my goal is to write a query that should return the cities which produced the highest avg. sales for each item-category.
This is the expected output:
item_category|city
books |los_angeles
toys |austin
electronics |san_fransisco
My 3 table schemas look like this:
users
user_id|city
sales
user_id|item_id|sales_amt
items
item_id|item_category
These are further notes to consider:
1. sales_amt is the only column that may have Null values. if no users have placed a sale for a particular item-category (no rows in sales with a non-Null sales_amt), then the city name should be Null.
2. only 1 row per each distinct item. It more than 1 city qualify, then pick the first one alphabetically.
The attempt I took looks like this but it does not produce the right output:
select a.item_category,a.city from (
select
i.item_category,
u.city,
row_number() over (partition by i.item_category,u.city order by avg(s.sales_amt) desc)rk
from sales s
join users u on s.user_id=u.user_id
join items i on i.item_id=s.item_id
group by i.item_category,u.city)a
where a.rk=1
My output does not return the Null cased for sales_amt. Also, I get non-unique rows. Therefore, I am very nervous I am not properly incorporating the 2 notes.
I hope someone can help.
my goal is to write a query that should return the cities which produced the highest avg. sales for each item-category.
This can be calculated using aggregation and window functions:
select ic.*
from (select i.item_category, u.city,
row_number() over(partition by u.item_category order by avg(s.sales_amt) desc, u.city) as seqnum
from users u join
sales s
on s.user_id = u.user_id join
items i
on i.item_id = s.item_id
group by i.item_category, u.city
) ic
where seqnum = 1;
Your question explicitly says "average" which is why this uses avg(). However, I suspect that you really want the sum in each city, which would be sum().
Notes:
You want one row so row_number() instead of rank().
You need sales to calculate the average, so join, instead of left join.
You want one row per item_category, so that is used for partitioning.
Aaaand my take on it is a mix of GMB and Gordon's advices; GMB points out that left joins are needed but I think his starting table, partition and choice of rank() is wrong (his query cannot generate null city names as requested, and could generate duplicates tied on same avg), and Gordon picked up on things like ordering by city on a tied avg which GMB did not but missed the "if no sales of any items in category X put null for the city" requirement. Both guys left cancelled orders floating round the system which introduces errors:
select *
from (
select
i.item_category,
u.city,
row_number() over(partition by i.item_category order by avg(s.sales_amt) desc, u.city asc) rn
from items i
left join (select * from sales where sale_amt is not null) s on i.item_id = s.item_id
left join users u on s.user_id = u.user_id
group by i.item_category, u.city
) t
where rn = 1
We start from itemcategory so that categories having no sales get nulls for their sale amount and city.
We also need to consider that any sales that didn't fulfil will have null in their amount and we exclude these with a subquery otherwise they will link through to users giving a false positive - even though the avg will calculate as null for a category that only has cancelled orders, the city will still show when it should not). I could also have done this with a and sales_amt is not null predicate in the join but I think this way is clearer. This should not be done with a predicate in the where clause because that will eliminate the sale-less categories we are trying to preserve
Row number is used on avg but with city name to break any ties. It's a simpler function than rank and cannot generate duplicate values
Finally we pull the rn 1s to get the top averaging cities
I think you want left joins starting from users in the inner query to preserve cities without sales.
As for the ranking: if you want one record per city, then do not put other columns that city in the partition (your current partition gives you one record per city and per category, which is not what you want).
Consider:
select *
from (
select
i.item_category,
u.city,
rank() over(partition by u.city order by avg(s.sales_amt) desc) rk
from users u
left join sales s on s.user_id = u.user_id
left join items i on i.item_id = s.item_id
group by i.item_category, u.city
) t
where rk = 1

SQL: Count in join not working

I have an SQL query where I simply join two tables. One table contain comments and the other is the user table. I join the tables to in a simple manner get user information about the user who wrote the comment (username) at the same time get the comment (comment text etc.).
Now I want to count the number of comments to write the correct number of comments on the top of the page. I do this by adding a COUNT, and an alias to save the value.
When I echo numCount, I get the correct value of comments, but I get no comments in my comment loop. As soon as I remove the count, I get all comments again. What am I doing wrong?
SELECT
ncID, ncText, ncDate,
uID, uName, uImageThumb,
COUNT(a.ncID) AS numComments
FROM tblNewsComments a LEFT JOIN tblUsers b
ON a.ncUserID = b.uID
WHERE a.ncNewsID = $newID
ORDER BY ncDate DESC
I am going to assume this is MySQL (or maybe SQLite), since most other RDBMS would fail on this query. The issue is that you are missing a GROUP BY clause, which is required when using an aggregate function like COUNT() unless it is to operate over the entire rowset. MySQL's unusual behavior is to allow the absence of a GROUP BY, or to allow columns in SELECT which are not also in the GROUP BY, producing unusual results.
The appropriate way to do this would be to join in a subquery which returns the COUNT() per ncID.
SELECT
ncID,
ncText,
ncDate,
uID,
uName,
uImageThumb,
/* The count returned by the subquery */
ccount.numComments
FROM
tblNewsComments a
LEFT JOIN tblUsers b ON a.ncUserID = b.uID
/* Derived table returns only ncID and count of comments */
LEFT JOIN (
SELECT ncID, COUNT(*) AS numComments
FROM tblNewsComments
GROUP BY ncID
) ccount ON a.ncID = ccount.ncID
WHERE a.ncNewsID = $newID
ORDER BY ncDate DESC
Edit Whoops - looks like you wanted the count per ncID, not the count per ncUserID as I originally had it.
I don't know what SQL engine you are using, but what you have here is not valid SQL and should be flagged as such.
COUNT is an aggregate function and you can only apply those to groups or a whole table, so in your case you would probably do
SELECT
ncID, ncText, ncDate,
uID, uName, uImageThumb,
COUNT(a.ncID) AS numComments
FROM tblNewsComments a LEFT JOIN tblUsers b
ON a.ncUserID = b.uID
WHERE a.ncNewsID = $newID
GROUP BY ncID, ncText, ncDate,
uID, uName, uImageThumb
ORDER BY ncDate DESC
You're using an AGGREGATE function (Count) but you're needing a GROUP BY to make any sense from that count.
I suggest adding "GROUP BY [all other field names except the COUNT]" to your query
Try this:
SELECT
ncID, ncText, ncDate,
uID, uName, uImageThumb,
(SELECT COUNT(ncID)
FROM
tblNewsComments a
INNER JOIN
tblUsers b
ON a.ncUserID = b.uID)
AS numComments
FROM tblNewsComments a LEFT JOIN tblUsers b
ON a.ncUserID = b.uID
WHERE a.ncNewsID = $newID
ORDER BY ncDate DESC

SQL to gather data from one table while counting records in another

I have a users table and a songs table, I want to select all the users in the users table while counting how many songs they have in the songs table. I have this SQL but it doesn't work, can someone spot what i'm doing wrong?
SELECT jos_mfs_users.*, COUNT(jos_mfs_songs.id) as song_count
FROM jos_mfs_users
INNER JOIN jos_mfs_songs
ON jos_mfs_songs.artist=jos_mfs_users.id
Help is much appreciated. Thanks!
The inner join won't work, because it joins every matching row in the songs table with the users table.
SELECT jos_mfs_users.*,
(SELECT COUNT(jos_mfs_songs.id)
FROM jos_mfs_songs
WHERE jos_mfs_songs.artist=jos_mfs_users.id) as song_count
FROM jos_mfs_users
WHERE (SELECT COUNT(jos_mfs_songs.id)
FROM jos_mfs_songs
WHERE jos_mfs_songs.artist=jos_mfs_users.id) > 10
There's a GROUP BY clause missing, e.g.
SELECT jos_mfs_users.id, COUNT(jos_mfs_songs.id) as song_count
FROM jos_mfs_users
INNER JOIN jos_mfs_songs
ON jos_mfs_songs.artist=jos_mfs_users.id
GROUP BY jos_mfs_users.id
If you want to add more columns from jos_mfs_users in the select list you should add them in the GROUP BYclause as well.
Changes:
Don't do SELECT *...specify your fields. I included ID and NAME, you can add more as needed but put them in the GROUP BY as well
Changed to a LEFT JOIN - INNER JOIN won't list any users that have no songs
Added the GROUP BY so it gives a valid count and is valid syntax
SELECT u.id, u.name COUNT(s.id) as song_count
FROM jos_mfs_users AS u
LEFT JOIN jos_mfs_songs AS S
ON s.artist = u.id
GROUP BY U.id, u.name
Try
SELECT
*,
(SELECT COUNT(*) FROM jos_mfs_songs as songs WHERE songs.artist=users.id) as song_count
FROM
jos_mfs_users as users
This seems like a many to many relationship. By that I mean it looks like there can be several records in the users table for each user, one of each song they have.
I would have three tables.
Users, which has one record for each user
Songs, which has one record for each song
USER_SONGS, which has one record for each user/song combination
Now, you can do a count of the songs each user has by doing a query on the intermediate table. You can also find out how many users have a particular song.
This will tell you how many songs each user has
select id, count(*) from USER_SONGS
GROUP BY id;
This will tell you how many users each song has
select artist, count(*) from USER_SONGS
GROUP BY artist;
I'm sure you will need to tweak this for your needs, but it may give you the type of results you are looking for.
You can also join either of these queries to the other two tables to find the user name, and/or artist name.
HTH
Harv Sather
ps I am not sure if you are looking for song counts or artist counts.
You need a GROUP BY clause to use aggregate functions (like COUNT(), for example)
So, assuming that jos_mfs_users.id is a primary key, something like this will work:
SELECT jos_mfs_users.*, COUNT( jos_mfs_users.id ) as song_count
FROM jos_mfs_users
INNER JOIN jos_mfs_songs
ON jos_mfs_songs.artist = jos_mfs_users.id
GROUP BY jos_mfs_users.id
Notice that
since you are grouping by user id, you will get one result per distinct user id in the results
the thing you need to COUNT() is the number of rows that are being grouped (in this case the number of results per user)

Problem With DISTINCT!

Here is my query:
SELECT
DISTINCT `c`.`user_id`,
`c`.`created_at`,
`c`.`body`,
(SELECT COUNT(*) FROM profiles_comments c2 WHERE c2.user_id = c.user_id AND c2.profile_id = 1) AS `comments_count`,
`u`.`username`,
`u`.`avatar_path`
FROM `profiles_comments` AS `c` INNER JOIN `users` AS `u` ON u.id = c.user_id
WHERE (c.profile_id = 1) ORDER BY `u`.`id` DESC;
It works. The problem though is with the DISTINCT word. As I understand it, it should select only one row per c.user_id.
But what I get is even 4-5 rows with the same c.user_id column. Where is the problem?
actually, DISTINCT does not limit itself to 1 column, basically when you say:
SELECT DISTINCT a, b
What you're saying is, "give me the distinct value of a and b combined" .. just like a multi-column UNIQUE index
distinct will ensure that ALL values in your select clause are unique, not just user_id. If you want to limit the results to individual user_ids, you should group by user_id.
Perhaps what you want is:
SELECT
`c`.`user_id`,
`u`.`username`,
`u`.`avatar_path`,
(SELECT COUNT(*) FROM profiles_comments c2 WHERE c2.user_id = c.user_id AND c2.profile_id = 1) AS `comments_count`
FROM `profiles_comments` AS `c` INNER JOIN `users` AS `u` ON u.id = c.user_id
WHERE (c.profile_id = 1)
GROUP BY `c`.`user_id`,
`u`.`username`,
`u`.`avatar_path`
ORDER BY `u`.`id` DESC;
DISTINCT works at a row level, not just a column level
If you want the DISTiNCT of only one column then you will have to aggregate the rest of the columns returned (MIN, MAX, SUM, AVG, etc)
SELECT DISTINCT (Name), Min (ID)
From MyTable
Distinct will try to return only unique rows, it will not return only 1 row per user id in your example.
http://dev.mysql.com/doc/refman/5.0/en/distinct-optimization.html
You misunderstand. The DISTINCT modifier applies to the entire row — it states that no two identical ROWS will be returned in the result set.
Looking at your SQL, what value of the several available do you expect to see returned in the created_at column (for instance)? It would be impossible to predict the results of the query as written.
Also, you're using profile_comments twice in your SELECT. It appears that you're trying to obtain a count of how many times each user has commented. If so, what you want to do is use an AGGREGATE query, grouped on user_id and including only those columns that uniquely identify a user along with a COUNT of the comments:
SELECT user_id, COUNT(*) FROM profile_comments WHERE profile_id = 1 GROUP BY user_id
You can add the join to users to get the user name if you want but, logically, your result set cannot include other columns from profile_comments and still produce only a single row per user_id unless those columns are also aggregated in some way:
SELECT user_id, MIN(created_at) AS Earliest, MAX(created_at) AS Latest, COUNT(*) FROM profile_comments WHERE profile_id = 1 GROUP BY user_id