sql using count & group by without using distinct keyword? - sql

I want to optimize this query
**SELECT * FROM Users WHERE Active = 1 AND UserId IN (SELECT UserId FROM Users_Roles WHERE RoleId IN (SELECT RoleId FROM Roles WHERE PermissionLevel >= 100)) ORDER BY LastName**
execution time became less wen i replace above query with joins as below,
**SELECT u.* FROM Users u INNER JOIN Users_Roles ur ON (u.UserId = ur.UserId) INNER JOIN Roles r ON (r.RoleId = ur.RoleId) WHERE u.Active = 1 AND r.PermissionLevel > 100 GROUP BY u.UserId ORDER BY u.LastName**
But the above query gives duplicate records since my roles table has more than one entry for every user.
I cant use DISTINCT since there is a function where i find count by replacing SELECT(*) FROM to SELECT COUNT(*) FROM to find count for pagination and then execute count query and result query
As we already known that count & GROUP BY is used together will result in bad output.
Now i want to optimize the query and have to find number of rows ie count for the query. Please give be better way find out the result.

It is difficult to optimise other peoples queries without fully knowing the schema, what is indexed what isn't, how much data there is, what your DBMS is etc. Even with this we can't see execution plans, IO statistics etc. With this in mind, the below may not be better than what you already have, but it is how I would write the query in your situation.
SELECT u.*
FROM Users u
INNER JOIN
( SELECT ur.UserID
FROM Users_Roles ur
INNER JOIN Roles r
ON r.RoleID = ur.RoleID
WHERE r.PermissionLevel > 100
GROUP BY ur.UserID
) ur
ON u.UserId = ur.UserId
WHERE u.Active = 1
ORDER BY u.LastName

Related

How create SQL pagination difficult join query with duplicate data?

I have several tables in the database.
Users, profiles and user roles.
The relationship of profiles and users one to one.
The relationship of roles and users many to many.
To select all users, I send the following request:
SELECT A.role_id, A.role_name, A.user_id,B.user_username, B.user_password, B.profile_color_text, B.profile_color_menu, B.profile_color_bg FROM
(SELECT Roles.role_id, Roles.role_name, UserRoles.user_id
FROM Roles INNER JOIN UserRoles ON Roles.role_id = UserRoles.role_id) AS A
LEFT JOIN
(SELECT Users.user_username, Users.user_password, Profiles.profile_color_text, Profiles.profile_color_menu, Profiles.profile_color_bg, Profiles.profile_id
FROM Users INNER JOIN Profiles ON Users.user_id = Profiles.profile_id) AS B
ON A.user_id = B.profile_id;
The question is how do I select a pagination?
I would get the 10 users first, then perform the joins. Two reasons for this:
Since you don't want specifically 10 results but just the results of 10 users, which could contain any number of rows, you can't get all the data then limit it, otherwise you could be getting 10 rows containing data for 5 users;
Even if point 1 were irrelevant because there was always a 1-1 relationship, and especially if the number of results is small like 10, it's faster to get those results first and then join on that smaller "table", rather than doing all your joins on all the data and then limiting it.
.
SELECT
u.user_id,
u.user_username,
u.user_password,
r.role_id,
r.role_name,
p.profile_id,
p.profile_color_text,
p.profile_color_menu,
p.profile_color_bg
FROM (
SELECT user_id, user_username, user_password
FROM users
ORDER BY ???
OFFSET 10
LIMIT 10
) AS u
LEFT JOIN profiles AS p
ON u.user_id = p.profile_id
LEFT JOIN userroles AS ur
ON u.user_id = ur.user_id
LEFT JOIN roles AS r
ON ur.role_id = r.role_id
I assume you'll want some order, so I've put an ORDER BY in there - to be completed.
OFFSET added to get the second page of results; first page wouldn't require it, or would be OFFSET 0. Then a LIMIT of course to limit the page size.
I've also restructured the joins in a way that made more sense to me.

PostgreSQL: Get the count of rows in a join query

I am trying to get some data joining few tables. I have an audit table where I store the audits for actions performed by users. I am trying to get the list of users in the order of the number audits they have and the number of audits. I have the following query:
SELECT s.user_created,
u.first_name,
u.last_name,
u.email,
a.message as audits
FROM cbrain_school s
inner join ugrp_user u on s.user_created = u.user_id
inner join audit a on u.user_id = a.user_id
order by u.created_time desc;
This query will give me 1 row per entry in the audit table. I just want 1 row per user and the count of entries in the audit table ordered by the number of audits.
Is there any way to do that. I was getting an error when I tried to include count() in the above query
First of all you are joining with the table cbrain_school. Why? You are selecting no data from this table (except for s.user_created which is simply u.user_id). I suppose you want to limit the users show to the cbrain_school.user_created? Then use EXISTS or IN to look this up.
select u.user_id, u.first_name, u.last_name, u.email, a.message as audits
from ugrp_user u
inner join audit a on u.user_id = a.user_id
where u.user_id in (select user_created from cbrain_school)
order by u.created_time desc;
This shows much better that cbrain_school.user_created is mere criteria. (But the query result is the same, of course.) It's a good habit to avoid joins, when you are not really interested in the joined rows.
Now you don't want to show each message anymore, but merely count them per user. So rather then joining messages, you should join the message count:
select u.user_id, u.first_name, u.last_name, u.email, a.cnt
from ugrp_user u
inner join
(
select user_id, count(*) as cnt
from audit
group by user_id
) a on u.user_id = a.user_id
where u.user_id in (select user_created from cbrain_school)
order by u.created_time desc;
(You could also join all messages and only then aggregate, but I don't recommend this. It would work for this and many other queries, but is prone to errors when working with multiple tables, where you might suddenly count or add up values multifold. It's a good habit to join before aggregating.)

Postgres SUM returns 0 rows when columns don't exist. COALESCE and ISNULL don't work

Using Postgres 9.5.
I'm trying to get the SUM of a user's submitted links votes. Think Reddit karma, SO etc.
It works perfectly for user's with submitted links , but when the user does not have any links then I get 0 rows in my response.
Is this a problem with the SUM(links.votes) or with the inner join? I'm quite new to SQL so I might be going at this from the wrong angle.
My query is as follows:
SELECT
users.firstname,
users.lastname,
users.email,
users.created_on,
SUM(links.votes) AS sum
FROM users
INNER JOIN links
ON users.id = links.created_by
WHERE users.id = 50
AND links.created_by = 50
GROUP BY users.firstname, users.lastname, users.email, users.created_on
I've also tried doing COALESCE(SUM(links.votes), 0) AS sum thinking that it would return 0 in the 'sum' column, but that doesn't work either.
Any ideas here?
Thanks!
You need a left join:
SELECT u.firstname, u.lastname, u.email, u.created_on,
SUM(l.votes) AS sum
FROM users u LEFT JOIN
links l
ON u.id = l.created_by AND
l.created_by = 50
WHERE u.id = 50
GROUP BY u.firstname, u.lastname, u.email, u.created_on;
Note that the condition on the links table goes in the ON clause rather than the WHERE clause.
You will need the LEFT OUTER JOIN and you must be able to handle NULL, the result of SUM in the case of not existing votes.
You can use CASE to make your statement return 0 instead of NULL.

SQL query at least one of something

I have a bunch of Users, each of whom has many Posts.
Schema:
Users: id
Posts: user_id, rating
How do I find all Users who have at least one post with a rating above, say, 10?
I'm not sure if I should use a subQuery for this, or if there's an easier way.
Thanks!
To find all users with at least one post with a rating above 10, use:
SELECT u.*
FROM USERS u
WHERE EXISTS(SELECT NULL
FROM POSTS p
WHERE p.user_id = u.id
AND p.rating > 10)
EXISTS doesn't care about the SELECT statement within it - you could replace NULL with 1/0, which should result in a math error for dividing by zero... But it won't, because EXISTS is only concerned with the filteration in the WHERE clause.
The correlation (the WHERE p.user_id = u.id) is why this is called a correlated subquery, and will only return rows from the USERS table where the id values match, in addition to the rating comparison.
EXISTS is also faster, depending on the situation, because it returns true as soon as the criteria is met - duplicates don't matter.
You can join the tables to find the relevant users, and use DISTINCT so each user is in the result set at most once even if they have multiple posts with rating > 10:
select distinct u.id,u.username
from users u inner join posts p on u.id = p.user_id
where p.rating > 10
Use an inner join:
SELECT * from users INNER JOIN posts p on users.id = p.user_id where p.rating > 10;
select distinct id
from users, posts
where id = user_id and rating > 10
SELECT max(p.rating), u.id
from users u
INNER JOIN posts p on users.id = p.user_id
where p.rating > 10
group by u.id;
Additionally, this will tell you what their highest rating is.
The correct answer for your question as stated is OMG Ponies's answer, WHERE EXISTS is more descriptive and almost always faster. But "SELECT NULL" looks really ugly and counterintuitive to me. I've seen SELECT * or SELECT 1 as a best practice for this.
Another way, in case we're collecting answers:
SELECT u.id
FROM users u
JOIN posts p on u.id = p.user_id
WHERE p.rating > 10
GROUP BY u.id
HAVING COUNT(*) > 1
This could be useful if it's not always 1 you're testing on.

SQL Query without nested queries

Let's say we have these tables;
table user:
- id
- username
- email
table user2group:
- userid
- groupid
table group:
- id
- groupname
How do I make one query that returns all users, and the groups they belong to (as an array in the resultset or something..)
select u.id, u.username, u.email, g.groupid, g.groupname
from user u
join user2group ug on u.userid=ug.userid
join group g on g.groupid=ug.groupid
order by u.userid
As you are looping through the result set, each time you see a new userid make a new user object (or whatever) and add the groups to it.
Eric's answer is great, but I would use a LEFT JOIN instead of an INNER to get users that do not belong to any group as well.
SELECT
u.id,
u.username,
u.email,
g.groupid,
g.groupname
FROM
user u
LEFT JOIN user2group ug ON u.userid = ug.userid
LEFT JOIN group g ON g.groupid = ug.groupid
ORDER BY
u.userid
Both of the above are more or less correct (deepends if each user has a group or not). But they will also both give a result set with several entries for each user.
There are ways of concatenating every group member into one comma separated string, I'd suggest you read about it here:
http://www.simple-talk.com/sql/t-sql-programming/concatenating-row-values-in-transact-sql/
Another method I personally like is to use bit values instead of the relational table user2group
table user then gets a int (or bigint) field group, and each group ID is assigned one bit value (ie: 1,2,4,8,16 and so on) The value of the user table's group field is then the sum of the groupID it's assigned to. To query if its got a group you do:
where (group AND groupID = groupID)