Only way to write this SQL JOIN question? - sql

I wrote this sql query and it seems to work great but im not sure if it is the correct way to write it or if there is another better way to write it:
SELECT
art.artid, users.userid
FROM
art LEFT JOIN users
ON
art.userid = users.userid
WHERE
(SELECT COUNT(1) FROM art WHERE art.userid = users.userid) > 5 AND
users.active = '1' AND
art.active = '1' AND
art.status = '0' AND
art.pricesek > 0 GROUP BY users.userid ORDER BY RAND()
It gets the users from users table that are active and has 5 or more artworks in the art table. It also checks to see that artwork is active, status of artwork is set to 0 "for sale" and price is more then 0. Then it groups results by userid in a random order.
Is this the correct way to write this or is there another way.
"All input is hardcoded so no userinput will be sent into database, so not worried about injections (should i be worried even if its hardcoded?)."

I made a small change in your code. Instead of using (SELECT COUNT (1) FROM art WHERE art.userid = users.userid)> 5 I put it in Having clause.
SELECT art.artid, users.userid
FROM art LEFT JOIN users ON art.userid = users.userid
WHERE users.active = '1' AND art.active = '1' AND
art.status = '0' AND art.pricesek > 0
GROUP BY users.userid, art.artid
HAVING COUNT(users.userid) > 5
ORDER BY RAND()

Your query has problems at many levels. The most obvious is that the GROUP BY clause is inconsistent with the SELECT. That should be generating an error.
It gets the users from users table that are active and has 5 or more artworks in the art table.
I would instead suggest aggregating the art table before joining:
SELECT u.userid
FROM users u JOIN
(SELECT a.userid, COUNT(*) as cnt
FROM art a
WHERE a.active = 1 AND
a.status = 0 AND
a.pricesek > 0
GROUP BY a.userid
) a
ON a.userid = u.userid
WHERE a.cnt > 5 AND u.active = 1
ORDER BY RAND();
Notes:
LEFT JOIN is not appropriate. In order to count the number of artworks, the JOIN must find at least 1 (really 6) matching rows.
It makes no sense to return a.artid. If you need an example, you could use min(a.artid) in the subquery. If you want all of them, then you would need to specify how to return them, but a JSON, array, or string aggregation function would be used n the subquery.
The values "1" and "0" look like numbers, so I removed the single quotes, so I assume the columns are numbers. Compare numbers to numbers and strings to strings. Try to avoid mixing the two.

Related

SQL Server left join same table twice I'm getting duplicate rows

I have this query:
SELECT
u.UserId,
up.PhoneNumber AS OfficePhoneNumber,
up2.PhoneNumber
FROM
[OnlineTools].[App].[User] AS u
LEFT JOIN
[UserPhone] AS up ON up.UserId = u.UserId
AND up.PhoneType = 'Work'
LEFT JOIN
[UserPhone] AS up2 ON up2.UserId = u.UserId
AND up2.PhoneType = 'Mobile'
Expected result is three records returned and its correct when I left join one UserPhone table only.
When I'm joining the same table again to get the mobile phones, instead of three records I'm getting 18 records.
What I can improve here in order to get the correct records?
You are getting a cross-join (or Cartesian result). For each entry in the "Work" condition, it is getting all entries in the "Mobile" since they are left-joins. Then, the inverse is true while getting all Mobile, joining to the work.
I would suggest joining ONCE to the given phone table and filter on just the two types, but add a column to SHOW what type it was... Something like
SELECT
u.UserId,
up.PhoneNumber,
up.PhoneType
FROM
OnlineTools.App.User u
JOIN UserPhone up
ON u.UserId = up.UserId
AND up.PhoneType in ( 'Work', 'Mobile' )
This way, it is a single entry retrieved, AND that specific type is the column to show if work or mobile. Now, if you want the work phone to be listed in the first position, just add
order by
up.PhoneType DESC
Which will result in all WORK phones first, then any mobile... but if there are no work phones, then only mobile will show (or vice-versa).
Assuming that for each user there is 1 work phone and 1 mobile phone, you could use conditional aggregation instead of the 2 joins:
SELECT u.UserId,
MAX(CASE WHEN up.PhoneType = 'Work' THEN up.PhoneNumber END) AS OfficePhoneNumber,
MAX(CASE WHEN up.PhoneType = 'Mobile' THEN up.PhoneNumber END) AS MobilePhoneNumber
FROM [OnlineTools].[App].[User] AS u LEFT JOIN [UserPhone] AS up
ON up.UserId = u.UserId
GROUP BY u.UserId;

SELECT DISTINCT + ORDER BY additional expression

I have no experience with PostgreSQL and I am migrating a Rails5+MySQL application to Rails5+PostgreSQL and I am having a problem with a query.
I've already looked at some questions/answers and still haven't been able to solve my problem. My problem seems to be ridiculous, but I needed to ask for help here!
Query:
SELECT DISTINCT users.* FROM users
INNER JOIN areas_users ON areas_users.user_id = users.id
INNER JOIN areas ON areas.deleted_at IS NULL AND areas.id = areas_users.area_id
WHERE users.deleted_at IS NULL AND users.company_id = 2 AND areas.id IN (2, 4, 5)
ORDER BY CASE WHEN users.id=3 THEN 0 WHEN users.id=5 THEN 1 END, users.id, 1 ASC
Running the query in DBeaver, returns the error:
SQL Error [42P10]: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
What do I need to do to be able to use this SELECT DISTINCT with this ORDER BY CASE?
It's like error message says:
for SELECT DISTINCT, ORDER BY expressions must appear in select list
This is an expression:
CASE WHEN users.id=3 THEN 0 WHEN users.id=5 THEN 1 END
You cannot order by it, while doing SELECT DISTINCT users.* FROM ... because that only allows ORDER BY expressions that appear in the SELECT list.
Typically, the best solution for DISTINCT is not to use it in the first place. If you don't duplicate rows, you don't have to de-duplicate them later. See:
How to speed up select distinct?
In your case, use an EXISTS semi-join (expression / subquery) instead of the joins. This avoids the duplication. Assuming distinct rows in table users, DISTINCT is out of job.
SELECT u.*
FROM users u
WHERE u.deleted_at IS NULL
AND u.company_id = 2
AND EXISTS (
SELECT FROM areas_users au JOIN areas a ON a.id = au.area_id
WHERE au.user_id = u.id
AND a.id IN (2, 4, 5)
AND a.deleted_at IS NULL
)
ORDER BY CASE u.id WHEN 3 THEN 0
WHEN 5 THEN 1 END, u.id, 1; -- ①
Does what you request, and typically much faster, too.
Using simple ("switched") CASE syntax.
① There is still an ugly bit. Using a positional reference in ORDER BY can be convenient short syntax. But while you have SELECT *, it's a really bad idea. If the order of columns in the underlying table changes, your query is silently changed. Spell out the column in this use case!
(Typically, you don't need SELECT * in the first place, but just a selection of columns.)
IF your ID column is guaranteed to have positive numbers, this would be a bit faster:
...
ORDER BY CASE u.id WHEN 3 THEN -2
WHEN 5 THEN -1
ELSE u.id END, <name_of_first_column>
I MUST use DISTINCT
(Really?) If you insist:
SELECT DISTINCT CASE u.id WHEN 3 THEN -2 WHEN 5 THEN -1 ELSE u.id END AS order_column, u.*
FROM users u
JOIN areas_users au ON au.user_id = u.id
JOIN areas a ON a.id = au.area_id
WHERE u.deleted_at IS NULL
AND u.company_id = 2
AND a.id IN (2, 4, 5)
AND a.deleted_at IS NULL
ORDER BY 1, <name_of_previously_first_column>; -- now, "ORDER BY 1" is ok
You get the additional column order_column in the result. You can wrap it in a subquery with a different SELECT ...
Just a proof of concept. Don't use this.
Or DISTINCT ON?
SELECT DISTINCT ON (CASE u.id WHEN 3 THEN -2 WHEN 5 THEN -1 ELSE u.id END, <name_of_first_column>)
u.*
FROM users u
JOIN areas_users au ON au.user_id = u.id
JOIN areas a ON a.id = au.area_id
WHERE u.deleted_at IS NULL
AND u.company_id = 2
AND a.id IN (2, 4, 5)
AND a.deleted_at IS NULL
ORDER BY CASE u.id WHEN 3 THEN -2 WHEN 5 THEN -1 ELSE u.id END, <name_of_first_column>;
This works without returning an additional column. Still just proof of concept. Don't use it, the EXISTS query is much cheaper.
See:
Select first row in each GROUP BY group?

SQL single-row subquery returns more than one row?

I'm trying to get ID and USER name from one query but at the same time I'm looking in my WHERE clause if ID exist in other table. I got error:
ORA-01427: single-row subquery returns more than one row
Here is how my query look:
SELECT s.ID, s.LASTFIRST
From USERS s
Left Outer Join CALENDAR c
On s.ID = c.USERID
Where c.SUPERVISOR = '103'
And TO_CHAR(c.DATEENROLLED,'fmmm/fmdd/yyyy') >= '4/22/2016'
And TO_CHAR(c.DATELEFT,'fmmm/fmdd/yyyy') <= '4/22/2016'
And s.ID != (SELECT USER_ID
From RESERVATIONS
Where EVENT_ID = '56')
My query inside of where clause returns two ID's: 158 and 159 so these two should not be returned in my query where I'm looking for s.ID and s.LASTFIRST. What could cause this error?
Use not in instead of !=
!= or = are for single IDs and values, not in and in are for multiple
And s.ID not in (SELECT USER_ID
From RESERVATIONS
Where EVENT_ID = '56')
Edit: not in vs not exists
Not exists is a perfectly viable option as well. In fact, it is better to not exists than not in if there are the possibility of null values in the subquery result set - In Oracle, the existence of a null will cause not in to return no results. As a general rule, I use not in for ID, not null columns, and not exists for everything else. It may be better practice to always use not exists... personal preference I suppose.
Not exists would be written like so:
SELECT s.ID, s.LASTFIRST
From USERS s
Left Outer Join CALENDAR c
On s.ID = c.USERID
Where c.SUPERVISOR = '103'
And TO_CHAR(c.DATEENROLLED,'fmmm/fmdd/yyyy') >= '4/22/2016'
And TO_CHAR(c.DATELEFT,'fmmm/fmdd/yyyy') <= '4/22/2016'
And not exists (SELECT USER_ID
From RESERVATIONS r
Where r.USER_ID = S.ID
And EVENT_ID = '56')
Performance
In Oracle there is no performance difference between using not in, not exists or a left join.
Source : https://explainextended.com/2009/09/17/not-in-vs-not-exists-vs-left-join-is-null-oracle/
Oracle's optimizer is able to see that NOT EXISTS, NOT IN and LEFT JOIN / IS NULL are semantically equivalent as long as the list values are declared as NOT NULL.
It uses same execution plan for all three methods, and they yield same results in same time.
This is a formatted comment that is not related to your question.
This is slow:
And TO_CHAR(c.DATEENROLLED,'fmmm/fmdd/yyyy') >= '4/22/2016'
because you are filtering on a function result.
This is logically equivalent and much faster:
And c.DATEENROLLED >= to_date('4/22/2016','fmmm/fmdd/yyyy')
Edit starts here
Aaron D's answer says to use not in. Here are two faster ways to do the same thing:
left join reservations r on s.id = user_id
and r.event_id = '56'
etc
where r.user_id is null
or
where s.id in
(
select user_id
from reservations
minus
select user_id
from reservations
where event_id = 56
)

PostgreSQL query returning multiple rows instead of one

I have two tables: user and projects, with a one-to-many relationship between two.
projects table has field status with project statuses of the user.
status can be one of:
launched, confirm, staffed, overdue, complete, failed, ended
I want to categorize users in two categories:
users having projects in launched phase
users having projects other than launched status.
I am using the following query:
SELECT DISTINCT(u.*), CASE
WHEN p.status = 'LAUNCHED' THEN 1
ELSE 2
END as user_category
FROM users u
LEFT JOIN projects p ON p.user_id = u.id
WHERE (LOWER(u.username) like '%%%'
OR LOWER(u.personal_intro) like '%%%'
OR LOWER(u.location) like '%%%'
OR u.account_status != 'DELETED'
AND system_role=10 AND u.account_status ='ACTIVE')
ORDER BY set_order, u.page_hits DESC
LIMIT 10
OFFSET 0
I am facing duplicate records for following scenario:
If user has projects with status launched as well as overdue, complete or failed, then that user is recorded two times as both the conditions in CASE are satisfying for that user.
Please suggest a query where a user that has any project in launched status gets his user_category set to 1. The same user should not be repeated for user_category 2.
The query is probably not doing what you think it does for a number of reasons
There is DISTINCT and there is DISTINCTON(col1, col2).
DISTINCT (u.*) is no different from DISTINCT u.*. The parentheses are just noise.
AND binds before OR according to operator precedence. I suspect you want to use parentheses around the conditions OR'ed together? Or do you need it the way it is? But you don't need parentheses around the whole WHERE clause in any case.
Your expression LOWER(u.username) LIKE '%%%' doesn't make any sense. Every non-null string qualifies. Can be replaced with u.username IS NOT NULL. I suspect you want something different?
Postgres is case sensitive in string handling. You write of status being 'launched' etc. but use 'LAUNCHED' in your query. Which is it?
A couple of table qualifications are missing from the question making it ambiguous for the reader. I filled in as I saw fit.
Everything put together, it might work like this:
SELECT DISTINCT ON (u.set_order, u.page_hits, u.id)
u.*
, CASE WHEN p.status = 'LAUNCHED' THEN 1 ELSE 2 END AS user_category
FROM users u
LEFT JOIN projects p ON p.user_id = u.id
WHERE LOWER(u.username) LIKE '%%%' -- ???
OR LOWER(u.personal_intro) LIKE '%%%'
OR LOWER(u.location) LIKE '%%%'
OR u.account_status != 'DELETED' -- with original logic
AND u.system_role = 10
AND u.account_status = 'ACTIVE'
ORDER BY u.set_order, u.page_hits DESC, u.id, user_category
LIMIT 10
Detailed explanation in this related question:
Select first row in each GROUP BY group?
Two EXISTS semi-joins instead of the DISTINCT ON and CASE might be faster:
SELECT u.*
, CASE WHEN EXISTS (
SELECT FROM projects p
WHERE p.user_id = u.id AND p.status = 'LAUNCHED')
THEN 1 ELSE 2 END AS user_category
FROM users u
WHERE
( LOWER(u.username) LIKE '%%%' -- ???
OR LOWER(u.personal_intro) LIKE '%%%'
OR LOWER(u.location) LIKE '%%%'
OR u.account_status != 'DELETED' -- with alternative logic?
)
AND u.system_role = 10 -- assuming it comes from users ???
AND u.account_status = 'ACTIVE'
AND EXISTS (SELECT 1 FROM projects p WHERE p.user_id = u.id)
ORDER BY u.set_order, u.page_hits DESC
LIMIT 10;
You can use MIN() on your CASE result, and it seems dropping the DISTINCT would be a wise choice:
SELECT u.*, MIN(CASE
WHEN p.status = 'LAUNCHED' THEN 1
ELSE 2
END) as user_category
...
GROUP BY <list all columns in the users table>
...
Since "launched" gives a 1, using MIN() will not only force a single result but will also give preference to "launched" over the other states.

Help with Complicated SELECT query

I have this SELECT query:
SELECT Auctions.ID, Users.Balance, Users.FreeBids,
COUNT(CASE WHEN Bids.Burned=0 AND Auctions.Closed=0 THEN 1 END) AS 'ActiveBids',
COUNT(CASE WHEN Bids.Burned=1 AND Auctions.Closed=0 THEN 1 END) AS 'BurnedBids'
FROM (Users INNER JOIN Bids ON Users.ID=Bids.BidderID)
INNER JOIN Auctions
ON Bids.AuctionID=Auctions.ID
WHERE Users.ID=#UserID
GROUP BY Users.Balance, Users.FreeBids, Auctions.ID
My problam is that it returns no rows if the UserID cant be found on the Bids table.
I know it's something that has to do with my
(Users INNER JOIN Bids ON Users.ID=Bids.BidderID)
But i dont know how to make it return even if the user is no on the Bids table.
You're doing an INNER JOIN, which only returns rows if there are results on both sides of the join. To get what you want, change your WHERE clause like this:
Users LEFT JOIN Bids ON Users.ID=Bids.BidderID
You may also have to change your SELECT statement to handle Bids.Burned being NULL.
If you want to return rows even if there's no matching Auction, then you'll have to make some deeper changes to your query.
My problam is that it returns no rows if the UserID cant be found on the Bids table.
Then INNER JOIN Bids/Auctions should probably be left outer joins. The way you've written it, you're filtering users so that only those in bids and auctions appear.
Left join is the simple answer, but if you're worried about performance I'd consider re-writing it a little bit. For one thing, the order of the columns in the group by matters to performance (although it often doesn't change the results). Generally, you want to group by a column that's indexed first.
Also, it's possible to re-write this query to only have one group by, which will probably speed things up.
Try this out:
with UserBids as (
select
a.ID
, b.BidderID
, ActiveBids = count(case when b.Burned = 0 then 1 end)
, BurnedBids = count(case when b.Burned = 0 then 1 end)
from Bids b
join Auctions a
on a.ID = b.AuctionID
where a.Closed = 0
group by b.BidderID, a.AuctionID
)
select
b.ID
, u.Balance
, u.FreeBids
, b.ActiveBids
, b.BurnedBids
from Users u
left join UserBids b
on b.BidderID = u.ID
where u.ID = #UserID;
If you're not familiar with the with UserBids as..., it's called a CTE (common table expression), and is basically a way to make a one-time use view, and a nice way to structure your queries.