Subquery and normal query comes out with different results - sql

I'm a beginner of the oracle, currently, I'm doing a question using subquery(without JOIN) and normal (with JOIN) query, but at the end, the results are different from this two query,
I can't figure out this problem, does anyone know?
The question is asking about list the dog owner details which has booked at least twice in this platform
SELECT PET_OWNER.Owner_id,Oname,OAdd,COUNT(*) AS BOOKING
FROM PET_OWNER
WHERE Owner_id IN(
SELECT Owner_id
FROM PET
WHERE PType = 'DOG' AND Pet_id IN(SELECT Pet_id FROM BOOKING))
GROUP BY PET_OWNER.Owner_id,Oname,OAdd
HAVING COUNT(*) >=2
ORDER BY PET_OWNER.Owner_id;
This subquery shows no rows selected,
SELECT PET_OWNER.Owner_id,Oname,OAdd,COUNT(*) AS BOOKING
FROM PET_OWNER,PET,BOOKING
WHERE PET_OWNER.Owner_id = PET.Owner_id AND
PET.Pet_id = BOOKING.Pet_id AND
PType = 'DOG'
GROUP BY PET_OWNER.Owner_id,Oname,OAdd
HAVING COUNT(*) >=2
ORDER BY PET_OWNER.Owner_id;
this query shows 10 records which are the correct answer for this question
I expected these two queries come out with the same result but it is not
does anyone know what is wrong with it?
can anyone show me how to convert this code to subquery?

Because duplicated join key will cause duplicatation in result.
In your case, the Owner_id should be non-unique in the PET table.
It is still possible to get the correct answer by using join. And as the owner_id in the subquery t is unique, so the execution plan should be same with the subquery version.
select p.* from Pet_Owner p
join (
select PET.Owner_id
from PET
inner join Booking on Booking.Pet_id = PET.Pet_id
where pType = 'DOG'
group by PET.Owner_id
having count(1) >= 2) t
on t.Owner_id = p.Owner_id
order by p.Owner_id
By the way, your SQL code is so old-school as it is in ANSI-89, while the join syntax is already in ANSI-92. I know many school teachers still love the old style, I hope you can read both, but only write code in ANSI-92 way.

What happen is that it will give you distinct values on your PET_OWNER.Owner_id,Oname,OAdd. So what we need is to group by owner_id first.
Here's your query. get first those owner_id with count() >= 2 as subquery
select * from Pet_Owner where Owner_id in (
select t1.Owner_id from PET_OWNER t1
inner join PET t2 on t1.Owner_id = t2.Owner_id
inner join Booking t3 on t3.Pet_id = t2.Pet_id
where pType = 'DOG'
group by t1.Owner_id
having count(1) >= 2)
order by Owner_id
not using join, nested subqueries is our only option
select * from Pet_Owner where Owner_id in (
select owner_id from Pet_Owner where Owner_id in
(select Owner_id from Pet where Pet_id in
(select Pet_id in Booking) and PType='DOG')
group by owner_id
having count(1) >= 2)
order by Owner_id
if you are trying to the # of dogs per owner:
select * from Pet_Owner where Owner_id in (
select Owner_id from Pet where Pet_id in
(select Pet_id in Booking) and PType='DOG'
group by owner_id
having count(1) >= 2)
) order by Owner_id

Related

Select columns based on count of many-to-many association

I have a Postgres database with 3 tables that looks a little something like this:
table categories
id
type
table games
id
table game_category
id
game_id
category_id
I want to select all games which have more than x categories where type is something
I have gotten this far:
SELECT * FROM games WHERE id IN (
SELECT game_id FROM game_category GROUP BY game_id HAVING COUNT(*) >= 5
)
This works to select all games with more than 5 categories, but doesn't narrow down the categories by their type. How could I expand on this to add the additional check for the type?
You have to join your categories table with the subquery. Then you can add a WHERE clause for the type. Replace '?' with your actual type, of course.
SELECT * FROM games WHERE id IN (
SELECT game_id FROM game_category
INNER JOIN categories ON (categories.id=game_category.category_id)
WHERE categories.type='?'
GROUP BY game_id HAVING COUNT(*) >= 5
)
Considering query response time, you can avoid the in clause. Mitchel's answer would work if written as follows:
SELECT game_id
FROM game_category gc
inner join categories c on c.id = gc.category_id
WHERE type = 'X'
GROUP BY game_id
HAVING COUNT(game_id) >= 5
Notice I avoided using count(*) that is also a query optimization strategy

delete duplicates and retain MAX(id) mysql

I have a code where it list all the duplicates of the data on database
SELECT MAX(id) id
FROM el_student_class_relation
GROUP BY student_id, class_id
HAVING COUNT(*) > 1
Now, I'm trying to retain the MAX(id), then the rest of the duplicates should be deleted
I tried the code
DELETE us
FROM el_student_class_relation us
INNER JOIN(SELECT MAX(id) id
FROM el_student_class_relation
GROUP BY student_id, class_id HAVING COUNT(*) > 1) t ON t.id = us.id
But it deletes the MAX(ID) and it is retaining the the other duplicates and it is the opposite of what I want.
Try this
DELETE FROM el_student_class_relation
WHERE id not in
(
SELECT * from
(SELECT MAX(id) id
FROM el_student_class_relation
GROUP BY student_id, class_id) temp_tbl
)
Please note:
do not use the HAVING COUNT(*) > 1 in inner query.
it will create issue when there is only single record with same id.
You might try the following query that deletes all elements for which another one with a higher ID (and same class and student) exists:
DELETE
FROM el_student_class_relation el1
WHERE EXISTS (SELECT el2.id
FROM el_student_class_relation el2
WHERE el1.student_id = el2.student_id
AND el1.class_id = el2.class_id
AND el2.id > el1.id);
The direct fix for your query is to use an "anti-join", where NOT joining is the important feature. This can be done with LEFT JOIN.
DELETE
us
FROM
el_student_class_relation us
LEFT JOIN
(
SELECT student_id, class_id, MAX(id) id
FROM el_student_class_relation
GROUP BY student_id, class_id
-- HAVING COUNT(*) > 1 [Don't do this, you need to return ALL the rows you want to keep]
)
gr
ON gr.id = us.id
WHERE
gr.id IS NULL -- WHERE there wasn't a match in the "good rows" table
EDIT MariaDB and MySQL aren't the same thing. MariaDB DOES allow self joins on the table being deleted from.
in mysql(lower version) in case of delete sub-query work a little bit different way, you have to use a layer more than required
DELETE FROM el_student_class_relation us
WHERE us.id not in
(
select * from (
SELECT MAX(id) id
FROM el_student_class_relation
GROUP BY student_id, class_id
) t1
)

How to implement/improve/ make faster this query with joins

Please bear with me I am not skilled in SQL:
I have three tables
1) Notifications - stores all my data
2) GroupTable - Has the names of groups and related id
3) GroupUser - this table maps Uname and Udob to a group from GroupTable.
Now before I fetch records from Notifications I want to check the GroupTable for GroupID take this GroupID and look in GroupUser for all the records in this GroupID (Names,DOB as these are unique) Once I get this data I want to fetch records from Notifications table for the Names and DOB's in ascending order of the date:
So far I have the following query, it works fine just that I am not satisfied and I think this can be improved:
SELECT
*
FROM
(SELECT
*
FROM Notifications
WHERE
DateToNotify < '2016-03-24' AND
NotificationDateFor IN
(SELECT gu.Name
FROM GroupUser AS gu
INNER JOIN GroupTable AS gt ON
gu.GroupID = gt._id AND
gt.GroupName = "Groupn"
) AND
DOB IN
(SELECT gu.DOB
FROM GroupUser AS gu
INNER JOIN GroupTable AS gt ON
gu.GroupID = gt._id AND
gt.GroupName = "Groupn"
)
) as T
ORDER BY
SUBSTR(DATE('NOW'), 0) > SUBSTR(DateToNotify, 0)
, SUBSTR(DateToNotify, 0)
I don't think that you would get this faster with joins instead of the IN clauses. It can be that re-writing would not even change the execution plan, because the dbms tries to access the data in the optimal way anyhow.
It seems a bit strange that you don't look for group users matching name and dob, but only ensure that there are group users matching the name and - possibly other - group users matching the dob. But as you say that the query works fine as is, okay.
EDIT: Okay, according to your comment you actually want groupuser matches on both name and dob. So what you are looking for would be
AND (NotificationDateFor, DOB) IN (SELECT gu.Name, gu.DOB FROM ...)
But SQLite doesn't support this beautiful syntax (Oracle is the only dbms I know of that does).
So you either join or use EXISTS.
With JOIN:
select distinct n.*
from notifications n
join
(
select name, dob
from groupuser
where groupid = (select _id from grouptable where groupname = 'groupn')
) as gu on n.notificationdatefor = gu.name and n.dob = gu.dob
where n.datetonotify < '2016-03-24'
order by date('now') > n.datetonotify, n.datetonotify;
With EXISTS:
select *
from notifications n
where datetonotify < '2016-03-24'
and exists
(
select *
from groupuser gu
where gu.groupid = (select _id from grouptable where groupname = 'groupn')
and gu.name = n.notificationdatefor
and gu.dob = n.dob
)
order by date('now') > n.datetonotify, n.datetonotify;

SQL SUBQUERY ON SELF

I have a table with the following data:
licence_number
date_of_birth
organisation
I want to do a query where:
Get the licence_numbers and dobs in organisation1 where the same
licence numbers and dobs are in organisation2.
I know it cant be that hard, but im struggling.
You can group by license_number and date_of_birth where organization is set to either of the two interesting organizations, and count how many distinct organizations there are in a group.
If there are two out of two possible in a single group, you have a hit.
SELECT license_number, date_of_birth
FROM mytable
WHERE organisation IN ('organisation1', 'organisation2')
GROUP BY license_number, date_of_birth
HAVING COUNT(DISTINCT organisation) = 2;
...or you can use INTERSECT;
SELECT license_number, date_of_birth
FROM mytable WHERE organisation = 'organisation1'
INTERSECT
SELECT license_number, date_of_birth
FROM mytable WHERE organisation = 'organisation2'
An SQLfiddle to test both.
select
from t t0
where
organization = 'organization1'
and
exists (
select 1
from t
where
organization = 'organization2'
and
licence_number = t0.licence_number
and
date_of_birth = t0.date_of_birth
)
You can just self join the table where the licence number and the dates are the same but the organisation isn't:
SELECT DISTINCT p1.licence_number, p1.date_of_birth
FROM people p1
INNER JOIN people p2
ON p1.licence_number = p2.licence_number AND
p1.date_of_birth = p2.date_of_birth AND
p1.organisation <> p2.organisation
SQL Fiddle here
Is it a JOIN or 1 table?
Select
[licence_number],
[date_of_birth],
[organisation]
From YourTable
Where organisation1 = organisation2
--OR
Select
[licence_number],
[date_of_birth],
[organisation]
From YourTable
Where organisation1 IN ('organisation2','organisation3','organisation3')
Order By [licence_number]

What's wrong with this MySQL query? SELECT * AS `x`, how to use x again later?

The following MySQL query:
select `userID` as uID,
(select `siteID` from `users` where `userID` = uID) as `sID`,
from `actions`
where `sID` in (select `siteID` from `sites` where `foo` = "bar")
order by `timestamp` desc limit 100
…returns an error:
Unknown column 'sID' in 'IN/ALL/ANY subquery'
I don't understand what I'm doing wrong here. The sID thing is not supposed to be a column, but the 'alias' (what is this called?) I created by executing (select siteID from users where userID = uID) as sID. And it’s not even inside the IN subquery.
Any ideas?
Edit: #Roland: Thanks for your comment. I have three tables, actions, users and sites. The table actions contains a userID field, which corresponds to an entry in the users table. Every user in this table (users) has a siteID.
I'm trying to select the latest actions from the actions table, and link them to the users and sites table to find out who performed those actions, and on which site. Hope that makes sense :)
You either need to enclose it into a subquery:
SELECT *
FROM (
SELECT userID as uID, (select siteID from users where userID = actions.userID) as sID,
FROM actions
) q
WHERE sID IN (select siteID from sites where foo = "bar")
ORDER BY
timestamp DESC
LIMIT 100
, or, better, rewrite it as a JOIN
SELECT a.userId, u.siteID
FROM actions a
JOIN users u
ON u.userID = a.userID
WHERE siteID IN
(
SELECT siteID
FROM sites
WHERE foo = 'bar'
)
ORDER BY
timestamp DESC
LIMIT 100
Create the following indexes:
actions (timestamp)
users (userId)
sites (foo, siteID)
The column alias is not established until the query processor finishes the Select clause, and buiulds the first intermediate result set, so it can only be referenced in a group By, (since the group By clause operates on that intermediate result set) if you want ot use it this way, puit the alias inside the sub-query, then it will be in the resultset generated by the subquery, and therefore accessible to the outer query. To illustrate
(This is not the simplest way to do this query but it illustrates how to establish and use a column alias from a subquery)
select a.userID as uID, z.Sid
from actions a
Join (select userID, siteID as sid1 from users) Z,
On z.userID = a.userID
where Z.sID in (select siteID from sites where foo = "bar")
order by timestamp desc limit 100
Try the following:
SELECT
a.userID as uID
,u.siteID as sID
FROM
actions as a
INNER JOIN
users as u ON u.userID=a.userID
WHERE
u.siteID IN (SELECT siteID FROM sites WHERE foo = 'bar')
ORDER BY
a.timestamp DESC
LIMIT 100
I think the reason for the error is that the alias isn't available to the WHERE instruction, which is why we have HAVING.
select `userID` as uID,
(select `siteID` from `users` where `userID` = uID) as `sID`,
from `actions`
HAVING `sID` in (select `siteID` from `sites` where `foo` = "bar")
order by `timestamp` desc limit 100
Though i also agree with the other answers that your query could be better structured.
Try the following
SELECT
a.userID as uID
,u.siteID as sID
FROM
actions as a
INNER JOIN
users as u ON u.userID = a.userID
INNER JOIN
sites as s ON u.siteID = s.siteID
WHERE
s.foo = 'bar'
ORDER BY
a.timestamp DESC
LIMIT 100
If you wish to use a field from the select section later you can try a subselect
SELECT One,
Two,
One + Two as Three
FROM (
SELECT 1 AS One,
2 as Two
) sub
I don't know whether this was not in the SQL standard 11 years ago, but I found it the easiest way to use HAVING:
select `userID` as uID,
(select `siteID` from `users` where `userID` = uID) as `sID`,
from `actions`
order by `timestamp` desc limit 100
HAVING `sID` in (select `siteID` from `sites` where `foo` = "bar")