SQL: Join with complete participation - sql

Suppose that I have a table Person(Name, Hobby) and there are 3 hobbies in total. The table's values are like
Amy | Stamp Collection
Kevin | Mountain Biking
Kevin | Stamp Collection
Ron | Mountain Biking
Here, Kevin has both the hobbies Mountain Biking and Stamp Collection. I need to write a query to retrieve Kevin.
How can I get the person who has all the hobbies?
Thanks

SELECT Name
FROM Person
GROUP BY Name
HAVING COUNT(*) = (SELECT COUNT(DISTINCT Hobby) FROM Person)
Runnable example

note : not tested, and its correct in Oracle sql
you can try this :
SELECT *
FROM
(
SELECT p.name,
count(distinct p.hobby) cnt
FROM Person p
GROUP BY p.name
) p2
WHERE p2.cnt = (SELECT count(distinct Hobby)
FROM Person)

If you need to get the person's name and count use this:
SELECT count(*) as count, Name FROM Person group by Name limit 1
This will get you the person's name and the amount of hobbies they have. To get all hobbies add the field to the query and remove the grouping:
SELECT Name, Hobby FROM Person where Name = {name} limit 1
(without the curly braces)

Related

How to return all names that appear multiple times in table [duplicate]

This question already has answers here:
What's the SQL query to list all rows that have 2 column sub-rows as duplicates?
(10 answers)
Closed last year.
Suppose I have the following schema:
student(name, siblings)
The related table has names and siblings. Note the number of rows of the same name will appear the same number of times as the number of siblings an individual has. For instance, a table could be as follows:
Jack, Lucy
Jack, Tim
Meaning that Jack has Lucy and Tim as his siblings.
I want to identify an SQL query that reports the names of all students who have 2 or more siblings. My attempt is the following:
select name
from student
where count(name) >= 1;
I'm not sure I'm using count correctly in this SQL query. Can someone please help with identifying the correct SQL query for this?
You're almost there:
select name
from student
group by name
having count(*) > 1;
HAVING is a where clause that runs after grouping is done. In it you can use things that a grouping would make available (like counts and aggregations). By grouping on the name and counting (filtering for >1, if you want two or more, not >=1 because that would include 1) you get the names you want..
This will just deliver "Jack" as a single result (in the example data from the question). If you then want all the detail, like who Jack's siblings are, you can join your grouped, filtered list of names back to the table:
select *
from
student
INNER JOIN
(
select name
from student
group by name
having count(*) > 1
) morethanone ON morethanone.name = student.name
You can't avoid doing this "joining back" because the grouping has thrown the detail away in order to create the group. The only way to get the detail back is to take the name list the group gave you and use it to filter the original detail data again
Full disclosure; it's a bit of a lie to say "can't avoid doing this": SQL Server supports something called a window function, which will effectively perform a grouping in the background and join it back to the detail. Such a query would look like:
select student.*, count(*) over(partition by name) n
from student
And for a table like this:
jack, lucy
jack, tim
jane, bill
jane, fred
jane, tom
john, dave
It would produce:
jack, lucy, 2
jack, tim, 2
jane, bill, 3
jane, fred, 3
jane, tom, 3
john, dave, 1
The rows with jack would have 2 on because there are two jack rows. There are 3 janes, there is 1 john. You could then wrap all that in a subquery and filter for n > 1 which would remove john
select *
from
(
select student.*, count(*) over(partition by name) n
from student
) x
where x.n > 1
If SQL Server didn't have window functions, it would look more like:
select *
from
student
INNER JOIN
(
select name, count(*) as n
from student
group by name
) x ON x.name = student.name
The COUNT(*) OVER(PARTITION BY name) is like a mini "group by name and return the count, then auto join back to the main detail using the name as key" i.e. a short form of the latter query
You can do:
select name
from student as s1
where exists (
select s2
from student as s2
where s1.name = s2.name and s1.siblings != s2.siblings
)
I think the best approach is what 'Caius Jard' mentioned. However, additional way if you want to get how many siblings each name has .
SELECT name, COUNT(*) AS Occurrences
FROM student
GROUP BY name
HAVING (COUNT(*) > 1)
I wanted to share another solution I came up with:
select s1.name
from student s1, student s2
where s1.name = s2.name and s1.sibling != s2.sibling;

How to find people in a database who live in the same cities?

I'm new to SQL, and I'm asking for help in an apparently easy question, but it gets cumbersome in my mind.
I have the following table:
ID NAME CITY
---------------------
1 John new york
2 Sam new york
3 Tom boston
4 Bob boston
5 Jan chicago
6 Ted san francisco
7 Kat boston
I want a query that returns all the people who live in a city that another person registered in the database also lives in.
The answer, for the table I showed above, would be:
ID NAME CITY
---------------------
1 John new york
2 Sam new york
3 Tom boston
4 Bob boston
7 Kat boston
This is really a two part question:
What cities have more than one user located in them?
What users live in that subset of cities?
Let's answer it in two parts. Let's also make the simplifying assumption (not stated in your question) that the Users table has only one entry per user per city.
To find cities with more than one user:
SELECT City FROM Users GROUP BY City HAVING COUNT(*) > 1
Now, let's find all the users for those cities:
SELECT ID, User, City FROM Users
WHERE City IN (SELECT City FROM Users GROUP BY CITY HAVING COUNT(*) > 1)
I would use EXISTS :
SELECT t.*
FROM table t
WHERE EXISTS (SELECT 1 FROM table t1 WHERE t1.city = t.city AND t1.name <> t.name);
To avoid a correlated subquery which leads to a nested loop, you could perform a self join:
SELECT id, name, city
FROM persons
JOIN (SELECT city
FROM persons
GROUP BY city HAVING count(*) > 1) AS cities
USING (city);
This might be the most performant solution.
This will give you the rows that have the same city more than 1 time:
SELECT persons.*
FROM persons
WHERE (SELECT COUNT(*) FROM persons AS p GROUP BY CITY HAVING p.CITY = persons.CITY) > 1
This is just a different flavor from the others that have posted.
SELECT ID,
name,
city
FROM (SELECT DISTINCT
ID,
name,
city,
COUNT(1) OVER (PARTITION BY city) AS cityCount
FROM table) t
WHERE cityCount > 1
This can be expressed many ways. Here is one possible way:
select * from persons p
where exists (
select 1 from persons p2
where p2.city = p.city and p2.name <> p.name
)

sql select tuples and group by id

I have the current database schema
EMPLOYEES
ID | NAME | JOB
JOBS
ID | JOBNAME | PRICE
I want to query so that it goes through each employee, and gets all their jobs, but I want each employee ID to be grouped so that it returns the employee ID followed by all the jobs they have. e.g if employee with ID 1 had jobs with ID, JOBNAME (1, Roofing), (1,Brick laying)
I want it to return something like
1 Roofing Bricklaying
I was trying
SELECT ID,JOBNAME FROM JOBS WHERE ID IN (SELECT ID FROM EMPLOYEES) GROUP BY ID;
but get the error
not a GROUP BY expression
Hope this is clear enough, if not please say and I'll try to explain better
EDIT:
WITH ALL_JOBS AS
(
SELECT ID,LISTAGG(JOBNAME || ' ') WITHIN GROUP (ORDER BY ID) JOBNAMES FROM JOBS GROUP BY ID
)
SELECT ID,JOBNAMES FROM ALL_JOBS A,EMPLOYEES B
WHERE A.ID = B.ID
GROUP BY ID,JOBNAMES;
In the with clause, I am grouping by on ID and concatenating the columns corresponding to an ID(also concatenating with ' ' to distinguish the columns).
For example, if we have
ID NAME
1 Roofing
1 Brick laying
2 Michael
2 Schumacher
we will get the result set as
ID NAME
1 Roofing Brick laying
2 Michael Schumacher
Then, I am join this result set with the EMPLOYEES table on ID.
You need to put JobName to group by expression too.
SELECT ID,JOBNAME FROM JOBS WHERE ID IN (SELECT ID FROM EMPLOYEES) GROUP BY ID,JOBNAME;

SQL Server - Possible Pivot Solution?

I have a simple enough issue that has been surprisingly difficult to locate online. Perhaps I am searching on improper keywords so I wanted to stop in and ask you guys because your site has been a blessing with my studies. See below scenario:
Select student, count(*) as Total, (the unknown variable: book1, book2, book3, book4, ect...) from mystudies.
Essentially all I would like to do is list out all books for a unique student id that matches the Total count. Could someone point me in the right direction, a good read or anything, so I can get a step going in the correct direction? I am assuming it would be done via a left join (not sure how to do the x1, x2, x3 part) and then just link the two by the unique student id number (no duplicates) but everyone online points to pivot but pivot appears to put all the rows into columns instead of one single column. SQL server 2005 is the platform of choice.
Thanks!
Sorry
The following query produces my unique id (the student) and the student's count for all duplicate entries in the table:
select student, count(*) as Total
from mystudies
group by student order by total desc
the part I don't know is how to create the left join on the table unique id (boookid)
select mystudies1.student, mystudies1.total, mystudies2.bookid
from ( select student, count(*) as Total
from mystudies
group by student
) mystudies1
left join
( select student, bookid
from mystudies
) mystudies2
on mystudies1.student=mystudies2.student
order by mystudies1.total desc, mystudies1.student asc
Obviously the above row will produce results similar to the following:
Student Total BookID
000001 3 100001
000001 3 100002
000001 3 100003
000002 2 200001
000002 2 200002
000003 1 300001
But what I actually want is something similar to the following:
Student Total BookID
000001 3 100001, 100002, 100003
000002 2 200001, 200002
000003 1 300001
I assumed it had to be done in a left join so that it didn't alter the actual count being performed on the student. thanks!
In SQL-Server use the FOR XML Path Method:
SELECT Student,
Total,
STUFF(( SELECT ', ' + BookID
FROM MyStudies books
WHERE Books.Student = MyStudies.Student
FOR XML PATH(''), TYPE
).value('.', 'VARCHAR(MAX)'), 1, 2, '') AS Books
FROM ( SELECT Student, COUNT(*) AS Total
FROM myStudies
GROUP BY Student
) MyStudies
I have previously given a full explanation of how the XML PATH Method works here. With a further improvement to my answer pointed out here
SQL Server Fiddle
In MySQL AND SQLite you can use the GROUP_CONCAT function:
SELECT Student,
COUNT(*) AS Total,
GROUP_CONCAT(BookID) AS Books
FROM myStudies
GROUP BY Student
MySQL Fiddle
SQLite Fiddle
In Postgresql you can use the ARRAY_AGG Function:
SELECT Student,
COUNT(*) AS Total,
ARRAY_AGG(BookID) AS Books
FROM myStudies
GROUP BY Student
Postgresql Fiddle
In oracle you can use the LISTAGG Function
SELECT Student,
COUNT(*) AS Total,
LISTAGG(BookID, ', ') WITHIN GROUP (ORDER BY BookID) AS Books
FROM myStudies
GROUP BY Student
Oracle SQL Fiddle

write a query to identify discrepancy

I have a table with Student ID's and Student Names. There has been issues with assigning unique Student Id's to students and Hence I want to find the duplicates
Here is the sample Table:
Student ID Student Name
1 Jack
1 John
1 Bill
2 Amanda
2 Molly
3 Ron
4 Matt
5 James
6 Kathy
6 Will
Here I want a third column "Duplicate_Count" to display count of duplicate records.
For e.g. "Duplicate_Count" would display "3" for Student ID = 1 and so on. How can I do this?
Thanks in advance
Select StudentId, Count(*) DupCount
From Table
Group By StudentId
Having Count(*) > 1
Order By Count(*) desc,
Select
aa.StudentId, aa.StudentName, bb.DupCount
from
Table as aa
join
(
Select StudentId, Count(*) as DupCount from Table group by StudentId
) as bb
on aa.StudentId = bb.StudentId
The virtual table gives the count for each StudentId, this is joined back to the original table to add the count to each student record.
If you want to add a column to the table to hold dupcount, this query can be used in an update statement to update that column in the table
This should work:
update mytable
set duplicate_count = (select count(*) from mytable t where t.id = mytable.id)
UPDATE:
As mentioned by #HansUp, adding a new column with the duplicate count probably doesn't make sense, but that really depends on what the OP originally thought of using it for. I'm leaving the answer in case it is of help for someone else.