SQL - Querying for names that occur at least twice

SQL - Querying for names that occur at least twice - sql

So I'm trying to find a way to query for a table "people" that has attribute "name", and I would like to query for names that occur at least twice, while the results should be distinct.
I was thinking of creating two alias tables, and joining on name but I can't figure it out.
Here is what I tried:
SELECT DISTINCT name
FROM people AS S1
INNER JOIN people AS S2 USING name
WHERE S2.lastname <> S2.surname
The surname part I did to remove cases of names appearing because of the two tables being equal (not even sure if this is correct).
But either way, this already failed as the syntax is wrong.
Would appreciate some help! Thanks in advance.

Aggregation is a simple method if you want just the names:
select name
from persons
group by name
having count(*) > 1;
If you want the original rows, use window functions:
select p.*
from (select p.*, count(*) over (partition by name) as cnt
from persons p
) p
where cnt >= 2;

Simple: use EXISTS() [ you only need to select from the people table once, and you dont have to use DISTINCT ] :
SELECT *
FROM people s1
WHERE EXISTS (SELECT *
FROM people s2
WHERE s2.name = s1.name
AND S2.lastname <> S1.lastname
);
BTW: assuming lastname <--> surname was a typo?

select p.people_name, count(1) as cnt
from people p
group by 1
having cnt >=1

Related

SQL: how to count entries in multiple tables by value?

I have two tables, question & field. I need to count entries , with coincidental value of template_id (both tables contains).
Please advice, how to do it?

select count(q.*)
from question q
left join field f on f.template.id = q.template_id
In StackOverflow one should show ones own attempt, show that some effort was done.
Above inner join is probably what you meant. Try first select q.*, f.*.

SELECT
COUNT(*) AS TotalRecords
FROM question q
INNER JOIN field f ON f.template_id = q.template_id

If you want the count of distinct template_id in the two tables, use JOIN and COUNT(DISTINCT):
select count(distinct q.template_id)
from question q join
field f
on f.template_id = q.template_id;
If you use count(*) you will get a count of matching rows, not template_ids, so duplicates will affect the result.
If template_id is known to be unique in one of the tables (say question), then exists is probably more efficient:
select count(*)
from question q
where exists (select 1
from field f
where f.template_id = q.template_id
);

Get Limited Records For Sub Query Each Result

I am trying to query some data from SQL Server 2012 using sub query. I am trying to get first 3 records for each Id returned by Sub Query but I am not getting the idea how to do so for now I write this query:
Select * from Student Where TeacherId in (Select TeacherId from Teacher)
I am not sure if this is achievable by using such query or do I have to write a function or any thing else ?
Any Suggestions would be great and sorry for my bad explanation skills.

You should join the Teacher to Student table, and then use an analytic function to get the first there records for each teacher:
SELECT *
FROM
(
SELECT
s.*, t.TeacherId, t.TeacherName,
ROW_NUMBER() OVER (PARTITION BY t.TeacherId ORDER BY some_col) rn
FROM Teacher t
INNER JOIN Student s
ON t.TeacherId = s.TeacherId
) t
WHERE rn = 3;
I assume that there is a column in one of the two tables some_col which you want to use for ordering. It does not make much sense to speak of the first three records without also defining some ordering.

I guess you want the top 3 rows for each ids, not top 3 rows for entire result set
WITH CTE AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY t.TeacherId ORDER BY ?) Seq
FROM Student s
INNER JOIN Teacher t ON t.TeacherId = s.TeacherId
)
SELECT * FROM CTE
WHERE Seq between 1 AND 3
? placeholder requires column_name to generate the sequence_number to get the result with boundary of 1 to 3 .

SQL: Find all rows in a table when the rows are a foreign key in another table

The caveat here is I must complete this with only the following tools:
The basic SQL construct: SELECT FROM .. AS WHERE... Distinct is ok.
Set operators: UNION, INTERSECT, EXCEPT
Create temporary relations: CREATE VIEW... AS ...
Arithmetic operators like <, >, <=, == etc.
Subquery can be used only in the context of NOT IN or a subtraction operation. I.e. (select ... from... where not in (select...)
I can NOT use any join, limit, max, min, count, sum, having, group by, not exists, any exists, count, aggregate functions or anything else not listed in 1-5 above.
Schema:
People (id, name, age, address)
Courses (cid, name, department)
Grades (pid, cid, grade)
I satisfied the query but I used not exists (which I can't use). The sql below shows only people who took every class in the Courses table:
select People.name from People
where not exists
(select Courses.cid from Courses
where not exists
(select grades.cid from grades
where grades.cid = courses.cid and grades.pid = people.id))
Is there way to solve this by using not in or some other method that I am allowed to use? I've struggled with this for hours. If anyone can help with this goofy obstacle, I'll gladly upvote your answer and select your answer.

As Nick.McDermaid said you can use except to identify students that are missing classes and not in to exclude them.
1 Get the complete list with a cartesian product of people x courses. This is what grades would look like if every student has taken every course.
create view complete_view as
select people.id as pid, courses.id as cid
from people, courses
2 Use except to identify students that are missing at least one class
create view missing_view as select distinct pid from (
select pid, cid from complete_view
except
select pid, cid from grades
) t
3 Use not in to select students that aren't missing any classes
select * from people where id not in (select pid from missing_view)

As Nick suggests, you can use EXCEPT in this case. Here is the sample:
select People.name from People
EXCEPT
select People.name from People AS p
join Grades AS g on g.pid = p.id
join Courses as c on c.cid = g.cid

you can turn the first not exists into not in using a constant value.
select *
from People a
where 1 not in (
select 1
from courses b
...

Finding pairs of repeating entries

Could you please help me with one SQL query?
Table : Students
Id | Name | Date of birth
1 Will 1991-02-10
2 James 1981-01-20
3 Sam 1991-02-10
I need to find pairs of students who has same Date of birth. However, we are not allowed to use GROUP BY, so simply grouping and counting records is not a solution.
I have been trying to do it with JOIN, however with no success.
Your help is greatly appreciated!

You can use a self join on the table, joining on the date_of_birth column:
select s1.name,
s2.name
from students s1
join students s2
on s1.date_of_birth = s2.date_of_birth
and s1.name < s2.name;
As wildplasser and dasblinkenlight pointed out the < operator (or >) is better than a <> because when using <> in the join condition, the combination Will/Sam will be reported twice.
Another way of removing duplicate those duplicates is to use a distinct query:
select distinct greatest(s1.name, s2.name), least(s1.name, s2.name)
from students s1
join students s2
on s1.date_of_birth = s2.date_of_birth
and s1.name <> s2.name;
(although eliminating the duplicates in the join condition is almost certainly more efficient)

select st.name, stu.name
from students st, students stu
where st.date_of_birth = stu.date_of_birth AND and st.name <> stu.name;

This query reports all students who have a non-unique birthdate.
SELECT *
FROM students s
WHERE EXISTS (
SELECT *
FROM students ex
WHERE ex.dob = st.dob
AND ex.name <> st.name
)
ORDER BY dob
;

SQL Comparing COUNT values within same table

I'm trying to solve a seemingly simple problem, but I think i'm tripping over on my understanding of how the EXISTS keyword works. The problem is simple (this is a dumbed down version of the actual problem) - I have a table of students and a table of hobbies. The students table has their student ID and Name. Return only the students that share the same number of hobbies (i.e. those students who have a unique number of hobbies would not be shown)
So the difficulty I run into is working out how to compare the count of hobbies. What I have tried is this.
SELECT sa.studentnum, COUNT(ha.hobbynum)
FROM student sa, hobby ha
WHERE sa.studentnum = ha.studentnum
AND EXISTS (SELECT *
FROM student sb, hobby hb
WHERE sb.studentnum = hb.studentnum
AND sa.studentnum != sb.studentnum
HAVING COUNT(ha.hobbynum) = COUNT(hb.hobbynum)
)
GROUP BY sa.studentnum
ORDER BY sa.studentnum;
So what appears to be happening is that the count of hobbynums is identical each test, resulting in all of the original table being returned, instead of just those that match the same number of hobbies.

Not tested, but maybe something like this (if I understand the problem correctly):
WITH h AS (
SELECT studentnum, COUNT(hobbynum) OVER (PARTITION BY studentnum) student_hobby_ct
FROM hobby)
SELECT studentnum, student_hobby_ct
FROM h h1 JOIN h h2 ON h1.student_hobby_ct = h2.student_hobby_ct AND
h1.studentnum <> h2.studentnum;

I think that what your query would do is only return students who had at least one other student that had the same number of hobbies. But you're not returning anything about the students with whom they match. Is that intentional? I'd treat both queries as sub-queries and aggregate before a join on the counts. You could do several things... here it's returning the number of students that have matching hobby counts, but you could limit HAVING(COUNT(distinct sb.studentnum) = 0 to get the result your query seemed to return...
with xx as
(SELECT sa.studentnum, count(ha.hobbynum) hobbycount
FROM student sa inner join hobby ha
on sa.studentnum = ha.studentnum
group by sa.studentnum
)
select sa.studentnum, sa.hobbycount, count(distinct sb.studentnum) as matchcount
from
xx sa inner join xx sb on
sa.hobbycount = sb.hobbycount
where
sa.studentnum != sb.studentnum
GROUP by sa.studentnum, sa.hobbycount
ORDER BY sa.studentnum;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL - Querying for names that occur at least twice - sql

Aggregation is a simple method if you want just the names: select name from persons group by name having count() > 1; If you want the original rows, use window functions: select p. from (select p., count() over (partition by name) as cnt from persons p ) p where cnt >= 2;

Simple: use EXISTS() [ you only need to select from the people table once, and you dont have to use DISTINCT ] : SELECT * FROM people s1 WHERE EXISTS (SELECT * FROM people s2 WHERE s2.name = s1.name AND S2.lastname <> S1.lastname ); BTW: assuming lastname <--> surname was a typo?

select p.people_name, count(1) as cnt from people p group by 1 having cnt >=1

Related

SQL: how to count entries in multiple tables by value?

Get Limited Records For Sub Query Each Result

SQL: Find all rows in a table when the rows are a foreign key in another table

Finding pairs of repeating entries

SQL Comparing COUNT values within same table

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL - Querying for names that occur at least twice - sql

Aggregation is a simple method if you want just the names: select name from persons group by name having count(*) > 1; If you want the original rows, use window functions: select p.* from (select p.*, count(*) over (partition by name) as cnt from persons p ) p where cnt >= 2;

Simple: use EXISTS() [ you only need to select from the people table once, and you dont have to use DISTINCT ] : SELECT * FROM people s1 WHERE EXISTS (SELECT * FROM people s2 WHERE s2.name = s1.name AND S2.lastname <> S1.lastname ); BTW: assuming lastname <--> surname was a typo?

select p.people_name, count(1) as cnt from people p group by 1 having cnt >=1

Related

SQL: how to count entries in multiple tables by value?

Get Limited Records For Sub Query Each Result

SQL: Find all rows in a table when the rows are a foreign key in another table

Finding pairs of repeating entries

SQL Comparing COUNT values within same table

Categories

Resources

Aggregation is a simple method if you want just the names: select name from persons group by name having count() > 1; If you want the original rows, use window functions: select p. from (select p., count() over (partition by name) as cnt from persons p ) p where cnt >= 2;