SQL Multiple Duplicate Row Detection - sql

I'm trying to determine a correct way to isolate rows within a table that have the same values in 2 columns.
There are two tables, one (Name) with the person's names and IDs, and the other one (Nation) with people's IDs and their nations. I join the two tables with inner join, and now the new table columns consist of an ID, first name, last name, and nation. If I want to find pairs of people who have the same last name and are from the same nation, why isn't
select ID, FName, LName, Nation
from (Name inner join Nation on Name.ID = Nation.ID)
group by Name, Nation
having count(Name) > 1 and count(Nation) > 1
working?
I'm aiming for the result to be a table with columns:
ID -------First--------------- Last ---------Nation
where the last names and nations will be identical pairs while first names will be different.
I feel like the group by part isnt appropriate, but is there even an alternate way? Thanks for any help.

If you are using MS SQL Server:
select
*
from
(
select
Name.*,
Nation.Nation,
cnt = count(*) over(partition by LName, Nation)
from Name
join Nation on Nation.ID = Name.ID
) t
where cnt > 1

Try this:
SELECT * FROM (
SELECT Name.ID, Name.FName, Name.LName, Nation.Nation
FROM Name
INNER JOIN Nation ON (Name.ID = Nation.ID)
) a
INNER JOIN (
SELECT Name.ID, Name.FName, Name.LName, Nation.Nation
FROM Name
INNER JOIN Nation ON (Name.ID = Nation.ID)
) b ON (a.LName = b.LName AND a.Nation = b.Nation)
WHERE a.ID < b.ID

As Simon Righarts hinted, something's not right with the design.
Scenario 1)
If a name can have multiple nations, you would have 3 tables implementing an n:m relationship.
CREATE TABLE name (name_id int, name text, ...);
CREATE TABLE nation (nation_id int, nation text, ...);
CREATE TABLE nationality (name_id int references name(name_id)
,nation_id int references nation(nation_id)
... );
Query for the scenario:
SELECT a.name_id, a.fname, a.lname, n.nation
FROM name a
JOIN nationality na USING (name_id)
JOIN nation n USING (nation_id)
JOIN (
SELECT a.lname, na.nation_id
FROM name a
JOIN nationality na USING (name_id)
GROUP BY 1,2
HAVING count(*) > 1) x USING (lname, nation_id)
Scenario 2)
If a name can only have one nation, there would be a column nation_id in the table name:
CREATE TABLE name (name_id int
,name text
,nation_id int references nation(nation_id), ...);
CREATE TABLE nation (nation_id int, nation text, ...);
Query for this scenario:
SELECT a.name_id, a.fname, a.lname, n.nation
FROM name a
JOIN nation n USING (nation_id)
JOIN (
SELECT a.lname, a.nation_id
FROM name a
GROUP BY 1,2
HAVING count(*) > 1) x USING (lname, nation_id);
All multiple occurrences are included here, not just "pairs" - assuming you meant that.
Your actual description doesn't fit either scenario.

Related

SQL find all items having a match for all items in another table

I have two tables:
Persons:
id,first_name,colorid
1,Mona,1
2,Davita,1
3,Mona,3
4,Davita,3
5,Marya,3
6,Mona,2
7,Whitby,3
8,Hardy,1
9,Hardy,2
10,Haskel,3
and colors table:
id,color
1,Green
2,Black
3,Red
I want to find first_names who have all the colors in the color table.
my attempt is:
SELECT DISTINCT P.first_name AS NAMES
FROM Persons P
JOIN Colors C ON C.id= P.colorid;
Is this correct?
select first_name
from person
group by first_name
having count(distinct colorid) = (select count(*) from color)

Display the names of each employees who works in both ‘IT’ and ‘SE’

Emp(sid(pk) : integer, sname: varchar(255))
Dep(sid(fk) : integer, dep : varchar(255))
SQL:How I find the names of each employees who works in both ‘IT’ and ‘SE’?
To observe a query that Joins two tables together and get common values depend on a common column Ex: id, using INNTER JOIN will help you on that
The INNER JOIN keyword selects records that have matching values in both tables.
Solution
SELECT Emp.sid, Emp.sname FROM Emp
INNER JOIN
(SELECT sid FROM Dep WHERE dep='IT'
INTERSECT
SELECT sid FROM Dep WHERE dep='SE') as A
ON Emp.sid = A.sid
References
SQL INNER JOIN
The way I understand it, the data situation is as below. emp with sid and sname, and dep, with sid - the foreign key to emp and dep, this time not as a table, but a column containing the department's abbreviation. And the combination, in the dep table, of sid and dep, is unique.
If that is the constellation, then join the two tables using sid, filter by: dep in the set:('IT' , 'SE'); Then, put the two columns from emp into the the column list, and GROUP BY them, and finally, apply the grouping filter HAVING COUNT(*) = 2 to just get the group that has two entries when filtered by the two departments.
WITH
emp(sid, sname) AS (
SELECT 42,'Arthur'
UNION ALL SELECT 43,'Ford'
UNION ALL SELECT 44,'Zaphod'
)
,
dep(sid, dep) AS (
SELECT 42,'IT'
UNION ALL SELECT 42,'SE'
UNION ALL SELECT 42,'AC'
UNION ALL SELECT 43,'IT'
UNION ALL SELECT 43,'AC'
UNION ALL SELECT 44,'SE'
UNION ALL SELECT 44,'SA'
)
SELECT
emp.sid
, emp.sname
FROM emp JOIN dep USING(sid)
WHERE dep.dep IN ('IT','SE')
GROUP BY
emp.sid
, emp.sname
HAVING COUNT(*) = 2;
-- out sid|sname
-- out 42|Arthur

SQL derived table. Easy SQL

Hi i get the error 'Every derived table must have its own alias' what can i do about it??
My question is :
Show team ID and names from teams from Germany who have never played in UEFA Champions league
UEFA Champions league = 1
Cid= Competions ID
Tid = Team ID
SELECT teams.TID, teams.name from(
SELECT Tid1 FROM(
(SELECT tid1,cid FROM matches
WHERE tid1 IN (SELECT tid FROM teams WHERE country='Germany')
UNION
SELECT tid2,cid FROM matches
WHERE tid2 IN (SELECT tid FROM teams WHERE country='Germany'))
)WHERE cid <> (SELECT cid FROM competitions WHERE cid='1'))
INNER JOIN matches ON tid1=team.tid;
I have tried looking at others derived soulutions but i cant get to work with mine...
It looks like its asking for you to give alias to the tables you have created within the query. Try this.
EFA Champions league = 1
Cid= Competions ID
Tid = Team ID
SELECT teams.TID, teams.name from(
SELECT Tid1 FROM(
(SELECT tid1,cid FROM matches
WHERE tid1 IN (SELECT tid FROM teams WHERE country='Germany')
UNION
SELECT tid2,cid FROM matches
WHERE tid2 IN (SELECT tid FROM teams WHERE country='Germany'))ALIAS1
)WHERE cid <> (SELECT cid FROM competitions WHERE cid='1')) ALIAS2
INNER JOIN matches ON tid1=team.tid;

sql query to select matching rows for all or nothing criteria

I have a table of cars where each car belongs to a company. In another table I have a list of company locations by city.
I want to select all cars from the cars table whose company has locations on all cities passed into the stored procedure, otherwise exclude those cars all together even if it falls short of one city.
So, I've tried something like:
select id, cartype from cars where companyid in
(
select id from locations where cityid in
(
select id from cities
)
)
This doesn't work as it obviously satisfies the condition if ANY of the cities are in the list, not all of them.
It sounds like a group by count, but can't make it work with what I tried.
I"m using MS SQL 2005
One example:
select id, cartype from cars c
where ( select count(1) from cities where id in (...))
= ( select count(distinct cityid)
from locations
where c.companyid = locations.id and cityid in (...) )
Maybe try counting all the cities, and then select the car if the company has the same number of distinct location cities are there are total cities.
SELECT id, cartype FROM cars
WHERE
--Subquery to find the number of locations belonging to car's company
(SELECT count(distinct cities.id) FROM cities
INNER JOIN locations on locations.cityid = cities.id
WHERE locations.companyId = cars.companyId)
=
--Subquery to find the total number of locations
(SELECT count(distinct cities.id) FROM cities)
I haven't tested this, and it may not be the most efficient query, but I think this might work.
Try this
SELECT e.*
FROM cars e
WHERE NOT EXISTS (
SELECT 1
FROM Cities p
WHERE p.location = e.Location
)

SQL query for finding row with same column values that was created most recently

If I have three columns in my MySQL table people, say id, name, created where name is a string and created is a timestamp.. what's the appropriate query for a scenario where I have 10 rows and each row has a record with a name. The names could have a unique id, but a similar name none the less. So you can have three Bob's, two Mary's, one Jack and 4 Phil's.
There is also a hobbies table with the columns id, hobby, person_id.
Basically I want a query that will do the following:
Return all of the people with zero hobbies, but only check by the latest distinct person created, if that makes sense. Meaning if there is a Bob person that was created yesterday, and one created today.. I only want to know if the Bob created today has zero hobbies. The one from yesterday is no longer relevant.
select pp.id
from people pp, (select name, max(created) from people group by name) p
where pp.name = p.name
and pp.created = p.created
and id not in ( select person_id from hobbies )
SELECT latest_person.* FROM (
SELECT p1.* FROM people p1
WHERE NOT EXISTS (
SELECT * FROM people p2
WHERE p1.name = p2.name AND p1.created < p2.created
)
) AS latest_person
LEFT OUTER JOIN hobbies h ON h.person_id = latest_person.id
WHERE h.id IS NULL;
Try This:
Select *
From people p
Where timeStamp =
(Select Max(timestamp)
From people
Where name = p.Name
And not exists
(Select * From hobbies
Where person_id = p.id))