Oracle Sql query Group by Clause - sql

MY_TABLE = Table with 2 columns Number, City.
Desired Output = City and count of unique Number associated to the city. Seattle, Bellevue is part of Combined. Even though there are 4 numbers associated to Seattle, Bellevue the output is 3 as there are only 3 distinct numbers - 123, 456, 786.
MY_TABLE
Number City
123 Seattle
456 Bellevue
789 LosAngeles
780 LosAngeles
123 Bellevue
786 Bellevue
Desired Output:
Combined 3
LosAngeles 2
Query so far:
SELECT NUMBER, CITY FROM MY_TABLE WHERE LOOKUP_ID=100 AND CITY IN
('Seattle', 'Bellevue', 'LosAngeles')
GROUP BY NUMBER, CITY
Would highly appreciate if anyone provides a recommendations around the same.

You could do something like
SELECT (case when city IN ('Seattle', 'Bellevue')
then 'Combined'
else city
end) city,
count( distinct number )
FROM my_table
WHERE lookup_id = 100
AND city IN ('Seattle', 'Bellevue', 'LosAngeles')
GROUP BY (case when city IN ('Seattle', 'Bellevue')
then 'Combined'
else city
end)
Of course, my guess is that you have some other table that tells you which CITY values need to be combined rather than having a hard-coded CASE statement.

with t as (
SELECT (
case when city IN ('Seattle', 'Bellevue')
then 'Combined'
else city
end
) city, number from my_table
)
select city, count(distinct number) from t
group by city
Tell please, if it was useful

Try this:
SELECT
(CASE CITY
WHEN 'Seattle' THEN ‘Combined’
WHEN 'Bellevue' THEN ‘Combined’
ELSE CITY
END), COUNT(*)
FROM
MY_TABLE
WHERE
LOOKUP_ID=100 AND CITY IN ('Seattle', 'Bellevue', 'LosAngeles')
GROUP BY
NUMBER,
(CASE CITY
WHEN 'Seattle' THEN ‘Combined’
WHEN 'Bellevue' THEN ‘Combined’
ELSE CITY
END)
that should do what you asked for, but I suspect that you have some other tables where you define which cities should be considered the same, in such a case you'll need to join on those tables

There are 3 answers already and none of them are generic for more cities.
Try this:
SELECT City, COUNT(Number) AS ExclusiveNumbers
FROM (SELECT q2.City, q2.CityNumCount, b.Number
FROM MY_Table b INNER JOIN
(SELECT c.City, MAX(NumOccurs) AS CityNumCount
FROM My_Table c INNER JOIN
(SELECT Number, COUNT(City) AS NumOccurs
FROM My_Table
GROUP BY Number) q1 ON c.Number = q1.Number
GROUP BY c.City) q2 ON b.City = q2.City) q3
WHERE CityNumCount = 1
GROUP BY City
UNION
SELECT 'Combined', COUNT(DISTINCT Number)
FROM (SELECT q2.City, q2.CityNumCount, b.Number
FROM MY_Table b INNER JOIN
(SELECT c.City, MAX(NumOccurs) AS CityNumCount
FROM My_Table c INNER JOIN
(SELECT Number, COUNT(City) AS NumOccurs
FROM My_Table
GROUP BY Number) q1 ON c.Number = q1.Number
GROUP BY c.City) q2 ON b.City = q2.City) q3
WHERE CityNumCount > 1
The top half of the union works out, for each City name that has no numbers in common with any other city, how many different numbers it has.
The bottom half works out the count of different numbers for cities that do have numbers in common with other cities. These 2 figures will always add up to the count of distinct numbers in the original table.

Related

How to get values of one column without the aggregate column?

I have this table:
first_name
last_name
age
country
John
Doe
31
USA
Robert
Luna
22
USA
David
Robinson
22
UK
John
Reinhardt
25
UK
Betty
Doe
28
UAE
How can I get only the names of the oldest per country?
When I do this query
SELECT first_name,last_name, MAX(age)
FROM Customers
GROUP BY country
I get this result:
first_name
last_name
MAX(age)
Betty
Doe
31
John
Reinhardt
22
John
Doe
31
But I want to get only first name and last name without the aggregate function.
If window functions are an option, you can use ROW_NUMBER for this task.
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY country ORDER BY age DESC) AS rn
FROM tab
)
SELECT first_name, last_name, age, country
FROM cte
WHERE rn = 1
Check the demo here.
It sounds like you want to get the oldest age per country first,
SELECT Country, MAX(age) AS MAX_AGE_IN_COUNTRY
FROM Customers
GROUP BY Country
With that, you want to match that back to the original table (aka a join) to see which names they match up to.
So, something like this perhaps:
SELECT Customers.*
FROM Customers
INNER JOIN
(
SELECT Country, MAX(age) AS MAX_AGE_IN_COUNTRY
FROM Customers
GROUP BY Country
) AS max_per_country_query
ON Customers.Country = max_per_country_query.Country
AND Customers.Age = max_per_country_query.MAX_AGE_IN_COUNTRY
If your database supports it, I prefer using the CTE style of handling these subqueries because it's easier to read and debug.
WITH cte_max_per_country AS (
SELECT Country, MAX(age) AS MAX_AGE_IN_COUNTRY
FROM Customers
GROUP BY Country
)
SELECT Customers.*
FROM Customers C
INNER JOIN cte_max_per_country
ON C.Country = cte_max_per_country.Country
AND C.Age = cte_max_per_country.MAX_AGE_IN_COUNTRY

SQL query how to divide two selects that return multiple columns

I have the following table :
city
state
numOrder
date
deadlineDate
NY
NY
111
2022/11/05
2022/11/06
LA
CA
222
2022/11/01
2022/10/01
SD
CA
333
2022/05/05
2022/11/06
LA
CA
444
2022/11/01
2022/05/01
I need to calculate the number of orders placed before the deadline divided by the number of orders placed by each state and city:
(SELECT state, city ,count(*)
FROM orders
WHERE date <= deadlineDate
group by state, city) /
(SELECT state, city ,count(*)
FROM orders
group by state, city)
I tried:
SELECT (
SELECT state, city ,count(*)
FROM orders
WHERE serviceDate <= limitDate
group by state, city
)/
(
SELECT state, city ,count(*)
FROM orders
group by state, city
)
FROM orders
But the I got ERROR:
Subquery must return only one column
Try the following:
SELECT state, city,
COUNT(*) FILTER (WHERE date <= deadlineDate)*1.0 / COUNT(*) AS result
FROM orders
GROUP BY state, city
See a demo.
Join the tables as two subquery tables and do the math in the select
SELECT A.COL1/B.COL1 AS MY_RATIO_COL
FROM
(SELECT COL1 FROM MY_TABLE WHERE [BLA BLA BLA]) A
JOIN
(SELECT COL1 FROM MY_TABLE WHERE [yata yata]) B
ON A.KEYCOL1 = B.KEYCOL1

SQL Server: Duplicates but based on specific criteria

I am trying to find duplicates based on forename, surname, and dateofbirth in my database. Below is what I have
Customers table:
custid cust_refno forename surname dateofbirth
1 10 David John 10-02-1980
2 20 Peter Broad 15-08-1978
3 30 Sarah Holly 16-09-1982
4 40 Mathew Mark 25-08-2001
5 50 Matt Mark 25-08-2001
Address table:
addid cust_refno addresstype line1
1 10 address No. 10, Mineview Road
2 10 address No. 20, Mineview Lane
3 20 address Rockview cottage, blackthorn
4 30 mobile 0504135864
5 40 address No. 64, New Lane
6 40 mobile 0504896532
7 50 address No. 11, John's cottage
Some customers have multiple addresses, so they are not duplicates. I am trying to find a way to avoid displaying those as duplicates. Can you advice how I can do that?
my query:
SELECT DISTINCT t.FORENAME, t.SURNAME, t.CUST_REFNO, t.DATE_OF_BIRTH , a.LINE1 FROM CUSTOMERS AS t
LEFT OUTER JOIN dbo.ADDRESS a
ON t.CUST_REFNO = a.CUST_REFNO
INNER JOIN (
SELECT FORENAME, surname, DTTM_OF_BIRTH
FROM CUSTOMERS GROUP BY FORENAME, SURNAME, DATE_OF_BIRTH
HAVING COUNT(*) > 1) AS td
ON t.FORENAME = td.FORENAME AND t.DTTM_OF_BIRTH = td.DATE_OF_BIRTH
AND t.SURNAME = td.SURNAME
WHERE a.addresstype = 'address'
my result is:
Forename surname cust_refno dateofbirth line1
David John 10 10-02-1980 No. 10, Mineview Road
David John 10 10-02-1980 No. 20, Mineview Lane
But in reality it is not a duplicate. Its just that the addresses are different. Is there a way to compare the cust_refno and see if it already exists so even if the address is different if the cust_refno is the same it does not show again?
If you want to get the customers with duplicates address, you can count how many times a customer has the same address and return just that with more than one:
SELECT t.FORENAME, t.SURNAME, t.CUST_REFNO, t.DATE_OF_BIRTH , a.LINE1
FROM CUSTOMERS AS t INNER JOIN ADDRESS a ON t.CUST_REFNO = a.CUST_REFNO
GROUP BY t.FORENAME, t.SURNAME, t.CUST_REFNO, t.DATE_OF_BIRTH , a.LINE1
HAVING COUNT(a.LINE1) > 1
You can use window functions to filter out customers with more than one address. Then aggregation can be used to return the duplicates:
select forename, surname, dateofbirth
from customers c join
(select a.*,
count(*) over (partition by cust_refno) as cnt
from addresses a
where addresstype = 'address'
) a
on c.cust_refno = a.cust_refno
where cnt = 1
group by forename, surname, dateofbirth
having count(*) > 1;
If you want the full customer record, just use window functions twice:
select c.*
from (select c.*,
count(*) over (partition by forename, surname, dateofbirth) as cnt
from customers c
) c join
(select a.*,
count(*) over (partition by cust_refno) as cnt
from addresses a
where addresstype = 'address'
) a
on c.cust_refno = a.cust_refno
where a.cnt = 1 and c.cnt > 1;
You can use the analytical function count and row_number as follows:
select * from
(SELECT t.FORENAME, t.SURNAME, t.CUST_REFNO, t.DATE_OF_BIRTH ,
a.LINE1,
row_number() over (partition by t.FORENAME, t.SURNAME, t.DATE_OF_BIRTH
order by 1) as rn,
count(1) over (partition by t.FORENAME, t.SURNAME, t.DATE_OF_BIRTH) as cnt
FROM CUSTOMERS AS t
LEFT OUTER JOIN dbo.ADDRESS a ON t.CUST_REFNO = a.CUST_REFNO
WHERE a.addresstype = 'address') t
where cnt > 1 and rn = 1

SQL Server : finding duplicates based on first few characters on column

I want to find duplicates based on the first three characters of the surname, is there a way a to do that on SQL? I can compare the whole name, but how to do we compare the first few characters?
Below are my tables
custid forename surname dateofbirth
----------------------------------------
1 David John 16-09-1985
2 David Jon 16-09-1985
3 Sarah Smith 10-08-2015
4 Peter Proca 11-06-2011
5 Peter Proka 11-06-2011
This is my query that I am currently running to compare
SELECT
y.id, y.forename, y.surname
FROM
customers y
INNER JOIN
(SELECT
forename, surname, COUNT(*) AS CountOf
FROM customers
GROUP BY forename, surname
HAVING COUNT(*) > 1) dt ON y.forename = dt.forename
You can use left():
select c.*
from (select c.*, count(*) over (partition by left(surname, 3)) as cnt
from customers c
) c
order by surname;
You can include the forename as well in the partition by if you mean forename and first three letters of surname.
You can use exists as follows:
select t.* from t
Where exists
(select 1 from t tt
Where left(t.surname, 3) = left(tt.surname, 3) and t.custid <> tt.custid
)
order by t.surname;

SQL find repeated values

I need to identify rows where a certain value is repeated. Here is a sample table:
COUNTRY CITY
Italy Milan
Englad London
USA New York
Canada London
USA Atlanta
The query should return...
COUNTRY CITY
Englad London
Canada London
...because London is repeated. Thank you in advance for your help.
The easiest way is to use a subquery that counts the number of times each city appears (and filter to those values that appear more than once):
SELECT * FROM Cities
WHERE City in
(
SELECT City FROM Cities
GROUP BY City
HAVING COUNT(*) > 1
)
If your DBMS supports windowed aggregates.
SELECT COUNTRY,
CITY
FROM (SELECT COUNTRY,
CITY,
COUNT(*) OVER (PARTITION BY CITY) AS Cnt
FROM Cities) T
WHERE Cnt > 1
SQL Fiddle
select country, city
from aTable
where city in
(
select city
from aTable
group by city
HAVING count(1) > 1
)
Try it here: http://sqlfiddle.com/#!3/e9b1a/1
Or if the same city & country combo appears twice and you're only interested where the countries are different:
select distinct country, city
from aTable
where city in
(
select city
from aTable
group by city
HAVING count(distinct country) > 1
)
Try it here: http://sqlfiddle.com/#!3/2dfaa/2
This one works. Got it from my wife (she finally had time to look into this). Thought you might be interested.
SELECT * FROM Cities
WHERE City in ( select city
from (SELECT City,
count(distinct country)
FROM Cities
GROUP BY City
HAVING count(distinct country) > 1) a )