Just another SQL case (GROUP BY) - sql

I'm stuck on an SQL problem that I don't know how to solve.
Let's say I have a table like this (concerning estimations on house prices):
estimationID | estimationDate | userID | cityID
1 | '2020-01-01' | 123456 | 987654
2 | '2020-12-01' | 135790 | 975310
...
With estimationDate being the date when the estimation was made, userID the ID of the user who made the estimation and cityID the ID of the city where the estimation was made.
I need to get the maximum number of estimations made by one user (I don't care which one, I don't need an ID) for each city.
Something like
SELECT cityID,*maximum number of estimations made by one user from this city* FROM estimationsTable GROUP BY cityID
Any idea?

Step by step:
Get the number of estimations per user and city.
Get the maximum of these numbers per city.
The query:
select cityid, max(cnt)
from
(
select cityid, userid, count(*) as cnt
from estimationstable
group by cityid, userid
) counted
group by cityid
order by cityid;

try like below
with cte as (
select userid,cityid,count(*) as cnt
from table_name group by userid,cityid
)
, cte2 as (
select *,
row_number() over(partition by cityid order by cnt desc) rn
from cte
) select * from cte2 where rn=1

sol 1:
SELECT id, MAX(maximum_number_of_estimations)
FROM (SELECT id,COUNT(*) AS maximum_number_of_estimations
FROM TABLE x)group by id as final_query
sol2:
use order by Count DESC with group by`
something like this should work
the idea is you count all the occurrences in the inner query with the group by on your id and another query to get the max of it OR you use ORDER BY [Field] DESC
with GROUP BY which will automatically put the highest ones on the top

In BigQuery, I think you can do this without a subquery:
select distinct cityid,
(array_agg(userid order by count(*) desc, userid))[ordinal(1)] as userid,
max(count(*)) over (order by count(*) desc) as cnt
from estimationstable
group by cityid, userid

Related

Find first n sums of a column, grouped by two other columns

I have a table in SQL like this. Now I want to find sum of score grouped by column ID & Name, and show just two highest sums for each ID as below, so how can I solve this?
Can you try something like this:
Select ID, Name, Score From (
Select ID, Name, SUM(Score) score, row_number() over (partition by ID,Name order by SUM(Score) desc) rn
from Table
group by ID,Name) allScores
where rn > 2
You can Try This:
select top(2) Id , name , sum(Score) as sumScore
from table
group by id , name
order by sum(Score) desc

function that allows grouping of rows

I'm using SQL Server Management Studio 2012. I have a similar looking output from a query shown below. I want to eliminate someone from the query who has 2 contracts.
Select
Row_Number() over (partition by ID ORDER BY ContractypeDescription DESC) as [Row_Number],
Name,
ContractDescription,
Role
From table
Output
Row_Number ID Name Contract Description Role
1 1234 Mike FullTime Admin
2 1234 Mike Temp Manager
1 5678 Dave FullTime Admin
1 9785 Liz FullTime Admin
What I would like to see
Row_Number ID Name Contract Description Role
1 5678 Dave FullTime Admin
1 9785 Liz FullTime Admin
Is there a function rather than Row_Number that allows you to group rows together so I can then use something like 'where Row_Number not like 1 and 2'?
You can use HAVING as
SELECT ID,
MAX(Name) Name,
MAX(ContractDescription) ContractDescription,
MAX(Role) Role
FROM t
GROUP BY ID
HAVING COUNT(*) = 1;
Demo
Try this:
select * from (
Select
Count(*) over (partition by ID ) as [Row_Number],
Name,
ContractDescription,
Role
From table
)t where [Row_Number] = 1
You can check this option-
SELECT *
FROM table
WHERE ID IN
(
SELECT ID
FROM table
GROUP BY ID
HAVING COUNT(*) = 1
)
You can use a CTE to get all the ids of people who got only one contract and then just join the result of the CTE with your table.
;with cte as (
select
id
,COUNT(id) as no
from #tbl
group by id
having COUNT(id) = 1
)
select
t.id
,t.name
,t.ContractDescription
,t.role
from #tbl t
inner join cte
on t.id = cte.id
Basically you need those record who have exactly one contract.
Just extend your script, (My script is not tested)
;with CTE as
(
Select
Row_Number() over (partition by ID ORDER BY ContractypeDescription DESC) as [Row_Number],
Name,
ContractDescription,
Role
From table
)
select * from CTE c where [Row_Number]=1
and not exists(select 1 from CTE c1 where c.id=c1.id and c1.[Row_Number]>1 )
Is there a function rather than Row_Number that allows you to group
rows together so I can then use something like 'where Row_Number not
like 1 and 2'?
You can use a windowed COUNT(). The key is the OVER() clause.
;WITH WindowedCount AS
(
SELECT
T.*,
WindowCount = COUNT(1) OVER (PARTITION BY T.ID)
FROM
YourTable AS T
)
DELETE W FROM
WindowedCount AS W
WHERE
W.WindowCount > 1
The COUNT() will count the amount of rows for each different ID, so if the same ID appears in 2 or more rows, those rows will be deleted.

Return row with the highest count (like city with most transactions, when there are multiple cities)

I have a table with the following rows:
customer_id
city
transaction_id
date
I want to pull the city which has the highest number of transactions (count(transaction_id)) over a certain time period. A customer can have transactions in multiple cities.
What's the right syntax to achieve this?
UPDATE:
customer A can have 5 transactions in NYC and 10 transactions in Boston. I just want the customer record of the 10 transactions in Boston to be returned. How do I do this and what would have to cities that have the same number of transactions per customer_id?
if you need just top city name count then you can use below approach
select city,count(transaction_id) as cnt
From YourTable
where date=condition --- that you need
group by city
order by cnt desc
limit 1
if you need customerwise highest city name then use row_number()
with cte as
(
select customerid,city,count(*) as cnt
from table_name group by customerid,city
), cte1 as
(
select * ,row_number()over(partition by cusomerid order by cnt desc) rn
from cte
) select * from cte1 where rn=1

How to get the max value for each group in Oracle?

I've found some solutions for this problem, however, they don't seem to work with Oracle.
I got this:
I want a view to present only the informations about the oldest person for each team. So, my output should be something like this:
PERSON | TEAM | AGE
Sam | 1 | 23
Michael | 2 | 21
How can I do that in Oracle?
Here is an example without keep but with row_number():
with t0 as
(
select person, team, age,
row_number() over(partition by team order by age desc) as rn
from t
)
select person, team, age
from t0
where rn = 1;
select * from table
where (team, age) in (select team, max(age) from table group by team)
One method uses keep:
select team, max(age) as age,
max(person) keep (dense_rank first order by age desc) as person
from t
group by team;
There are other methods, but in my experience, keep works quite well.
select * from (select person,team,age,
dense_rank() over (partition by team order by age desc) rnk)
where rnk=1;
Using an Analytic Function returns all people with maximum age per team (needed if there are people with identical ages), selects Table one time only and is thus faster than other solutions that reference Table multiple times:
With MaxAge as (
Select T.*, Max (Age) Over (Partition by Team) MaxAge
From Table T
)
Select Person, Team, Age From MaxAge Where Age=MaxAge
;
This also works in MySQL/MariaDB.

Get most common value for each value of another column in SQL

I have a table like this:
Column | Type | Modifiers
---------+------+-----------
country | text |
food_id | int |
eaten | date |
And for each country, I want to get the food that is eaten most often. The best I can think of (I'm using postgres) is:
CREATE TEMP TABLE counts AS
SELECT country, food_id, count(*) as count FROM munch GROUP BY country, food_id;
CREATE TEMP TABLE max_counts AS
SELECT country, max(count) as max_count FROM counts GROUP BY country;
SELECT country, max(food_id) FROM counts
WHERE (country, count) IN (SELECT * from max_counts) GROUP BY country;
In that last statement, the GROUP BY and max() are needed to break ties, where two different foods have the same count.
This seems like a lot of work for something conceptually simple. Is there a more straight forward way to do it?
It is now even simpler: PostgreSQL 9.4 introduced the mode() function:
select mode() within group (order by food_id)
from munch
group by country
returns (like user2247323's example):
country | mode
--------------
GB | 3
US | 1
See documentation here:
https://wiki.postgresql.org/wiki/Aggregate_Mode
https://www.postgresql.org/docs/current/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE
PostgreSQL introduced support for window functions in 8.4, the year after this question was asked. It's worth noting that it might be solved today as follows:
SELECT country, food_id
FROM (SELECT country, food_id, ROW_NUMBER() OVER (PARTITION BY country ORDER BY freq DESC) AS rn
FROM ( SELECT country, food_id, COUNT('x') AS freq
FROM country_foods
GROUP BY 1, 2) food_freq) ranked_food_req
WHERE rn = 1;
The above will break ties. If you don't want to break ties, you could use DENSE_RANK() instead.
SELECT DISTINCT
"F1"."food",
"F1"."country"
FROM "foo" "F1"
WHERE
"F1"."food" =
(SELECT "food" FROM
(
SELECT "food", COUNT(*) AS "count"
FROM "foo" "F2"
WHERE "F2"."country" = "F1"."country"
GROUP BY "F2"."food"
ORDER BY "count" DESC
) AS "F5"
LIMIT 1
)
Well, I wrote this in a hurry and didn't check it really well. The sub-select might be pretty slow, but this is shortest and most simple SQL statement that I could think of. I'll probably tell more when I'm less drunk.
PS: Oh well, "foo" is the name of my table, "food" contains the name of the food and "country" the name of the country. Sample output:
food | country
-----------+------------
Bratwurst | Germany
Fisch | Frankreich
try this:
Select Country, Food_id
From Munch T1
Where Food_id=
(Select Food_id
from Munch T2
where T1.Country= T2.Country
group by Food_id
order by count(Food_id) desc
limit 1)
group by Country, Food_id
Try something like this
select country, food_id, count(*) cnt
into #tempTbl
from mytable
group by country, food_id
select country, food_id
from #tempTbl as x
where cnt =
(select max(cnt)
from mytable
where country=x.country
and food_id=x.food_id)
This could be put all into a single select, but I don't have time to muck around with it right now.
Good luck.
Here's how to do it without any temp tables:
Edit: simplified
select nf.country, nf.food_id as most_frequent_food_id
from national_foods nf
group by country, food_id
having
(country,count(*)) in (
select country, max(cnt)
from
(
select country, food_id, count(*) as cnt
from national_foods nf1
group by country, food_id
)
group by country
having country = nf.country
)
SELECT country, MAX( food_id )
FROM( SELECT m1.country, m1.food_id
FROM munch m1
INNER JOIN ( SELECT country
, food_id
, COUNT(*) as food_counts
FROM munch m2
GROUP BY country, food_id ) as m3
ON m1.country = m3.country
GROUP BY m1.country, m1.food_id
HAVING COUNT(*) / COUNT(DISTINCT m3.food_id) = MAX(food_counts) ) AS max_foods
GROUP BY country
I don't like the MAX(.) GROUP BY to break ties... There's gotta be a way to incorporate eaten date into the JOIN in some way to arbitrarily select the most recent one...
I'm interested on the query plan for this thing if you run it on your live data!
select country,food_id, count(*) ne
from food f1
group by country,food_id
having count(*) = (select max(count(*))
from food f2
where country = f1.country
group by food_id)
Here is a statement which I believe gives you what you want and is simple and concise:
select distinct on (country) country, food_id
from munch
group by country, food_id
order by country, count(*) desc
Please let me know what you think.
BTW, the distinct on feature is only available in Postgres.
Example, source data:
country | food_id | eaten
US 1 2017-1-1
US 1 2017-1-1
US 2 2017-1-1
US 3 2017-1-1
GB 3 2017-1-1
GB 3 2017-1-1
GB 2 2017-1-1
output:
country | food_id
US 1
GB 3