Find first n sums of a column, grouped by two other columns - sql

I have a table in SQL like this. Now I want to find sum of score grouped by column ID & Name, and show just two highest sums for each ID as below, so how can I solve this?

Can you try something like this:
Select ID, Name, Score From (
Select ID, Name, SUM(Score) score, row_number() over (partition by ID,Name order by SUM(Score) desc) rn
from Table
group by ID,Name) allScores
where rn > 2

You can Try This:
select top(2) Id , name , sum(Score) as sumScore
from table
group by id , name
order by sum(Score) desc

Related

Just another SQL case (GROUP BY)

I'm stuck on an SQL problem that I don't know how to solve.
Let's say I have a table like this (concerning estimations on house prices):
estimationID | estimationDate | userID | cityID
1 | '2020-01-01' | 123456 | 987654
2 | '2020-12-01' | 135790 | 975310
...
With estimationDate being the date when the estimation was made, userID the ID of the user who made the estimation and cityID the ID of the city where the estimation was made.
I need to get the maximum number of estimations made by one user (I don't care which one, I don't need an ID) for each city.
Something like
SELECT cityID,*maximum number of estimations made by one user from this city* FROM estimationsTable GROUP BY cityID
Any idea?
Step by step:
Get the number of estimations per user and city.
Get the maximum of these numbers per city.
The query:
select cityid, max(cnt)
from
(
select cityid, userid, count(*) as cnt
from estimationstable
group by cityid, userid
) counted
group by cityid
order by cityid;
try like below
with cte as (
select userid,cityid,count(*) as cnt
from table_name group by userid,cityid
)
, cte2 as (
select *,
row_number() over(partition by cityid order by cnt desc) rn
from cte
) select * from cte2 where rn=1
sol 1:
SELECT id, MAX(maximum_number_of_estimations)
FROM (SELECT id,COUNT(*) AS maximum_number_of_estimations
FROM TABLE x)group by id as final_query
sol2:
use order by Count DESC with group by`
something like this should work
the idea is you count all the occurrences in the inner query with the group by on your id and another query to get the max of it OR you use ORDER BY [Field] DESC
with GROUP BY which will automatically put the highest ones on the top
In BigQuery, I think you can do this without a subquery:
select distinct cityid,
(array_agg(userid order by count(*) desc, userid))[ordinal(1)] as userid,
max(count(*)) over (order by count(*) desc) as cnt
from estimationstable
group by cityid, userid

How do I create a new SQL table with custom column names and populate these columns

So I currently have an SQL statement that generates a table with the most frequent occurring value as well as the least frequent occurring value in a table. However this table has 2 rows with the row values as well as the fields. I need to create a custom table with 2 columns with min and max. Then have one row with one value for each. The value for these columns needs to be from the same row.
(SELECT name, COUNT(name) AS frequency
FROM firefighter_certifications
GROUP BY name
ORDER BY frequency DESC limit 1)
UNION
(SELECT name, COUNT(name) AS frequency
FROM firefighter_certifications
GROUP BY name
ORDER BY frequency ASC limit 1);
So for the above query I would need the names of the min and max values in one row. I need to be able to define the name of new columns for the generated SQL query as well.
Min_Name | Max_Name
Certif_1 | Certif_2
I think this query should give you the results you want. It ranks each name according to the number of times it appears in the table, then uses conditional aggregation to select the min and max frequency names in one row:
with cte as (
select name,
row_number() over (order by count(*) desc) as maxr,
row_number() over (order by count(*)) as minr
from firefighter_certifications
group by name
)
select max(case when minr = 1 then name end) as Min_Name,
max(case when maxr = 1 then name end) as Max_Name
from cte
Postgres doesn't offer "first" and "last" aggregation functions. But there are other, similar methods:
select distinct first_value(name) over (order by cnt desc, name) as name_at_max,
first_value(name) over (order by cnt asc, name) as name_at_min
from (select name, count(*) as cnt
from firefighter_certifications
group by name
) n;
Or without any subquery at all:
select first_value(name) over (order by count(*) desc, name) as name_at_max,
first_value(name) over (order by count(*) asc, name) as name_at_min
from firefighter_certifications
group by name
limit 1;
Here is a db<>fiddle

Group by two columns, take sum, then max

There are three columns: Id (char), Name (char), and Score (int).
First, we group by Id and Name and add Score for each group. Let us call the added score total_score.
Then, we group by Name and take only the maximum of total_score and its corresponding Id and Name. I've got everything else but I'm having a hard time figuring out how to get the Id. The error I get is
Column 'Id' is invalid in the select list because
it is not contained in either an aggregate function or the GROUP BY
clause.
WITH Tmp AS
(SELECT Id,
Name,
SUM(Score) AS total_score
FROM Mytable
GROUP BY Id,
Name)
SELECT Name, -- Id,
MAX(total_score) AS max_score
FROM Tmp
GROUP BY Name
ORDER BY max_score DESC
just add row_number() partition by Name to your query and get the 1st row (order by total_score descending)
select *
from
(
-- your existing `total_score` query
SELECT Id, Name,
SUM(Score) AS total_score,
r = row_number() over (partition by Name order by SUM(Score) desc)
FROM Mytable
GROUP BY Id, Name
) d
where r = 1
WITH Tmp AS
(SELECT Id,
Name,
SUM(Score) AS total_score
FROM Mytable
GROUP BY Id,
Name)
SELECT Name, Id,
MAX(total_score) AS max_score
FROM Tmp
GROUP BY Name,id
ORDER BY max_score DESC
Try this. Hope this will help.
WITH Tmp AS
(
SELECT Id,
Name,
SUM(Score) AS total_score
FROM Mytable
GROUP BY Id,
NAME
)
SELECT Name, Id,
MAX(total_score) AS max_score
FROM Tmp
GROUP BY Name,id
ORDER BY max_score DESC
Note:- If we are using aggregate function then we have to use other column as Group By....
In your case you are using SUM(Score) as aggregate function then we to use other column as Group by ...
I am not sure about performance of below query but we can use window functions to get maximum value from data partition.
SELECT
Id,
Name,
SUM(Score) AS total_score,
MAX(SUM(Score)) OVER(Partition by Name) AS max_score
FROM Mytable
GROUP BY Id, Name;
Tested -
declare #Mytable table (id int, name varchar(10), score int);
insert into #Mytable values
(1,'abc', 100),
(2,'abc', 200),
(3,'def', 300),
(3,'def', 400),
(4,'pqr', 500);
Output -
Id Name total_score max_score
1 abc 100 200
2 abc 200 200
3 def 700 700
4 pqr 500 500
You can select DENSE_RANK() with total_score column and then select records with Rank = 1. This will work for those also when there are multiple Name which are having same total_score.
WITH Tmp AS
(SELECT Id,
Name,
SUM(Score) AS total_score
FROM Mytable
GROUP BY Id, Name)
SELECT Id,
Name,
total_score AS max_score
FROM (SELECT Id,
Name,
total_score,
DENSE_RANK() OVER (PARTITION BY Name ORDER BY total_score DESC) AS Rank
FROM Tmp) AS Tmp2
WHERE Rank = 1
You can try this as well:
select id,name,max(total_score) over (partition by name) max_score from (
select id,name,sum(score) as total_score from YOURTABLE
group by id,name
) t

How to count record by specific column in SQL SELECT statement and putting that value in another column of that row

I am trying to put a count value in the AppNum column for the same student where the FIRST/TOP record in the result set is 1 and the subsequent record would be 2, 3, etc. I attempted to do this using GROUP BY but not getting the result I'm looking for. In the following PIC the FIRST resultset shows what I'm getting and the SECOND resultset is what I'm needing.
Here is a PIC of what I'm looking to do:
The query I've tried to get the correct resultset is below. Any HELP/DIRECTION would be appreciated:
SELECT
StudentID, Location, Status, EconomicDisadvantageCode,
StatusEffectiveDate, Enddate, SchoolYear, ApplicationTypeCode,
LastUpdated, UpdatedAppType, DataSource, COUNT(Status) AS AppNum
FROM
#MCS_5
WHERE
StudentID IN (SELECT StudentID
FROM #MCS_5
GROUP BY StudentID
HAVING COUNT(StudentID) > 1)
GROUP BY
Status, StudentID, Location, Status, EconomicDisadvantageCode,
StatusEffectiveDate, Enddate, SchoolYear, ApplicationTypeCode,
LastUpdated, UpdatedAppType, DataSource
ORDER BY
StudentID ASC, StatusEffectiveDate ASC;
Use row_number function, its quite handy in this scenario.
SELECT StudentID
,Location
,STATUS
,EconomicDisadvantageCode
,StatusEffectiveDate
,Enddate
,SchoolYear
,ApplicationTypeCode
,LastUpdated
,UpdatedAppType
,DataSource
,ROW_NUMBER () OVER (PARTITION BY Studentid,STATUS ORDER BY [STATUS])
Appnum
FROM #MCS_5
WHERE StudentID IN (
SELECT StudentID
FROM #MCS_5
GROUP BY StudentID
HAVING COUNT(StudentID) > 1
)
Just use window functions for everything:
Select ...
From (select t.*,
Row_number() over (partition by studentid order by status) as seqnum,
Count(*) over (partition by studentid) as cnt
From #mcd_5 t
) t
Where cnt > 1

how to get the distinct records based on maximum date?

I'm working with Sql server 2008.i have a table contains following columns,
Id,
Name,
Date
this table contains more than one record for same id.i want to get distinct id having maximum date.how can i write sql query for this?
Use the ROW_NUMBER() function and PARTITION BY clause. Something like this:
SELECT Id, Name, Date FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Date desc) AS ROWNUM
FROM [MyTable]
) x WHERE ROWNUM = 1
If you need only ID column and other columns are NOT required, then you don't need to go with ROW_NUMBER or MAX or anything else. You just do a Group By over ID column, because whatever the maximum date is you will get same ID.
SELECT ID FROM table GROUP BY ID
--OR
SELECT DISTINCT ID FROM table
If you need ID and Date columns with maximum date, then simply do a Group By on ID column and select the Max Date.
SELECT ID, Max(Date) AS Date
FROM table
GROUP BY ID
If you need all the columns but 1 line having Max. date then you can go with ROW_NUMBER or MAX as mentioned in other answers.
SELECT *
FROM table AS M
WHERE Exists(
SELECT 1
FROM table
WHERE ID = M.ID
HAVING M.Date = Max(Date)
)
One way, using ROW_NUMBER:
With CTE As
(
SELECT Id, Name, Date, Rn = Row_Number() Over (Partition By Id
Order By Date DESC)
FROM dbo.TableName
)
SELECT Id --, Name, Date
FROM CTE
WHERE Rn = 1
If multiple max-dates are possible and you want all you could use DENSE_RANK instead.
Here's an overview of sql-server's ranking function: http://technet.microsoft.com/en-us/library/ms189798.aspx
By the way, CTE is a common-table-expression which is similar to a named sub-query. I'm using it to be able to filter by the row_number. This approach allows to select all columns if you want.
select Max(Date) as "Max Date"
from table
group by Id
order by Id
Try with Max(Date) and GROUP BY the other two columns (the ones with repeating data)..
SELECT ID, Max(Date) as date, Name
FROM YourTable
GROUP BY ID, Name
You may try with this
DECLARE #T TABLE(ID INT, NAME VARCHAR(50),DATE DATETIME)
INSERT INTO #T VALUES(1,'A','2014-04-20'),(1,'A','2014-04-28')
,(2,'A2','2014-04-22'),(2,'A2','2014-04-24')
,(3,'A3','2014-04-20'),(3,'A3','2014-04-28')
,(4,'A4','2014-04-28'),(4,'A4','2014-04-28')
,(5,'A5','2014-04-28'),(5,'A5','2014-04-28')
SELECT T.ID FROM #T T
WHERE T.DATE=(SELECT MAX(A.DATE)
FROM #T A
WHERE A.ID=T.ID
GROUP BY A.ID )
GROUP BY T.ID
select id, max(date) from NameOfYourTable group by id;