Order table by the total count but do not lose the order by names - sql

I have a table, consisting of 3 columns (Person, Year and Count), so for each person, there are several rows with different years and counts and the final row with total count. I want to keep the table ordered by Name, but also order it by the total count.
So the rows should be ordered by sum, but also grouped by the Person and ordered by year. When I am trying to order by sum, of course, both person and years are messed up. Is there a way to sort like this?

You've stored those "total" rows as well? Gosh! Why did you do that?
Anyway: if you
compute rank for rows whose year column is equal to 'total' and
add case expression into the order by clause,
you might get what you want:
SQL> with sorter as
2 (select name, cnt,
3 rank() over (order by cnt) rnk
4 from test
5 where year = 'total'
6 )
7 select t.*
8 from test t join sorter s on s.name = t.name
9 order by s.rnk, case when year = 'total' then '9'
10 else year
11 end;
NAME YEAR CNT
---- ----- ----------
John 2018 3
John 2019 2
John total 5
Bob 2017 2
Bob 2019 4
Bob total 6
6 rows selected.
SQL>

Related

Select car of max(date) for every employee [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Select First Row of Every Group in sql [duplicate]
(2 answers)
Return row with the max value of one column per group [duplicate]
(3 answers)
GROUP BY with MAX(DATE) [duplicate]
(6 answers)
How to return only latest record on join [duplicate]
(6 answers)
Closed last year.
I would need a code for the following problem:
I have a table like this:
Employee
Year
Month
Car
Tom
2021
9
Ford
Tom
2021
10
Ford
Tom
2021
11
Ford
Tom
2021
12
Renault
Tom
2022
1
Renault
Mark
2021
12
VW
Mark
2022
1
VW
Mark
2022
2
VW
Joe
2021
8
Opel
Joe
2021
9
Tesla
Joe
2021
10
Ferrari
And I would need the car used by the employee for the last possible date. So the result should be:
Employee
Car
Tom
Renault
Mark
VW
Joe
Ferrari
With:
select employee, max(year || month) from table.cars
group by employee
I get the max(date) for every employee, but I do not know how to join the cars to the max(date).
How can I get the result I want?
You can use ROW_NUMBER() analytic function such as
SELECT Employee, Car
FROM (SELECT ROW_NUMBER() OVER
(PARTITION BY Employee ORDER BY year DESC, month DESC) AS rn,
c.*
FROM cars c)
WHERE rn = 1
provided that the data type of the year and month are of string type, then you can replace the part ORDER BY year DESC, month DESC with
ORDER BY TO_NUMBER(TRIM(year)) DESC, TO_NUMBER(TRIM(month)) DESC
with t as
(
select *,
row_number() over (partition by employee order by year desc, month desc) rn
from cars
)
select employee, car
from t
where rn = 1
Try this:
select employee, car
from (
Select *, ROW_NUMBER(partition by employee order by year, month DESC) as row_number
from cars
)a
Where row_number = 1

Looking to select values grouped by one column but create a hierarchy of the different columns to find "the best" column

Mightn't make much sense but let's try.
I have a dataset that is quite large and I have a few "duplicates" in a column. Within that column, I want to group it but select the corresponding row that is the "best fit" based on the max/sum of other columns. Is this possible within SQL?
Input:
Name
Transactions
Date
Apple #
Orange #
John
10
today
10
10
John
15
Yesterday
10
10
Jack
10
Today
5
5
Output I expect:
Name
Transactions
Date
Apple #
Orange #
Total #
John
15
Yesterday
10
10
20
Jack
10
Today
5
5
10
The hierarchy would be, max(transactions), max(date) and then sum(Apple, Orange).
I want to do it then for every unique name.
If I understand correctly, you can use row_number(). The key is setting up the order by to reflect the conditions you want:
select t.*
from (select t.*,
row_number() over (partition by name order by transactions desc, date desc, apple + orange desc) as seqnum
from t
) t
where seqnum = 1;

Running Total by Year in SQL

I have a table broken out into a series of numbers by year, and need to build a running total column but restart during the next year.
The desired outcome is below
Amount | Year | Running Total
-----------------------------
1 2000 1
5 2000 6
10 2000 16
5 2001 5
10 2001 15
3 2001 18
I can do an ORDER BY to get a standard running total, but can't figure out how to base it just on the year such that it does the running total for each unique year.
SQL tables represent unordered sets. You need a column to specify the ordering. One you have this, it is a simple cumulative sum:
select amount, year, sum(amount) over (partition by year order by <ordering column>)
from t;
Without a column that specifies ordering, "cumulative sum" does not make sense on a table in SQL.

How to get top records of subset?

Say I have a table
StoreID TotalSales Month Year
-- ---------- ----- ----
1 10 1 2012
2 2 1 2012
3 15 1 2012
1 4 2 2012
2 5 2 2012
I need: For each unique "Month/Year", grab the top two StoreID's with the highest Sales.
I'm at a loss on how to do this. I tried with a cross apply but that doesn't seem to work. This is all way over my head so hopefully someone can give me a nudge in the right direction.
This query uses Common Table Expression and Window Function to be able to get all the columns within the row. It works on SQL Server 2005 and up
WITH records
AS
(
SELECT StoreID, TotalSales , Month, Year,
DENSE_RANK() OVER (PARTITION BY Month, Year
ORDER BY TotalSales DESC) rn
FROM tableName
)
SELECT StoreID, TotalSales , Month, Year
FROM records
WHERE rn <= 2
SQLFiddle Demo

selecting top N rows for each group in a table

I am facing a very common issue regarding "Selecting top N rows for each group in a table".
Consider a table with id, name, hair_colour, score columns.
I want a resultset such that, for each hair colour, get me top 3 scorer names.
To solve this i got exactly what i need on Rick Osborne's blogpost "sql-getting-top-n-rows-for-a-grouped-query"
That solution doesn't work as expected when my scores are equal.
In above example the result as follow.
id name hair score ranknum
---------------------------------
12 Kit Blonde 10 1
9 Becca Blonde 9 2
8 Katie Blonde 8 3
3 Sarah Brunette 10 1
4 Deborah Brunette 9 2 - ------- - - > if
1 Kim Brunette 8 3
Consider the row 4 Deborah Brunette 9 2. If this also has same score (10) same as Sarah, then ranknum will be 2,2,3 for "Brunette" type of hair.
What's the solution to this?
If you're using SQL Server 2005 or newer, you can use the ranking functions and a CTE to achieve this:
;WITH HairColors AS
(SELECT id, name, hair, score,
ROW_NUMBER() OVER(PARTITION BY hair ORDER BY score DESC) as 'RowNum'
)
SELECT id, name, hair, score
FROM HairColors
WHERE RowNum <= 3
This CTE will "partition" your data by the value of the hair column, and each partition is then order by score (descending) and gets a row number; the highest score for each partition is 1, then 2 etc.
So if you want to the TOP 3 of each group, select only those rows from the CTE that have a RowNum of 3 or less (1, 2, 3) --> there you go!
The way the algorithm comes up with the rank, is to count the number of rows in the cross-product with a score equal to or greater than the girl in question, in order to generate rank. Hence in the problem case you're talking about, Sarah's grid would look like
a.name | a.score | b.name | b.score
-------+---------+---------+--------
Sarah | 9 | Sarah | 9
Sarah | 9 | Deborah | 9
and similarly for Deborah, which is why both girls get a rank of 2 here.
The problem is that when there's a tie, all girls take the lowest value in the tied range due to this count, when you'd want them to take the highest value instead. I think a simple change can fix this:
Instead of a greater-than-or-equal comparison, use a strict greater-than comparison to count the number of girls who are strictly better. Then, add one to that and you have your rank (which will deal with ties as appropriate). So the inner select would be:
SELECT a.id, COUNT(*) + 1 AS ranknum
FROM girl AS a
INNER JOIN girl AS b ON (a.hair = b.hair) AND (a.score < b.score)
GROUP BY a.id
HAVING COUNT(*) <= 3
Can anyone see any problems with this approach that have escaped my notice?
Use this compound select which handles OP problem properly
SELECT g.* FROM girls as g
WHERE g.score > IFNULL( (SELECT g2.score FROM girls as g2
WHERE g.hair=g2.hair ORDER BY g2.score DESC LIMIT 3,1), 0)
Note that you need to use IFNULL here to handle case when table girls has less rows for some type of hair then we want to see in sql answer (in OP case it is 3 items).