Not selecting duplicate records with unique keys - sql

Sample table:
Table Movies:
Title | Year | Price | Genre | ID
Batman 2016 12 Comic 1
Avengers 2014 7 Comic 2
Batman 2016 7 Comic 3
Fast 5 2012 7 Car 4
Superman 5
Star Wars 6
Desired Result:
Title | Year | ID
Batman 2016 1
Avengers 2014 2
Fast 5 2012 4
Superman 5
Star Wars 6
So I need to select distinct ID, Title and Year where Title and Year aren't duplicates. Note Title and Year will always be the same if it is a duplicate so for example Batman 2014 wouldn't be a choice. If it was a duplicate both title and year would be the same as the duplicate record. Basically need to not select duplicate records that have unique keys. What is the most efficient way to do this?
Edit: one other thing. Be aware that null values might be present and I don't want those omitted. I updated the example to show this.

Use a row_number, partition by the distinct fields and order by another
with CTE as
(
select a1.*, row_number() over (partition by Title, Year order by ID) as r_ord
from Movies a1
)
select CTE.*
from CTE
where r_ord =1
or, if you only want the Title, Year and ID:
select Title, Year, min(ID)
from movies
group by Title, Year

Related

find out the player with highest score in each year

I have a table like these
country
gender
player
score
year
Germany
male
Michael
14
1990
Austria
male
Simon
13
1990
Germany
female
Mila
16
1990
Austria
female
Simona
15
1990
This is a table in the database. It shows 70 countries around the world with player names and gender. It shows which player score how many goals in which year. The years goes from 1990 to 2015. So the table is large. Now I would like to know which female player and which male player score most in every year from 2010 to 2012.
I expect this:
gender
player
score
year
male
Michael
24
2010
male
Simon
19
2011
male
Milos
19
2012
female
Mara
16
2010
female
Simona
16
2011
female
Dania
17
2012
I used that code but got an error
SELECT gender,year,player, max(score) as score from (football) where player = max(score) and year in ('2010','2011','2012') group by 1,2,3
football is the table name
with main as (
select
gender,
player,
year,
sum(score) as total_score -- incase each player played multiple match in a year
from <table_name>
where year between 2010 and 2012
group by 1,2,3
),
ranking as (
select *,
row_number(total_score) over(partition by year, gender order by total_score desc) as rank_
)
select
gender,
player,
year,
total_score
from ranking where rank_ = 1
filter on years
first you add total score, to make sure you cover the cases if there are multiple matches played by the same player in same year
then you create a rank based on year, gender and the total score, so for a given year and for a given gender create a rank
then you filter on rank_ = 1 as it represents the highest score
You can use the dense_rank function to achieve this, if you are using sqlite version 3.25 or higher.
Query
select t.* from(
select *, dense_rank() over(
partition by year, gender
order by score desc
) as rn
from football
where year in ('2010','2011','2012')
) as t
where t.rn = 1;

Order table by the total count but do not lose the order by names

I have a table, consisting of 3 columns (Person, Year and Count), so for each person, there are several rows with different years and counts and the final row with total count. I want to keep the table ordered by Name, but also order it by the total count.
So the rows should be ordered by sum, but also grouped by the Person and ordered by year. When I am trying to order by sum, of course, both person and years are messed up. Is there a way to sort like this?
You've stored those "total" rows as well? Gosh! Why did you do that?
Anyway: if you
compute rank for rows whose year column is equal to 'total' and
add case expression into the order by clause,
you might get what you want:
SQL> with sorter as
2 (select name, cnt,
3 rank() over (order by cnt) rnk
4 from test
5 where year = 'total'
6 )
7 select t.*
8 from test t join sorter s on s.name = t.name
9 order by s.rnk, case when year = 'total' then '9'
10 else year
11 end;
NAME YEAR CNT
---- ----- ----------
John 2018 3
John 2019 2
John total 5
Bob 2017 2
Bob 2019 4
Bob total 6
6 rows selected.
SQL>

SQL oracle with joining tables and Max functions

Some help please? Just a noob here starting to learn how to write SQL and ran into this problem. I know how to use the MAX function but I can't figure out how to join all these requirements together. I have two tables, Accounts and Books (below is an example of the data)
Accounts
ID Series YesorNot Dated Filed Plan Year
1 123 Yes 06/12/2015 2015
2 123 No 06/12/2015 2015
3 145 Yes 06/06/2015 2015
4 145 No 02/02/2015 2014
5 198 Yes 02/03/2015 2015
6 187 Yes 02/14/2013 2013
7 153 Yes 01/02/2011 2011
Books
Primary Key Date Created ID
1 06/13/2015 123
2 06/12/2015 123
3 06/07/2015 145
4 02/02/2015 145
5 02/03/2015 198
Two tables: Accounts and Books
Looking for:
1. Data that exists in both tables by the Project ID = Primary Key
2. I only want one unqiue Series (Series also = ID)
3. I want the MAX (most recent) value of Plan Year, and then if there are duplicates for Plan Year, I need the MAX (most recent) value of Date Created.
4. I just need the columns Project ID, Series, YesorNot, Date Filed, Plan Year so my output should be like this:
Project ID Series YesorNot Dated Filed Plan Year
1 123 Yes 06/12/2015 2015
3 145 Yes 06/06/2015 2015
4 145 No 02/02/2015 2014
5 198 Yes 02/03/2015 2015
First join the tables:
SELECT B.Primary_Key as Project_ID, A.Series, A.YesorNot, A.Date_Filed, A.Plan_Year
FROM Books B
JOIN Accounts A ON B.ID = A.Series
You should have been able to get this far on your own (and you should have posted it as part of the question) -- if you can't I'd say find a different career. Assuming you could now the slightly harder part.
Now we add a row number based on your criteria
ROW_NUMBER() PARTITION BY (B.Primary_Key, A.Series, A.YesorNot, A.Date_Filed ORDER BY A.Date_Year DESC, B.Date_Created DESC) AS RN
Now just take the first of the row number.
SELECT Project_ID, Series, YesorNot, Date_Filed, Plan_Year
FROM (
SELECT B.Primary_Key as Project_ID, A.Series, A.YesorNot, A.Date_Filed, A.Plan_Year,
ROW_NUMBER() PARTITION BY (B.Primary_Key, A.Series, A.YesorNot, A.Date_Filed ORDER BY A.Date_Year DESC, B.Date_Created DESC) AS RN
FROM Books B
JOIN Accounts A ON B.ID = A.Series
) X
WHERE RN = 1

How to group two fields together in SQL?

say I have a sql that currently returns all soccer players who has played during each years. Like so:
name year goals
john 2010 1
john 2006 2
john 2006 8
fred 2006 1
But I want the result to be grouped by the years they played, but do not compress player names if they are from different years, like so:
name year goals
john 2010 1
john 2006 10 <--- This is compressed, but there are still 2 johns
fred 2006 1 since they are from different years
say I have done this so far.
(select name, year, goals
from table) as T
If I just do
select *
from
(select name, year, goals
from table) as T
group by year;
Fred will disappear, but if I do "group by name", there are only 1 john left. Any help?
select name, year, sum(goals) as totalgoals
from table
group by name, year

Retrieve highest value from sql table

How can retrieve that data:
Name Title Profit
Peter CEO 2
Robert A.D 3
Michael Vice 5
Peter CEO 4
Robert Admin 5
Robert CEO 13
Adrin Promotion 8
Michael Vice 21
Peter CEO 3
Robert Admin 15
to get this:
Peter........4
Robert.......15
Michael......21
Adrin........8
I want to get the highest profit value from each name.
If there are multiple equal names always take the highest value.
select name,max(profit) from table group by name
Since this type of request almost always follows with "now can I include the title?" - here is a query that gets the highest profit for each name but can include all the other columns without grouping or applying arbitrary aggregates to those other columns:
;WITH x AS
(
SELECT Name, Title, Profit, rn = ROW_NUMBER()
OVER (PARTITION BY Name ORDER BY Profit DESC)
FROM dbo.table
)
SELECT Name, Title, Profit
FROM x
WHERE rn = 1;