SQL Query to find winners within a contest - sql

I have a query which returns to me results as follows:
Race | Candidate | Total Votes | MaxNoOfWinners
---------------------------------------------------
1 | 1 | 5000 | 3
1 | 2 | 6700 | 3
2 | 1 | 100 | 3
2 | 2 | 200 | 3
2 | 3 | 300 | 3
2 | 4 | 400 | 3
...
I was wondering if there was a query that could be written to return only the winners (based on the MaxNoOfWinners and TotalVotes) for a certain race. So for the above i would only get back
Race | Candidate | Total Votes | MaxNoOfWinners
---------------------------------------------------
1 | 1 | 5000 | 3
1 | 2 | 6700 | 3
2 | 2 | 200 | 3
2 | 3 | 300 | 3
2 | 4 | 400 | 3
...

Here is a solution... I did not test so there may be typos. The idea is is use the RANK() function of SQL Server to give a ranking by Race based on votes and not include those that don't meet the criteria. Note, using RANK() and not ROW_NUMBER() will include ties in the result.
WITH RankedResult AS
(
SELECT Race, Candidate, [Total Votes], MaxNoOfWinners, RANK ( ) OVER (PARTITION BY Race ORDER BY [Total Votes] DESC) AS aRank
FROM Results
)
SELECT Race, Candidate, [Total Votes], MaxNoOfWinners
FROM RankedResult
WHERE aRANK <= MaxNumberOfWinners

Here's a complete working sample that assumes two tables race and candiate
Create Table #Race(Race_id int , MaxNoOfwinners int )
INSERT INTO #Race (Race_id , MaxNoOfwinners)
VALUES (1,3),
(2,3),
(3,1)
CREATE TABLE #Candidate (CandidateID int , Race_ID int , Total_Votes int )
INSERT INTO #Candidate (CandidateID , Race_ID , Total_Votes )
VALUES (1,1,5000),
(2,1,6700),
(1,2,100),
(2,2,200),
(3,2,300),
(4,2,400),
(1,3,42),
(2,3,22)
;WITH CTE as (
SELECT
RANK() OVER(PARTITION BY race_id ORDER BY race_id, total_votes DESC ) num,
CandidateID , Race_ID , Total_Votes
From
#Candidate)
SELECT * FROM cte inner join #Race r
on CTE.Race_ID = r.Race_id
and num <= r.MaxNoOfwinners
DROP TABLE #Race
DROP TABLE #Candidate
With the following results
num CandidateID Race_ID Total_Votes Race_id MaxNoOfwinners
-------------------- ----------- ----------- ----------- ----------- --------------
1 2 1 6700 1 3
2 1 1 5000 1 3
1 4 2 400 2 3
2 3 2 300 2 3
3 2 2 200 2 3
1 1 3 42 3 1

WITH q0 AS (SELECT qry.*, rank() AS r
FROM qry OVER (PARTITION BY race ORDER BY total_votes DESC))
SELECT q0.race, q0.candidate, q0.total_votes FROM q0 WHERE r<=q0.max_winners;

Related

Adding a row number respecting the order of each row

I have a table like this
id, period, tag
1 1 A
1 2 A
1 3 B
1 4 A
1 5 A
1 6 A
2 1 A
2 2 B
2 3 B
2 4 B
2 5 B
2 6 A
I would like to add a new column with a ranking, respecting the order of the row given my column 'period' to obtain something like this
id, period, tag rank
1 1 A 1
1 2 A 1
1 3 B 2
1 4 A 3
1 5 A 3
1 6 A 3
2 1 A 1
2 2 B 2
2 3 B 2
2 4 B 2
2 5 B 2
2 6 A 3
What can I do?
I try rank and dense_rank function without any success
And another candidate for CONDITIONAL_CHANGE_EVENT()
less code, and quite effective, too ...!
WITH
input(id,period,tag) AS (
SELECT 1,1,'A'
UNION ALL SELECT 1,2,'A'
UNION ALL SELECT 1,3,'B'
UNION ALL SELECT 1,4,'A'
UNION ALL SELECT 1,5,'A'
UNION ALL SELECT 1,6,'A'
UNION ALL SELECT 2,1,'A'
UNION ALL SELECT 2,2,'B'
UNION ALL SELECT 2,3,'B'
UNION ALL SELECT 2,4,'B'
UNION ALL SELECT 2,5,'B'
UNION ALL SELECT 2,6,'A'
)
SELECT
*
, CONDITIONAL_CHANGE_EVENT(tag) OVER(PARTITION BY id ORDER BY period) + 1 AS rank
FROM input;
-- out id | period | tag | rank
-- out ----+--------+-----+------
-- out 1 | 1 | A | 1
-- out 1 | 2 | A | 1
-- out 1 | 3 | B | 2
-- out 1 | 4 | A | 3
-- out 1 | 5 | A | 3
-- out 1 | 6 | A | 3
-- out 2 | 1 | A | 1
-- out 2 | 2 | B | 2
-- out 2 | 3 | B | 2
-- out 2 | 4 | B | 2
-- out 2 | 5 | B | 2
-- out 2 | 6 | A | 3
-- out (12 rows)
-- out
-- out Time: First fetch (12 rows): 14.823 ms. All rows formatted: 14.874 ms
One method is a cumulative sum based on a lag():
select t.*,
sum(case when prev_tag = tag then 0 else 1 end) over (partition by id order by period) as rank
from (select t.*, lag(tag) over (partition by id order by period) as prev_tag
from t
) t;

Get sum over a column value that is determined by two other column values in the same table

I have the following table MY_TABLE
ID | SEQ | TYPE | VAL
1 | 2 | A | 100
1 | 3 | A | 100
1 | 2 | B | 200
1 | 3 | A | 100
1 | 3 | B | 200
2 | 25 | X | 100
2 | 24 | Y | 200
2 | 24 | X | 300
2 | 25 | Y | 400
2 | 25 | X | 50
Here in MY_TABLE, each ID has a set of Seq values and Type values. I want to get the sum of VAL rows per TYPE that belong to each IDs max(Seq).
Expected output:
ID| SEQ | TYPE | SUM(VAL)
1 | 3 | A | 200 <- 100 + 100
1 | 3 | B | 200
2 | 25 | X | 150 <- 100 + 50
2 | 25 | Y | 400
What I tried:
-- this sub query finds the max(seq) for each ID
with max_seq as (
select id, max(seq) max_seq
from my_table t
group by id)
-- select query on my_table
select
bd.id,
bd.seq,
bd.type,
sum(bd.val)
from my_table bd
-- joining on id-max_seq pair
inner join max_seq
on
(max_seq.id = bd.id)
and
(max_seq.max_seq = bd.seq)
-- sum(val) per ID, MAX(SEQ), TYPE
group by bd.id, bd.seq, bd.type;
Question:
The above query works well for smaller tables but gets slower when the table is bigger. Is there an efficient way of getting this output? (Maybe without using two joins on the same table with a sub query?)
You could avoid the self-join by using a subquery which gets a ranking for each row based on the id and seq:
select id, seq, type, sum(val)
from (
select id, seq, type, val, rank() over (partition by id order by seq desc) as rnk
from my_table
)
where rnk = 1
group by id, seq, type
order by id, seq, type;
ID SEQ T SUM(VAL)
---------- ---------- - ----------
1 3 A 200
1 3 B 200
2 25 X 150
2 25 Y 400
Because of the order by seq desc, the rnk value is 1 for the highest seq for each id. The outer query then just filters on rnk = 1, limiting the output and the aggregation to those lowest-rank (highest-seq) rows.
db<>fiddle demo

Check the first value according to first date

I have two tables
guid | id | Status
-----| -----| ----------
1 | 123 | 0
2 | 456 | 3
3 | 789 | 0
The other table is
id | modified date | Status
------| --------------| ----------
1 | 26-08-2017 | 3
1 | 27-08-2017 | 0
1 | 01-09-2017 | 0
1 | 02-09-2017 | 0
2 | 26-08-2017 | 3
2 | 01-09-2017 | 0
2 | 02-09-2017 | 3
3 | 01-09-2017 | 0
3 | 02-09-2017 | 3
3 | 03-09-2017 | 0
Every time the status in the first table changes for each id it also modifies date and status in second table.Like for id 1 status was changed 4 times.
I want to select those ids by joining both tables whose value of status was 0 in its first modified date.
In this example it should return only id 3 because only id 3 has a status value as 0 on it first modified date 01-09-2017.Ids 1& 2 have value 3 in their first modified date.
Any help
Try using below(Assuming first table as A and second table as B):
;with cte as (
Select a.id, b.Status, row_number() over(partition by a.id order by [modified date] asc) row_num
from A
inner join B
on a.id = b.id
)
Select * from cte where
status = 0 and row_num = 1
Think this will do what your looking for.
WITH cte
AS (SELECT id
, ROW_NUMBER() OVER (PARTITION BY (id) ORDER BY [modified date]) RN
, Status
FROM SecondTable
)
SELECT *
FROM FirstTable
JOIN cte ON FirstTable.id = cte.id
AND RN = 1
WHERE cte.Status = 0
Just expand out the * and return what fields you need.

Postgresql - Return (N) rows for each ID

I have a table like this
contact_id | phone_number
1 | 55551002
1 | 55551003
1 | 55551000
2 | 55552001
2 | 55552008
2 | 55552003
2 | 55552007
3 | 55553001
3 | 55553002
3 | 55553009
3 | 55553004
4 | 55554000
I want to return only 3 numbers of each contact_id, order by phone_number, like this:
contact_id | phone_number
1 | 55551000
1 | 55551002
1 | 55551003
2 | 55552001
2 | 55552003
2 | 55552007
3 | 55553001
3 | 55553002
3 | 55553004
4 | 55554000
please need be an optimized query.
My Query
SELECT a.cod_cliente, count(a.telefone) as qtd
FROM crm.contatos a
LEFT JOIN (
SELECT *
FROM crm.contatos b
LIMIT 3
) AS sub_contatos ON sub_contatos.cod_contato = a.cod_cliente
group by a.cod_cliente;
This type of query can easily be solved using window functions:
select contact_id, phone_number
from (
select contact_id, phone_number,
row_Number() over (partition by contact_id order by phone_number) as rn
from crm.contatos
) t
where rn <= 3
order by contact_id, phone_number;

SQL Server : query grouping

I have some queries in SQL Server. I have two tables
keyword_text
Keyword_relate
Columns in keyword_text:
key_id
keywords
Columns in keyword_relate:
key_id
product_id
score
status
Sample data for keyword_text:
----|----------
1 | Pencil
2 | Pen
3 | Books
Sample data for keyword_relate:
----------------------------
Sno| Product | SCore|status
---------------------------
1 | 124 | 2 | 1
1 | 125 | 3 | 1
2 | 124 | 3 | 1
2 | 125 | 2 | 1
From this I want to get the product_id, grouped by keywords and which have maximum score
Presuming that key_id of first table is Sno in second table. You can use ROW_NUMBER:
WITH CTE AS
(
SELECT Product AS ProductID, Score As MaxScore,
RN = ROW_NUMBER() OVER (PARTITION BY kt.key_id ORDER BY Score DESC)
FROM keyword_text kt INNER JOIN keyword_relate kr
ON kt.key_id = kr.Sno
)
SELECT ProductID, MaxScore
FROM CTE
WHERE RN = 1