Rank Visits SQL Server 2014 - sql

I have a sample table of doctor visits by ID. I'm looking to rank the problems by age, partitioned by ID so I can do some statistic calculations on the 2nd and 3rd visit of the same problem by ID. Please Note: I have a larger dataset so i'm looking for something that will handle that.
So far I have
SELECT
ID, Age, Problem, COUNT(Problem) AS cnt,
RANK() OVER (PARTITION BY id ORDER BY Problem, Age ASC) AS rnk
FROM
#Test1
GROUP BY
ID, Problem, Age
ORDER BY
Age ASC
The code runs but the rank is not properly calculated. Please help.

As I understand, you need partition by ID and Problem:
CREATE TABLE #Test1 (ID int, Problem nvarchar(20), Age int)
INSERT INTO #Test1
VALUES
(1,'Arm',50),
(1,'Arm',52),
(1,'Foot',54),
(1,'Tongue',55),
(1,'Arm',59),
(2,'Toe',60),
(2,'Toe',60),
(2,'Arm',61),
(3,'Tooth',75),
(3,'Tooth',76),
(3,'Knee',78)
SELECT
ID,
Age,
Problem,
COUNT(*) OVER (PARTITION BY ID, Problem, Age) as cnt,
RANK() OVER (PARTITION BY ID, Problem ORDER BY Age) as rnk
FROM #Test1 AS t
ORDER BY t.Age
DROP TABLE #Test1
In this solution you will get the same rank = 1 for data (2,'Toe',60). To enumerate them, replace RANK with ROW_NUMBER

I believe you want row_number() instead of rank():
select
id
, Age
, Problem
, cnt = count(*) over (partition by id, Problem)
, rnk = row_number() over (partition by id, Problem order by Age)
from t
order by id, Age, Problem
test setup: http://rextester.com/DUWG50873
returns:
+----+-----+---------+-----+-----+
| id | Age | Problem | cnt | rnk |
+----+-----+---------+-----+-----+
| 1 | 50 | Arm | 3 | 1 |
| 1 | 52 | Arm | 3 | 2 |
| 1 | 54 | Foot | 1 | 1 |
| 1 | 55 | Tongue | 1 | 1 |
| 1 | 59 | Arm | 3 | 3 |
| 2 | 60 | Toe | 2 | 1 |
| 2 | 60 | Toe | 2 | 2 |
| 2 | 61 | Arm | 1 | 1 |
| 3 | 75 | Tooth | 2 | 1 |
| 3 | 76 | Tooth | 2 | 2 |
| 3 | 78 | Knee | 1 | 1 |
+----+-----+---------+-----+-----+

Related

Get the position of X user in the ranking

I have these tables
RANKING
+-----------+----------+
| id_users | points |
+-----------+----------+
| 1 | 27 | //3rd
| 2 | 55 | //1st
| 3 | 9 | //5th
| 4 | 14 | //4th
| 5 | 38 | //2nd
+-----------+----------+
I would like to retrieve user's data along with its ranking position, filtering by id. So for example if I want info for id 3 I should get
+----------+--------|---------------+
| id_users | points | rank_position |
+----------+--------|---------------+
| 3 | 9 | 5 |
+----------+--------|---------------+
My query actually has the following:
SELECT
ROW_NUMBER() OVER (ORDER BY points ASC) AS RowNum,
id_users
FROM
RANKING
And I don't know how to continue
If you use ROW_NUMBER(), you need to use a subquery:
SELECT r.*
FROM (SELECT r.*,
ROW_NUMBER() OVER (ORDER BY points ASC) AS RowNum
FROM RANKING r
) r
WHERE id_users = 5;

Joining 2 unrelated tables together

I have just delved into PostgreSQL and am currently trying to practice an unorthodox query whereby I want to join 2 unrelated tables, each with the same number of rows, together such that every row carries the combined columns of both tables.
These are what I have:
technical table
position | height | technical_id
----------+--------+-------------
Striker | 172 | 3
CAM | 165 | 4
(2 rows)
footballers table
name | age | country | game_id
----------+-----+-----------+--------
Pele | 77 | Brazil | 1
Maradona | 65 | Argentina | 2
(2 rows)
What i have tried:
SELECT name, '' AS position, null AS height, age, country, game_id, null as technical_id
from footballers
UNION
SELECT '' as name, position, height, null AS age,'' AS country, null as game_id, technical_id
from technical;
Output:
name | position | height | age | country | game_id | technical_id
----------+----------+--------+-----+-----------+---------+-------------
| Striker | 172 | | | | 3
| CAM | 165 | | | | 4
Maradona | | | 65 | Argentina | 2 |
Pele | | | 77 | Brazil | 1 |
(4 rows)
What I'm looking for (ideally):
name | position | height | age | country | game_id | technical_id
----------+----------+--------+-----+-----------+---------+-------------
Pele | Striker | 172 | 77 | Brazil | 1 | 3
Maradona | CAM | 165 | 65 | Argentina | 2 | 4
(2 rows)
Please use below query. But its not the right way of designing the schema. You should have a foreign key.
select t1.position,t1.height,t1.technical_id,t2.name,t2.age,t2.country,t2.game_id
from
(select position,height,technical_id, row_number() over(partition by
position,height,technical_id) as rnk) t1
inner join
(select name,age,country,game_id, row_number() over(partition by
name,age,country,game_id) as rnk) t2
on t1.rnk = t2.rnk;
You don't have a column to join on, so you can generate one. What works is a sequential number generated by row_number(). So:
select *
from (select t.*, row_number() over () as sequm
from technical t
) t join
(select f.*, row_number() over () as sequm
from footballers f
) f
using (seqnum);
Note: Postgres has extended the syntax of row_number() so it does not require an order by clause. The ordering of the rows is arbitrary and might change on different runs of the query.

Combine PARTITION BY and GROUP BY

I have a (mssql) table like this:
+----+----------+---------+--------+--------+
| id | username | date | scoreA | scoreB |
+----+----------+---------+--------+--------+
| 1 | jim | 01/2020 | 100 | 0 |
| 2 | max | 01/2020 | 0 | 200 |
| 3 | jim | 01/2020 | 0 | 150 |
| 4 | max | 02/2020 | 150 | 0 |
| 5 | jim | 02/2020 | 0 | 300 |
| 6 | lee | 02/2020 | 100 | 0 |
| 7 | max | 02/2020 | 0 | 200 |
+----+----------+---------+--------+--------+
What I need is to get the best "combined" score per date. (With "combined" score I mean the best scores per user and per date summarized)
The result should look like this:
+----------+---------+--------------------------------------------+
| username | date | combined_score (max(scoreA) + max(scoreB)) |
+----------+---------+--------------------------------------------+
| jim | 01/2020 | 250 |
| max | 02/2020 | 350 |
+----------+---------+--------------------------------------------+
I came this far:
I can group the scores by user like this:
SELECT
username, (max(scoreA) + max(scoreB)) AS combined_score,
FROM score_table
GROUP BY username
ORDER BY combined_score DESC
And I can get the best score per date with PARTITION BY like this:
SELECT *
FROM
(SELECT t.*, row_number() OVER (PARTITION BY date ORDER BY scoreA DESC) rn
FROM score_table t) as tmp
WHERE tmp.rn = 1
ORDER BY date
Is there a proper way to combine these statements and get the result I need? Thank you!
Btw. Don't care about possible ties!
You can combine window functions and aggregation functions like this:
SELECT s.*
FROM (SELECT username, date, (max(scoreA) + max(scoreB)) AS combined_score,
ROW_NUMBER() OVER (PARTITION BY date ORDER BY max(scoreA) + max(scoreB) DESC) as seqnum
FROM score_table
GROUP BY username, date
) s
ORDER BY combined_score DESC;
Note that date needs to be part of the aggregation.

How to apply TOP statement to only 1 column while selecting multiple columns from a table?

I am trying to select multiple columns from a table, but I want to select top certain number of records based on one column. I tried this :
select roll_no ,marks as Percentage
from database
where marks in (select top (3) *
from database
where subject = ''
order by marks desc) order by percentage desc
and I am getting the error:
Only one expression can be specified in the select list when the
sub-query is not introduced with EXISTS or more than specified number
of records.
I also tried :
select roll_no ,marks as Percentage
from database
where marks in (select top (3) marks
from database
where subject = ''
order by marks desc) order by percentage desc
which returns the right result for some subjects but for others..it is displaying top marks from other subjects as well.
eg :
+---------+-------+
| roll_no | marks |
+---------+-------+
|10003 | 87 |
|10006 | 72 |
|10003 | 72 |
|10002 | 67 |
|10004 | 67 |
+---------+-------+
How to frame the query correctly?
sample data :
+---------+-------+---------+
| roll_no | marks |subject |
+---------+-------+---------+
|10001 | 45 | Maths |
|10001 | 72 | Science |
|10001 | 64 | English |
|10002 | 52 | Maths |
|10002 | 35 | Science |
|10002 | 75 | English |
|10003 | 52 | Maths |
|10003 | 35 | Science |
|10003 | 75 | English |
|10004 | 52 | Maths |
|10004 | 35 | Science |
|10004 | 75 | English |
+---------+-------+---------+
If I'm right and you are looking for the best 3 marks for each subject, then you can get it with the following:
DECLARE #SelectedSubject VARCHAR(50) = 'Maths'
;WITH FilteredSubjectMarks AS
(
SELECT
D.Subject,
D.Roll_no,
D.Marks,
MarksRanking = DENSE_RANK() OVER (ORDER BY D.Marks DESC)
FROM
[Database] AS D
WHERE
D.Subject = #SelectedSubject
)
SELECT
F.*
FROM
FilteredSubjectMarks AS F
WHERE
F.MarksRanking <= 3
You can use window functions to rank your marks column (specifically dense_rank, which allows duplicate rankings whilst retaining sequential numbering) and then return all rows with a rank of 3 or less:
declare #t table(roll_no int identity(1,1),marks int);
insert into #t(marks) values(2),(4),(5),(8),(6),(1),(3),(2),(1),(8);
with t as
(
select roll_no
,marks
,dense_rank() over (order by marks desc) as r
from #t
)
select *
from t
where r <= 3;
Output:
+---------+-------+---+
| roll_no | marks | r |
+---------+-------+---+
| 4 | 8 | 1 |
| 10 | 6 | 1 |
| 5 | 6 | 2 |
| 3 | 5 | 3 |
+---------+-------+---+

Getting distinct values with the highest value in a specific column

How can I get the highlighted rows from the table below in SQL? (Distinct rows based on User name with the highest Version are highlighted)
In case you need plain text table:
+----+-----------+---+
| 1 | John | 1 |
+----+-----------+---+
| 2 | Brad | 1 |
+----+-----------+---+
| 3 | Brad | 3 |
+----+-----------+---+
| 4 | Brad | 2 |
+----+-----------+---+
| 5 | Jenny | 1 |
+----+-----------+---+
| 6 | Jenny | 2 |
+----+-----------+---+
| 7 | Nick | 4 |
+----+-----------+---+
| 8 | Nick | 1 |
+----+-----------+---+
| 9 | Nick | 3 |
+----+-----------+---+
| 10 | Nick | 2 |
+----+-----------+---+
| 11 | Chris | 1 |
+----+-----------+---+
| 12 | Nicole | 2 |
+----+-----------+---+
| 13 | Nicole | 1 |
+----+-----------+---+
| 14 | James | 1 |
+----+-----------+---+
| 15 | Christine | 1 |
+----+-----------+---+
What I have so far is (works for one user)
SELECT USER, VERSION
FROM TABLE
WHERE USER = 'Brad'
AND VERSION = (SELECT MAX(VERSION ) FROM TABLE WHERE USER= 'Brad')
SELECT USER, max(VERSION) VERSION
FROM TABLE GROUP BY USER;
If you need an ID then
SELECT ID, USER, VERSION FROM (
SELECT ID, USER, VERSION,
RANK() OVER(PARTITION BY USER ORDER BY VERSION DESC) RNK
FROM TABLE
) WHERE RNK = 1;
if you have
| 2 | Brad | 5 |
+----+-----------+---+
| 3 | Brad | 3 |
+----+-----------+---+
| 4 | Brad | 5 |
The query with RANK gives you both users
| 2 | Brad | 5 |
+----+-----------+---+
| 4 | Brad | 5 |
If you need only one row then replace RANK() with ROW_NUMBER()
In your query you're using AND VERSION = (SELECT MAX(VERSION ) FROM TABLE WHERE USER= 'Brad') which is equivalent to RANK() (all rows with the max VERSION)
The first_value analytic function should do the trick:
SELECT DISTINCT FIRST_VALUE (id)
OVER (PARTITION BY name ORDER BY version DESC)
name,
FIRST_VALUE (version)
OVER (PARTITION BY name ORDER BY version DESC)
FROM my_table
Another way to go would be to use the row_number function:
SELECT id, name, version
FROM (SELECT id, name, version
ROW_NUMBER() OVER (PARTITION BY name ORDER BY version DESC) rn
FROM my_table)
WHERE rn = 1
Not sure which I prefer, personally. They each have their merit and their ugliness.
this might help you :
select id, user, version
from
(
select id, user, version, row_number() over (partition by user order by version desc) rownum
from yourtable
) as t
where t.rownum = 1
sql fiddle