SQL ranking over two tables - sql

I have two tables with user rankings.
Table rankingA and rankingB.
Each table has the columns:
user_id
points
group_id
Higher the points so higher the rank of the user/group...
Now i try to get the group ranking for the question which rank has my group.
So far i have this SQL:
select sum(ra.points) as rapoints, sum(rb.points) as rbpoints from public.rankinga ra
LEFT JOIN public.rankingb rb ON ra.group_id=rb.group_id and ra.user_id=rb.user_id where
ra.group_id=200;
It returns the points from rankinga and rankinb for the group 200.
How can i get the rankings of the group? I tryd it with:
row_number() OVER (ORDER BY sum(rb.points) DESC) AS rankb
but got a wrong result.
My expected result for group_id 200 is:
rapoints,rbpoints,rarank, rbrank
420, 10, 3, same points as group_id 300 so rbrank 2 or 3
How can i get this?
Setup
CREATE TABLE rankinga
(
user_id bigint,
group_id bigint,
points integer
)
CREATE TABLE rankingb
(
user_id bigint,
group_id bigint,
points integer
)
insert into public.rankinga (user_id,group_id,points) values (1,100,120),(2,100,300), (3,100,20),(4,200,300),(5,200,120),(6,300,600);
insert into public.rankingb (user_id,group_id,points) values (1,100,5),(2,100,3),(3,100,10),(4,200,2),(5,200,8),(6,300,10);

I think you want to do this with union all, aggregation, and the window function. Joining the tables is likely to miss rows (if users are in one table but not the other) or over count (if you join on group). So this may do what you want:
select group_id, sum(rapoints) as rapoints, sum(rbpoints) as rbpoints,
sum(rapoints) + sum(rbpoints) as points,
dense_rank() over (order by sum(rapoints) + sum(rbpoints) desc) as ranking
from ((select ra.group_id, sum(ra.points) as rapoints, 0 as rbpoints
from public.rankinga ra
group by ra.group_id
) union all
(select rb.group_id, 0, sum(rb.points) as rbpoints
from public.rankingb rb
group by rb.group_id
)
) ab
group by group_id;
If you want to select just one group, then put this in a subquery (or CTE) and then select the group.
Here is a SQL Fiddle.
EDIT:
If you want just the result for one group, you still need to calculate the values for all groups. So:
select ab.*
from (<above query here>) ab
where group_id = 200;

Related

How to find Max value in a column in SQL Server 2012

I want to find the max value in a column
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
2 1 10 P2
3 2 50 P2
4 2 80 P1
Above is my table structure. I just want to find the max total value only from the table. In that four row ID 1 and 2 have same value in CName but total val and PName has different values. What I am expecting is have to find the max value in ID 1 and 2
Expected result:
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
4 2 80 P1
I need result same as like mention above
select Max(Tot_Val), CName
from table1
where PName in ('P1', 'P2')
group by CName
This is query I have tried but my problem is that I am not able to bring PName in this table. If I add PName in the select list means it will showing the rows doubled e.g. Result is 100 rows but when I add PName in selected list and group by list it showing 600 rows. That is the problem.
Can someone please help me to resolve this.
One possible option is to use a subquery. Give each row a number within each CName group ordered by Tot_Val. Then select the rows with a row number equal to one.
select x.*
from ( select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt ) x
where x.No = 1;
An alternative would be to use a common table expression (CTE) instead of a subquery to isolate the first result set.
with x as
(
select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt
)
select x.*
from x
where x.No = 1;
See both solutions in action in this fiddle.
You can search top-n-per-group for this kind of a query.
There are two common ways to do it. The most efficient method depends on your indexes and data distribution and whether you already have another table with the list of all CName values.
Using ROW_NUMBER
WITH
CTE
AS
(
SELECT
ID, CName, Tot_Val, PName,
ROW_NUMBER() OVER (PARTITION BY CName ORDER BY Tot_Val DESC) AS rn
FROM table1
)
SELECT
ID, CName, Tot_Val, PName
FROM CTE
WHERE rn=1
;
Using CROSS APPLY
WITH
CTE
AS
(
SELECT CName
FROM table1
GROUP BY CName
)
SELECT
A.ID
,A.CName
,A.Tot_Val
,A.PName
FROM
CTE
CROSS APPLY
(
SELECT TOP(1)
table1.ID
,table1.CName
,table1.Tot_Val
,table1.PName
FROM table1
WHERE
table1.CName = CTE.CName
ORDER BY
table1.Tot_Val DESC
) AS A
;
See a very detailed answer on dba.se Retrieving n rows per group
, or here Get top 1 row of each group
.
CROSS APPLY might be as fast as a correlated subquery, but this often has very good performance (and better than ROW_NUMBER():
select t.*
from t
where t.tot_val = (select max(t2.tot_val)
from t t2
where t2.cname = t.cname
);
Note: The performance depends on having an index on (cname, tot_val).

Can I use a CTE data inside another CTE by joining both of them (Oracle SQL)

Requirement
I want to get the first four hundred GROUP_ID's from a table(greater than input GROUP_ID), and in the same table against each GROUP_ID, there are two LOG_ID's out of which I want the lowest one. Once I get the lowest LOG_ID, I will use that LOG_ID to get the data from another table where it is a foreign key.
APPROACH I USED
First I have formed a subset of top 400 GROUP_ID's which are greater than input GROUP_ID's
Then I used all the GROUP_IDs in my second subset where I will get the lowest LOG_ID against each GROUP_ID.
And finally, when I have the lowest LOG_ID, I used it to get the details from another table.
QUERY USED
WITH INIT AS (
SELECT GROUP_ID
FROM PV_ADAPTER_LOG
WHERE GROUP_ID > 2004141441192825
AND ADAPTER_ID IN ('2568','2602')
ORDER BY GROUP_ID
FETCH FIRST 400 ROWS ONLY
)
,INIT2 AS (
SELECT MIN(L.LOG_ID) AS LOG_ID
FROM PV_ADAPTER_LOG L
JOIN INIT ON INIT.GROUP_ID =L.GROUP_ID
GROUP BY L.GROUP_ID
)
SELECT A.LOG_ID,A.OPER_SEQ AS CALL_SEQUENCE,A.GROUP_ID ,B.INTERFACE_ID,A.INSTRUCTION_NAME, B.ADAPTER_DETAIL AS XML_CONTENT,B.SEQ AS XML_SEQUENCE
FROM INIT2
JOIN PV_ADAPTER_LOG A ON A.LOG_ID=INIT2.LOG_ID
JOIN PV_ADAPTER_LOG_DETAIL B ON B.LOG_ID=A.LOG_ID
Is my approach right or is there any other way to achieve this.
I think this is what you're looking for:
Use row_number ordered by group to find the first 400 rows
Use row_number partitioned by group and ordered by log to find the first log per group
Which is:
WITH INIT AS (
SELECT P.*,
ROW_NUMBER () OVER (
ORDER BY GROUP_ID
) RN,
ROW_NUMBER () OVER (
PARTITION BY GROUP_ID
ORDER BY LOG_ID
) MN
FROM PV_ADAPTER_LOG p
WHERE GROUP_ID > 2004141441192825
AND ADAPTER_ID IN ('2568','2602')
)
SELECT * FROM INIT
WHERE RN <= 400
AND MN = 1
You can use the analytical function to get the first 400 groups and then record with min log_id per group in a single query as follows:
SELECT GROUP_ID, LOG_ID FROM
(SELECT P.GROUP_ID, P.LOG_ID,
ROW_NUMBER() OVER (ORDER BY GROUP_ID) AS RNGRP,
ROW_NUMBER() OVER (PARTITION BY GROUP_ID ORDER BY LOG_ID) AS RNLOG
FROM PV_ADAPTER_LOG
WHERE GROUP_ID > 2004141441192825
AND ADAPTER_ID IN ('2568','2602'))
WHERE RNGRP <= 400 AND RNLOG = 1;
You can then use it wherever you want to use it. (In CTE or In Inner view)

Select a row with preceding and following rows

I have a table as follows:
CREATE TABLE results (
id uuid primary key UNIQUE,
score integer NOT NULL
)
I need to select a record with particular UUID and what's around it (say, 5 before and after) ordered by score
SELECT * FROM results
WHERE id = <SOME_UUID>
ORDERED BY score
OFFSET -5 LIMIT 10; -- apparently this is wrong
How can I effectively do that?
Its not 'effective', but you could try this:
select a.* from (SELECT * FROM results
WHERE id <> <SOME_UUID> and score <= (select score from results WHERE id = <SOME_UUID>)
ORDERED BY score,id desc
LIMIT 5) as a
UNION ALL
SELECT * FROM results
WHERE id = <SOME_UUID>
UNION ALL
select b.* from (SELECT * FROM results
WHERE id <> <SOME_UUID> and score >= (select score from results WHERE id = <SOME_UUID>)
ORDERED BY score, id asc
LIMIT 5) as b
I tried this an SQL-Server, which needded the 'ALL' to compute.
So you may get records with equal score as duplicates. To avoid this make it again to a subquery and use select distinct.
One way of solving this is with a rank for each row assigned using a window function and then finding out which ranks you are interested in:
WITH ranked AS (
SELECT id, score, rank() OVER (ORDER BY score) AS rnk
FROM results),
this_rank AS (
SELECT rnk - 5 AS low_rnk FROM ranked
WHERE id = <some uuid>::uuid)
SELECT id, score
FROM ranked, this_rank
WHERE rnk >= low_rnk
ORDER BY rnk
LIMIT 11;
For very low or high scores you get fewer than 11 rows, rather than rows with NULLs.
SQLFiddle
One further detail: A PRIMARY KEY already implies uniqueness so you do not have to use the UNIQUE clause in your table definition.

SQL query to select distinct row with minimum value

I want an SQL statement to get the row with a minimum value.
Consider this table:
id game point
1 x 5
1 z 4
2 y 6
3 x 2
3 y 5
3 z 8
How do I select the ids that have the minimum value in the point column, grouped by game? Like the following:
id game point
1 z 4
2 y 5
3 x 2
Use:
SELECT tbl.*
FROM TableName tbl
INNER JOIN
(
SELECT Id, MIN(Point) MinPoint
FROM TableName
GROUP BY Id
) tbl1
ON tbl1.id = tbl.id
WHERE tbl1.MinPoint = tbl.Point
This is another way of doing the same thing, which would allow you to do interesting things like select the top 5 winning games, etc.
SELECT *
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Point) as RowNum, *
FROM Table
) X
WHERE RowNum = 1
You can now correctly get the actual row that was identified as the one with the lowest score and you can modify the ordering function to use multiple criteria, such as "Show me the earliest game which had the smallest score", etc.
This will work
select * from table
where (id,point) IN (select id,min(point) from table group by id);
As this is tagged with sql only, the following is using ANSI SQL and a window function:
select id, game, point
from (
select id, game, point,
row_number() over (partition by game order by point) as rn
from games
) t
where rn = 1;
Ken Clark's answer didn't work in my case. It might not work in yours either. If not, try this:
SELECT *
from table T
INNER JOIN
(
select id, MIN(point) MinPoint
from table T
group by AccountId
) NewT on T.id = NewT.id and T.point = NewT.MinPoint
ORDER BY game desc
SELECT DISTINCT
FIRST_VALUE(ID) OVER (Partition by Game ORDER BY Point) AS ID,
Game,
FIRST_VALUE(Point) OVER (Partition by Game ORDER BY Point) AS Point
FROM #T
SELECT * from room
INNER JOIN
(
select DISTINCT hotelNo, MIN(price) MinPrice
from room
Group by hotelNo
) NewT
on room.hotelNo = NewT.hotelNo and room.price = NewT.MinPrice;
This alternative approach uses SQL Server's OUTER APPLY clause. This way, it
creates the distinct list of games, and
fetches and outputs the record with the lowest point number for that game.
The OUTER APPLY clause can be imagined as a LEFT JOIN, but with the advantage that you can use values of the main query as parameters in the subquery (here: game).
SELECT colMinPointID
FROM (
SELECT game
FROM table
GROUP BY game
) As rstOuter
OUTER APPLY (
SELECT TOP 1 id As colMinPointID
FROM table As rstInner
WHERE rstInner.game = rstOuter.game
ORDER BY points
) AS rstMinPoints
This is portable - at least between ORACLE and PostgreSQL:
select t.* from table t
where not exists(select 1 from table ti where ti.attr > t.attr);
Most of the answers use an inner query. I am wondering why the following isn't suggested.
select
*
from
table
order by
point
fetch next 1 row only // ... or the appropriate syntax for the particular DB
This query is very simple to write with JPAQueryFactory (a Java Query DSL class).
return new JPAQueryFactory(manager).
selectFrom(QTable.table).
setLockMode(LockModeType.OPTIMISTIC).
orderBy(QTable.table.point.asc()).
fetchFirst();
Try:
select id, game, min(point) from t
group by id

Select a Column in SQL not in Group By

I have been trying to find some info on how to select a non-aggregate column that is not contained in the Group By statement in SQL, but nothing I've found so far seems to answer my question. I have a table with three columns that I want from it. One is a create date, one is a ID that groups the records by a particular Claim ID, and the final is the PK. I want to find the record that has the max creation date in each group of claim IDs. I am selecting the MAX(creation date), and Claim ID (cpe.fmgcms_cpeclaimid), and grouping by the Claim ID. But I need the PK from these records (cpe.fmgcms_claimid), and if I try to add it to my select clause, I get an error. And I can't add it to my group by clause because then it will throw off my intended grouping. Does anyone know any workarounds for this? Here is a sample of my code:
Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid
This is the result I'd like to get:
Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid, cpe.fmgcms_claimid
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid
The columns in the result set of a select query with group by clause must be:
an expression used as one of the group by criteria , or ...
an aggregate function , or ...
a literal value
So, you can't do what you want to do in a single, simple query. The first thing to do is state your problem statement in a clear way, something like:
I want to find the individual claim row bearing the most recent
creation date within each group in my claims table
Given
create table dbo.some_claims_table
(
claim_id int not null ,
group_id int not null ,
date_created datetime not null ,
constraint some_table_PK primary key ( claim_id ) ,
constraint some_table_AK01 unique ( group_id , claim_id ) ,
constraint some_Table_AK02 unique ( group_id , date_created ) ,
)
The first thing to do is identify the most recent creation date for each group:
select group_id ,
date_created = max( date_created )
from dbo.claims_table
group by group_id
That gives you the selection criteria you need (1 row per group, with 2 columns: group_id and the highwater created date) to fullfill the 1st part of the requirement (selecting the individual row from each group. That needs to be a virtual table in your final select query:
select *
from dbo.claims_table t
join ( select group_id ,
date_created = max( date_created )
from dbo.claims_table
group by group_id
) x on x.group_id = t.group_id
and x.date_created = t.date_created
If the table is not unique by date_created within group_id (AK02), you you can get duplicate rows for a given group.
You can do this with PARTITION and RANK:
select * from
(
select MyPK, fmgcms_cpeclaimid, createdon,
Rank() over (Partition BY fmgcms_cpeclaimid order by createdon DESC) as Rank
from Filteredfmgcms_claimpaymentestimate
where createdon < 'reportstartdate'
) tmp
where Rank = 1
The direct answer is that you can't. You must select either an aggregate or something that you are grouping by.
So, you need an alternative approach.
1). Take you current query and join the base data back on it
SELECT
cpe.*
FROM
Filteredfmgcms_claimpaymentestimate cpe
INNER JOIN
(yourQuery) AS lookup
ON lookup.MaxData = cpe.createdOn
AND lookup.fmgcms_cpeclaimid = cpe.fmgcms_cpeclaimid
2). Use a CTE to do it all in one go...
WITH
sequenced_data AS
(
SELECT
*,
ROW_NUMBER() OVER (PARITION BY fmgcms_cpeclaimid ORDER BY CreatedOn DESC) AS sequence_id
FROM
Filteredfmgcms_claimpaymentestimate
WHERE
createdon < 'reportstartdate'
)
SELECT
*
FROM
sequenced_data
WHERE
sequence_id = 1
NOTE: Using ROW_NUMBER() will ensure just one record per fmgcms_cpeclaimid. Even if multiple records are tied with the exact same createdon value. If you can have ties, and want all records with the same createdon value, use RANK() instead.
You can join the table on itself to get the PK:
Select cpe1.PK, cpe2.MaxDate, cpe1.fmgcms_cpeclaimid
from Filteredfmgcms_claimpaymentestimate cpe1
INNER JOIN
(
select MAX(createdon) As MaxDate, fmgcms_cpeclaimid
from Filteredfmgcms_claimpaymentestimate
group by fmgcms_cpeclaimid
) cpe2
on cpe1.fmgcms_cpeclaimid = cpe2.fmgcms_cpeclaimid
and cpe1.createdon = cpe2.MaxDate
where cpe1.createdon < 'reportstartdate'
Thing I like to do is to wrap addition columns in aggregate function, like max().
It works very good when you don't expect duplicate values.
Select MAX(cpe.createdon) As MaxDate, cpe.fmgcms_cpeclaimid, MAX(cpe.fmgcms_claimid) As fmgcms_claimid
from Filteredfmgcms_claimpaymentestimate cpe
where cpe.createdon < 'reportstartdate'
group by cpe.fmgcms_cpeclaimid
What you are asking, Sir, is as the answer of RedFilter.
This answer as well helps in understanding why group by is somehow a simpler version or partition over:
SQL Server: Difference between PARTITION BY and GROUP BY
since it changes the way the returned value is calculated and therefore you could (somehow) return columns group by can not return.
You can use as below,
Select X.a, X.b, Y.c from (
Select X.a as a, sum (b) as sum_b from name_table X
group by X.a)X
left join from name_table Y on Y.a = X.a
Example;
CREATE TABLE #products (
product_name VARCHAR(MAX),
code varchar(3),
list_price [numeric](8, 2) NOT NULL
);
INSERT INTO #products VALUES ('paku', 'ACE', 2000)
INSERT INTO #products VALUES ('paku', 'ACE', 2000)
INSERT INTO #products VALUES ('Dinding', 'ADE', 2000)
INSERT INTO #products VALUES ('Kaca', 'AKB', 2000)
INSERT INTO #products VALUES ('paku', 'ACE', 2000)
--SELECT * FROM #products
SELECT distinct x.code, x.SUM_PRICE, product_name FROM (SELECT code, SUM(list_price) as SUM_PRICE From #products
group by code)x
left join #products y on y.code=x.code
DROP TABLE #products