SQL Select Rows with Highest Version - sql

I am trying to query the following table:
ID
ConsentTitle
ConsentIdentifier
Version
DisplayOrder
1
FooTitle
foo1
1
1
2
FooTitle 2
foo1
2
2
3
Bar Title
bar1
1
3
4
Bar Title 2
bar1
2
4
My table has entries with unique ConsentTemplateIdentifier. I want to bring back only the rows with the highest version number for that particular unique Identifier...
ID
ConsentTitle
ConsentIdentifier
Version
DisplayOrder
2
FooTitle 2
foo1
2
2
4
Bar Title 2
bar1
2
4
My current query doesn't seem to work. It is telling me:
Column 'ConsentTemplates.ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
Select Distinct ID,ConsentTitle, DisplayOrder, ConsentTemplateIdentifier, MAX(Version) as Version
from FooDocs
group by ConsentTemplateIdentifier
How do I select the rows distinctly which have the highest Version number for their respective ConsentTemplateIdentifiers ordered by their display order?
Any help would be really appreciated. I am using SQL Server.

You can do this with CROSS APPLY.
SELECT DISTINCT ca.*
FROM FooDocs fd
CROSS APPLY (SELECT TOP 1 *
FROM FooDocs
WHERE ConsentIdentifier = fd.ConsentIdentifier
ORDER BY Version DESC) ca
If your unique identifier has it's own table.
SELECT ca.*
FROM ConsentTable ct
CROSS APPLY (SELECT TOP 1 *
FROM FooDocs
WHERE ConsentIdentifier = ct.Identifier
ORDER BY Version DESC) ca

Using CROSS APPLY works, but effectively invokes the sub-query for every row in in your table, then expend effort to de-duplicate the results with DISTINCT, resulting in a semi-cartesian-product / triangular-join.
It is usually much more efficient just to use ROW_NUMBER() and avoid the implicit join all together...
WITH
sorted_by_version AS
(
SELECT
*,
ROW_NUMBER()
OVER (
PARTITION BY ConsentTemplateIdentifier
ORDER BY version DESC
)
AS version_ordinal
FROM
ConsentTemplates
)
SELECT
*
FROM
sorted_by_version
WHERE
version_ordinal = 1
ORDER BY
DisplayOrder

With a slight modification to Derrick's answer, I was able to get the data back the way I wanted to see it using CROSS APPLY
SELECT DISTINCT ca.*
FROM ConsentTemplates fd
CROSS APPLY (SELECT TOP 1 *
FROM ConsentTemplates
WHERE ConsentTemplateIdentifier = fd.ConsentTemplateIdentifier
ORDER BY Version DESC) ca
order by DisplayOrder

Related

How to return the category with max value for every user in postgresql?

This is the table
id
category
value
1
A
40
1
B
20
1
C
10
2
A
4
2
B
7
2
C
7
3
A
32
3
B
21
3
C
2
I want the result like this
id
category
1
A
2
B
2
C
3
A
For small tables or for only very few rows per user, a subquery with the window function rank() (as demonstrated by The Impaler) is just fine. The resulting sequential scan over the whole table, followed by a sort will be the most efficient query plan.
For more than a few rows per user, this gets increasingly inefficient though.
Typically, you also have a users table holding one distinct row per user. If you don't have it, created it! See:
Is there a way to SELECT n ON (like DISTINCT ON, but more than one of each)
Select first row in each GROUP BY group?
We can leverage that for an alternative query that scales much better - using WITH TIES in a LATERAL JOIN. Requires Postgres 13 or later.
SELECT u.id, t.*
FROM users u
CROSS JOIN LATERAL (
SELECT t.category
FROM tbl t
WHERE t.id = u.id
ORDER BY t.value DESC
FETCH FIRST 1 ROWS WITH TIES -- !
) t;
db<>fiddle here
See:
Get top row(s) with highest value, with ties
Fetching a minimum of N rows, plus all peers of the last row
This can use a multicolumn index to great effect - which must exist, of course:
CREATE INDEX ON tbl (id, value);
Or:
CREATE INDEX ON tbl (id, value DESC);
Even faster index-only scans become possible with:
CREATE INDEX ON tbl (id, value DESC, category);
Or (the optimum for the query at hand):
CREATE INDEX ON tbl (id, value DESC) INCLUDE (category);
Assuming value is defined NOT NULL, or we have to use DESC NULLS LAST. See:
Sort by column ASC, but NULL values first?
To keep users in the result that don't have any rows in table tbl, user LEFT JOIN LATERAL (...) ON true. See:
What is the difference between LATERAL JOIN and a subquery in PostgreSQL?
You can use RANK() to identify the rows you want. Then, filtering is easy. For example:
select *
from (
select *,
rank() over(partition by id order by value desc) as rk
from t
) x
where rk = 1
Result:
id category value rk
--- --------- ------ --
1 A 40 1
2 B 7 1
2 C 7 1
3 A 32 1
See running example at DB Fiddle.

How to find Max value in a column in SQL Server 2012

I want to find the max value in a column
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
2 1 10 P2
3 2 50 P2
4 2 80 P1
Above is my table structure. I just want to find the max total value only from the table. In that four row ID 1 and 2 have same value in CName but total val and PName has different values. What I am expecting is have to find the max value in ID 1 and 2
Expected result:
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
4 2 80 P1
I need result same as like mention above
select Max(Tot_Val), CName
from table1
where PName in ('P1', 'P2')
group by CName
This is query I have tried but my problem is that I am not able to bring PName in this table. If I add PName in the select list means it will showing the rows doubled e.g. Result is 100 rows but when I add PName in selected list and group by list it showing 600 rows. That is the problem.
Can someone please help me to resolve this.
One possible option is to use a subquery. Give each row a number within each CName group ordered by Tot_Val. Then select the rows with a row number equal to one.
select x.*
from ( select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt ) x
where x.No = 1;
An alternative would be to use a common table expression (CTE) instead of a subquery to isolate the first result set.
with x as
(
select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt
)
select x.*
from x
where x.No = 1;
See both solutions in action in this fiddle.
You can search top-n-per-group for this kind of a query.
There are two common ways to do it. The most efficient method depends on your indexes and data distribution and whether you already have another table with the list of all CName values.
Using ROW_NUMBER
WITH
CTE
AS
(
SELECT
ID, CName, Tot_Val, PName,
ROW_NUMBER() OVER (PARTITION BY CName ORDER BY Tot_Val DESC) AS rn
FROM table1
)
SELECT
ID, CName, Tot_Val, PName
FROM CTE
WHERE rn=1
;
Using CROSS APPLY
WITH
CTE
AS
(
SELECT CName
FROM table1
GROUP BY CName
)
SELECT
A.ID
,A.CName
,A.Tot_Val
,A.PName
FROM
CTE
CROSS APPLY
(
SELECT TOP(1)
table1.ID
,table1.CName
,table1.Tot_Val
,table1.PName
FROM table1
WHERE
table1.CName = CTE.CName
ORDER BY
table1.Tot_Val DESC
) AS A
;
See a very detailed answer on dba.se Retrieving n rows per group
, or here Get top 1 row of each group
.
CROSS APPLY might be as fast as a correlated subquery, but this often has very good performance (and better than ROW_NUMBER():
select t.*
from t
where t.tot_val = (select max(t2.tot_val)
from t t2
where t2.cname = t.cname
);
Note: The performance depends on having an index on (cname, tot_val).

get ROW NUMBER of random records

For a simple SQL like,
SELECT top 3 MyId FROM MyTable ORDER BY NEWID()
how to add row numbers to them so that the row numbers become 1,2, and 3?
UPDATE:
I thought I can simplify my question as above, but it turns out to be more complicated. So here is a fuller version -- I need to give three random picks (from MyTable) for each person, with pick/row number of 1, 2, and 3, and there is no logical joining between person and picks.
SELECT * FROM Person
LEFT JOIN (
SELECT top 3 MyId FROM MyTable ORDER BY NEWID()
) D ON 1=1
The problem with above SQL are,
Obviously, pick/row number of 1, 2, and 3 should be added
and what is not obvious is that, the above SQL will give each person the same picks, whereas I need to give different person different picks
Here is a working SQL to test it out:
SELECT TOP 15 database_id, create_date, cs.name FROM sys.databases
CROSS apply (
SELECT top 3 Row_number()OVER(ORDER BY (SELECT NULL)) AS RowNo,*
FROM (SELECT top 3 name from sys.all_views ORDER BY NEWID()) T
) cs
So, Please help.
NOTE: This is NOT about MySQL byt T-SQL as their syntax are different, Thus the solution is different as well.
Add Row_number to outer query. Try this
SELECT Row_number()OVER(ORDER BY (SELECT NULL)),*
FROM (SELECT TOP 3 MyId
FROM MyTable
ORDER BY Newid()) a
Logically TOP keyword is processed after Select. After Row Number is generated random 3 records will be pulled. So you should not generate Row Number in original query
Update
It can be achieved through CROSS APPLY. Replace the column names inside cross apply where clause with valid column name from Person table
SELECT *
FROM Person p
CROSS apply (SELECT Row_number()OVER(ORDER BY (SELECT NULL)) rn,*
FROM (SELECT TOP 3 MyId
FROM MyTable
WHERE p.some_col = p.some_col -- Replace it with some column from person table
ORDER BY Newid())a) cs

SQL query to select distinct row with minimum value

I want an SQL statement to get the row with a minimum value.
Consider this table:
id game point
1 x 5
1 z 4
2 y 6
3 x 2
3 y 5
3 z 8
How do I select the ids that have the minimum value in the point column, grouped by game? Like the following:
id game point
1 z 4
2 y 5
3 x 2
Use:
SELECT tbl.*
FROM TableName tbl
INNER JOIN
(
SELECT Id, MIN(Point) MinPoint
FROM TableName
GROUP BY Id
) tbl1
ON tbl1.id = tbl.id
WHERE tbl1.MinPoint = tbl.Point
This is another way of doing the same thing, which would allow you to do interesting things like select the top 5 winning games, etc.
SELECT *
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Point) as RowNum, *
FROM Table
) X
WHERE RowNum = 1
You can now correctly get the actual row that was identified as the one with the lowest score and you can modify the ordering function to use multiple criteria, such as "Show me the earliest game which had the smallest score", etc.
This will work
select * from table
where (id,point) IN (select id,min(point) from table group by id);
As this is tagged with sql only, the following is using ANSI SQL and a window function:
select id, game, point
from (
select id, game, point,
row_number() over (partition by game order by point) as rn
from games
) t
where rn = 1;
Ken Clark's answer didn't work in my case. It might not work in yours either. If not, try this:
SELECT *
from table T
INNER JOIN
(
select id, MIN(point) MinPoint
from table T
group by AccountId
) NewT on T.id = NewT.id and T.point = NewT.MinPoint
ORDER BY game desc
SELECT DISTINCT
FIRST_VALUE(ID) OVER (Partition by Game ORDER BY Point) AS ID,
Game,
FIRST_VALUE(Point) OVER (Partition by Game ORDER BY Point) AS Point
FROM #T
SELECT * from room
INNER JOIN
(
select DISTINCT hotelNo, MIN(price) MinPrice
from room
Group by hotelNo
) NewT
on room.hotelNo = NewT.hotelNo and room.price = NewT.MinPrice;
This alternative approach uses SQL Server's OUTER APPLY clause. This way, it
creates the distinct list of games, and
fetches and outputs the record with the lowest point number for that game.
The OUTER APPLY clause can be imagined as a LEFT JOIN, but with the advantage that you can use values of the main query as parameters in the subquery (here: game).
SELECT colMinPointID
FROM (
SELECT game
FROM table
GROUP BY game
) As rstOuter
OUTER APPLY (
SELECT TOP 1 id As colMinPointID
FROM table As rstInner
WHERE rstInner.game = rstOuter.game
ORDER BY points
) AS rstMinPoints
This is portable - at least between ORACLE and PostgreSQL:
select t.* from table t
where not exists(select 1 from table ti where ti.attr > t.attr);
Most of the answers use an inner query. I am wondering why the following isn't suggested.
select
*
from
table
order by
point
fetch next 1 row only // ... or the appropriate syntax for the particular DB
This query is very simple to write with JPAQueryFactory (a Java Query DSL class).
return new JPAQueryFactory(manager).
selectFrom(QTable.table).
setLockMode(LockModeType.OPTIMISTIC).
orderBy(QTable.table.point.asc()).
fetchFirst();
Try:
select id, game, min(point) from t
group by id

SQL Select highest values from table on two (or more) columns

not sure if there's an elegant way to acheive this:
Data
ID Ver recID (loads more columns of stuff)
1 1 1
2 2 1
3 3 1
4 1 2
5 1 3
6 2 3
So, we have ID as the Primary Key, the Ver as the version and recID as a record ID (an arbitary base ID to tie all the versions together).
So I'd like to select from the following data, rows 3, 4 and 6. i.e. the highest version for a given record ID.
Is there a way to do this with one SQL query? Or would I need to do a SELECT DISTINCT on the record ID, then a seperate query to get the highest value? Or pull the lot into the application and filter from there?
A GROUP BY would be sufficient to get each maximum version for every recID.
SELECT Ver = MAX(Ver), recID
FROM YourTable
GROUP BY
recID
If you also need the corresponding ID, you can wrap this into a subselect
SELECT yt.*
FROM Yourtable yt
INNER JOIN (
SELECT Ver = MAX(Ver), recID
FROM YourTable
GROUP BY
recID
) ytm ON ytm.Ver = yt.Ver AND ytm.recID = yt.RecID
or, depending on the SQL Server version you are using, use ROW_NUMBER
SELECT *
FROM (
SELECT ID, Ver, recID
, rn = ROW_NUMBER() OVER (PARTITION BY recID ORDER BY Ver DESC)
FROM YourTable
) yt
WHERE yt.rn = 1
Getting maximum ver for a given recID is easy. To get the ID, you need to join on a nested query that gets these maximums:
select ID, ver, recID from table x
inner join
(select max(ver) as ver, recID
from table
group by recID) y
on x.ver = y.ver and x.recID = y.recID
You could use a cte with ROW_NUMBER function:
WITH cte AS(
SELECT ID, Ver, recID
, ROW_NUMBER()OVER(PARTITION BY recID ORDER BY Ver DESC)as RowNum
FROM data
)
SELECT ID,Ver,recID FROM cte
WHERE RowNum = 1
straighforward example using a subquery:
SELECT a.*
FROM tab a
WHERE ver = (
SELECT max(ver)
FROM tab b
WHERE b.recId = a.recId
)
(Note: this assumes that the combination of (recId, ver) is unique. Typically there would be a primary key or unique constraint on those columns, in that order, and then that index can be used to optimize this query)
This works on almost all RDBMS-es, although the correlated subquery might not be handled very efficiently (depending on RDBMS). SHould work fine in MS SQL 2008 though.