SQL Select highest values from table on two (or more) columns - sql

not sure if there's an elegant way to acheive this:
Data
ID Ver recID (loads more columns of stuff)
1 1 1
2 2 1
3 3 1
4 1 2
5 1 3
6 2 3
So, we have ID as the Primary Key, the Ver as the version and recID as a record ID (an arbitary base ID to tie all the versions together).
So I'd like to select from the following data, rows 3, 4 and 6. i.e. the highest version for a given record ID.
Is there a way to do this with one SQL query? Or would I need to do a SELECT DISTINCT on the record ID, then a seperate query to get the highest value? Or pull the lot into the application and filter from there?

A GROUP BY would be sufficient to get each maximum version for every recID.
SELECT Ver = MAX(Ver), recID
FROM YourTable
GROUP BY
recID
If you also need the corresponding ID, you can wrap this into a subselect
SELECT yt.*
FROM Yourtable yt
INNER JOIN (
SELECT Ver = MAX(Ver), recID
FROM YourTable
GROUP BY
recID
) ytm ON ytm.Ver = yt.Ver AND ytm.recID = yt.RecID
or, depending on the SQL Server version you are using, use ROW_NUMBER
SELECT *
FROM (
SELECT ID, Ver, recID
, rn = ROW_NUMBER() OVER (PARTITION BY recID ORDER BY Ver DESC)
FROM YourTable
) yt
WHERE yt.rn = 1

Getting maximum ver for a given recID is easy. To get the ID, you need to join on a nested query that gets these maximums:
select ID, ver, recID from table x
inner join
(select max(ver) as ver, recID
from table
group by recID) y
on x.ver = y.ver and x.recID = y.recID

You could use a cte with ROW_NUMBER function:
WITH cte AS(
SELECT ID, Ver, recID
, ROW_NUMBER()OVER(PARTITION BY recID ORDER BY Ver DESC)as RowNum
FROM data
)
SELECT ID,Ver,recID FROM cte
WHERE RowNum = 1

straighforward example using a subquery:
SELECT a.*
FROM tab a
WHERE ver = (
SELECT max(ver)
FROM tab b
WHERE b.recId = a.recId
)
(Note: this assumes that the combination of (recId, ver) is unique. Typically there would be a primary key or unique constraint on those columns, in that order, and then that index can be used to optimize this query)
This works on almost all RDBMS-es, although the correlated subquery might not be handled very efficiently (depending on RDBMS). SHould work fine in MS SQL 2008 though.

Related

SQL Select Rows with Highest Version

I am trying to query the following table:
ID
ConsentTitle
ConsentIdentifier
Version
DisplayOrder
1
FooTitle
foo1
1
1
2
FooTitle 2
foo1
2
2
3
Bar Title
bar1
1
3
4
Bar Title 2
bar1
2
4
My table has entries with unique ConsentTemplateIdentifier. I want to bring back only the rows with the highest version number for that particular unique Identifier...
ID
ConsentTitle
ConsentIdentifier
Version
DisplayOrder
2
FooTitle 2
foo1
2
2
4
Bar Title 2
bar1
2
4
My current query doesn't seem to work. It is telling me:
Column 'ConsentTemplates.ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
Select Distinct ID,ConsentTitle, DisplayOrder, ConsentTemplateIdentifier, MAX(Version) as Version
from FooDocs
group by ConsentTemplateIdentifier
How do I select the rows distinctly which have the highest Version number for their respective ConsentTemplateIdentifiers ordered by their display order?
Any help would be really appreciated. I am using SQL Server.
You can do this with CROSS APPLY.
SELECT DISTINCT ca.*
FROM FooDocs fd
CROSS APPLY (SELECT TOP 1 *
FROM FooDocs
WHERE ConsentIdentifier = fd.ConsentIdentifier
ORDER BY Version DESC) ca
If your unique identifier has it's own table.
SELECT ca.*
FROM ConsentTable ct
CROSS APPLY (SELECT TOP 1 *
FROM FooDocs
WHERE ConsentIdentifier = ct.Identifier
ORDER BY Version DESC) ca
Using CROSS APPLY works, but effectively invokes the sub-query for every row in in your table, then expend effort to de-duplicate the results with DISTINCT, resulting in a semi-cartesian-product / triangular-join.
It is usually much more efficient just to use ROW_NUMBER() and avoid the implicit join all together...
WITH
sorted_by_version AS
(
SELECT
*,
ROW_NUMBER()
OVER (
PARTITION BY ConsentTemplateIdentifier
ORDER BY version DESC
)
AS version_ordinal
FROM
ConsentTemplates
)
SELECT
*
FROM
sorted_by_version
WHERE
version_ordinal = 1
ORDER BY
DisplayOrder
With a slight modification to Derrick's answer, I was able to get the data back the way I wanted to see it using CROSS APPLY
SELECT DISTINCT ca.*
FROM ConsentTemplates fd
CROSS APPLY (SELECT TOP 1 *
FROM ConsentTemplates
WHERE ConsentTemplateIdentifier = fd.ConsentTemplateIdentifier
ORDER BY Version DESC) ca
order by DisplayOrder

How to find Max value in a column in SQL Server 2012

I want to find the max value in a column
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
2 1 10 P2
3 2 50 P2
4 2 80 P1
Above is my table structure. I just want to find the max total value only from the table. In that four row ID 1 and 2 have same value in CName but total val and PName has different values. What I am expecting is have to find the max value in ID 1 and 2
Expected result:
ID CName Tot_Val PName
--------------------------------
1 1 100 P1
4 2 80 P1
I need result same as like mention above
select Max(Tot_Val), CName
from table1
where PName in ('P1', 'P2')
group by CName
This is query I have tried but my problem is that I am not able to bring PName in this table. If I add PName in the select list means it will showing the rows doubled e.g. Result is 100 rows but when I add PName in selected list and group by list it showing 600 rows. That is the problem.
Can someone please help me to resolve this.
One possible option is to use a subquery. Give each row a number within each CName group ordered by Tot_Val. Then select the rows with a row number equal to one.
select x.*
from ( select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt ) x
where x.No = 1;
An alternative would be to use a common table expression (CTE) instead of a subquery to isolate the first result set.
with x as
(
select mt.ID,
mt.CName,
mt.Tot_Val,
mt.PName,
row_number() over(partition by mt.CName order by mt.Tot_Val desc) as No
from MyTable mt
)
select x.*
from x
where x.No = 1;
See both solutions in action in this fiddle.
You can search top-n-per-group for this kind of a query.
There are two common ways to do it. The most efficient method depends on your indexes and data distribution and whether you already have another table with the list of all CName values.
Using ROW_NUMBER
WITH
CTE
AS
(
SELECT
ID, CName, Tot_Val, PName,
ROW_NUMBER() OVER (PARTITION BY CName ORDER BY Tot_Val DESC) AS rn
FROM table1
)
SELECT
ID, CName, Tot_Val, PName
FROM CTE
WHERE rn=1
;
Using CROSS APPLY
WITH
CTE
AS
(
SELECT CName
FROM table1
GROUP BY CName
)
SELECT
A.ID
,A.CName
,A.Tot_Val
,A.PName
FROM
CTE
CROSS APPLY
(
SELECT TOP(1)
table1.ID
,table1.CName
,table1.Tot_Val
,table1.PName
FROM table1
WHERE
table1.CName = CTE.CName
ORDER BY
table1.Tot_Val DESC
) AS A
;
See a very detailed answer on dba.se Retrieving n rows per group
, or here Get top 1 row of each group
.
CROSS APPLY might be as fast as a correlated subquery, but this often has very good performance (and better than ROW_NUMBER():
select t.*
from t
where t.tot_val = (select max(t2.tot_val)
from t t2
where t2.cname = t.cname
);
Note: The performance depends on having an index on (cname, tot_val).

Query historized data

To describe my query problem, the following data is helpful:
A single table contains the columns ID (int), VAL (varchar) and ORD (int)
The values of VAL may change over time by which older items identified by ID won't get updated but appended. The last valid item for ID is identified by the highest ORD value (increases over time).
T0, T1 and T2 are points in time where data got entered.
How do I get in an efficient manner to the Result set?
A solution must not involve materialized views etc. but should be expressible in a single SQL-query. Using Postgresql 9.3.
The correct way to select groupwise maximum in postgres is using DISTINCT ON
SELECT DISTINCT ON (id) sysid, id, val, ord
FROM my_table
ORDER BY id,ord DESC;
Fiddle
You want all records for which no newer record exists:
select *
from mytable
where not exists
(
select *
from mytable newer
where newer.id = mytable.id
and newer.ord > mytable.ord
)
order by id;
You can do the same with row numbers. Give the latest entry per ID the number 1 and keep these:
select sysid, id, val, ord
from
(
select
sysid, id, val, ord,
row_number() over (partition by id order by ord desc) as rn
from mytable
)
where rn = 1
order by id;
Left join the table (A) against itself (B) on the condition that B is more recent than A. Pick only the rows where B does not exist (i.e. A is the most recent row).
SELECT last_value.*
FROM my_table AS last_value
LEFT JOIN my_table
ON my_table.id = last_value.id
AND my_table.ord > last_value.ord
WHERE my_table.id IS NULL;
SQL Fiddle

SQL query to select distinct row with minimum value

I want an SQL statement to get the row with a minimum value.
Consider this table:
id game point
1 x 5
1 z 4
2 y 6
3 x 2
3 y 5
3 z 8
How do I select the ids that have the minimum value in the point column, grouped by game? Like the following:
id game point
1 z 4
2 y 5
3 x 2
Use:
SELECT tbl.*
FROM TableName tbl
INNER JOIN
(
SELECT Id, MIN(Point) MinPoint
FROM TableName
GROUP BY Id
) tbl1
ON tbl1.id = tbl.id
WHERE tbl1.MinPoint = tbl.Point
This is another way of doing the same thing, which would allow you to do interesting things like select the top 5 winning games, etc.
SELECT *
FROM
(
SELECT ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Point) as RowNum, *
FROM Table
) X
WHERE RowNum = 1
You can now correctly get the actual row that was identified as the one with the lowest score and you can modify the ordering function to use multiple criteria, such as "Show me the earliest game which had the smallest score", etc.
This will work
select * from table
where (id,point) IN (select id,min(point) from table group by id);
As this is tagged with sql only, the following is using ANSI SQL and a window function:
select id, game, point
from (
select id, game, point,
row_number() over (partition by game order by point) as rn
from games
) t
where rn = 1;
Ken Clark's answer didn't work in my case. It might not work in yours either. If not, try this:
SELECT *
from table T
INNER JOIN
(
select id, MIN(point) MinPoint
from table T
group by AccountId
) NewT on T.id = NewT.id and T.point = NewT.MinPoint
ORDER BY game desc
SELECT DISTINCT
FIRST_VALUE(ID) OVER (Partition by Game ORDER BY Point) AS ID,
Game,
FIRST_VALUE(Point) OVER (Partition by Game ORDER BY Point) AS Point
FROM #T
SELECT * from room
INNER JOIN
(
select DISTINCT hotelNo, MIN(price) MinPrice
from room
Group by hotelNo
) NewT
on room.hotelNo = NewT.hotelNo and room.price = NewT.MinPrice;
This alternative approach uses SQL Server's OUTER APPLY clause. This way, it
creates the distinct list of games, and
fetches and outputs the record with the lowest point number for that game.
The OUTER APPLY clause can be imagined as a LEFT JOIN, but with the advantage that you can use values of the main query as parameters in the subquery (here: game).
SELECT colMinPointID
FROM (
SELECT game
FROM table
GROUP BY game
) As rstOuter
OUTER APPLY (
SELECT TOP 1 id As colMinPointID
FROM table As rstInner
WHERE rstInner.game = rstOuter.game
ORDER BY points
) AS rstMinPoints
This is portable - at least between ORACLE and PostgreSQL:
select t.* from table t
where not exists(select 1 from table ti where ti.attr > t.attr);
Most of the answers use an inner query. I am wondering why the following isn't suggested.
select
*
from
table
order by
point
fetch next 1 row only // ... or the appropriate syntax for the particular DB
This query is very simple to write with JPAQueryFactory (a Java Query DSL class).
return new JPAQueryFactory(manager).
selectFrom(QTable.table).
setLockMode(LockModeType.OPTIMISTIC).
orderBy(QTable.table.point.asc()).
fetchFirst();
Try:
select id, game, min(point) from t
group by id

sql query to get earliest date

If I have a table with columns id, name, score, date
and I wanted to run a sql query to get the record where id = 2 with the earliest date in the data set.
Can you do this within the query or do you need to loop after the fact?
I want to get all of the fields of that record..
If you just want the date:
SELECT MIN(date) as EarliestDate
FROM YourTable
WHERE id = 2
If you want all of the information:
SELECT TOP 1 id, name, score, date
FROM YourTable
WHERE id = 2
ORDER BY Date
Prevent loops when you can. Loops often lead to cursors, and cursors are almost never necessary and very often really inefficient.
SELECT TOP 1 ID, Name, Score, [Date]
FROM myTable
WHERE ID = 2
Order BY [Date]
While using TOP or a sub-query both work, I would break the problem into steps:
Find target record
SELECT MIN( date ) AS date, id
FROM myTable
WHERE id = 2
GROUP BY id
Join to get other fields
SELECT mt.id, mt.name, mt.score, mt.date
FROM myTable mt
INNER JOIN
(
SELECT MIN( date ) AS date, id
FROM myTable
WHERE id = 2
GROUP BY id
) x ON x.date = mt.date AND x.id = mt.id
While this solution, using derived tables, is longer, it is:
Easier to test
Self documenting
Extendable
It is easier to test as parts of the query can be run standalone.
It is self documenting as the query directly reflects the requirement
ie the derived table lists the row where id = 2 with the earliest date.
It is extendable as if another condition is required, this can be easily added to the derived table.
Try
select * from dataset
where id = 2
order by date limit 1
Been a while since I did sql, so this might need some tweaking.
Using "limit" and "top" will not work with all SQL servers (for example with Oracle).
You can try a more complex query in pure sql:
select mt1.id, mt1."name", mt1.score, mt1."date" from mytable mt1
where mt1.id=2
and mt1."date"= (select min(mt2."date") from mytable mt2 where mt2.id=2)