SQL - Summarize column with maximum date value and other fields - sql

I have a table with the following fields:
Id|Date|Name
---------------
A|2019-04-24|"VALUE1"
A|2019-04-23|"VALUE2"
A|2019-06-11|"VALUE3"
A|2019-06-12|"VALUE4"
B|2019-05-21|"VALUE5"
B|2019-05-22|"VALUE6"
B|2019-03-13|"VALUE7"
C|2019-01-03|"VALUE8"
I would like to get one line per Id having the info of the maximum date line. This would be the output:
Id|Date|Name
---------------
A|2019-06-12|"VALUE4"
B|2019-05-22|"VALUE6"
C|2019-01-03|"VALUE8"
I have achieved through a group by getting the Id and the MAX Date, but not the value associated to that date.
What I am working on now is to inner join that table with the input one joining it on date and id, but I am not able to join on two fields.
Is there any way to bring to the result the value field related to the max date in the group by clause?
Otherwise, How could I join on two different fields those two tables?
Any Suggestion?
Thank you so much!!

You can use a correlated subquery :
select t.*
from table t
where t.date = (select max(t1.date) from table t1 where t1.id = t.id);
However, Most of DBMS supports analytical functions, so you can use :
select t.*
from (select t.*, row_number() over (partition by t.id order by t.date desc) as seq
from table t
) t
where seq = 1;

Related

select rows in sql with latest date from 3 tables in each group

I'm creating PREDICATE system for my application.
Please see image that I already
I have a question how can I select rows in SQL with latest date "Taken On" column tables for each "QuizESId" columns, before that I am understand how to select it but it only using one table, I learn from this
select rows in sql with latest date for each ID repeated multiple times
Here is what I have already tried
SELECT tt.*
FROM myTable tt
INNER JOIN
(SELECT ID, MAX(Date) AS MaxDateTime
FROM myTable
GROUP BY ID) groupedtt ON tt.ID = groupedtt.ID
AND tt.Date = groupedtt.MaxDateTime
What I am confused about here is how can I select from 3 tables, I hope you can guide me, of course I need a solution with good query and efficient performance.
Thanks
This is for SQL Server (you didn't specify exactly what RDBMS you're using):
if you want to get the "latest row for each QuizId" - this sounds like you need a CTE (Common Table Expression) with a ROW_NUMBER() value - something like this (updated: you obviously want to "partition" not just by QuizId, but also by UserName):
WITH BaseData AS
(
SELECT
mAttempt.Id AS Id,
mAttempt.QuizModelId AS QuizId,
mAttempt.StartedAt AS StartsOn,
mUser.UserName,
mDetail.Score AS Score,
RowNum = ROW_NUMBER() OVER (PARTITION BY mAttempt.QuizModelId, mUser.UserName
ORDER BY mAttempt.TakenOn DESC)
FROM
UserQuizAttemptModels mAttempt
INNER JOIN
AspNetUsers mUser ON mAttempt.UserId = muser.Id
INNER JOIN
QuizAttemptDetailModels mDetail ON mDetail.UserQuizAttemptModelId = mAttempt.Id
)
SELECT *
FROM BaseData
WHERE QuizId = 10053
AND RowNum = 1
The BaseData CTE basically selects the data (as you did) - but it also adds a ROW_NUMBER() column. This will "partition" your data into groups of data - based on the QuizModelId - and it will number all the rows inside each data group, starting at 1, and ordered by the second condition - the ORDER BY clause. You said you want to order by "Taken On" date - but there's no such date visible in your query - so I just guessed it might be on the UserQuizAttemptModels table - change and adapt as needed.
Now you can select from that CTE with your original WHERE condition - and you specify, that you want only the first row for each data group (for each "QuizId") - the one with the most recent "Taken On" date value.

SQL query for filtering duplicate rows of a column by the minimum DateTime of those corresponding rows

I have a SQL database table, "Helium_Test_Data", that has multiple entries based on the KeyID column (the KeyID represents a single tested part ). I need to query the entries and only show one entry per KeyID (part) based on the earliest creation date-time (format example is 2018-12-29 08:22:11.123). This is because the same part was tested several times but the first reading is the one I need to use. Here is the query currently tried:
SELECT mt.*
FROM Helium_Test_Data mt
INNER JOIN
(
SELECT
KeyID,
MIN(DateTime) AS DateTime
FROM Helium_Test_Data
WHERE PSNo='11166565'
GROUP BY KeyID
) t ON mt.KeyID = t.KeyID AND mt.DateTime = t.DateTime
WHERE PSNo='11167197'
AND (mt.DateTime > '2018-12-29 07:00')
AND (mt.DateTime < '2018-12-29 18:00') AND OK=1
ORDER BY KeyId,DateTime
It returns only the rows that have no duplicate KeyID present in the table whereas I need one row per every single KeyID (duplicate or not). And for the duplicate ones, I need the earliest date.
Thanks in advance for the help.
use row_number() window function which support most dbms
select * from
(
select *,row_number() over(partition by KeyID order by DateTime) rn
from Helium_Test_Data
) t where t.rn=1
or you could use corelated subquery
select t1.* from Helium_Test_Data t1
where t1.DateTime= (select min(DateTime)
from Helium_Test_Data t2
where t2.KeyID=t1.KeyID
)

Query historized data

To describe my query problem, the following data is helpful:
A single table contains the columns ID (int), VAL (varchar) and ORD (int)
The values of VAL may change over time by which older items identified by ID won't get updated but appended. The last valid item for ID is identified by the highest ORD value (increases over time).
T0, T1 and T2 are points in time where data got entered.
How do I get in an efficient manner to the Result set?
A solution must not involve materialized views etc. but should be expressible in a single SQL-query. Using Postgresql 9.3.
The correct way to select groupwise maximum in postgres is using DISTINCT ON
SELECT DISTINCT ON (id) sysid, id, val, ord
FROM my_table
ORDER BY id,ord DESC;
Fiddle
You want all records for which no newer record exists:
select *
from mytable
where not exists
(
select *
from mytable newer
where newer.id = mytable.id
and newer.ord > mytable.ord
)
order by id;
You can do the same with row numbers. Give the latest entry per ID the number 1 and keep these:
select sysid, id, val, ord
from
(
select
sysid, id, val, ord,
row_number() over (partition by id order by ord desc) as rn
from mytable
)
where rn = 1
order by id;
Left join the table (A) against itself (B) on the condition that B is more recent than A. Pick only the rows where B does not exist (i.e. A is the most recent row).
SELECT last_value.*
FROM my_table AS last_value
LEFT JOIN my_table
ON my_table.id = last_value.id
AND my_table.ord > last_value.ord
WHERE my_table.id IS NULL;
SQL Fiddle

Filter SQL data by repetition on a column

Very simple basic SQL question here.
I have this table:
Row Id __________Hour__Minute__City_Search
1___1409346767__23____24_____Balears (Illes)
2___1409346767__23____13_____Albacete
3___1409345729__23____7______Balears (Illes)
4___1409345729__23____3______Balears (Illes)
5___1409345729__22____56_____Balears (Illes)
What I want to get is only one distinct row by ID and select the last City_Search made by the same Id.
So, in this case, the result would be:
Row Id __________Hour__Minute__City_Search
1___1409346767__23____24_____Balears (Illes)
3___1409345729__23____7______Balears (Illes)
What's the easier way to do it?
Obviously I don't want to delete any data just query it.
Thanks for your time.
SELECT Row,
Id,
Hour,
Minute,
City_Search
FROM Table T
JOIN
(
SELECT MIN(Row) AS Row,
ID
FROM Table
GROUP BY ID
) AS M
ON M.Row = T.Row
AND M.ID = T.ID
Can you change hour/minute to a timestamp?
What you want in this case is to first select what uniquely identifies your row:
Select id, max(time) from [table] group by id
Then use that query to add the data to it.
SELECT id,city search, time
FROM (SELECT id, max(time) as lasttime FROM [table] GROUP BY id) as Tkey
INNER JOIN [table] as tdata
ON tkey.id = tdata.id AND tkey.lasttime = tdata.time
That should do it.
two options to do it without join...
use Row_Number function to find the last one
Select * FROM
(Select *,
row_number() over(Partition BY ID Order BY Hour desc Minute Desc) as RNB
from table)
Where RNB=1
Manipulate the string and using simple Max function
Select ID,Right(MAX(Concat(Hour,Minute,RPAD(Searc,20,''))),20)
From Table
Group by ID
avoiding Joins is usually much faster...
Hope this helps

Date of max id: sql/oracle optimization

What is a more elegant way of doing this:
select date from table where id in (
select max(id) from table);
Surely there is a better way...
You can use the ROWNUM pseudocolumn. The subquery is necessary to order the result before finding the first row:
SELECT date
FROM (SELECT * FROM table ORDER BY id DESC)
WHERE ROWNUM = 1;
You can use subquery factoring in Oracle 9i and later in the following way:
WITH ranked_table AS (
SELECT ROWNUM AS rn, date
FROM table
ORDER BY id DESC
)
SELECT date FROM ranked_table WHERE rn = 1;
You can use a self-join, and find where no row exists with a greater id:
SELECT date
FROM table t1
LEFT OUTER JOIN table t2
ON t1.id < t2.id
WHERE t2.id IS NULL;
Which solution is best depends on the indexes in your table, and the volume and distribution of your data. You should test each solution to determine what works best, is fastest, is most flexible for your needs, etc.
select date from (select date from table order by id desc)
where rownum < 2
assuming your ids are unique.
EDIT: using subquery + rownum