PostgreSQL Selecting Most Recent Entry for a Given ID - sql

Table Essentially looks like:
Serial-ID, ID, Date, Data, Data, Data, etc.
There can be Multiple Rows for the Same ID. I'd like to create a view of this table to be used in Reports that only shows the most recent entry for each ID. It should show all of the columns.
Can someone help me with the SQL select? thanks.

There's about 5 different ways to do this, but here's one:
SELECT *
FROM yourTable AS T1
WHERE NOT EXISTS(
SELECT *
FROM yourTable AS T2
WHERE T2.ID = T1.ID AND T2.Date > T1.Date
)
And here's another:
SELECT T1.*
FROM yourTable AS T1
LEFT JOIN yourTable AS T2 ON
(
T2.ID = T1.ID
AND T2.Date > T1.Date
)
WHERE T2.ID IS NULL
One more:
WITH T AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Date DESC) AS rn
FROM yourTable
)
SELECT * FROM T WHERE rn = 1
Ok, i'm getting carried away, here's the last one I'll post(for now):
WITH T AS (
SELECT ID, MAX(Date) AS latest_date
FROM yourTable
GROUP BY ID
)
SELECT yourTable.*
FROM yourTable
JOIN T ON T.ID = yourTable.ID AND T.latest_date = yourTable.Date

I would use DISTINCT ON
CREATE VIEW your_view AS
SELECT DISTINCT ON (id) *
FROM your_table a
ORDER BY id, date DESC;
This works because distinct on suppresses rows with duplicates of the expression in parentheses. DESC in order by means the one that normally sorts last will be first, and therefor be the one that shows in the result.
https://www.postgresql.org/docs/10/static/sql-select.html#SQL-DISTINCT

This seems like a good use for correlated subqueries:
CREATE VIEW your_view AS
SELECT *
FROM your_table a
WHERE date = (
SELECT MAX(date)
FROM your_table b
WHERE b.id = a.id
)
Your date column would need to uniquely identify each row (like a TIMESTAMP type).

Related

Can I nest a select statement within an IF function in SQL?

Using Teradata..
I want to write a query that joins table 1 and table 2 on item code to the location in table 2.
There are multiple locations per item code and potentially multiple item code entries per location depending on date. I'm only interested in the most recent item per location. To achieve this I've used a nested query to select the max date per both location and item number. I'm still returning more rows of data than anticipated and suspect it is due to some duplicate locations slipping through, potentially with two different item numbers.
I'm wondering if its possible to use the IF operator to say "If there are duplicate locations, choose the location with the more recent date"
Is this possible?
Here is what I have written so far:
SELECT t1.item_no, t1.date, t2.location, t2.date
FROM table 1 t1
JOIN table 2 t2 ON t1.item_no = t2.item_no
WHERE (t1.item_no, t1.date) IN
(
SELECT item_no, MAX(date)
FROM table 1
GROUP BY item_no
)
AND (t2.location, t2.date) IN
(
SELECT location, MAX(date)
FROM table 2
GROUP BY location
)
Change your query and use Subquery
SELECT t1.item_no, t1.date, t2.location, t2.date FROM
(
SELECT item_no, MAX(date) date
FROM table 1
GROUP BY item_no
) T1
JOIN
(
SELECT location, MAX(date) date
FROM table 2
GROUP BY location
) T2
ON t1.item_no = t2.location
Without knowing DBMS, a solution could be to use ROW_NUMBER(). I'm not sure if there's a preference for nested queries over say CTE but a solution w/ CTE could be:
WITH items AS (
SELECT
item_no
,date AS item_date
,row_number() OVER (PARTITION BY item_no ORDER BY date desc) as rn
FROM table1
),
locations AS (
SELECT
location
,item_no
,date AS location_date
,ROW_NUMBER() OVER(PARTITION BY item_no, location ORDER BY date desc) as rn
from table2
)
SELECT
t1.item_no
,t1.item_date
,t2.location
,t2.location_date
FROM items AS t1
JOIN locations AS t2 on t1.item_no = t2.item_no
AND t2.rn = 1
WHERE t1.rn = 1

How to get Full Record with MAX as aggregate function

I have a table with schema (id, date, value, source, ticker). I wanted to get record having highest ID group by date in sql server
Example Data
ID|date|value|source|ticker
3|10-Dec-2017|10|a|b
1|10-Dec-2017|11|p|q
Below query works in Sqlite. Do we know if I can do same with SqlServer
select max(id), date, value, source, ticker from table group by date
Expected return:-
ID|date|value|source|ticker
3|10-Dec-2017|10|a|b
Also how I can do same operation on UNION of 2 tables with same schema.
You can use subquery :
select t.*
from table t
where id = (select max(t1.id) from table t1 where t1.date = t.date);
However, you can also use row_number() function :
select top (1) with ties *
from table t
order by row_number() over (partition by [date] order by id desc);
You can also do it like below :
select t1.* from table1 t1
join (
select max(id) as id, [date] from table1
group by [date]
) as t2 on t1.id = t2.id
SQL HERE

SQL - Latest record based on time-stamp and ID

Changing my whole question as I get a lot of complaints about posting images. I also added a code which is more similar to my situation. My apologies I am new to SO, I try and make it as easy as possible for you.
I use IBM DB2 DBMS
I have a query which selects a lot of records(messages) that always have an ID(which is supposed to be unique), a status(error, completed) and a time-stamp. My query is the following;
select *
from tableone tr, tabletwo ms
where ms.TS BETWEEN '2017-09-15 00:00:00.000' and '2017-09-16 00:00:00.000'
and ms.ID=tr.ID
and ms.STATUS in ('ERROR','COMPLETED')
ORDER by tr.ID
The ID is unique to one message, a message can get multiple statuses on different time-stamps, which will result in multiple records as output of the query above.
I wish to only have records with unique messages and the latest gotten status.
I hope you guys and gals can help, thanks in advance.
Postgres, Oracle, SQL Server:
with CTE as
(
select t1.*, row_number() over(partition by t1.ID order by t1.Timestamp desc) rn
from MyTable t1
where t1.STATUS in ('ERROR','COMPLETED')
)
select *
from CTE
where rn = 1
MySQL
select t1.*
from MyTable t1
inner join
(
select t2.ID, max(t2.Timestamp) as MaxT
from MyTable t2
where t2.STATUS in ('ERROR','COMPLETED')
group by t2.ID
) x3
on x3.ID = t1.ID
and x3.MaxT = t1.Timestamp
where t1.STATUS in ('ERROR','COMPLETED')
Try it
select *
from table_name a,
where a.STATUS in ('ERROR','COMPLETED')
and a.TimeStamp = (select max(b.TimeStamp)
from table_name b,
where a.ID=B.ID)
ORDER by a.ID
or
select *
from table_name a,
where a.STATUS in ('ERROR','COMPLETED')
and a.TimeStamp = (select Top(1) b.TimeStamp
from table_name b,
where a.ID=B.ID
order by b.TimeStamp desc)
ORDER by a.ID
Here is the code based on your values.
I Reversed your ID to get only numbers. I did a row_number by the new ID and sorting it desc to get the newest.
With pretty ID:
select * from (
select *,ROW_NUMBER() over(partition by TrueID order by timestamp DESC) as RN
from (
SELECT REVERSE(substring(reverse([ID]),1,2)) as TrueID
,[Status]
,[Timestamp]
FROM [LegOgSpass].[dbo].[statustable])x
)z where RN= 1
With original ID:
select * from (
select *,ROW_NUMBER() over(partition by ID order by timestamp DESC) as RN
from (
SELECT ID
,[Status]
,[Timestamp]
FROM [LegOgSpass].[dbo].[statustable])x
)z where RN= 1

Select duplicated data from table

Query
select * from table1
where having count(reference)>1
I want to select * the data which have duplicate data,any idea why my query is not working?
Below are my expect result..
You can make use of window function count to find number of rows per id and reference and then filter to get those which have count more than 1.
;with cte as (
select t.*, count(*) over (partition by id, reference) cnt
from table1 t
)
select * from cte where cnt > 1;
Demo
In the above solution, I have made an assumption that name and id has one to one correspondence (which is true as per your given data). If that's not the case, add name too in the partition by clause:
;with cte as (
select t.*, count(*) over (partition by name, id, reference) cnt
from table1 t
)
select * from cte where cnt > 1;
I might actually approach this by using a subquery with GROUP BY:
SELECT t1.*
FROM table1 t1
INNER JOIN
(
SELECT Name, ID, reference
FROM table1
GROUP BY Name, ID, reference
HAVING COUNT(*) > 1
) t2
ON t1.Name = t2.Name AND
t1.ID = t2.ID AND
t1.reference = t2.reference
Demo here:
Rextester
Try this ), first i get count by partition, after that i get row with count > 1
select No, Name, ID, Reference
from (select count(*) over (partition by name, ID, reference) cnt, table1.* from table1)
where cnt>1
The easy way (although maybe not the best for performance) would be:
select * from table1 where reference in (
select reference from table1 group by reference having count(*)>1
)
In a subselect you have the duplicated data, and in the outter select you have all the data for these references.

reuse table alias in another select

I have a sql statement:
select id from table1 t1, table t2
where.....
order by ( select count(owner_id) from t2) ASC;
What I want to do here is to select the id of the item whose owner has least number of items.
Is this possible? If not, what I can do to achieve to goal?
Thanks in advance!
You don't mention what SQL you're using but you can do this, or something similar, in PL ( and My I believe ); I'm assuming you're linking table 1 and 2 on id; I haven't ordered by the count(owner_id) alone as this will always be the same value. Obviously partition by whatever you want to get the correct count you're after.
select id
from ( select t1.id, t2.ct
from table1 t1
, ( select id, count(owner_id) over ( partition by id ) as ct
from table2 ) t2
where t1.id = t2.id
order by t2.ct ASC )
;