Get records with the newest date in Oracle - sql

I need to find the emails of the last person that performed an action over a post. The database structure is a little bit complicated because of several reasons not important for the case.
SELECT u.address
FROM text t
JOIN post p ON (p.pid=t.pid)
JOIN node n ON (n.nid=p.nid)
JOIN user u ON (t.login=u.login)
WHERE n.nid='123456'
AND p.created IN (
SELECT max(p.created)
FROM text t
JOIN post p ON (p.pid=t.pid)
JOIN node n ON (n.nid=p.nid)
WHERE n.nid='123456');
I would like to know if there is a way to do use the max function or any other way to get the latest date without having to make a subquery (that is almost the same as the main query).
Thank you very much

You can use a window function (aka "analytical" function) to calculate the max date.
Then you can select all rows where the created date equals the max. date.
select address
from (
SELECT u.address,
p.created,
max(p.created) over () as max_date
FROM text t
JOIN post p ON (p.pid=t.pid)
JOIN node n ON (n.nid=p.nid)
JOIN user u ON (t.login=u.login)
WHERE n.nid='123456'
) t
where created = max_date;
The over() clause is empty as you didn't use a GROUP BY in your question. But if you need e.g. the max date per address then you could use
max(p.created) over (partition by t.adress) as max_date
The partition by works like a group by
You can also extend that query to work for more than one n.id. In that you you have to include it in the partition:
max(p.created) over (partition by n.id, ....) as max_date
Btw: if n.id is a numeric column you should not compare it to a string literal. '123456' is a string, 123456 is a number

SELECT address
FROM (
SELECT u.address,
row_number() OVER (PARTITION BY n.nid ORDER BY p.created DESC) AS rn
FROM text t JOIN post p ON (p.pid=t.pid)
JOIN node n ON (n.nid=p.nid)
JOIN user u ON (t.login=u.login)
WHERE n.nid='123456'
)
WHERE rn = 1;
The ROW_NUMBER function numbers the rows in descending order of p.created with PARTITION BY n.nid making separate partitions for row numbers of separate n.nids.

Related

SQL - Selecting records by the newest date for each record

I have a table with some columns, amongst which there are two - number, joining_date.
What I want to select is the newest joining date and matching number. I created the following script:
SELECT ac.number, ac.joining_date
FROM accounts ac
INNER JOIN (
SELECT number, MAX(joining_date) as maxDate FROM accounts GROUP BY number
) iac ON ac.number = iac.number AND ac.joining_date = iac.maxDate;
It seems fine, however, I have noticed that when the joining_date is equal i.e. 2020-04-02 10:17:00.000000 for more than one record, the number appears twice in the result, even if the MAX should return only one row.
Question: how to retrieve only one number by the newest joining_date? Is DISTINCT guarantees that?
With the DISTINCT ON clause:
SELECT DISTINCT ON(number) number, joining_date
FROM accounts
ORDER BY number, joining_date DESC
EDIT: I overlooked the postgres tag, rendering my answer not very useful.
Original answer:
There are several options, and I usually use CROSS APPLY:
SELECT ac.number, ac.joining_date
FROM accounts ac
CROSS APPLY (SELECT TOP 1 a.number, a.joining_date
FROM accounts a
WHERE a.number = ac.number AND a.joining_date = ac.joining_date
ORDERBY a.joining_date DESC) iac
Another option is to use Row_Number() OVER (PARTITION BY a.number ORDER BY a.joining_date DESC) as rownum in the subquery, while also adding WHERE rownum=1 to ensure only one match.

SQL Server : get latest date from 2 tables

I have two tables P and G and want to write a query that will get the latest date from table G and will not pull in duplicate client IDs:
Table P
Table G
I want to get this result from the query:
So far I have joined the tables, but unable get the result intended.
Any help would be appreciated.
Not sure how your tables are related other than your column ClientID, but you would want to join the two tables on those columns:
select p.clientid,
max(g.created_on) latest_created_on,
max(p.info) as info
from tableP p
left join tableG g on p.ClientID = g.ClientID
group by p.clientid;
SQL Fiddle Demo
You can use OVER PARTITION to take the record with the most recent date for each ClientID.
In this case, I would write:
SELECT g.ClientID,
g.created_on,
g.INFO
FROM (
SELECT ClientID
created_on,
INFO,
row_number() OVER ( PARTITION BY ClientID ORDER BY created_on DESC) AS RowNum
FROM Table_G
) AS g
WHERE g.RowNum = 1
The subquery creates a table with all the columns you want, and the row_number() function assigns each record a row_number. PARTITION BY says what to group by, and ORDER BY says how to sort within that partition.
In this case, you want the record with the most recent date for each ClientID. We group by ClientID, sort by date to assign row numbers, and then in the main query, we select only the first row in each group, using WHERE g.RowNum = 1
This is a guide for PostreSQL, but it's helped me understand OVER PARTITION.

Column is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

I'm trying to select the latest date and group by name and keep other columns.
For example:
name status date
-----------------------
a l 13/19/04
a n 13/09/05
a dd 13/18/03
b l 13/01/01
b dd 13/01/02
b n 13/01/03
and I want the result like:
name status date
-----------------
a n 13/09/05
b n 13/01/03
Here's my code
SELECT
Name,
MAX(DATE) as Date,
Status
FROM
[ST].[dbo].[PS_RC_STATUS_TBL]
GROUP BY
Name
I know that I should put max(status) because There are a lot of possibilities in each case, and nothing in the query makes it clear which value to choose for status in each group. Is there anyway to use inner join ?
It's not clear to me you want the max or min status. Rather it seems to me you want the name and status as of a date certain. That is, you want the rows with the lastest date for each name. So ask for that:
select * from PS_RC_STATUS_TBL as T
where exists (
select 1 from PS_RC_STATUS_TBL
where name = T.name
group by name
having max(date) = T.date
)
Another way to think about it is
select T.*
from PS_RC_STATUS_TBL as T
join (
select name, max(date) as date
from PS_RC_STATUS_TBL
group by name
) as D
on T.name = D.name
and T.date = D.date
SQL Server needs to know what to do with the rows that you are not grouping on (it has multiple rows to show on 1 line - so how?). If you have aggregated on them (MIN, MAX, AVG, etc) then you are telling it what to do with these rows. If not it will not know what to do - and will give you an error like the one you are getting.
From what you are saying though - it sounds like you do not want to group by the status. It sounds like you are not interested in that column at all. Let me know If that assumption is wrong.
SELECT
Name,
MAX(Date) AS 'Date',
FROM
PS_RC_STATUS_TBL
GROUP BY
Name
If you really do want the status, but don't want to group on it - try this:
SELECT
MyTable1.Name,
MyTable2.Status,
MyTable1.Date
FROM
(SELECT Name, MAX(Date) AS 'Date' FROM PS_RC_STATUS_TBL GROUP BY Name) MyTable1
INNER JOIN
(SELECT Name, Date, Status FROM PS_RC_STATUS_TBL) MyTable2
ON MyTable1.Name = MyTable2.Name
AND MyTable1.Date = MyTable2.Date
That gives the exact results you've asked for - so does the method below using a CTE.
OR
WITH cte AS (
SELECT Name, MAX(Date) AS Date
FROM PS_RC_STATUS_TBL
GROUP BY Name)
SELECT cte.Name,
tbl.Status,
cte.Date
FROM cte INNER JOIN
PS_RC_STATUS_TBL tbl ON cte.Name = tbl.Name
AND cte.Date = tbl.Date
SQLFiddle example.
It just means that you need to put all non-aggregated columns in the GROUP BY clause, so in the case you need to put the other one
Select Name ,
MAX(DATE) as Date ,
Status
FROM [ST].[dbo].[PS_RC_STATUS_TBL] PS
Group by Name, Status
This is a common problem with text fields in SQL aggregation scenarios. Using either MAX(Status) or MIN(Status) in your field list is a solution, usually MAX(Status) because of the lexical ordering:
"" < " " < "a"
In cases where you really need a more detailed ordering:
Join to a StatusOrder relation (*Status, OrderSequence) in your main query;
select Max(OrderSequence) in your aggregated query; and
Join back to your StatusOrder relation on OrderSequence to select the correct Status value for display.
Whatever fields you're selecting other than aggregation function, need to mention in group by clause.
SELECT
gf.app_id,
ma.name as name,
count(ma.name) as count
FROM [dbo].[geo_fen_notification_table] as gf
inner join dbo.mobile_applications as ma on gf.app_id = ma.id
GROUP BY app_id,name
Here im accessing app_id and name in select, so i need to mention that after group by clause. otherwise it will throw error.

Distinctly sum a column on a joined table?

This is a simple problem, and I'm not sure if its possible here. Here's the problem:
=> http://sqlfiddle.com/#!12/584f1/7
Explanation:
A ticket belongs to an attendee
An attendee has a revenue
I need to group the tickets by section and get the total revenue.
This double counts attendees because 2 tickets can belong to the same attendee, thus double counting it. I'd like to grab the sum of the revenue, but only count the attendees once.
In my sqlfiddle example, I'd like to see:
section | total_revenue
------------------------
A | 40 <= 40 is correct, but I'm getting 50...
B | null
C | 40
I'd like to solve this without the use of sub queries. I need a scalable solution that will allow me to do this for multiple columns on different joins in a single query. So whatever allows me to accomplish this, I'm open to suggestions.
Thanks for your help.
Here is a version using row_number():
select section,
sum(revenue) Total
from
(
select t.section, a.revenue,
row_number() over(partition by a.id, t.section order by a.id) rn
from tickets t
left join attendees a
on t.attendee_id = a.id
) src
where rn = 1
group by section
order by section;
See SQL Fiddle with Demo
Again, without subquery:
Key element is to add PARTITION BY to the window function(s):
SELECT DISTINCT
t.section
-- ,sum(count(*)) OVER (PARTITION BY t.section) AS tickets_count
,sum(min(a.revenue)) OVER (PARTITION BY t.section) AS atendees_revenue
FROM tickets t
LEFT JOIN attendees a ON a.id = t.attendee_id
GROUP BY t.attendee_id, t.section
ORDER BY t.section;
-> sqlfiddle
Here, you GROUP BY t.attendee_id, t.section, before you run the result through the window function. And use PARTITION BY t.section in the window function as you want results partitioned by section this time.
Uncomment the second line if you want to get a count of tickets, too.
Otherwise, it works similar to my answer to your previous question. I.e., the rest of the explanation applies.
You can do this:
select t.section, sum(d.revenue)
from
(
SELECT DISTINCT section, attendee_id FROM tickets
) t
left join attendees d on t.attendee_id = d.id
group by t.section
order by t.section;

One row of data for a max date only - transact SQL

I am trying to select the max dates on a field with other tables, to only give me one distinct row for the max date and not other rows with other dates. the code i have for max is
SELECT DISTINCT
Cust.CustId,
LastDate=(Select Max(Convert(Date,TreatmentFieldHstry.TreatmentDateTime))
FROM TreatmentFieldHstry
WHERE Cust.CustSer = Course.CustSer
AND Course.CourseSer = Session.CourseSer
AND Session.SessionSer = TreatmentFieldHstry.SessionSer)
This gives multiple rows depending on how many dates - i just want one for the max - can anyone help with this?
Thanks
You didn't specify exactly what database and version you're using - but if you're on SQL Server 2005 or newer, you can use something like this (a CTE with the ROW_NUMBER ranking function) - I've simplified it a bit, since I don't know what those other tables are that you have in your select, that don't ever show up in any of the SELECT column lists.....
;WITH TopData AS
(
SELECT c.CustId, t.TreatmentDateTime,
ROW_NUMBER() OVER(PARTITION BY c.CustId ORDER BY t.TreatmentDateTime DESC) AS 'RowNum'
FROM
dbo.TreatmentFieldHstry t
INNER JOIN
dbo.Customer c ON c.CustId = t.CustId -- or whatever JOIN condition you have
WHERE
c.CustSer = Course.CustSer
)
SELECT
*
FROM
TopData
WHERE
RowNum = 1
Basically, the CTE (Common Table Expression) partitions your data by CustId and order by TreatmentDateTime (descending - newest first) - and numbers every entry with a consecutive number - for each "partition" (e.g. for each new value of CustId). With this, the newest entry for each customer has RowNum = 1 which is what I use to select it from that CTE.