Get last_modified_date from a group of date of each target - sql

I've a table in oracle database:
Transaction_ID Target Status Last_modified_date
80913570 8536349 1 2018-10-03 03:40:36.0
80913540 8860342 1 2018-09-28 08:45:32.0
80913541 9135368 1 2018-09-28 08:45:42.0
80913532 8860342 1 2018-09-28 08:12:52.0
80913624 9256309 1 2018-10-05 01:25:06.0
80913573 9256309 0 2018-10-03 07:18:35.0
80913574 9256309 0 2018-10-03 07:21:26.0
80913576 9256309 1 2018-10-03 07:28:36.0
80913613 5429179 0 2018-10-08 05:45:00.0
80913614 5429179 1 2018-10-04 06:48:06.0
In this table, I want most recent modified dates of all Target. As some Targets have single record while others are with multiple modified dates.
I tried following query:
select max(last_modified_date) from demoTable where target in (select distinct target from demoTable);
But, is getting only one value among all Targets due to in condition while I want values of to all Targets.
*PL/SQL too can be used to achieve the results. But I'm new to the industry, I don't know exactly how to do it.

Use group by
select target,max(last_modified_date) from demoTable
group by target

use co-related sub-query, As you need each target recent date so you can choose any of the method from below two
select t.* from demoTable t
where t.Last_modified_date in
( select max(Last_modified_date) from demoTable t1
where t1.Target=t.Target
)
Or use row_number window function
select Transaction_ID ,Target , Status, Last_modified_date from
(
select Transaction_ID ,Target , Status, Last_modified_date , row_number() over(partition by target order by Last_modified_date desc) as rn from demoTable
) t where t.rn=1

Related

deleting specific duplicate and original entries in a table based on date

i have a table called "main" which has 4 columns, ID, name, DateID and Sign.
i want to create a query that will delete entries in this table if there is the same ID record in twice within a certain DateID.
i have my where clause that searches the previous 3 weeks
where DateID =((SELECT MAX( DateID)
WHERE DateID < ( SELECT MAX( DateID )-3))
e.g of my dataset im working with:
id
name
DateID
sign
12345
Paul
1915
Up
23658
Danny
1915
Down
37868
Jake
1916
Up
37542
Elle
1917
Up
12345
Paul
1917
Down
87456
John
1918
Up
78563
Luke
1919
Up
23658
Danny
1920
Up
in the case above, both entries for ID 12345 would need to be removed.
however the entries for ID 23658 would need to be kept as the DateID > 3
how would this be possible?
You can use window functions for this.
It's not quite clear, but it seems LAG and conditional COUNT should fit what you need.
DELETE t
FROM (
SELECT *,
CountWithinDate = COUNT(CASE WHEN t.PrevDate >= t.DateId - 3 THEN 1 END) OVER (PARTITION BY t.id)
FROM (
SELECT *,
PrevDate = LAG(t.DateID) OVER (PARTITION BY t.id ORDER BY t.DateID)
FROM YourTable t
) t
) t
WHERE CountWithinDate > 0;
db<>fiddle
Note that you do not need to re-join the table, you can delete directly from the t derived table.
Hope this works:
DELETE FROM test_tbl
WHERE id IN (
SELECT T1.id
FROM test_tbl T1
WHERE EXISTS (SELECT 1 FROM test_tbl T2 WHERE T1.id = T2.id AND ABS(T2.dateid - T1.dateid) < 3 AND T1.dateid <> T2.dateid)
)
In case you need more logic for data processing, I would suggest using Stored Procedure.

datediff for row that meets my condition only once per row

I want to do a datediff between 2 dates on different rows only if the rows have a condition.
my table looks like the following, with additional columns (like guid)
Id | CreateDateAndTime | condition
---------------------------------------------------------------
1 | 2018-12-11 12:07:55.273 | with this
2 | 2018-12-11 12:07:53.550 | I need to compare this state
3 | 2018-12-11 12:07:53.550 | with this
4 | 2018-12-11 12:06:40.780 | state 3
5 | 2018-12-11 12:06:39.317 | I need to compare this state
with this example I would like to have 2 rows in my selection which represent the difference between the dates from id 5-3 and from id 2-1.
As of now I come with a request that gives me the difference between dates from id 5-3 , id 5-1 and id 2-1 :
with t as (
SELECT TOP (100000)
*
FROM mydatatable
order by CreateDateAndTime desc)
select
DATEDIFF(SECOND, f.CreateDateAndTime, s.CreateDateAndTime) time
from t f
join t s on (f.[guid] = s.[guid] )
where f.condition like '%I need to compare this state%'
and s.condition like '%with this%'
and (f.id - s.id) < 0
My problem is I cannot set f.id - s.id to a value since other rows can be between the ones I want to make the diff on.
How can I make the datediff only on the first rows that meet my conditions?
EDIT : To make it more clear
My condition is an eventname and I want to calculate the time between the occurence of my event 1 and my event 2 and fill a column named time for example.
#Salman A answer is really close to what I want except it will not work when my event 2 is not happening (which was not in my initial example)
i.e. in table like the following , it will make the datediff between row id 5 and row id 2
Id | CreateDateAndTime | condition
---------------------------------------------------------------
1 | 2018-12-11 12:07:55.273 | with this
2 | 2018-12-11 12:07:53.550 | I need to compare this state
3 | 2018-12-11 12:07:53.550 | state 3
4 | 2018-12-11 12:06:40.780 | state 3
5 | 2018-12-11 12:06:39.317 | I need to compare this state
the code I modified :
WITH cte AS (
SELECT id
, CreateDateAndTime AS currdate
, LAG(CreateDateAndTime) OVER (PARTITION BY guid ORDER BY id desc ) AS prevdate
, condition
FROM t
WHERE condition IN ('I need to compare this state', 'with this ')
)
SELECT *
,DATEDIFF(second, currdate, prevdate) time
FROM cte
WHERE condition = 'I need to compare this state '
and DATEDIFF(second, currdate, prevdate) != 0
order by id desc
Perhaps you want to match ids with the nearest smaller id. You can use window functions for this:
WITH cte AS (
SELECT id
, CreateDateAndTime AS currdate
, CASE WHEN LAG(condition) OVER (PARTITION BY guid ORDER BY id) = 'with this'
THEN LAG(CreateDateAndTime) OVER (PARTITION BY guid ORDER BY id) AS prevdate
, condition
FROM t
WHERE condition IN ('I need to compare this state', 'with this')
)
SELECT *
, DATEDIFF(second, currdate, prevdate)
FROM cte
WHERE condition = 'I need to compare this state'
The CASE expression will match this state with with this. If you have mismatching pairs then it'll return NULL.
try by using analytic function lead()
with cte as
(
select 1 as id, '2018-12-11 12:07:55.273' as CreateDateAndTime,'with this' as condition union all
select 2,'2018-12-11 12:07:53.550','I need to compare this state' union all
select 3,'2018-12-11 12:07:53.550','with this' union all
select 4,'2018-12-11 12:06:40.780','state 3' union all
select 5,'2018-12-11 12:06:39.317','I need to compare this state'
) select *,
DATEDIFF(SECOND,CreateDateAndTime,lead(CreateDateAndTime) over(order by Id))
from cte
where condition in ('with this','I need to compare this state')
You Ideally want LEADIF/LAGIF functions, because you are looking for the previous row where condition = 'with this'. Since there are no LEADIF/LAGIFI think the best option is to use OUTER/CROSS APPLY with TOP 1, e.g
CREATE TABLE #T (Id INT, CreateDateAndTime DATETIME, condition VARCHAR(28));
INSERT INTO #T (Id, CreateDateAndTime, condition)
VALUES
(1, '2018-12-11 12:07:55', 'with this'),
(2, '2018-12-11 12:07:53', 'I need to compare this state'),
(3, '2018-12-11 12:07:53', 'with this'),
(4, '2018-12-11 12:06:40', 'state 3'),
(5, '2018-12-11 12:06:39', 'I need to compare this state');
SELECT ID1 = t1.ID,
Date1 = t1.CreateDateAndTime,
ID2 = t2.ID,
Date2 = t2.CreateDateAndTime,
Difference = DATEDIFF(SECOND, t1.CreateDateAndTime, t2.CreateDateAndTime)
FROM #T AS t1
CROSS APPLY
( SELECT TOP 1 t2.CreateDateAndTime, t2.ID
FROM #T AS t2
WHERE t2.Condition = 'with this'
AND t2.CreateDateAndTime > t1.CreateDateAndTime
--AND t2.GUID = t.GUID
ORDER BY CreateDateAndTime
) AS t2
WHERE t1.Condition = 'I need to compare this state';
Which Gives:
ID1 Date1 D2 Date2 Difference
-------------------------------------------------------------------------------
2 2018-12-11 12:07:53.000 1 2018-12-11 12:07:55.000 2
5 2018-12-11 12:06:39.000 3 2018-12-11 12:07:53.000 74
I would enumerate the values and then use window functions for the difference.
select min(id), max(id),
datediff(second, min(CreateDateAndTime), max(CreateDateAndTime)) as seconds
from (select t.*,
row_number() over (partition by condition order by CreateDateAndTime) as seqnum
from t
where condition in ('I need to compare this state', 'with this')
) t
group by seqnum;
I cannot tell what you want the results to look like. This version only output the differences, with the ids of the rows you care about. The difference can also be applied to the original rows, rather than put into summary rows.

How to get the first record from a group of records MS sql

I have several business Unit ids with dates that records were added and I want to get the rows of when the last record was added only
2018-06-26 22:54:51.190 1
2018-07-05 10:36:49.563 1
2018-07-16 10:14:04.093 1
2018-07-17 15:24:22.173 1
2018-07-19 10:40:24.700 1
2018-07-23 09:53:34.607 13
2018-07-23 09:53:57.107 13
2018-09-04 14:55:04.860 4
2018-09-04 14:56:34.147 4
should be
2018-07-19 10:40:24.700 1
2018-07-23 09:53:57.107 13
2018-09-04 14:56:34.147 4
I tried this
select
mnt.DateAdded,
bu.BuId
from BusinessUnit bu
inner join MagicNumbersTable mnt on mnt.BuId = bu.BuId
where bu.BuId in (select top 1 b.BuId
from BusinessUnit b inner join MagicNumbersTable m on b.BuId = m.BuId
group by b.BuId, m.DateAdded
order by m.DateAdded desc)
which only return one record and not each first record order by dateadded desc per bu
What would the correct way be of achieving this?
You can use row_number() function with ties clause :
select top (1) with ties mnt.DateAdded, bu.BuId
from BusinessUnit bu inner join
MagicNumbersTable mnt
on mnt.BuId = bu.BuId
order by row_number() over (partition by bu.BuId order by mnt.DateAdded desc);
However, you can simply use GROUP BY clasue with MAX(), but assuming you want some more info with this. If not, then GROUP BYclause is enough to implement not need to use subuqery or row_number()
From my understanding, I wrote this. What you think about this.
create table #maxDate (date datetime, id int)
insert #maxDate values
( '2018-06-26 22:54:51.190', 1)
, ( '2018-07-05 10:36:49.563', 1)
, ( '2018-07-16 10:14:04.093', 1)
, ( '2018-07-17 15:24:22.173', 1)
, ( '2018-07-19 10:40:24.700', 1)
, ( '2018-07-23 09:53:34.607', 13)
, ( '2018-07-23 09:53:57.107', 13)
, ( '2018-09-04 14:55:04.860', 4)
, ( '2018-09-04 14:56:34.147', 4)
select id, max(date) date from #maxDate
group by id
id date
-------------------------------
1 2018-07-19 10:40:24.700
4 2018-09-04 14:56:34.147
13 2018-07-23 09:53:57.107
Revert me, if query needs updates.

Remove duplicate rows based on field in a select query with PostgreSQL?

Considering the table mdl_files that contains the following fields: id, contenthash, timecreated, filesize.
This tables stores attachment files.
We consider that all the rows with the same content hash are duplicate rows and I just want to keep the oldest row (or first if dates are equals).
How can I do that?
The following query:
SELECT
id,
contenthash,
filesize,
to_timestamp(timecreated) :: DATE
FROM mdl_files
ORDER BY contenthash;
returns:
2480229 00002e87605311feb82b70473b61e81f0223c774 18178 2016-10-05
2997411 0000bfd20ef84948eee6811ce5bbac03de42ccb0 1293 2017-03-31
1304839 000280169fc78d704a2d4569bfb6f42ea4a1d5ae 8203 2015-11-10
1364656 000280169fc78d704a2d4569bfb6f42ea4a1d5ae 8203 2015-11-17
71568 0003c6aec5835964870902d697c06d21abf76bf7 139439 2013-04-19
2959945 000419c19d77df7285e669614075b47414e3ab2c 398 2017-03-20
3483049 00061dc0bc2452304107ddc75e7ee2908c729905 28618 2017-08-17
3483047 00061dc0bc2452304107ddc75e7ee2908c729905 28618 2017-08-17
I want to get this resultset:
2480229 00002e87605311feb82b70473b61e81f0223c774 18178 2016-10-05
2997411 0000bfd20ef84948eee6811ce5bbac03de42ccb0 1293 2017-03-31
1304839 000280169fc78d704a2d4569bfb6f42ea4a1d5ae 8203 2015-11-10
71568 0003c6aec5835964870902d697c06d21abf76bf7 139439 2013-04-19
2959945 000419c19d77df7285e669614075b47414e3ab2c 398 2017-03-20
3483049 00061dc0bc2452304107ddc75e7ee2908c729905 28618 2017-08-17
I want the following duplicated lines to be removed from the resultset:
1364656 000280169fc78d704a2d4569bfb6f42ea4a1d5ae 8203 2015-11-17
3483047 00061dc0bc2452304107ddc75e7ee2908c729905 28618 2017-08-17
Use DISTINCT ON:
SELECT DISTINCT ON (contenthash)
id,
contenthash,
filesize,
to_timestamp(timecreated) :: DATE
FROM mdl_files
ORDER BY contenthash, timecreated, id;
DISTINCT ON is a Postgres extension that makes sure that returns one row for each unique combination of the keys in parentheses. The specific row is the first one found based on the order by clause.
You can try to use ROW_NUMBER() with windows function to make row number then delete it.
SELECT t.*
FROM (
SELECT
id,
contenthash,
filesize,
ROW_NUMBER() OVER (PARTITION BY contenthash,filesize order by timecreated) rn
FROM mdl_files
) t
where t.rn = 1
sqlfiddle
If you want to DELETE duplicate data you can use EXISTS in where clause.
DELETE
FROM mdl_files f WHERE EXISTS(
SELECT 1
FROM (
SELECT
id,
contenthash,
filesize,
ROW_NUMBER() OVER (PARTITION BY contenthash,filesize order by timecreated) rn
FROM mdl_files
) t
where t.rn > 1 and t.id = f.id
)
sqlfiddle

SQL aggregate using DISTINCT on ID by latest date

Request
I have a section of data below and my goal is to limit the agent column to be distinct only containing unique values, where the unique value selected is the latest date it was modified.
Existing Data
modified agent rank
2016-10-18 346502 0
2013-06-04 346502 41
2011-10-31 346503 0
2012-08-13 346505 0
2016-04-18 346506 66
2015-01-27 346506 1
2016-01-21 346507 103
2015-01-27 346507 130
2012-01-30 346508 0
Trying to use this answer https://stackoverflow.com/a/29912858/461887 as a basis but cannot get where to aggregate it properly.
SQL not working
SELECT DISTINCT
FLiex.agtprof.modify_date_time
,FLiex.agtprof.agent_id
,FLiex.agtprof.rank
,FLiex.agtprof.external_id
WHERE
FLiex.agtprof.modify_date_time = MAX( FLiex.agtprof.modify_date_time)
FROM
FLiex.agtprof
Desired Output
modify agent rank
18/10/2016 346502 0
18/04/2016 346506 66
21/01/2016 346507 103
13/08/2012 346505 0
30/01/2012 346508 0
31/10/2011 346503 0
You're attempting to get single row data, but based on the other rows. While this may be possible with aggregate functions, it's much easier to do with window (analytic) functions:
SELECT [modified], [agent], [rank], [id]
FROM (SELECT [modified], [agent], [rank], [id],
ROW_NUMBER() OVER (PARTITION BY [agent]
ORDER BY [modified] DESC) AS rn
FROM [agtprof]) t
WHERE rn = 1
SELECT DISTINCT max(id_date), agent, rank, id
FROM fliex.agtprof
GROUP BY 2,3,4;
Try this. I think if you chose the max id_date and then group by the rest, you should get the results you're looking for.
Try this:
SELECT
FLiex.agtprof.modify_date_time
,FLiex.agtprof.agent_id
,FLiex.agtprof.rank
,FLiex.agtprof.external_id
FROM
FLiex.agtprof
INNER JOIN (
SELECT
Max(FLiex.agtprof.modify_date_time) as max_mod_date_time
,FLiex.agtprof.agent_id as agent_id
FROM
FLiex.agtprof
GROUP BY FLiex.agtprof.agent_id
) Filter
ON FLiex.agtprof.agentID = Filter.agent_id
AND FLiex.agtprof.modify_date_time = Filter.max_mod_date_time