Removing duplicates rows by max date on ordered table

Removing duplicates rows by max date on ordered table - sql

I have the table Example Table with the following data
Example Table
ID
DATE
NAME
9
2021-04-13 21:39:00.569000
ABC
8
2020-12-17 16:49:17.903000
ABC
7
2020-12-16 16:49:17.903000
ABC
6
2020-06-09 09:55:52.005000
WER
5
2020-06-09 09:55:52.004000
WER
4
2020-06-08 09:48:43.318000
YTG
3
2020-06-05 14:51:42.860000
YTG
2
2020-04-28 13:58:30.972000
YTG
1
2020-04-25 13:58:30.972000
ABC
And I want to get for every distinct NAME in the table it's has max date until found a diffrent NAME. So the result set must be as the following
Expected Output
ID
DATE
NAME
9
2021-04-13 21:39:00.569000
ABC
6
2020-06-09 09:55:52.005000
WER
4
2020-06-08 09:48:43.318000
YTG
1
2020-04-25 13:58:30.972000
ABC
I tried sql query as below and got no result.
select NAME
from
(
select NAME, max(DATE) - min(DATE) as diff
from Example_Table
group by NAME
) ex
order by diff desc;

Use lead():
select t.*
from (select t.*,
lead(name) over (order by date) as next_name
from t
) t
where next_name is null or next_name <> name;

Related

Netezza add new field for first record value of the day in SQL

I'm trying to add new columns of first values of the day for location and weight.
For instance, the original data format is:
id dttm location weight
--------------------------------------------
1 1/1/20 11:10:00 A 40
1 1/1/20 19:07:00 B 41.1
2 1/1/20 08:01:00 B 73.2
2 1/1/20 21:00:00 B 73.2
2 1/2/20 10:03:00 C 74
I want each id to have only one day record, such as:
id dttm location weight
--------------------------------------------
1 1/1/20 11:10:00 A 40
2 1/1/20 08:01:00 B 73.2
2 1/2/20 10:03:00 C 74
I have other columns in my data set that I'm using location and weight to create, so I don't think I can just filter for 'first' records of the day.. Is it possible to write query to recognize first record of the day for those two columns and create new column with those values?

You can use row_number():
select t.*
from (select t.*,
row_number() over (partition by id, ddtm::date order by dttm) as seqnum
from t
) t
where seqnum = 1;

Hive: Query to get max count per word per date

Here's the data I have:
date | word | count
01/01/2020 #abc 1
01/01/2020 #xyz 2
02/05/2020 #ghi 2
02/05/2020 #def 1
02/04/2020 #pqr 4
02/04/2020 #cde 3
01/01/2020 #lmn 1
Here's the result that I want:
date | word | count
01/01/2020 #xyz 2
02/04/2020 #pqr 4
02/05/2020 #ghi 2
So basically, I want the word with maximum count on each particular date.
Can someone help me out with the query?

Use row_number window function with partition by and order by clause and select only the maximum count from the partition!
SELECT date,word,count
FROM (
SELECT date,word,count,row_number() over (partition by date order by count desc) as rn
from <table_name>) sq
WHERE sq.rn = 1;

How to get latest records based on two columns of max

I have a table called Inventory with the below columns
item warehouse date sequence number value
111 100 2019-09-25 12:29:41.000 1 10
111 100 2019-09-26 12:29:41.000 1 20
222 200 2019-09-21 16:07:10.000 1 5
222 200 2019-09-21 16:07:10.000 2 10
333 300 2020-01-19 12:05:23.000 1 4
333 300 2020-01-20 12:05:23.000 1 5
Expected Output:
item warehouse date sequence number value
111 100 2019-09-26 12:29:41.000 1 20
222 200 2019-09-21 16:07:10.000 2 10
333 300 2020-01-20 12:05:23.000 1 5
Based on item and warehouse, i need to pick latest date and latest sequence number of value.
I tried with below code
select item,warehouse,sequencenumber,sum(value),max(date) as date1
from Inventory t1
where
t1.date IN (select max(date) from Inventory t2
where t1.warehouse=t2.warehouse
and t1.item = t2.item
group by t2.item,t2.warehouse)
group by t1.item,t1.warehouse,t1.sequencenumber
Its working for latest date but not for latest sequence number.
Can you please suggest how to write a query to get my expected output.

You can use row_number() for this:
select *
from (
select
t.*,
row_number() over(
partition by item, warehouse
order by date desc, sequence_number desc, value desc
) rn
from mytable t
) t
where rn = 1

Need to update Current version of FROM_DT to previous version TO_DATE for same table

Requirement is like below: In the same table i need as below expected output.
Table name: TAB
Current Output:
PRIM_KEY| FROM_DT | TO_DT
11111 01-JAN-00 01-JAN-25
11112 01-MAR-16 01-JAN-25
Expecting Output:
PRIM_KEY| FROM_DT | TO_DT
11111 01-JAN-00 01-MAR-16
11112 01-MAR-16 01-JAN-25

You can use window function and focus only on row following,the last row will be null you can skip that.
MERGE
INTO table1 AS A
USING
(
SELECT PRIM_KEY, FROM_DT, MIN(FROM_DT) OVER ( PARTITION BY 1 ORDER BY PRIM_KEY ROWS BETWEEN 1 FOLLOWING AND 1 FOLLOWING) AS NEXT_START_DT
FROM table1) AS B
ON A.PRIM_KEY = B.PRIM_KEY AND B. PRIM_KEY IS NOT NULL
WHEN MATCHED THEN
UPDATE SET TO_DT = NEXT_START_DT;

SQL Select the Columnid with a max column group by one column

I think this question is already answered but it didn't satisfy my question.
I'd like to select the id/s of the names group by the latest date value (MAX) in my table. Using a group by column Name and group by column Date, I must get the ID, Name, Date.
Here is my table
ID Name Date
---------------------------------------
1 Brent 2012-02-17
2 Ash 2012-08-02
3 Brent 2012-08-15
4 Harold 2012-09-30
5 Margaret 2012-10-10
6 Ash 2012-12-01
7 Harold 2013-02-14
8 Ash 2012-01-01
9 Brent 2013-05-11
Output must be:
ID Name Date
---------------------------------------
5 Margaret 2012-10-10
6 Ash 2012-12-01
7 Harold 2013-02-14
9 Brent 2013-05-11
I try this statement:
SELECT
[ID], [Name], MAX([Date]) as [Date]
FROM
[SampleTable]
GROUP BY
[Name]
But I get this error:
Column 'ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

you can use Window Function such as ROW_NUMBER()
SELECT a.ID, a.Name, a.Date
FROM
(
SELECT ID, Name, Date,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY DATE DESC) rn
FROM TableName
) a
WHERE a.rn = 1
if ID and Name is the same for every group, you can simply add Name in the GROUP BY clause.
GROUP BY ID, Name

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Removing duplicates rows by max date on ordered table - sql

Use lead(): select t.* from (select t.*, lead(name) over (order by date) as next_name from t ) t where next_name is null or next_name <> name;

Related

Netezza add new field for first record value of the day in SQL

Hive: Query to get max count per word per date

How to get latest records based on two columns of max

Need to update Current version of FROM_DT to previous version TO_DATE for same table

SQL Select the Columnid with a max column group by one column

Categories

Resources