SQL Update Latest rows for a distinct column

SQL Update Latest rows for a distinct column - sql

I want to set the SGNumber = 2 for the last modified for each unique MaterialeNo.
Start table:
Expected result:
If it is of any help, i made a select query that only selects the latest of each with each MaterialNo, which are the rows i want to update
select distinct MaterialeNo, max(SGNumber) as tag
from Mydatabase
group by MaterialeNo

With a join of the table to the query that gets the max LastModified:
update m
set m.SGNumber = 2
from Mydatabase m inner join (
select MaterialeNo, max(LastModified) as MaxLastModified
from Mydatabase
group by MaterialeNo
) t on t.MaterialeNo = m.MaterialeNo and t.MaxLastModified = m.LastModified
See the demo.
Results:
> id | MaterialeNo | LastModified | SGNumber
> -: | :---------- | :------------------ | -------:
> 1 | test1 | 05/07/2019 00:00:00 | 1
> 2 | test1 | 06/07/2019 00:00:00 | 2
> 3 | test2 | 04/07/2019 00:00:00 | 1
> 4 | test2 | 05/07/2019 00:00:00 | 2

Use This. using your row ID to make sure those items with same LastModified and MaterialNo wont get updated.
Update MyDatabase
Set SGNumber = 2
FROM
(select t1.id from
(select id, row_number() over (partition by MaterialNo order by LastModified desc) rownum from MyDatabase) t1
Where t1.rownum = 1) t2
WHERE MyDatabase .id = t2.id

if you are using MS SQL Server you can do it with row_number it will count (same item and set order number for id) how seems in your case i show you have latest dates at the end se you can achive it simply
so you can check this sample i did for you
Declare #Mydatabase table (Id int identity(1,1),MaterialNo varchar(20),LastModified datetime,SGNumber int)
insert into #Mydatabase values ('TEST 1','2019.07.05',1)
insert into #Mydatabase values ('TEST 1','2019.07.06',1)
insert into #Mydatabase values ('TEST 2','2019.07.04',1)
insert into #Mydatabase values ('TEST 2','2019.07.05',1)
select Id,MaterialNo,LastModified,SGNumber, ROW_NUMBER() OVER(PARTITION BY MaterialNo ORDER BY LastModified ASC) Numbering
from #Mydatabase
results
Id MaterialNo LastModified SGNumber Numbering
-- ---------- ----------------------- -------- ---------
1 TEST 1 2019-07-05 00:00:00.000 1 1
2 TEST 1 2019-07-06 00:00:00.000 1 2
3 TEST 2 2019-07-04 00:00:00.000 1 1
4 TEST 2 2019-07-05 00:00:00.000 1 2

Use an updatable CTE and row_number():
with toupdate as (
select t.*,
row_number() over (partition by MaterialNo order by LastModified desc) as rownum
from mydatabase t
)
update toupdate
set sgnumber = 2
where seqnum = 1;

Related

Delete of duplicate records

I have a table where I would like to identify duplicate records based on two columns(id and role) and I use a third column (unit) to select a subset of records to analyze and do the deletion within. Here comes the table and a few rows a example data:
id | role | unit
----------------
946| 1001 | 1
946| 1002 | 1
946| 1003 | 1
946| 1001 | 2
946| 1002 | 2
900| 1001 | 3
900| 1002 | 3
900| 1001 | 3
An analysis of unit 1 and 2 should identify two rows to delete 946/1001 and 946/1002. It doesn't matter if I delete the rows labeled unit 1 or 2. In a subsequent step I will update all records labeled unit=2 to unit=1.
I have a select statement capable to identify the rows to delete:
SELECT * FROM (SELECT
unit,
id,
role,
ROW_NUMBER() OVER (
PARTITION BY
id,
role
ORDER BY
id,
role
) row_num
FROM thetable WHERE unit IN (1,2) ) as x
WHERE row_num > 1;
This query will give this result:
id | role | unit
----------------
946| 1001 | 2
946| 1002 | 2
Now I would like to combine this with DELETE to delete the identified records. I have come pretty close (I believe) with this statement:
DELETE FROM thetable tp1 WHERE EXISTS
(SELECT
unit,
id,
role,
ROW_NUMBER() OVER (
PARTITION BY
id,
role
ORDER BY
id,
role
) as row_num
FROM
thetable tp2
WHERE unit IN (1,2) AND
tp1.unit=tp2.unit AND
tp1.role=tp2.role AND
tp1.id=tp2.id AND row_num >1
)
However, the row_num is not recognized as column. So how should I modify this statement to delete the two identified records?

It is very simple with EXISTS:
DELETE FROM thetable t
WHERE t.unit IN (1,2)
AND EXISTS (
SELECT 1 FROM thetable
WHERE (id, role) = (t.id, t.role) AND unit < t.unit
)
See the demo.
Results:
> id | role | unit
> --: | ---: | ---:
> 946 | 1001 | 1
> 946 | 1002 | 1
> 946 | 1003 | 1
> 900 | 1001 | 3
> 900 | 1002 | 3
> 900 | 1001 | 3

You could phrase this as:
delete from thetable t
where t.unit > (
select min(t1.unit)
from thetable t1
where t1.id = t.id and t1.role = t.role
)
This seems like a simple way to solve the assignment, basically phrasing as: delete rows for which another row exists with a smaller unit and the same id and role.
As for the query you wanted to write, using row_number(), I think that would be:
delete from thetable t
using (
select t.*, row_number() over(partition by id, role order by unit) rn
from mytable t
) t1
where t1.id = t.id and t1.role = t.role and t1.unit = t.unit and t1.rn > 1

SQL Ranking by blocks

Im sure the answer to this is going to end up being really obvious, but i just cant get this bit of sql to work.
I have a table that has 3 columns in:
User | Date | AchievedTarget
----------------------------------------
1 | 2018-01-01 | 1
1 | 2018-02-01 | 0
1 | 2018-03-01 | 1
1 | 2018-04-01 | 1
1 | 2018-05-01 | 0
I want to add a ranking as follows based on the AchievedTarget column, is this possible with the data in the table above to create the ranking in the table below:
User | Date | AchievedTarget | Rank
----------------------------------------
1 | 2018-01-01 | 1 | 1
1 | 2018-02-01 | 0 | 1
1 | 2018-03-01 | 1 | 1
1 | 2018-04-01 | 1 | 2
1 | 2018-05-01 | 0 | 1

This is a guess, based on that this is actually a gaps and island question. if so, this does result in the second dataset the OP has provided:
CREATE TABLE dbo.TestTable ([User] tinyint, --Avoid using keywords for column names
[date] date, --Avoid using datatypes for column names
AchievedTarget bit);
GO
INSERT INTO dbo.TestTable ([User],[date],AchievedTarget)
VALUES (1,'20180101',1),
(1,'20180201',0),
(1,'20180301',1),
(1,'20180401',1),
(1,'20180501',0);
GO
WITH Grps AS(
SELECT [User],[date],AchievedTarget,
ROW_NUMBER() OVER (ORDER BY [date]) -
ROW_NUMBER() OVER (PARTITION BY AchievedTarget ORDER BY [date]) AS Grp
FROM dbo.TestTable)
SELECT [User],[date],AchievedTarget,
ROW_NUMBER() OVER (PARTITION BY AchievedTarget, Grp ORDER BY [date]) AS [Rank] --Avoid using keywords for column names
FROM Grps
ORDER BY [date]
GO
DROP TABLE dbo.TestTable;

Other method:
with tmp as (
select row_number() over(order by date) ID, *
from dbo.TestTable
)
select f1.*, NbBefore + 1
from tmp f1
outer apply
(
select top 1 f2.ID IDLimit from tmp f2 where f2.ID<f1.ID and f2.AchievedTarget<>f1.AchievedTarget
order by f2.ID desc
) f3
outer apply
(
select count(*) NbBefore from tmp f4 where f4.ID<f1.ID and f4.ID> f3.IDLimit
) f5

SQL Server - Select Distinct of two columns, where the distinct column selected has a maximum value based on two other columns

I have 2 tables - TC and T, with columns specified below. TC maps to T on column T_ID.
TC
----
T_ID,
TC_ID
T
-----
T_ID,
V_ID,
Datetime,
Count
My current result set is:
V_ID TC_ID Datetime Count
----|-----|------------|--------|
2 | 1 | 2013-09-26 | 450600 |
2 | 1 | 2013-12-09 | 14700 |
2 | 1 | 2014-01-22 | 15000 |
2 | 1 | 2014-01-22 | 15000 |
2 | 1 | 2014-01-22 | 7500 |
4 | 1 | 2014-01-22 | 1000 |
4 | 1 | 2013-12-05 | 0 |
4 | 2 | 2013-12-05 | 0 |
Using the following query:
select T.V_ID,
TC.TC_ID,
T.Datetime,
T.Count
from T
inner join TC
on TC.T_ID = T.T_ID
Result set I want:
V_ID TC_ID Datetime Count
----|-----|------------|--------|
2 | 1 | 2014-01-22 | 15000 |
4 | 1 | 2014-01-22 | 1000 |
4 | 2 | 2013-12-05 | 0 |
I want to write a query to select each distinct V_ID + TC_ID combination, but only with the maximum datetime, and for that datetime the maximum count. E.g. for the distinct combination of V_ID = 2 and TC_ID = 1, '2014-01-22' is the maximum datetime, and for that datetime, 15000 is the maximum count, so select this record for the new table. Any ideas? I don't know if this is too ambitious for a query and I should just handle the result set in code instead.

One method uses row_number():
select v_id, tc_id, datetime, count
from (select T.V_ID, TC.TC_ID, T.Datetime, T.Count,
row_number() over (partition by t.V_ID, tc.tc_id
order by datetime desc, count desc
) as seqnum
from t join
tc
on tc.t_id = t._id
) tt
where seqnum = 1;
The only issue is that some rows have the same maximum datetime value. SQL tables represent unordered sets, so there is no way to determine which is really the maximum -- unless the datetime really has a time component or another column specifies the ordering within a day.

It is possible to solve this using CTEs. First, extracting the data from your query. Second, get the maxdates. Third, get the highest count for each maxdate.:
;WITH Dataset AS
(
select T.V_ID,
TC.TC_ID,
T.[Datetime],
T.[Count]
from T
inner join TC
on TC.T_ID = T._ID
),
MaxDates AS
(
SELECT V_ID, TC_ID, MAX(t.[Datetime]) AS MaxDate
FROM Dataset t
GROUP BY t.V_ID, t.TC_ID
)
SELECT t.V_ID, t.TC_ID, t.[Datetime], MAX(t.[Count]) AS [Count]
FROM Dataset t
INNER JOIN MaxDates m ON t.V_ID = m.V_ID AND t.TC_ID = m.TC_ID AND m.MaxDate = t.[Datetime]
GROUP BY t.V_ID, t.TC_ID, t.[Datetime]

Just to keep it simple:
You need to group by T.V_ID,TC.TC_ID,
with selecting the max of date and then to get the maximum count, you must use a sub query as follows,
select T.V_ID,
TC.TC_ID,
max(T.Datetime) as Date_Time,
(select max(Count) from T as tb where v_ID = T.v_ID and DateTime = max(T.DateTime)) as Count
from T
inner join TC
on TC.T_ID = T._ID
group by T.V_ID,TC.TC_ID,

Selecting row with highest ID based on another column

In SQL Server 2008 R2, suppose I have a table layout like this...
+----------+---------+-------------+
| UniqueID | GroupID | Title |
+----------+---------+-------------+
| 1 | 1 | TEST 1 |
| 2 | 1 | TEST 2 |
| 3 | 3 | TEST 3 |
| 4 | 3 | TEST 4 |
| 5 | 5 | TEST 5 |
| 6 | 6 | TEST 6 |
| 7 | 6 | TEST 7 |
| 8 | 6 | TEST 8 |
+----------+---------+-------------+
Is it possible to select every row with the highest UniqueID number, for each GroupID. So according to the table above - if I ran the query, I would expect this...
+----------+---------+-------------+
| UniqueID | GroupID | Title |
+----------+---------+-------------+
| 2 | 1 | TEST 2 |
| 4 | 3 | TEST 4 |
| 5 | 5 | TEST 5 |
| 8 | 6 | TEST 8 |
+----------+---------+-------------+
Been chomping on this for a while, but can't seem to crack it.
Many thanks,

SELECT *
FROM (SELECT uniqueid, groupid, title,
Row_number()
OVER ( partition BY groupid ORDER BY uniqueid DESC) AS rn
FROM table) a
WHERE a.rn = 1

With SQL-Server as rdbms you can use a ranking function like ROW_NUMBER:
WITH CTE AS
(
SELECT UniqueID, GroupID, Title,
RN = ROW_NUMBER() OVER (PARTITON BY GroupID
ORDER BY UniqueID DESC)
FROM dbo.TableName
)
SELECT UniqueID, GroupID, Title
FROM CTE
WHERE RN = 1
This returns exactly one record for each GroupID even if there are multiple rows with the highest UniqueID (the name does not suggest so). If you want to return all rows in then use DENSE_RANK instead of ROW_NUMBER.
Here you can see all functions and how they work: http://technet.microsoft.com/en-us/library/ms189798.aspx

Since you have not mentioned any RDBMS, this statement below will work on almost all RDBMS. The purpose of the subquery is to get the greatest uniqueID for every GROUPID. To be able to get the other columns, the result of the subquery is joined on the original table.
SELECT a.*
FROM tableName a
INNER JOIN
(
SELECT GroupID, MAX(uniqueID) uniqueID
FROM tableName
GROUP By GroupID
) b ON a.GroupID = b.GroupID
AND a.uniqueID = b.uniqueID
In the case that your RDBMS supports Qnalytic functions, you can use ROW_NUMBER()
SELECT uniqueid, groupid, title
FROM
(
SELECT uniqueid, groupid, title,
ROW_NUMBER() OVER (PARTITION BY groupid
ORDER BY uniqueid DESC) rn
FROM tableName
) x
WHERE x.rn = 1
TSQL Ranking Functions
The ROW_NUMBER() generates sequential number which you can filter out. In this case the sequential number is generated on groupid and sorted by uniqueid in descending order. The greatest uniqueid will have a value of 1 in rn.

SELECT *
FROM the_table tt
WHERE NOT EXISTS (
SELECT *
FROM the_table nx
WHERE nx.GroupID = tt.GroupID
AND nx.UniqueID > tt.UniqueID
)
;
Should work in any DBMS (no window functions or CTEs are needed)
is probably faster than a sub query with an aggregate

Keeping it simple:
select * from test2
where UniqueID in (select max(UniqueID) from test2 group by GroupID)
Considering:
create table test2
(
UniqueID numeric,
GroupID numeric,
Title varchar(100)
)
insert into test2 values(1,1,'TEST 1')
insert into test2 values(2,1,'TEST 2')
insert into test2 values(3,3,'TEST 3')
insert into test2 values(4,3,'TEST 4')
insert into test2 values(5,5,'TEST 5')
insert into test2 values(6,6,'TEST 6')
insert into test2 values(7,6,'TEST 7')
insert into test2 values(8,6,'TEST 8')

SELECT ONE Row with the MAX() value on a column

I have a pretty simple dataset of monthly newsletters:
id | Name | PublishDate | IsActive
1 | Newsletter 1 | 10/15/2012 | 1
2 | Newsletter 2 | 11/06/2012 | 1
3 | Newsletter 3 | 12/15/2012 | 0
4 | Newsletter 4 | 1/19/2012 | 0
and etc.
The PublishDate is unique.
Result (based on above):
id | Name | PublishDate | IsActive
2 | Newsletter 2 | 11/06/2012 | 1
What I want is pretty simple. I just want the 1 newsletter that IsActive and PublishDate = MAX(PublishDate).

select top 1 * from newsletters where IsActive = 1 order by PublishDate desc

You can use row_number():
select id, name, publishdate, isactive
from
(
select id, name, publishdate, isactive,
row_number() over(order by publishdate desc) rn
from table1
where isactive = 1
) src
where rn = 1
See SQL Fiddle with Demo
You can even use a subquery that selects the max() date:
select t1.*
from table1 t1
inner join
(
select max(publishdate) pubdate
from table1
where isactive = 1
) t2
on t1.publishdate = t2.pubdate
See SQL Fiddle with Demo

CREATE TABLE Tmax(Id INT,NAME VARCHAR(15),PublishedDate DATETIME,IsActive BIT)
INSERT INTO Tmax(Id,Name,PublishedDate,IsActive)
VALUES(1,'Newsletter 1','10/15/2012',1),(2,'Newsletter 2','11/06/2012',1),(3,'Newsletter 3','12/15/2012',0),(4,'Newsletter 4','1/19/2012',0)
SELECT * FROM Tmax
SELECT t.Id
,t.NAME
,t.PublishedDate
,t.IsActive
FROM Tmax AS t
WHERE PublishedDate=
(
SELECT TOP 1 MAX(PublishedDate)
FROM Tmax
WHERE IsActive=1
)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Update Latest rows for a distinct column - sql

Use an updatable CTE and row_number(): with toupdate as ( select t.*, row_number() over (partition by MaterialNo order by LastModified desc) as rownum from mydatabase t ) update toupdate set sgnumber = 2 where seqnum = 1;

Related

Delete of duplicate records

SQL Ranking by blocks

SQL Server - Select Distinct of two columns, where the distinct column selected has a maximum value based on two other columns

Selecting row with highest ID based on another column

SELECT ONE Row with the MAX() value on a column

Categories

Resources