Delete rows where date is not minimum - sql

I have a table (structure below) that I need to clean up by deleting rows for each Object_ID:
WHERE Current_Step is NULL and Change = 'change'
and Date_of_Change <> MIN(Date_of_Change)
That is, I need to leave only the row with minimum date for each Object_ID.
Table sample
Object_ID
Current_Step
Change
Date_of_Change
0025307
NULL
change
16.11.2021
0025307
NULL
change
19.11.2021
0025307
NULL
change
19.11.2021
I am using MS SQL.
There are no primary keys.
All columns are VARCHAR except Date_of_Change being of type DATE.
The reason why I need to clean up this table is because it was incorrectly filled because source query was checking for IF NULL = NULL and then marked those status changes as changed even though they did not change. So I need to revert values back to original date they were changed because if they still have value NULL that means there were no actual changes happening in status.
Desired behavior
My attempt in identifying rows that I need to keep:
SELECT [Object_ID]
,MIN([Date_of_Change])
FROM table
WHERE [Current_Step] IS NULL
AND [Change] = 'change'
GROUP BY Object_ID
I just need to remove other rows with the same Object_ID whose Date_of_Change is not equal to the one identified in query above.

do join on same table like i did on 'Table1223' below.
Example:
DELETE tbl
FROM Table1223 tbl
JOIN (SELECT * FROM Table1223) objID
ON objID.Object_ID = tbl.Object_ID
WHERE tbl.Date_of_Change > objID.Date_of_Change

Schema
So you have a table with versioned objects which holds change records associated to the object with some details and a date.
Now you want to select
the first change per object
the oldest (within the GROUP of this object's changes)
using MIN function on a DATE column-type
This oldest should be retained/kept and stay. All other object change-versions should be deleted.
Solving
A. Selecting the fist/oldest changes per object in 2 steps.
Select the MIN(date) per object:
SELECT Object_ID, COUNT(Object_ID) AS Count_Changes, MIN(Date_of_Change) AS First_Change
FROM table
GROUP BY Object_ID
Resultset contains each object with the total count of changes and the date of the first change.
Select the first changes using previous result as subquery in a JOIN:
SELECT *
FROM table t
-- join with a table-subquery having only 2 columns to correlate
JOIN (
SELECT Object_ID, MIN(Date_of_Change) AS First_Change
FROM table
WHERE Current_Step is NULL and Change = 'change'
GROUP BY Object_ID
) m ON t.Object_ID = m.Object_ID AND t.Date_of_Change = m.First_Change
WHERE Current_Step is NULL and Change = 'change'
This are the rows to keep and not remove. The first change of each object should be retained and not cleaned.
B. Now we can invert the JOIN-condition to get all the rows, that we want to delete/clean:
Change the date-comparison
) m ON t.Object_ID = m.Object_ID AND t.Date_of_Change = m.First_Change
to not-equal:
) m ON t.Object_ID = m.Object_ID AND t.Date_of_Change <> m.First_Change
Run a dry-select first, to get at least the count before deleting.
SELECT COUNT(Object_ID) AS records_to_remove
FROM table t
-- join with a table-subquery having only 2 columns to correlate
JOIN (
SELECT Object_ID, MIN(Date_of_Change) AS First_Change
FROM table
WHERE Current_Step is NULL and Change = 'change'
GROUP BY Object_ID
) m ON t.Object_ID = m.Object_ID AND t.Date_of_Change <> m.First_Change
WHERE Current_Step is NULL and Change = 'change'
Prepare the DELETE statement with JOIN (if supported by DBMS):
DELETE FROM table t
JOIN (
SELECT Object_ID, MIN(Date_of_Change) AS First_Change
FROM table
WHERE Current_Step is NULL and Change = 'change'
GROUP BY Object_ID
) m ON t.Object_ID = m.Object_ID AND t.Date_of_Change <> m.First_Change
WHERE t.Current_Step is NULL AND t.Change = 'change'
Alternative to JOIN try USING on other DBMS
Some DBMS do not support JOIN in DELETE statements, but alternatives like USING:
DELETE FROM table t
USING (
SELECT Object_ID, MIN(Date_of_Change) AS First_Change
FROM table t2
WHERE t2.Current_Step is NULL AND t2.Change = 'change'
) AS m
WHERE ...
AND t.Object_ID = m.Object_ID AND t.Date_of_Change <> m.First_Change

Related

SQL Oracle Update containing left join

I want to perform an update for the results of a select query.
SELECT
a.reason,
n.note
FROM applications a
LEFT JOIN notes n on n.app_id = a.app_id
AND n.note LIKE '%old%'
WHERE a.code = 'run' AND a.reason IS NULL
I thought I could perform these updates separately wrapping the select in an update however I get the error ORA-01733: virtual column not allowed here. How can I go about performing these updates?
UPDATE (
SELECT
a.reason AS Reason
FROM applications a
LEFT JOIN notes n on n.app_id = a.app_id
AND n.note LIKE '%old%'
WHERE a.code = 'run' AND a.reason IS NULL
) SET Reason = null
UPDATE (
SELECT
n.note AS Note
FROM applications a
LEFT JOIN notes n on n.app_id = a.app_id
AND n.note LIKE '%old%'
WHERE a.code = 'run' AND a.reason IS NULL
) SET Note = null
You can't update the two tables at the same time. You need two different update statements as follows:
Updating the APPLICATIONS table is quite easy as all the records of the APPLICATIONS table having a.code = 'run' AND a.reason IS NULL will be there in your SELECT query.
UPDATE APPLICATIONS A
SET
REASON = NULL
WHERE A.CODE = 'run'
AND A.REASON IS NULL;
To update the NOTES table, you can use the EXISTS clause as follows:
UPDATE NOTES N
SET
NOTE = NULL
WHERE EXISTS (
SELECT 1
FROM APPLICATIONS A
WHERE N.APP_ID = A.APP_ID
AND A.CODE = 'run'
AND A.REASON IS NULL
)
AND N.NOTE LIKE '%old%'
You must update the NOTES table first and then APPLICATIONS table as while updating the NOTES table you are using the condition A. REASON IS NULL but while updating the APPLICATIONS table, you are updating the REASON column.

Null value is considered an existing value

I have a validation into if conditional like:
IF(EXISTS
(SELECT TOP 1 [TaskAssignationId]
FROM [Task] AS [T]
INNER JOIN #TaskIdTableType AS [TT] ON [T].[TaskId] = [TT].[Id]
))
But it returns NULL value because TaskAssignationId is NULL so in consequence IF condition it's true because it exist with NULL value, but I don't want to consider NULL as a value. How can add an exception of nulls? Regards
If you don't want to include rows where [TaskAssignationId] is null then add that to a WHERE clause.
IF(EXISTS
SELECT TOP 1 [TaskAssignationId]
FROM [Task] AS [T]
INNER JOIN #TaskIdTableType AS [TT] ON [T].[TaskId] = [TT].[Id]
WHERE [TaskAssignationId] is not null
))
Exists works like "Did the (sub)query return more than zero (correlated) rows" not "did the (sub)query return a non null value"
These are perfectly valid exists:
SELECT * FROM person p
WHERE EXISTS (SELECT null FROM address a WHERE a.personid = p.id)
SELECT * FROM person p
WHERE EXISTS (SELECT 1 FROM address a WHERE a.personid = p.id)
SELECT * FROM person p
WHERE EXISTS (SELECT * FROM address a WHERE a.personid = p.id)
It doesn't matter what values you return, or how many columns, exists cares whether the rowcount is 0 or greater when determining whether results exist
Hence you have to make sure your (sub)query returns no rows if you want the exists check to fail. If Addresses that have a null type are unacceptable, the (sub)query has to exclude them with WHERE a.type IS NOT NULL so that only rows with a non null type are considered
There's also little point doing a TOP 1 in the (sub)query; the optimiser knows that the only condition it cares about is 0 or not-0 rows, so it automatically do a TOP 1 (i.e. it will stop retrieving data when it knows there is at least one row)
If you want to check the existence then no need to assign the column name, you can use select 1
IF(EXISTS
SELECT TOP 1 1
FROM [Task] AS [T]
INNER JOIN #TaskIdTableType AS [TT] ON [T].[TaskId] = [TT].[Id]
))
begin
----code---
end

SQL How to add values of a select statement to a varchar

I'm writing a function that returns the names of a child table's one column's values in one varchar.
The relation is:
I have a parent table called Activity.
And a child table in N-1 relation with table Activity, called ActivityObjective.
And a third table where I keep the names of the objectives, called Objective.
This is the query that I make. This returns the names of the objectives of a specific Activity with ActivityID = #ActivityID
SELECT o.ObjectiveName
FROM Activity a
INNER JOIN ActivityObjective ao ON a.ActivityID = ao.ActivityID
INNER JOIN Objective o ON o.ObjectiveID = ao.ObjectiveID
WHERE a.ActivityID = #ActivityID
This returns something like:
ObjectiveName
|-------------------|
objName1
objName2
objName3
My aim is no have a varchar "objName1, objName2, objName3". I cannot create a temp table because I'm working in a function.
You will need to adjust the following to match what you specifically wanted, but this is a start:
Select substring((
SELECT (', ' + o.ObjectiveName)
FROM Activity a
INNER JOIN ActivityObjective ao ON a.ActivityID = ao.ActivityID
INNER JOIN Objective o ON o.ObjectiveID = ao.ObjectiveID
WHERE a.ActivityID = #ActivityID
FOR XML PATH( '' )
), 3, 1000 )

SQL: Want to alter the conditions on a join depending on values in table

I have a table called Member_Id which has a column in it called Member_ID_Type. The select statement below returns the value of another column, id_value from the same table. The join on the tables in the select statement is on the universal id column. There may be several entries in that table with this same universal id.
I want to adjust the select statement so that it will return the id_values for entries that have member_id_type equal to '7'. However if this is null then I want to return records that have member_id_type equal to '1'
So previously I had a condition on the join (commented out below) but that just returned records that had member_id_type equal to '7' and otherwise returned null.
I think I may have to use a case statement here but I'm not 100% sure how to use it in this scenario
SELECT TOP 1 cm.Contact_Relation_Gid,
mc.Universal_ID,
mi.ID_Value,
cm.First_Name,
cm.Last_Name,
cm.Middle_Name,
cm.Name_Suffix,
cm.Email_Address,
cm.Disability_Type_PKID,
cm.Race_Type_PKID,
cm.Citizenship_Type_PKID,
cm.Marital_Status_Type_PKID,
cm.Actual_SSN,
cm.Birth_Date,
cm.Gender,
mc.Person_Code,
mc.Relationship_Code,
mc.Member_Coverage_PKID,
sc.Subscriber_Coverage_PKID,
FROM Contact_Member cm (NOLOCK)
INNER JOIN Member_Coverage mc (NOLOCK)
ON cm.contact_relation_gid = mc.contact_relation_gid
AND mc.Record_Status = 'A'
INNER JOIN Subscriber_Coverage sc (NOLOCK)
ON mc.Subscriber_Coverage_PKID = sc.Subscriber_Coverage_PKID
AND mc.Record_Status = 'A'
LEFT outer JOIN Member_ID mi ON mi.Universal_ID = cm.Contact_Gid
--AND mi.Member_ID_Type_PKID='7'
WHERE cm.Contact_Relation_Gid = #Contact_Relation_Gid
AND cm.Record_Status = 'A'
Join them both, and use one if the other is not present:
select bt.name
, coalesce(eav1.value, eav2.value) as Value1OrValue2
from BaseTable bt
left join EavTable eav1
on eav1.id = bt.id
and eav1.type = 1
left join EavTable eav2
on eav2.id = bt.id
and eav2.type = 2
This query assumes that there is never more than one record with the same ID and Type.

selecting latest rows per distinct foreign key value

excuse the title, i couldn't come up with something short and to the point...
I've got a table 'updates' with the three columns, text, typeid, created - text is a text field, typeid is a foreign key from a 'type' table and created is a timestamp. A user is entering an update and select the 'type' it corresponds too.
There's a corresponding 'type' table with columns 'id' and 'name'.
I'm trying to end up with a result set with as many rows as is in the 'type' table and the latest value from updates.text for the particular row in types. So if i've got 3 types, 3 rows would be returned, one row for each type and the most recent updates.text value for the type in question.
Any ideas?
thanks,
John.
select u.text, u.typeid, u.created, t.name
from (
select typeid, max(created) as MaxCreated
from updates
group by typeid
) mu
inner join updates u on mu.typeid = u.typeid and mu.MaxCreated = u.Created
left outer join type t on u.typeid = t.typeid
What are the actual columns you want returned?
SELECT t.*,
y.*
FROM TYPE t
JOIN (SELECT u.typeid,
MAX(u.created) 'max_created'
FROM UPDATES u
GROUP BY u.typeid) x ON x.typeid = t.id
JOIN UPDATES y ON y.typeid = x.typeid
AND y.created = x.max_created
SELECT
TYP.id,
TYP.name,
TXT.comment
FROM
dbo.Types TYP
INNER JOIN dbo.Type_Comments TXT ON
TXT.type_id = TYP.id
WHERE
NOT EXISTS
(
SELECT
*
FROM
dbo.Type_Comments TXT2
WHERE
TXT2.type_id = TYP.id AND
TXT2.created > TXT.created
)
Or:
SELECT
TYP.id,
TYP.name,
TXT.comment
FROM
dbo.Types TYP
INNER JOIN dbo.Type_Comments TXT ON
TXT.type_id = TYP.id
LEFT OUTER JOIN dbo.Type_Comments TXT2 ON
TXT2.type_id = TYP.id AND
TXT2.created > TXT.created
WHERE
TXT2.type_id IS NULL
In either case, if the created date can be identical between two rows with the same type_id then you would need to account for that.
I've also assumed at least one comment per type exists. If that's not the case then you would need to make a minor adjustment for that as well.