SQL - delete record where sum = 0 - sql

I have a table which has below values:
If Sum of values = 0 with same ID I want to delete them from the table. So result should look like this:
The code I have:
DELETE FROM tmp_table
WHERE ID in
(SELECT ID
FROM tmp_table WITH(NOLOCK)
GROUP BY ID
HAVING SUM(value) = 0)
Only deletes rows with ID = 2.
UPD: Including additional example:
Rows in yellow needs to be deleted

Your query is working correctly because the only group to total zero is id 2, the others have sub-groups which total zero (such as the first two with id 1) but the total for all those records is -3.
What you're wanting is a much more complex algorithm to do "bin packing" in order to remove the sub groups which sum to zero.

You can do what you want using window functions -- by enumerating the values for each id. Taking your approach using a subquery:
with t as (
select t.*,
row_number() over (partition by id, value order by id) as seqnum
from tmp_table t
)
delete from t
where exists (select 1
from t t2
where t2.id = t.id and t2.value = - t.value and t2.seqnum = t.seqnum
);
You can also do this with a second layer of window functions:
with t as (
select t.*,
row_number() over (partition by id, value order by id) as seqnum
from tmp_table t
),
tt as (
select t.*, count(*) over (partition by id, abs(value), seqnum) as cnt
from t
)
delete from tt
where cnt = 2;

Related

How to select distinct records with preference depending on a value

i have this table :
Table 1
notice that some ids have double records with IsTop = 1
if i have this kind of scenario im interested in selecting the one that has IsTop = 1 and if i dont im interested in keeping the one that IsTop = 0.
The goal is to have distinct Id's but take IsTop = 1 is it exists.
How do i do that?
You can use row_number():
select t.*
from (select t.*,
row_number() over (partition by id order by isTop desc) as seqnum
from t
) t
where seqnum = 1;

Delete Distinct column and latest date of other column

I have a table where the primary key is a composite key of ID and date. Is there a way that I can delete a single row where ID matches and the date is the latest date?
I am new to SQL, so I have tried a few things, but I either don't get the results I am looking for or cant get the syntax correct
DELETE FROM Master
WHERE ((Identifier = 'SomeID')
AND (EffectiveDate = MAX(EffectiveDate));
There are multiple columns with the same ID, but different dates, ie.
ID EffectiveDate
-------------------------
A '2019-09-18'
A '2019-09-17'
A '2019-09-16'
Is there a way I can delete only the row with A | '2019-09-18'?
You can use window functions and an updatable CTE:
with todelete as (
select t.*, row_number() over (partition by id order by effective_date desc) as seqnum
from t
)
delete from todelete
where seqnum = 1;
Note: If you want to limit this to a single id, then be sure to include a where id = 'a' in either the subquery or outer query.
use row_number()
delete from (select *, row_number() over(partition by id order by effectivedate desc) rn from table_name
) a where a.rn=1
A correlated subquery might get the job done:
DELETE FROM Master
WHERE
Identifier = 'SomeID'
AND EffectiveDate = (
SELECT MAX(EffectiveDate) FROM Master WHERE Identifier = 'SomeID'
)
;
Use the CTE Function to Delete the Row but the below Query will not delete the Record of Max Date of those ID's where Single Record exist against that.
with todelete as (
select t.*, row_number() over (partition by id order by effective_date desc) as seqnum
from t
)
delete from todelete
where seqnum = 1 and id in(select distinct id from todelete where seqnum<>1)
With correlated subquery for all IDs:
delete table1
from table1 t1
where t1.EffectiveDate =
(
select max(t2.EffectiveDate)
from table1 t2
where t2.ID = t1.ID
)

SQL query null value replaced based on another

So I have query to return data and a row number using ROW_NUMBER() OVER(PARTITION BY) and I place it into a temp table. The initial output looks the screenshot:
.
From here I need to, in the bt_newlabel column, replace the nulls respectively. So Rownumber 1-4 would be in progress, 5-9 would be underwriting, 10-13 would be implementation, and so forth.
I am hitting a wall trying to determine how to do this. Thanks for any help or input of how I would go about this.
One method is to assign groups, and then the value. Such as:
select t.*, max(bt_newlabel) over (partition by grp) as new_newlabel
from (select t.*, count(bt_newlabel) over (order by bt_stamp) as grp
from t
) t;
The group is simply the number of known values previously seen in the data.
You can update the field with:
with toupdate as (
select t.*, max(bt_newlabel) over (partition by grp) as new_newlabel
from (select t.*, count(bt_newlabel) over (order by bt_stamp) as grp
from t
) t
)
update toupdate
set bt_newlabel = new_newlabel
where bt_newlabel is null;
If I understood what you are trying to do, this is the type of update you need to do on your temp table:
--This will update rows 1-4 to 'Pre-Underwritting'
UPDATE temp_table SET bt_newlabel = 'Pre-Underwritting'
WHERE rownumber between
1 AND (SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Pre-Underwritting');
--This will update rows 5-9 to 'Underwritting'
UPDATE temp_table SET bt_newlabel = 'Underwritting'
WHERE rownumber between
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Pre-Underwritting')
AND
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Underwritting');
--This will update rows 10-13 to 'Implementation'
UPDATE temp_table SET bt_newlabel = 'Implementation'
WHERE rownumber between
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Underwritting')
AND
(SELECT TOP 1 rownumber FROM temp_table WHERE bt_oldlabel = 'Implementation');
I made a working Fiddle to check out the results: http://sqlfiddle.com/#!18/1cae2/1/3

How to update rows based only on ROW_NUMBER()?

Such SQL query:
SELECT ROW_NUMBER() OVER (PARTITION BY ID, YEAR order by ID ), ID, YEAR
from table t
give me following query set:
1 1000415591 2012
1 1000415591 2013
2 1000415591 2013
1 1000415591 2014
2 1000415591 2014
How could I update records with ROW_NUMBER() equals to 2? Other fields of this records is identically (select distinct from table where id = 1000415591 gives 3 records when there are 5 without distinct keyword), so I can depend only on ROW_NUMBER() value.
I need solution for Oracle, because I saw something similar for SQL-Server but it won't work with Oracle.
You could use a MERGE statement which is quite verbose and easy to understand.
For example,
MERGE INTO t s
USING
(SELECT ROW_NUMBER() OVER (PARTITION BY ID, YEAR order by ID ) RN,
ID,
YEAR
FROM TABLE t
) u ON (s.id = u.id)
WHEN MATCHED THEN
UPDATE SET YEAR = some_value WHERE u.RN = 2)
/
Note You cannot merge the same column which is used to join in the ON clause.
Try to use ROWID field:
UPDATE T
SET t.year = t.year*1000
WHERE (rowid,2) in (SELECT rowid,
ROW_NUMBER()
OVER (PARTITION BY ID, t.YEAR order by ID )
FROM T)
SQLFiddle demo
If you need to delete range of ROWNUMBERS then :
UPDATE T
SET t.year = t.year*1000
WHERE rowid in ( SELECT rowid FROM
(
SELECT rowid,
ROW_NUMBER()
OVER (PARTITION BY ID, t.YEAR order by ID ) as RN
FROM T
) T2 WHERE RN >=2 AND RN <=10
)
SQLFiddle demo
This is not the update statement but this is how to get the 2 rows you wanted to update:
SELECT *
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY ID, YEAR order by ID ) as rn, ID, YEAR
from t )
where rn = 2
SQLFIDDLE
When I've posted thq question, I've found that this could be wrong approach. I could modify table and add new fields. So better solution to create one more field IDENTITY and update it with numbers from the new sequence from 1 to total row numbers. Then I could update fields based on this IDENTIY field.
I'll keep this question opened if someone come up with solution based on ROW_NUMBER() analytic function.
update TABLE set NEW_ID = TABLE_SEQ.nextval
where IDENTITY in (
select IDENTITY from (
select row_number() over(PARTITION BY ID, YEAR order by ID) as row_num, t.ID, t."YEAR", t.IDENTITY
from TABLE t
) where row_num > 1
)

Select random values from each group, SQL

I have a project through which I'm creating a game powered by a database.
The database has data entered like this:
(ID, Name) || (1, PhotoID),(1,PhotoID),(1,PhotoID),(2,PhotoID),(2,PhotoID) and so on. There are thousands of entries.
This is my current SQL statement:
$sql = "SELECT TOP 8 * FROM Image WHERE Hidden = '0' ORDER BY NEWID()";
But this can also produce results with matching IDs, where I need to have each result have a unique ID (that is I need one result from each group).
How can I change my query to grab one result from each group?
Thanks!
Since ORDER BY NEWID() will result in tablescan anyway, you might use row_number() to isolate first in group:
; with randomizer as (
select id,
name,
row_number() over (partition by id
order by newid()) rn
from Image
where hidden = 0
)
select top 8
id,
name
from randomizer
where rn = 1
-- Added by mellamokb's suggestion to allow groups to be randomized
order by newid()
Sql Fiddle playground thanks to mellamokb.
Looks like this may work, but I can't vouch for performance:
SELECT TOP 8 ID,
(select top 1 name from image i2
where i2.id = i1.id order by newid())
FROM Image i1
WHERE hidden = '0'
group by ID
ORDER BY NEWID();
Demo: http://www.sqlfiddle.com/#!3/657ad/6
If you have an index on the ID column and want to take advantage of the index and avoid a full table scan, do your randomization on the key values first:
WITH IDs AS
(
SELECT DISTINCT ID
FROM Image
WHERE Hidden = '0'
),
SequencedIDs AS
(
SELECT ID, ROW_NUMBER() OVER (ORDER BY NEWID()) AS Seq
FROM IDs
),
ImageGroups AS
(
SELECT i.*, ROW_NUMBER() OVER (PARTITION BY i.ID ORDER BY NEWID()) Seq
FROM SequencedIDs s
INNER JOIN Image i
ON i.ID = s.ID
WHERE s.Seq < 8
AND i.Hidden = '0'
)
SELECT *
FROM ImageGroups
WHERE Seq = 1
This should drastically reduce the cost over the table scan approach, although I don't have a schema big enough that I can test with - so try running some statistics in SSMS and make sure ID is actually indexed for this to be effective.
select * from (select * from photos order by rand()) as _SUB group by _SUB.id;
select ID, Name from (select ID, Name, row_number() over
(partition by ID, Name order by ID) as ranker from Image where Hidden = 0 ) Z where ranker = 1
order by newID()