Removing entries from a table where it's values already exist in the table - sql

I'm starting with this example table (#temp2):
| a | b |
|---|---|
| 2 | 4 |
| 2 | 5 | x
| 3 | 1 |
| 6 | 4 | x
| 6 | 5 |
| 7 | 5 | x
| 7 | 4 | x
|---|---|
This is a table of transaction keys that I want to be deleted from another existing table. It represents transactions that negate other transactions, where a negates b or vice-versa. So I cannot have a single a negating multiple b or a single b negating multiple a. I have some logic that I thought would do it but there is a problem. With my existing logic it will look to see if a or b is a duplicate and delete it if it is. The problem is, if I want a row to be deleted I would like it to 'free' up the value that wasn't the reason for deletion. I hope this isn't too confusing. But from my example I put x's next to each row I want deleted. Currently my algorithm is deleting too many rows. The row (6,5) gets deleted because there already exists a '5' in the second row, but that row is getting deleted (since '2' can't negate '4' and '5') so this 'frees up' the '5' to negate entry 6.
This is my current code, but it is deleting too many rows:
delete t
from #temp2 t
where exists(select * from #temp2
where b = t.b
and a < t.a)
or exists(select * from #temp2
where a = t.a
and b < t.b)
Any help is much appreciated!

Related

Suggestion for sql scenario

I am having two columns in table
InventoryId | RevisionId
-------------------------
1 | 1
2 | 1
2 | 2
2 | 2
3 | 1
3 | 2
3 | 3
3 | 3
but from now on I want to prevent following records
2 | 2
2 | 2
3 | 3
3 | 3
So I thought to create a unique index on these two columns
but the table having so much existing data. So anything we can do this situation.
Any suggestion?
you can use a trigger to prevent new rows being added with duplicate values
look at this example
create trigger TR_UI_YourTable on YourTable
for update, insert as
begin
set nocount on
if exists (select 1 from inserted i where i.InventoryId = i.RevisionId)
begin
;throw 99001, 'no dupes allowed anymore...', 1
end
end
A better solution would be to move the duplicates to a seperate table for history, and then add a check constraint on these 2 columns
EDIT
you could do it by an check constraint like this
alter table yourtable
add constraint chk_dupes
check ((InventoryId <> RevisionId) or (id <= 12345))
where 12345 is the highest value of the column id now.
You will have to test it a bit if it works on all situations.
Also, it will only work if all new rows have a value in id that is larger then the current highest value (12345 in my example)

Include data in a table looking in every insert if there is a match with the table values

I need to insert data from one table into another, but this insert must look into the table which receives data to determine if there is a match or not, and if it is, don't insert new data.
So, i have the next tables (NODE_ID refers to values in NODE1 and NODE2, think about lines with two nodes everyone):
Table A:
| ARC | NODE1 | NODE2 | STATE |
| x | 1 | 2 | A |
| y | 2 | 3 | A |
| z | 3 | 4 | B |
Table B:
| NODE_ID| VALUE |
| 1 | N |
| 2 | N |
| 3 | N |
| 4 | N |
And want the next result, that relates NODE_ID with ARCS and write in the result table the value of STATE from ARCS table, only one result for each NODE, because if not, i would have more than one row for the same NODE:
Table C result:
| NODE_ID| STATE |
| 1 | A |
| 2 | A |
| 3 |A(or B)|
I tried to do this with CASE statement with EXISTS, IF , and NVL2() and so on in the select but have no result at this time.
Any idea about how could i write this query?
Thank you very much for your help
Ok guys, i edit my message to explain how i did it finally, i've also changed a little bit my first message to make it more clear to undestand because we had problems with that.
So finally i used this query, that #mathguy introduced to me:
merge into Table_C c
using (select distinct b.NODE_ID as nodes, a.STATE
from Table_A a, Table_B b
where (b.NODE_ID=a.NODE1 or b.NODE_ID=a.NODE2) s
on (s.nodes=c.NODE_ID)
when not matched then
insert (NODE_ID, STATE)
values (s.nodes, s.STATE)
That's all
This can be done with insert, but often when you update one table with values from another, the merge statement is more powerful (more flexible).
merge into table_c c
using ( select arc, min(state) as state from table_a group by arc ) s
on (s.arc = c.node_id)
when not matched then insert (node_id, state)
values (s.arc, s.state)
;
Thanks to #Boneist and #ThorstenKettner for pointing out several syntax errors (now fixed).
If table C does not yet exist, use a create select statement:
create table c as select arc as node_id, state from a;
In case there can be duplicate arc (not shown in your sample) you'd need aggregation:
create table c as select arc as node_id, min(state) as state from a group by arc;

sqlite string replace/delete

I have a column in my database table with name tags which contains comma separated strings and it has records like this-
index | tags
-------------
1 | a,b,c
2 | b
3 | c
4 | z
5 | b,a,c
6 | p,f,w
7 | a,c,b
(for simplicity i am denoting strings with characters)
Now i want to replace/delete particular string.
Delete - say I want to delete b from all rows. If tags column become empty after this operation that row/record should be deleted (index 2 in this case). My records should look like this after this operation.
index | tags
-------------
1 | a,c
3 | c
4 | z
5 | a,c
6 | p,f,w
7 | a,c
Replace - say I want to replace all a with k on original records
index | tags
-------------
1 | k,b,c
2 | b
3 | c
4 | z
5 | b,k,c
6 | p,f,w
7 | k,c,b
Question - I am thinking of using replace function somehow but not sure how to meet above requirement with that. Can i do this in a single sql command? If not please suggest best way to do this (may be multiple sql commands).
I use MSSQL, I'm not sure sqlite. But, you use REPLACE function, like this:
To remove b:
UPDATE Your_Table
SET tags = REPLACE(REPLACE(tags, ',b', ' '), 'b,', ' ')
UPDATE Your_Table
SET tags = NULL WHERE tags = 'b'
To replace a with k:
UPDATE Your_Table
SET tags = REPLACE(tags, 'a', 'k')

Finding the difference between two sets of data from the same table

My data looks like:
run | line | checksum | group
-----------------------------
1 | 3 | 123 | 1
1 | 7 | 123 | 1
1 | 4 | 123 | 2
1 | 5 | 124 | 2
2 | 3 | 123 | 1
2 | 7 | 123 | 1
2 | 4 | 124 | 2
2 | 4 | 124 | 2
and I need a query that returns me the new entries in run 2
run | line | checksum | group
-----------------------------
2 | 4 | 124 | 2
2 | 4 | 124 | 2
I tried several things, but I never got to a satisfying answer.
In this case I'm using H2, but of course I'm interested in a general explanation that would help me to wrap my head around the concept.
EDIT:
OK, it's my first post here so please forgive if I didn't state the question precisely enough.
Basically given two run values (r1, r2, with r2 > r1) I want to determine which rows having row = r2 have a different line, checksum or group from any row where row = r1.
select * from yourtable
where run = 2 and checksum = (select max(checksum)
from yourtable)
Assuming your last run will have the higher run value than others, below SQL will help
select * from table1 t1
where t1.run in
(select max(t2.run) table1 t2)
Update:
Above SQL may not give you the right rows because your requirement is not so clear. But the overall idea is to fetch the rows based on the latest run parameters.
SELECT line, checksum, group
FROM TableX
WHERE run = 2
EXCEPT
SELECT line, checksum, group
FROM TableX
WHERE run = 1
or (with slightly different result):
SELECT *
FROM TableX x
WHERE run = 2
AND NOT EXISTS
( SELECT *
FROM TableX x2
WHERE run = 1
AND x2.line = x.line
AND x2.checksum = x.checksum
AND x2.group = x.group
)
A slightly different approach:
select min(run) run, line, checksum, group
from mytable
where run in (1,2)
group by line, checksum, group
having count(*)=1 and min(run)=2
Incidentally, I assume that the "group" column in your table isn't actually called group - this is a reserved word in SQL and would need to be enclosed in double quotes (or backticks or square brackets, depending on which RDBMS you are using).

Deleting similar columns in SQL

In PostgreSQL 8.3, let's say I have a table called widgets with the following:
id | type | count
--------------------
1 | A | 21
2 | A | 29
3 | C | 4
4 | B | 1
5 | C | 4
6 | C | 3
7 | B | 14
I want to remove duplicates based upon the type column, leaving only those with the highest count column value in the table. The final data would look like this:
id | type | count
--------------------
2 | A | 29
3 | C | 4 /* `id` for this record might be '5' depending on your query */
7 | B | 14
I feel like I'm close, but I can't seem to wrap my head around a query that works to get rid of the duplicate columns.
count is a sql reserve word so it'll have to be escaped somehow. I can't remember the syntax for doing that in Postgres off the top of my head so I just surrounded it with square braces (change it if that isn't correct). In any case, the following should theoretically work (but I didn't actually test it):
delete from widgets where id not in (
select max(w2.id) from widgets as w2 inner join
(select max(w1.[count]) as [count], type from widgets as w1 group by w1.type) as sq
on sq.[count]=w2.[count] and sq.type=w2.type group by w2.[count]
);
There is a slightly simpler answer than Asaph's, with EXISTS SQL operator :
DELETE FROM widgets AS a
WHERE EXISTS
(SELECT * FROM widgets AS b
WHERE (a.type = b.type AND b.count > a.count)
OR (b.id > a.id AND a.type = b.type AND b.count = a.count))
EXISTS operator returns TRUE if the following SQL statement returns at least one record.
According to your requirements, seems to me that this should work:
DELETE
FROM widgets
WHERE type NOT IN
(
SELECT type, MAX(count)
FROM widgets
GROUP BY type
)