SQLServer Delete from table with list coming from another table - sql-server-2012

I have a DELETE requiring a where with an AND in the WHERE clause to pick the rows. One the of values in the WHERE is actually a bunch of ids. I could accomplish with programming to get the list of user ids and then looping using the list.
This query returns a list of GroupUserIds
SELECT Id FROM GroupUser WHERE GroupUserId = #GroupUserId
I this wish to delete from VariableTransaction for each GroupUserId
Foreach #GroupUserId in GroupUserIds
DELETE FROM VariableTransaction WHERE VariableId = #VariableId AND GroupUserId = #GroupUserId
There should be a way to combine these into a single SQL statement but looking at all the examples I cannot figure out a solution, the where with the AND complicates it.

Try using an IN clause with a subquery:
DELETE FROM VariableTransaction WHERE VariableId = #VariableId
AND GroupUserId IN (SELECT Id FROM GroupUser WHERE GroupUserId = #GroupUserId)

Related

Deleting value using SQlite while doing an INNER JOIN

I am trying to delete all voters from a voters table where they are not registered as a democrat or republican AND only voted once. I have a database with three tables, congress_members, voters, and votes and have to JOIN votes with voters in order to delete the right data.
This code finds the data I want to delete:
SELECT voters.*
FROM voters JOIN votes ON voters.id = votes.voter_id
WHERE party = 'green' OR party = 'na' OR party = 'independent'
GROUP BY votes.voter_id
HAVING COUNT(*) = 1;
But I am unable to delete it because I am getting an error everytime I try to delete with a JOIN statement
You can phrase this as a delete with a where clause:
delete from voters
where votes.party not in ('democrat', 'republican') and
voters.id in (select id from votes group by id having count(*) = 1);
You are getting the error because the join will query your database and create a temporary table that will hold your newly queried data. The delete staements are used to remove data that is stored inside your database on your disk and not inside your memory.
The delete statement syntax is "DELETE FROM table WHERE conditions". The table value will need to be one of the three tables in your database, and your target is voters. As of right now, you have half of your delete statement complete.
The where clause needs to evaluate to a boolean value for each row. There is a function called EXISTS (). This function can be used to delete this data. Essentially, you will place your select statement from your post inside of the EXISTS (). The function will compare each of your rows in the target delete table to a row in your table inside of exists. If there is a match, then the row exists, the function evaluates to true for that row, and it is deleted.
DELETE FROM voters
WHERE (party = 'green' OR party = 'na' OR party = 'independent')
AND EXISTS (
SELECT 1 FROM votes WHERE votes.id = voters.id
HAVING COUNT(*) = 1
)

SQL query to select * from table, but return ID column as looked-up usernames

I'm using Bugzilla, and I essentially want to SELECT * FROM bugs table in the "bugs" database. However, the "assigned_to" column actually contains integer values (IDs) instead of a string with the user name.
These IDs match primary keys in the "profiles" table (the "userid" column), and the string I want my query to return is actually stored in the "realname" column in that table.
How can I modify this query to capture all columns in "bugs," but perform a lookup on the assigned_to column and return usernames?
SELECT b.*, p.realname FROM bugs b
JOIN profiles p
ON b.assigned_to = p.userid

Delete duplicates with no primary key

Here want to delete rows with a duplicated column's value (Product) which will be then used as a primary key.
The column is of type nvarchar and we don't want to have 2 rows for one product.
The database is a large one with about thousands rows we need to remove.
During the query for all the duplicates, we want to keep the first item and remove the second one as the duplicate.
There is no primary key yet, and we want to make it after this activity of removing duplicates.
Then the Product columm could be our primary key.
The database is SQL Server CE.
I tried several methods, and mostly getting error similar to :
There was an error parsing the query. [ Token line number = 2,Token line offset = 1,Token in error = FROM ]
A method which I tried :
DELETE FROM TblProducts
FROM TblProducts w
INNER JOIN (
SELECT Product
FROM TblProducts
GROUP BY Product
HAVING COUNT(*) > 1
)Dup ON w.Product = Dup.Product
The preferred way trying to learn and adjust my code with something similar
(It's not correct yet):
SELECT Product, COUNT(*) TotalCount
FROM TblProducts
GROUP BY Product
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC
--
;WITH cte -- These 3 lines are the lines I have more doubt on them
AS (SELECT ROW_NUMBER() OVER (PARTITION BY Product
ORDER BY ( SELECT 0)) RN
FROM Word)
DELETE FROM cte
WHERE RN > 1
If you have two DIFFERENT records with the same Product column, then you can SELECT the unwanted records with some criterion, e.g.
CREATE TABLE victims AS
SELECT MAX(entryDate) AS date, Product, COUNT(*) AS dups FROM ProductsTable WHERE ...
GROUP BY Product HAVING dups > 1;
Then you can do a DELETE JOIN between ProductTable and Victims.
Or also you can select Product only, and then do a DELETE for some other JOIN condition, for example having an invalid CustomerId, or EntryDate NULL, or anything else. This works if you know that there is one and only one valid copy of Product, and all the others are recognizable by the invalid data.
Suppose you instead have IDENTICAL records (or you have both identical and non-identical, or you may have several dupes for some product and you don't know which). You run exactly the same query. Then, you run a SELECT query on ProductsTable and SELECT DISTINCT all products matching the product codes to be deduped, grouping by Product, and choosing a suitable aggregate function for all fields (if identical, any aggregate should do. Otherwise I usually try for MAX or MIN). This will "save" exactly one row for each product.
At that point you run the DELETE JOIN and kill all the duplicated products. Then, simply reimport the saved and deduped subset into the main table.
Of course, between the DELETE JOIN and the INSERT SELECT, you will have the DB in a unstable state, with all products with at least one duplicate simply disappeared.
Another way which should work in MySQL:
-- Create an empty table
CREATE TABLE deduped AS SELECT * FROM ProductsTable WHERE false;
CREATE UNIQUE INDEX deduped_ndx ON deduped(Product);
-- DROP duplicate rows, Joe the Butcher's way
INSERT IGNORE INTO deduped SELECT * FROM ProductsTable;
ALTER TABLE ProductsTable RENAME TO ProductsBackup;
ALTER TABLE deduped RENAME TO ProductsTable;
-- TODO: Copy all indexes from ProductsTable on deduped.
NOTE: the way above DOES NOT WORK if you want to distinguish "good records" and "invalid duplicates". It only works if you have redundant DUPLICATE records, or if you do not care which row you keep and which you throw away!
EDIT:
You say that "duplicates" have invalid fields. In that case you can modify the above with a sorting trick:
SELECT * FROM ProductsTable ORDER BY Product, FieldWhichShouldNotBeNULL IS NULL;
Then if you have only one row for product, all well and good, it will get selected. If you have more, the one for which (FieldWhichShouldNeverBeNull IS NULL) is FALSE (i.e. the one where the FieldWhichShouldNeverBeNull is actually not null as it should) will be selected first, and inserted. All others will bounce, silently due to the IGNORE clause, against the uniqueness of Product. Not a really pretty way to do it (and check I didn't mix true with false in my clause!), but it ought to work.
EDIT
actually more of a new answer
This is a simple table to illustrate the problem
CREATE TABLE ProductTable ( Product varchar(10), Description varchar(10) );
INSERT INTO ProductTable VALUES ( 'CBPD10', 'C-Beam Prj' );
INSERT INTO ProductTable VALUES ( 'CBPD11', 'C Proj Mk2' );
INSERT INTO ProductTable VALUES ( 'CBPD12', 'C Proj Mk3' );
There is no index yet, and no primary key. We could still declare Product to be primary key.
But something bad happens. Two new records get in, and both have NULL description.
Yet, the second one is a valid product since we knew nothing of CBPD14 before now, and therefore we do NOT want to lose this record completely. We do want to get rid of the spurious CBPD10 though.
INSERT INTO ProductTable VALUES ( 'CBPD10', NULL );
INSERT INTO ProductTable VALUES ( 'CBPD14', NULL );
A rude DELETE FROM ProductTable WHERE Description IS NULL is out of the question, it would kill CBPD14 which isn't a duplicate.
So we do it like this. First get the list of duplicates:
SELECT Product, COUNT(*) AS Dups FROM ProductTable GROUP BY Product HAVING Dups > 1;
We assume that: "There is at least one good record for every set of bad records".
We check this assumption by positing the opposite and querying for it. If all is copacetic we expect this query to return nothing.
SELECT Dups.Product FROM ProductTable
RIGHT JOIN ( SELECT Product, COUNT(*) AS Dups FROM ProductTable GROUP BY Product HAVING Dups > 1 ) AS Dups
ON (ProductTable.Product = Dups.Product
AND ProductTable.Description IS NOT NULL)
WHERE ProductTable.Description IS NULL;
To further verify, I insert two records that represent this mode of failure; now I do expect the query above to return the new code.
INSERT INTO ProductTable VALUES ( "AC5", NULL ), ( "AC5", NULL );
Now the "check" query indeed returns,
AC5
So, the generation of Dups looks good.
I proceed now to delete all duplicate records that are not valid. If there are duplicate, valid records, they will stay duplicate unless some condition may be found, distinguishing among them one "good" record and declaring all others "invalid" (maybe repeating the procedure with a different field than Description).
But ay, there's a rub. Currently, you cannot delete from a table and select from the same table in a subquery ( http://dev.mysql.com/doc/refman/5.0/en/delete.html ). So a little workaround is needed:
CREATE TEMPORARY TABLE Dups AS
SELECT Product, COUNT(*) AS Duplicates
FROM ProductTable GROUP BY Product HAVING Duplicates > 1;
DELETE ProductTable FROM ProductTable JOIN Dups USING (Product)
WHERE Description IS NULL;
Now this will delete all invalid records, provided that they appear in the Dups table.
Therefore our CBPD14 record will be left untouched, because it does not appear there. The "good" record for CBPD10 will be left untouched because it's not true that its Description is NULL. All the others - poof.
Let me state again that if a record has no valid records and yet it is a duplicate, then all copies of that record will be killed - there will be no survivors.
To avoid this can may first SELECT (using the query above, the check "which should return nothing") the rows representing this mode of failure into another TEMPORARY TABLE, then INSERT them back into the main table after the deletion (using transactions might be in order).
Create a new table by scripting the old one out and renaming it. Also script all objects (indexes etc..) from the old table to the new. Insert the keepers into the new table. If you're database is in bulk-logged or simple recovery model, this operation will be minimally logged. Drop the old table and then rename the new one to the old name.
The advantage of this over a delete will be that the insert can be minimally logged. Deletes do double work because not only does the data get deleted, but the delete has to be written to the transaction log. For big tables, minimally logged inserts will be much faster than deletes.
If it's not that big and you have some downtime, and you have Sql Server Management studio, you can put an identity field on the table using the GUI. Now you have the situation like your CTE, except the rows themselves are truly distinct. So now you can do the following
SELECT MIN(table_a.MyTempIDField)
FROM
table_a lhs
join table_1 rhs
on lhs.field1 = rhs.field1
and lhs.field2 = rhs.field2 [etc]
WHERE
table_a.MyTempIDField <> table_b.MyTempIDField
GROUP BY
lhs.field1, rhs.field2 etc
This gives you all the 'good' duplicates. Now you can wrap this query with a DELETE FROM query.
DELETE FROM lhs
FROM table_a lhs
join table_b rhs
on lhs.field1 = rhs.field1
and lhs.field2 = rhs.field2 [etc]
WHERE
lhs.MyTempIDField <> rhs.MyTempIDField
and lhs.MyTempIDField not in (
SELECT MIN(lhs.MyTempIDField)
FROM
table_a lhs
join table_a rhs
on lhs.field1 = rhs.field1
and lhs.field2 = rhs.field2 [etc]
WHERE
lhs.MyTempIDField <> rhs.MyTempIDField
GROUP BY
lhs.field1, lhs.field2 etc
)
Try this:
DELETE FROM TblProducts
WHERE Product IN
(
SELECT Product
FROM TblProducts
GROUP BY Product
HAVING COUNT(*) > 1)
This suffers from the defect that it deletes ALL the records with a duplicated Product. What you probably want to do is delete all but one of each group of records with a given Product. It might be worthwhile to copy all the duplicates to a separate table first, and then somehow remove duplicates from that table, then apply the above, and then copy remaining products back to the original table.

Update with multiple rows

i am using UPDATE to update all records in the table with value from another table:
UPDATE x.ADR_TEMP SET LOCATIONID = (SELECT ID FROM x.OBJECTRELATIONSHIP WHERE PROPERTYCLASS = 'PHYSICALLOCATION')
in both tables are the same number of records. I just need to read id from one table and put them into another.
I have error: ORA-01427: single-row subquery returns more than one row
I am new in this subject and suspect that solution is very simple. Please help me
UPDATE x.ADR_TEMP SET LOCATIONID = (
SELECT ID
FROM x.OBJECTRELATIONSHIP
WHERE PROPERTYCLASS = 'PHYSICALLOCATION' AND x.ADR_TEMP.relatedKey = x.OBJECTRELATIONSHIP.relatedKey )
if your tables are related to each other with some way like this(relatedKey is a column which relates to the second table mentioned), you can try this one.(not sure or perfect)

how to delete duplicates from a database table based on a certain field

i have a table that somehow got duplicated. i basically want to delete all records that are duplicates, which is defined by a field in my table called SourceId. There should only be one record for each source ID.
is there any SQL that i can write that will delete every duplicate so i only have one record per Sourceid ?
Assuming you have a column ID that can tie-break the duplicate sourceid's, you can use this. Using min(id) causes it to keep just the min(id) per sourceid batch.
delete from tbl
where id NOT in
(
select min(id)
from tbl
group by sourceid
)
delete from table
where pk in (
select i2.pk
from table i1
inner join table i2
on i1.SourceId = i2.SourceId
)
good practice is to start with
select * from … and only later replace to delete from …