Can I delete entries from two tables in one statement? - sql

I have to remove a row from each of two tables, they're linked by an ID but not with a proper PK - FK relationship (this db has NO foreign keys!)
The tables have a supposed 1-1 relationship. I don't know why they weren't just put in the same table but I'm not at liberty to change it.
People
PersonId | Name | OwnsMonkey
----------------------------
1 Jim true
2 Jim false
3 Gaz true
Info
PersonId | FurtherInfo
-----------------------------
1 Hates his monkey
2 Wants a monkey
3 Loves his monkey
To decide what to delete, I have to find a username and whether or not they own a monkey:
Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false'
SO I'm doing two separate statements using this idea, deleting from Info first and then from People
delete from Info where PersonId = (Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false');
delete from People where PersonId = (Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false');
I found a promising answer here on StackOverflow
delete a.*, b.*
from People a
inner join Info b
where a.People = b.Info
and a.PersonId =
(Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false')
But it gives a syntax error in Sql Server (2012), I tried it without alias' too, but it doesn't seem possible to delete on two tables at once

Can I delete entries from two tables in one statement?
No. One statement can delete rows only from one table in MS SQL Server.
The answer that you refer to talks about MySQL and MySQL indeed allows to delete from several tables with one statement, as can be seen in the MySQL docs. MS SQL Server doesn't support this, as can be seen in the docs. There is no syntax to include more than one table in the DELETE statement in SQL Server. If you try to delete from a view, rather than a table, there is a limitation as well:
The view referenced by table_or_view_name must be updatable and
reference exactly one base table in the FROM clause of the view
definition.
I was hoping to avoid two separate statements on the off-chance the
second doesn't work for whatever reason, interrupted - concurrency
really, I guess the TRY/CATCH will work well for that.
This is what transactions are for.
You can put several statements in a transaction and either all of them would succeed, or all of them would fail. Either all or nothing.
In your case you not just can, but should put both DELETE statements in a transaction.
TRY/CATCH helps to process possible errors in a more controlled way, but the primary concept is "transaction".
BEGIN TRANSACTION
delete from Info where PersonId = (Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false');
delete from People where PersonId = (Select PersonId from People where Name = 'Jim' and OwnsMonkey = 'false');
COMMIT
I highly recommend to read a great article Error and Transaction Handling in SQL Server by Erland Sommarskog.
If you try to be tricky, like this:
WITH
CTE
AS
(
SELECT
Info.PersonId AS ID1, People.PersonId AS ID2
FROM
Info
INNER JOIN People ON Info.PersonId = People.PersonId
)
DELETE FROM CTE
WHERE ID1 = 1;
You'll get an error:
View or function 'CTE' is not updatable because the modification
affects multiple base tables.
Or like this:
WITH
CTE
AS
(
SELECT
PersonId
FROM Info
UNION ALL
SELECT
PersonId
FROM People
)
DELETE FROM CTE
WHERE PersonId = 1;
You'll get another error:
View 'CTE' is not updatable because the definition contains a UNION
operator.

Related

Postgres - How to find id's that are not used in different multiple tables (inactive id's) - badly written query

I have table towns which is main table. This table contains so many rows and it became so 'dirty' (someone inserted 5 milions rows) that I would like to get rid of unused towns.
There are 3 referent table that are using my town_id as reference to towns.
And I know there are many towns that are not used in this tables, and only if town_id is not found in neither of these 3 tables I am considering it as inactive and I would like to remove that town (because it's not used).
as you can see towns is used in this 2 different tables:
employees
offices
and for table * vendors there is vendor_id in table towns since one vendor can have multiple towns.
so if vendor_id in towns is null and town_id is not found in any of these 2 tables it is safe to remove it :)
I created a query which might work but it is taking tooooo much time to execute, and it looks something like this:
select count(*)
from towns
where vendor_id is null
and id not in (select town_id from banks)
and id not in (select town_id from employees)
So basically I said, if vendor_is is null it means this town is definately not related to vendors and in the same time if same town is not in banks and employees, than it will be safe to remove it.. but query took too long, and never executed successfully...since towns has 5 milions rows and that is reason why it is so dirty..
In face I'm not able to execute given query since server terminated abnormally..
Here is full error message:
ERROR: server closed the connection unexpectedly This probably means
the server terminated abnormally before or while processing the
request.
Any kind of help would be awesome
Thanks!
You can join the tables using LEFT JOIN so that to identify the town_id for which there is no row in tables banks and employee in the WHERE clause :
WITH list AS
( SELECT t.town_id
FROM towns AS t
LEFT JOIN tbl.banks AS b ON b.town_id = t.town_id
LEFT JOIN tbl.employees AS e ON e.town_id = t.town_id
WHERE t.vendor_id IS NULL
AND b.town_id IS NULL
AND e.town_id IS NULL
LIMIT 1000
)
DELETE FROM tbl.towns AS t
USING list AS l
WHERE t.town_id = l.town_id ;
Before launching the DELETE, you can check the indexes on your tables.
Adding an index as follow can be usefull :
CREATE INDEX town_id_nulls ON towns (town_id NULLS FIRST) ;
Last but not least you can add a LIMIT clause in the cte so that to limit the number of rows you detele when you execute the DELETE and avoid the unexpected termination. As a consequence, you will have to relaunch the DELETE several times until there is no more row to delete.
You can try an JOIN on big tables it would be faster then two IN
you could also try UNION ALL and live with the duplicates, as it is faster as UNION
Finally you can use a combined Index on id and vendor_id, to speed up the query
CREATE TABLe towns (id int , vendor_id int)
CREATE TABLE
CREATE tABLE banks (town_id int)
CREATE TABLE
CREATE tABLE employees (town_id int)
CREATE TABLE
select count(*)
from towns t1 JOIN (select town_id from banks UNION select town_id from employees) t2 on t1.id <> t2.town_id
where vendor_id is null
count
0
SELECT 1
fiddle
The trick is to first make a list of all the town_id's you want to keep and then start removing those that are not there.
By looking in 2 tables you're making life harder for the server so let's just create 1 single list first.
-- build empty temp-table
CREATE TEMPORARY TABLE TEMP_must_keep
AS
SELECT town_id
FROM tbl.towns
WHERE 1 = 2;
-- get id's from first table
INSERT TEMP_must_keep (town_id)
SELECT DISTINCT town_id
FROM tbl.banks;
-- add index to speed up the EXCEPT below
CREATE UNIQUE INDEX idx_uq_must_keep_town_id ON TEMP_must_keep (town_id);
-- add new ones from second table
INSERT TEMP_must_keep (town_id)
SELECT town_id
FROM tbl.employees
EXCEPT -- auto-distincts
SELECT town_id
FROM TEMP_must_keep;
-- rebuild index simply to ensure little fragmentation
REINDEX TABLE TEMP_must_keep;
-- optional, but might help: create a temporary index on the towns table to speed up the delete
CREATE INDEX idx_towns_town_id_where_vendor_null ON tbl.towns (town_id) WHERE vendor IS NULL;
-- Now do actual delete
-- You can do a `SELECT COUNT(*)` rather than a `DELETE` first if you feel like it, both will probably take some time depending on your hardware.
DELETE
FROM tbl.towns as del
WHERE vendor_id is null
AND NOT EXISTS ( SELECT *
FROM TEMP_must_keep mk
WHERE mk.town_id = del.town_id);
-- cleanup
DROP INDEX tbl.idx_towns_town_id_where_vendor_null;
DROP TABLE TEMP_must_keep;
The idx_towns_town_id_where_vendor_null is optional and I'm not sure if it will actaully lower the total time but IMHO it will help out with the DELETE operation if only because the index should give the Query Optimizer a better view on what volumes to expect.

How to find matching values in two different tables SQL

I am trying to write a statement to find matching values from two different tables and set table 1 suppression to 1 if person_id from table 2 matches person_id from table 1 in SQL.
Table 1 = id_num, person_id, phone_number, supression
table 2 = person_id, uid
So if person_id from table 2 matches person_id from table 1 it should set each record to 1 in suppression column in table 1.
Sample data and dbms would be needed to give a reliable answer, but the general gist is like this
-- for mySql, syntax will vary by dbms
update table1
inner
join table2
on table1.personid = table2.personid
set table1.suppression = 1;
http://sqlfiddle.com/#!9/42dbf6
Thank you for the first suggestion. I ended up going with the below
update in_12768_load1_all set "suppression" = '1' from (select * from "sup1027") as a where a."person_id" = in_12768_load1_all."person_id";
I thought I explained my question pretty well. No clue why someone would down arrow this.

Grabbing data from ORACLE DB

I have a database which has Author(id, name), AuthorPublication(aid, pid), Publication(id, title)
Author.id links to AuthorPublication.aid, AuthorPublication.pid links to Publication.od.
I am trying to write a query that returns the name of authors who has co-authored with "amol" or "samuel" BUT NOT BOTH.
So far I have
Select name
From Author, AuthorPublication, Publication
Where Publication.id = PID AND aid = Author.id
In the above code I need to filter the PID to be authors whose pid matches author "samuel" or "amol". But not both
Being new to oracle db, Im not so sure how to implement this, any help?
Thanks in advance!
Logic? Basically, get the id of the two authors, get any pid of their publications, those pids can't be the same....then use the resulting publication ids. There are several ways to do this in 1 query, here's one way using table aliases to use tables more than once in one query:
select auth_sam.name as sams_name
,auth_amol.name as amols_name
,nvl(auth_sam.name, auth_amol.name) as single_author_name
,s_pubinfo.title as sam_publication_titles
,a_pubinfo.title as amol_publication_titles
,nvl(s_pubinfo.title, a_pubinfo.title) as single_pub_name
from author auth_sam
,authorPublication sam_pubs
,publication s_pubinfo -- 3 aliases for samuel data
,author auth_amol
,authorPublication amol_pubs
,publication a_pubinfo -- 3 aliases for amol data
where auth_sam.name = 'samuel'
and auth_sam.id = sam_pubs.aid -- pubs by samuel
and sam_pubs.pid = s_pubinfo.id -- samuel titles
and auth_amol.name = 'amol'
and auth_amol.id = amol_pubs.aid -- pubs by amol
and amol_pubs.pid = a_pubinfo.id -- amol titles
and sam_pubs.pid != amol_pubs.pid -- not the same publication
Because of the !=, the query effectively returns 2 sets of results. Records for 'samuel' will have the 'sams_name' column populated and the 'amols_name' column will be null. Records for 'amol' will have his name column populated and the samuel name column value will be null. Because of these nulls, I included two columns using NVL() to demonstrate a method to choose which author field value and title field value to display. (Not a very "clean" solution, but I think it demonstrates several insights in the power of relational logic and SQL.)
btw - in this example, I really think the SQL is more readable with the Oracle SQL syntax. The ANSI SQL version, with all the JOIN and ON keywords, feels more difficult to read to me.
in your where clause try: and (author.name = 'samuel' and author.name != 'amol') OR (author.name != 'samuel' and author.name = 'amol') (According to https://community.oracle.com/thread/2342467?tstart=0).

Combine query results from one table with the defaults from another

This is a dumbed down version of the real table data, so may look bit silly.
Table 1 (users):
id INT
username TEXT
favourite_food TEXT
food_pref_id INT
Table 2 (food_preferences):
id INT
food_type TEXT
The logic is as follows:
Let's say I have this in my food preference table:
1, 'VEGETARIAN'
and this in the users table:
1, 'John', NULL, 1
2, 'Pete', 'Curry', 1
In which case John defaults to be a vegetarian, but Pete should show up as a person who enjoys curry.
Question, is there any way to combine the query into one select statement, so that it would get the default from the preferences table if the favourite_food column is NULL?
I can obviously do this in application logic, but would be nice just to offload this to SQL, if possible.
DB is SQLite3...
You could use COALESCE(X,Y,...) to select the first item that isn't NULL.
If you combine this with an inner join, you should be able to do what you want.
It should go something like this:
SELECT u.id AS id,
u.username AS username,
COALESCE(u.favorite_food, p.food_type) AS favorite_food,
u.food_pref_id AS food_pref_id
FROM users AS u INNER JOIN food_preferences AS p
ON u.food_pref_id = p.id
I don't have a SQLite database handy to test on, however, so the syntax might not be 100% correct, but it's the gist of it.

Merging contacts in SQL table without creating duplicate entries

I have a table that holds only two columns - a ListID and PersonID. When a person is merged with another in the system, I was to update all references from the "source" person to be references to the "destination" person.
Ideally, I would like to call something simple like
UPDATE MailingListSubscription
SET PersonID = #DestPerson
WHERE PersonID = #SourcePerson
However, if the destination person already exists in this table with the same ListID as the source person, a duplicate entry will be made. How can I perform this action without creating duplicated entries? (ListID, PersonID is the primary key)
EDIT: Multiple ListIDs are used. If SourcePerson is assigned to ListIDs 1, 2, and 3, and DestinationPerson is assigned to ListIDs 3 and 4, then the end result needs to have four rows - DestinationPerson assigned to ListID 1, 2, 3, and 4.
--out with the bad
DELETE
FROM MailingListSubscription
WHERE PersonId = #SourcePerson
and ListID in (SELECT ListID FROM MailingListSubscription WHERE PersonID = #DestPerson)
--update the rest (good)
UPDATE MailingListSubscription
SET PersonId = #DestPerson
WHERE PersonId = #SourcePerson
First you should subscribe destperson to all lists that SourcePerson is subscribed to that Destperson isn't already subscibed. Then delete all the SourcePersons subscriptions.
This will work with multiple ListIDs.
Insert into MailingListSubscription
(
ListID,
PersonID
)
Select
ListID,
#DestPerson
From
MailingListSubscription as t1
Where
PersonID = #SourcePerson and
Not Exists
(
Select *
From MailingListSubscription as t2
Where
PersonID = #DestPerson and
t1.ListID = t2.ListID
)
Delete From MailingListSubscription
Where
PersonID = #SourcePerson
I have to agree with David B here. Remove all the older stuff that shouldn't be there and then do your update.
Actually, I think you should go back and reconsider your database design as you really shouldn't be in circumstances where you're changing the primary key for a record as you're proposing to do - it implies that the PersonID column is not actually a suitable primary key in the first place.
My guess is your PersonID is exposed to your users, they've renumbered their database for some reason and you're syncing the change back in. This is generally a poor idea as it breaks audit trails and temporal consistency. In these circumstances, it's generally better to use your own non-changing primary key - usually an identity - and set up the PersonID that the users see as an attribute of that. It's extra work but will give you additional consistency and robustness in the long run.
A good rule of thumb is the primary key of a record should not be exposed to the users where possible and you should only do so after careful consideration. OK, I confess to breaking this myself on numerous occasions but it's worth striving for where you can :-)