Postgres: Update all columns from another table - sql

I need to update a table from another one, and I need to update all columns, My questions is that, besides list every column in the SET, is there a way to update them all like this:
update tableA
set * = tableB.*
from tableB where tableA.id = tableB.id
I tried in psql, it doesn't work.. I have to list every column like this:
update tableA
set c1 = tableB.c1, c2 = tableB.c2, ...
from tableB where tableA.id = tableB.id
tableB is created use create..like tableA. So they are basically identical. And the reason I'm doing it is that I need load .csv data to temp table tableB and then update tableA based on the new data in tableB. tableA needs to be locked as little as possible and tableA needs to keep integrity. I'm not sure 'delete then insert' would be a good option?
Thanks!

You could delete and re-insert, if the two tables have the same columns in the same order. Assuming that all records in tableB match tableA:
delete from tableA
where id in (select id from tableB)
insert into tableA
select *
from tableB;
(If all records do not match, you could use a temporary table to keep the necessary ids.)
Generally, I oppose doing insert without a column list. In some cases, it is tolerable -- such as when tableB was created as a subset from tableB using *.

Depends on your use case and will cause locks but of a different type. Approach that works well for some scenarios.
BEGIN;
DROP TABLE IF EXISTS tableA_old;
ALTER tableA RENAME TO tableA_old;
ALTER tableB RENAME TO tableA;
COMMIT;

Related

Updating a column with maximum matches from another table

I have two tables let’s say A & B and would like to update the column of Status in table A with maximum matches from the column Scores in table B by comparing the 'Topics' of two tables.
I am using the script shown here, but it's taking a really long time so I'd appreciate it if somebody could provide an alternative / better and faster option/script
UPDATE tableA
SET status = (SELECT max(scores)
FROM tableB
WHERE tableB.topics = tableA.topics)
Try creating proper indexes for each column involved and you should be fine, e.g:
CREATE INDEX idx_tableb_topics_scores ON tableb (topics,scores);
An alternative to your query is to apply the aggregate function max() in a way that it only has to be executed once, but I doubt it will speed things up:
UPDATE tablea a SET status = j.max_scores
FROM (SELECT a.topics,max(b.scores) AS max_scores
FROM tablea a
JOIN tableb b ON a.topics = b.topics
GROUP BY a.topics) j
WHERE a.topics = j.topics;
For this query:
UPDATE tableA
SET status = (SELECT max(scores)
FROM tableB
WHERE tableB.topics = tableA.topics
);
The only index you need is on tableB(topics, scores).
If you like, you can rewrite this as an aggregation, which looks like this:
UPDATE tableA
SET status = b.max_scores
FROM (SELECT b.topic, MAX(scores) as max_scores
FROM tableB b
GROUP BY b.topic
) b
WHERE b.topic = a.topic;
Note that this is subtly different from your query. If there are topics in A that are not in B, then this will not update those rows. I do not know if that is desirable.
If many rows in A have the same topic, then pre-aggregating could be significantly faster than the correlated subquery.

Condtional statement execution in sqlite

New to SQL and Sqlite, I'm afraid. I need to delete a row from table A, but only if that row isn't referenced in table B. So, I guess I need to do something like this:
delete from tableA
where (col1 = 94) and
(select count(*) from tableB (where col2 = 94) = 0);
What's the right way to do this?
Alternatively, I could just do this from C in two steps, first checking that the row isn't referenced, and then deleting it. Would this be better? I'm hesitant to do this because I would need to put an sqlite3_mutex around several steps in the C code, which might be less efficient that executing a single more complex statement. Thanks.
Your method is pretty close. The parentheses are in the wrong place:
delete from tableA
where (col1 = 94) and
(select count(*) from tableB where col2 = 94) = 0;
For instance, SQL doesn't allow a paren before the where.
I would however suggest that you learn about foreign keys. These are a useful construct in SQL. The SQLite documentation is here.
A foreign key constraint would have the database do this check automatically whenever a row is being deleted from tableA. The delete would return an error, but it would make sense -- something like "you can't delete this row because another row references it".
delete tableA
from tableA
left join tableB on tableA.col1 = tableB.col2
where tableB.col2 is null
and tableA.col1 = 94
You can left join the two tables and delete only those where the link could not be established between them (tableB.col2 is null).
Alternatively you can do
delete from tableA
where col1 = 94
and not exists
(
select 1 from tableB where col2 = 94
)

How do I write an SQL statement (for DB2) to update a table with these conditions?

Here's what I need to do:
For each row in Table1 where Name is not null or blank, and Table2 has a row with matching Name, replace another column in Table1 with the contents of a column from Table2, and set the name in Table 1 to null.
I can't seem to wrap my head around getting that logic into SQL.
I don't really care if Table2 has multiple rows with matching Names, just grabbing the first one is good enough.
In DB2 you could use the MERGE statement to do the job
MERGE INTO Table1 A
USING Table2 B
on A.name=B.name
WHEN MATCHED
THEN UPDATE SET A.another_column=B.Table2_Column
, A.name=NULL;
In T-Sql
You would use an UPDATE statement with a join criteria that would only impact the matching rows, something like the below:
Update A
set A.another_column=B.Table2_Column
, A.name=NULL
From Table1 A
inner join Table2 B
on A.name=B.name
Where isnull(A.name,'')<>''

Inheritance Problem in SQL Server 2005

I am having a problem whose solution I can't seem to figure out at the moment. Say I have a table
CREATE TABLE #Test
(
SomeValue1 INT,
SomeValue2 INT
)
Now, I have two other tables, say TableA and TableB, in which TableB inherits from TableA by sharing its primary key. I want to insert SomeValue1 into TableA and SomeValue2 into TableB, but in a way such that when you join TableA and TableB by their primary key you get #Test, but I can't figure out how to do it. Can anyone please help me?
SELECT SomeValue1, SomeValue2 FROM TableA A JOIN TableB B ON A.PrimaryKey=B.PrimaryKey
That sounds like what you're asking for, from what I can tell. Maybe you're also saying you want to create a View on the two tables of existing data?
If TableB inherits I assume TableB has the foreign key to an IDENTITY column in TableA
INSERT TableA (SomeValue) VALUES (SomeValue1)
INSERT TableB (SomeValue) VALUES (SCOPE_IDENTITY(), SomeValue2)
Otherwise, you can adapt the OUTPUT clause from your other recent questions: Output to Temporary Table in SQL Server 2005
Note:
How does this relate to Iterating Through Table in High-Performance Code? What is your actual end to end requirement... you are asking very specific questions that probably don't really help

what is the best way to delete millions of records in TSQL?

I have a following table structre
Table1 Table2 Table3
--------------------------------
sId sId sId
name x y
x1 x2 x3
I want to remove all records from table1 that do not have a matching record in the table3 based on sId and if sId present in table2 then do not delete record from table1.Ther are about 20,15 and 10 millions records in table1,table2 & table3 resp.
--I have done something like this
Delete Top (3000000)
From Table1 A
Left Join Table2 B
on A.Name ='XYZ' and
B.sId = A.sId
Left Join Table3 C
on A.Name = 'XYZ' and
C.sId = A.sId
((I have added index on sId But not on Name.))
But This takes a long time to remove records.
Is there any better way to delete millions records?
Thanks in advance.
do it in batches of 5000 or 10000 instead if you need to delete less than 40% of the data, if you need more then dump what you want to keep in another table/bcp out, truncate this table and insert those rows you dumped in the other table again/bcp in
while ##rowcount > 0
begin
Delete Top (5000)
From Table1 A
Left Join Table2 B
on A.Name ='XYZ' and
B.sId = A.sId
Left Join Table3 C
on A.Name = 'XYZ' and
C.sId = A.sId
end
Small example you can run to see what happens
CREATE TABLE #test(id INT)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
INSERT #test VALUES(1)
WHILE ##rowcount > 0
BEGIN
DELETE TOP (2) FROM #test
END
One way to remove millions of records is to select the remaining records in new tables then drop the old tables and rename the new ones. You can choose the best way for you depending on the foreign keys you can eithe drop and recreate the foreign keys or truncate the data in the old tables and copy the selected data back.
If you need to delete just few records disregard this answer. This is if you actually want to DELETE millions of records.
One other method is to insert the data that you want to keep into another table say Table1_good.
Once the is completed and verified:
Drop Table1 then Rename Table1_good to Table1
Dirty way to do it but it works.
Using the top clause is more for improving concurrency and may actually make the code run slower.
One suggestion is to delete the data from a derived table:
http://sqlblogcasts.com/blogs/simons/archive/2009/05/22/DELETE-TOP-x-rows-avoiding-a-table-scan.aspx
Have you set up appropriate indexes on the relevant table fields? If not it could take a long time to delete the records.
The DELETE operation you're performing is running an underlying SELECT statement to find the records that will be deleted. The operation you're doing is fundamentally a simple join. If you optimize that join, the final DELETE will be faster, too.
Make sure you have the indexes on the columns on which you're doing the joins on. Run an Execution Plan to make sure they are being used.
Once you have cleaned up the data, I would put an AFTER DELETE trigger on table3 that automatically deleted the applicable records from table1. This way you keep the data cleaned up in real time and never have to delete huge chunks.
i'd create a temp table create a seleet and populate the temp table, add indexes to the temp table and delete from my table that i want to delete records from. Then i would drop my temp table when i'm done something like this
Select * into #temp from mytable
Where blah blah(or your query)
//add contraints if you want
i would just shove the primary key into the temp table
then i would say
Delete mytable
where primary key in(select myPrimarykey from #temp)