I basically have a table A with 30 million records and I want to add a column entitled "TYPE" to the table. I have a look up table B that maps a code to a color. I want to iterate through table A and compare the code in TABLE A to the code in TABLE B and then add the color to the TYPE column in table A. Is this possible? What would be the best approach to this problem? The codes in table B don't match perfectly with the actual codes in table A.
hard to say without seeing the schema or knowing the DBMS but, if it's always a the first 2 digits of the code used to look up the color, why not
UPDATE table_a SET type = SUBSTR(code, 2)
and do a JOIN normally
you could do a join like
JOIN table_b ON table_b.id = SUBSTR(table_a.code,2)
but that would hardly be performant.
Give or take issues with size of transaction, isn't it a simple ALTER TABLE to add the column and an UPDATE to fix it?
ALTER TABLE TableA ADD (Type VARCHAR(10));
UPDATE TableA
SET Type = ((SELECT Colour FROM TableB WHERE TableA.Type = TableB.Type));
The only tricky bit might be the double parentheses; they're needed in some DBMS and may not be needed in others (single parentheses may be sufficient). You may also be able to use a UPDATE with a JOIN; the syntax isn't entirely standard there, either.
Note that this mapping relies on a change of NULL to NULL being a no-op. If you have values in some rows but not all of those rows match an entry in TableB, then you need to be careful with an full-table UPDATE like that. It will set the Type for any rows in TableA for which there is no match in TableB to NULL. If that wasn't what you needed, you'd either use an UPDATE with JOIN or you'd write:
UPDATE TableA
SET Type = ((SELECT Colour FROM TableB WHERE TableA.Type = TableB.Type));
WHERE Type IN (SELECT Colour FROM TableB WHERE TableA.Type = TableB.Type);
In the current case, where the column is newly added and therefore contains NULL anyway, there is no harm done omitting the WHERE clause on the UPDATE itself.
You can do the update just by looking at the first characters of the code in table B, as in the following statement:
update table_a
set table_a.type = table_b.type
from table_b
where table_b.code = substr(table_a.code, 1, length(table_b.code))
This may be a bit slow, since you cannot use any indexes to speed it up. However, if table_b is small, then the performance may be acceptable.
Related
I am attempting to optimise a bulk UPDATE statement in Postgres using the UPDATE..FROM syntax to update from a list of values. It works except when the same row might be updated more than once in the same query.
For example say I have a table "test" with columns "key" and "value".
update test as t set value = v.value from (values
('key1', 'update1'),
('key1', 'update2') )
as v (key, value)
where t.key = v.key;
My desired behavior is for the row with key 'key1' to be updated twice, finishing with value set to 'update2'. In practice sometimes the value is updated to update1 and sometimes to update2. Also an update trigger function on the table is only invoked once.
The documentation (http://www.postgresql.org/docs/9.1/static/sql-update.html) explains why:
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_list, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.
Because of this indeterminacy, referencing other tables only within sub-selects is safer, though often harder to read and slower than using a join.
Is there any way to reformulate this query to achieve the behavior I'm looking for? Does the reference to sub-selects in the documentation give a hint?
Example (assuming id is a PK in the target table, and {id, date_modified} is a PK in the source table)
UPDATE target dst
Set a = src.a , b = src.b
FROM source src
WHERE src.id = dst.id
AND NOT EXISTS (
SELECT *
FROM source nx
WHERE nx.id = src.id
-- use an extra key field AS tie-breaker
AND nx.date_modified > src.date_modified
);
(in fact, this is deduplication of the source table -> forcing the source table to the same PK as the target table)
I've added a new column(NewValue) to my table which holds an int and allows nulls. Now I want to update the column but my insert statement only attempts to update the first column in the table not the one I specified.
I basically start with a temp table that I put my initial data into and it has two columns like this:
create table #tempTable
(
OldValue int,
NewValue int
)
I then do an insert into that table and based on the information NewValue can be null.
Example data in #tempTable:
OldValue NewValue
-------- --------
34556 8765432
34557 7654321
34558 null
Once that's complete I planned to insert NewValue into the primary table like so:
insert into myPrimaryTable(NewValue)
select tt.NewValue from #tempTable tt
left join myPrimaryTable mpt on mpt.Id = tt.OldValue
where tt.NewValue is not null
I only want the NewValue to insert into rows in myPrimaryTable where the Id matches the OldValue. However when I try to execute this code I get the following error:
Cannot insert the value NULL into column 'myCode', table 'myPrimaryTable'; column does not allow nulls. INSERT fails.
But I'm not trying to insert into 'myCode', I specified 'NewValue' as the column but it doesn't seem to see it. I've checked NewValue and it is set to allow int and is set to allow null and it does exist on the right table in the right database. The column 'myCode' is actually the second column in the table. Could someone please point me in the right direction with this error?
Thanks in advance.
INSERT always creates new rows, it never modifies existing rows. If you skip specifying a value for a column in an INSERT and that column has no DEFAULT bound to it and is not identity, that column will be NULL in the new row--thus your error. I believe you might be looking for an UPDATE instead of an INSERT.
Here's a potential query that might work for you:
UPDATE mpt
SET
mpt.NewValue = tt.NewValue
FROM
myPrimaryTable mpt
INNER JOIN #tempTable tt
ON mpt.Id = tt.OldValue -- really?
WHERE
tt.NewValue IS NOT NULL;
Note that I changed it to an INNER JOIN. A LEFT JOIN is clearly incorrect since you are filtering #tempTable for only rows with values, and don't want to update mpt where there is no match to tt--so LEFT JOIN expresses the wrong logical join type.
I put "really?" as a comment on the ON clause since I was wondering if OldValue is really an Id. It probably is--you know your table best. It just raised a mild red flag in my mind to see an Id column being compared to a column that does not have Id in its name (so if it is correct, I would suggest OldId as a better column choice than OldValue).
Also, I recommend that you never name a column just Id again--column names should be the same in every table in the database. Also, when it comes join time you will be more likely to make mistakes when your columns from different tables can coincide. It is much better to follow the format of SomethingId in the Something table, instead of just Id. Correspondingly, the suggested old column name would be OldSomethingId.
I am having a table which has about 17 fields. I need to perform frequent updates in this table. But the issue is each time I may be updating only a few fields. Whats the best way to write a query for updating in such a scenario? I am looking for an option in which the value gets updated only if it is not null.
For example I have four fields in database Say A,B,C,D.
User updates the value of say D. All other values remains the same. So I want an update query which updates only the value of D keeping the others unaltered.
SO if i put a,b and c as null and d with the value supplied by user I want to write an update query which only updates the value of d as a,b and c is null.
Is it something achievable?
I am using SQLite database.
Could someone please throw some light into it?
Without knowing your database it's tough to be specific. In SQL Server the syntax would be something like ...
UPDATE MyTable
SET
Field1 = IsNull(#Field1, Field1),
Field2 = IsNull(#Field2, Field2),
Field3 = IsNull(#Field3, Field3)
WHERE
<your criteria here>
EDIT
Since you specified SQLLite ...replace my IsNull function with COALESCE() or alternately look at the IfNull function.
Posting a SQL Server solution with 2 tables for posterity. Query joins two tables and updates the values that are present. Otherwise original value is maintained.
tables = table1, table2 each having field1 and field2
update t1 WITH (ROWLOCK)
set T1.Field2 = ISNULL(T2.Field2,T1.Field2)
from Table1 T1 Join Table2 T2
ON T1.Field1 = T2.Field1
I have to write a statement which fills a table (customers) with synthetically generated values. There is an addtional constraint that I should only fill those attributes (columns) with a special property (i.e. formally do a projection on them and then operate on them exclusively). These properties are stored in a second table, attributes.
My first draft consists of the following two statements:
-- Get the attributes (columns) we are interested in only
SELECT attributeID from attributes
WHERE tableID = 'customers'
-- Iterate over each row of customers, filling only those attributes (columns)
-- obtained by the above SELECT statement
UPDATE customers
SET (use the records from above select statement...)
Now my problem is how to put them together. I know there is the possibility of appending a WHERE clause to the SET clause, but that would select rows, not columns, as I need. I also read about PIVOT, but so far only inside one single table, not two, as is the case here. I would be very thankful for any hint, since I have no idea how to do this.
is not it you're looking for?
SQL Update Multiple Fields FROM via a SELECT Statement
UPDATE
Table
SET
Table.col1 = other_table.col1,
Table.col2 = other_table.col2
FROM
Table
INNER JOIN
other_table
ON
Table.id = other_table.id
Standard SQL-92 requires a scalar subquery:
UPDATE customers
SET attributeID = (
SELECT A1.attributeID
FROM attributes AS A1
WHERE A1.tableID = 'customers'
);
However, UPDATE customers...WHERE A1.tableID = 'customers' "smells" like you may be mixing data with metadata.
If only one row with a distinct field is to be updated,I can use:
insert into tab(..) value(..) on duplicate key update ...
But now it's not the case,I need to update 4 rows inside the same table,which have its field "accountId" equal to $_SESSION['accountId'].
What I can get out of my mind at the moment is:
delete from tab where accountId = $_SESSION['accountId'],
then insert the new rows.
Which obviously is not the best solution.
Has someone a better idea about this?
Use the update just like that!
update tab set col1 = 'value' where accountId = $_SESSION['accountId']
Moreover, MySQL allows you to do an update with a join, if that makes your life a bit easier:
update
tab t
inner join accounts a on
t.accountid = a.accountid
set
t.col1 = 'value'
where
a.accountname = 'Tom'
Based on your question, it seems like you should review the Update Statement.
Insert is used to put new rows in - not update them. Delete is used to remove. And Update is used to modify existing rows. Using "Insert On Duplicate Key Update" is a hackish way to modify rows, and is poor form to use when you know the row is already there.
load all of the values in to a temporary table.
UPDATE all of the values using a JOIN.
INSERT all of the values from the temp table that don't exist in the target table.
You can use replace statement. This will work as a DELETE followed by INSERT