Update if different/changed - sql

Is it possible to perform an update statement in sql, but only update if the updates are different?
for example
if in the database, col1 = "hello"
update table1 set col1 = 'hello'
should not perform any kind of update
however, if
update table1 set col1 = "bye"
this should perform an update.

During query compilation and execution, SQL Server does not take the time to figure out whether an UPDATE statement will actually change any values or not. It just performs the writes as expected, even if unnecessary.
In the scenario like
update table1 set col1 = 'hello'
you might think SQL won’t do anything, but it will – it will perform all of the writes necessary as if you’d actually changed the value. This occurs for both the physical table (or clustered index) as well as any non-clustered indexes defined on that column. This causes writes to the physical tables/indexes, recalculating of indexes and transaction log writes. When working with large data sets, there is huge performance benefits to only updating rows that will receive a change.
If we want to avoid the overhead of these writes when not necessary we have to devise a way to check for the need to be updated. One way to check for the need to update would be to add something like “where col <> 'hello'.
update table1 set col1 = 'hello' where col1 <> 'hello'
But this would not perform well in some cases, for example if you were updating multiple columns in a table with many rows and only a small subset of those rows would actually have their values changed. This is because of the need to then filter on all of those columns, and non-equality predicates are generally not able to use index seeks, and the overhead of table & index writes and transaction log entries as mentioned above.
But there is a much better alternative using a combination of an EXISTS clause with an EXCEPT clause. The idea is to compare the values in the destination row to the values in the matching source row to determine if an update is actually needed. Look at the modified query below and examine the additional query filter starting with EXISTS. Note how inside the EXISTS clause the SELECT statements have no FROM clause. That part is particularly important because this only adds on an additional constant scan and a filter operation in the query plan (the cost of both is trivial). So what you end up with is a very lightweight method for determining if an UPDATE is even needed in the first place, avoiding unnecessary write overhead.
update table1 set col1 = 'hello'
/* AVOID NET ZERO CHANGES */
where exists
(
/* DESTINATION */
select table1.col1
except
/* SOURCE */
select col1 = 'hello'
)
This looks overly complicated vs checking for updates in a simple WHERE clause for the simple scenerio in the original question when you are updating one value for all rows in a table with a literal value. However, this technique works very well if you are updating multiple columns in a table, and the source of your update is another query and you want to minimize writes and transaction logs entries. It also performs better than testing every field with <>.
A more complete example might be
update table1
set col1 = 'hello',
col2 = 'hello',
col3 = 'hello'
/* Only update rows from CustomerId 100, 101, 102 & 103 */
where table1.CustomerId IN (100, 101, 102, 103)
/* AVOID NET ZERO CHANGES */
and exists
(
/* DESTINATION */
select table1.col1
table1.col2
table1.col3
except
/* SOURCE */
select z.col1,
z.col2,
z.col3
from #anytemptableorsubquery z
where z.CustomerId = table1.CustomerId
)

The idea is to not perform any update if a new value is the same as in DB right now
WHERE col1 != #newValue
(obviously there is also should be some Id field to identify a row)
WHERE Id = #Id AND col1 != #newValue
PS: Originally you want to do update only if value is 'bye' so just add AND col1 = 'bye', but I feel that this is redundant, I just suppose
PS 2: (From a comment) Also note, this won't update the value if col1 is NULL, so if NULL is a possibility, make it WHERE Id = #Id AND (col1 != #newValue OR col1 IS NULL).

If you want to change the field to 'hello' only if it is 'bye', use this:
UPDATE table1
SET col1 = 'hello'
WHERE col1 = 'bye'
If you want to update only if it is different that 'hello', use:
UPDATE table1
SET col1 = 'hello'
WHERE col1 <> 'hello'
Is there a reason for this strange approach? As Daniel commented, there is no special gain - except perhaps if you have thousands of rows with col1='hello'. Is that the case?

This is possible with a before-update trigger.
In this trigger you can compare the old with the new values and cancel the update if they don't differ. But this will then lead to an error on the caller's site.
I don't know, why you want to do this, but here are several possibilities:
Performance: There is no performance gain here, because the update would not only need to find the correct row but additionally compare the data.
Trigger: If you want the trigger only to be fired if there was a real change, you need to implement your trigger like so, that it compares all old values to the new values before doing anything.

CREATE OR REPLACE PROCEDURE stackoverflow([your_value] IN TYPE) AS
BEGIN
UPDATE [your_table] t
SET t.[your_collumn] = [your_value]
WHERE t.[your_collumn] != [your_value];
COMMIT;
EXCEPTION
[YOUR_EXCEPTION];
END stackoverflow;

You need an unique key id in your table, (let's suppose it's value is 1) to do something like:
UPDATE table1 SET col1="hello" WHERE id=1 AND col1!="hello"

Old question but none of the answers correctly address null values.
Using <> or != will get you into trouble when comparing values for differences if there are is potential null in the new or old value to safely update only when changed use the is distinct from operator in Postgres. Read more about it here

I think this should do the trick for ya...
create trigger [trigger_name] on [table_name]
for insert
AS declare #new_val datatype,#id int;
select #new_val = i.column_name from inserted i;
select #id = i.Id from inserted i;
update table_name set column_name = #new_val
where table_name.Id = #id and column_name != #new_val;

Related

Update/Create table from select with changes in the column values in Oracle 11g (speed up)

At the job we have an update script for some Oracle 11g database that takes around 20 hours, and some of the most demanding queries are updates where we change some values, something like:
UPDATE table1 SET
column1 = DECODE(table1.column1,null,null,'no info','no info','default value'),
column2 = DECODE(table1.column2,null,null,'no info','no info','another default value'),
column3 = 'default value';
And like this, we have many updates. The problem is that the tables have around 10 millions of rows. We also have some updates where some columns are going to have a default value but they are nullable (I know if they have the not null and a default constrains then the add of such columns is almost immediate because the values are in a catalog), and then the update or add of such columns is costing a lot of time.
My approach is to recreate the table (as TOM said in https://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:6407993912330 ). But I have no idea on how to retrive some columns from the original table, that are going to remain the same and also other that are going to change to a default value (and before the update such column had a sensible info), this because we need to keep some info private.
So, my approach is something like this:
CREATE TABLE table1_tmp PARALLEL NOLOGGING
AS (select col1,col2,col3,col4 from table1);
ALTER TABLE table1_tmp ADD ( col5 VARCHAR(10) default('some info') NOT NULL;
ALTER TABLE table1_tmp ADD ( col6 VARCHAR(10) default('some info') NOT NULL;
ALTER TABLE table1_tmp ADD ( col7 VARCHAR(10);
ALTER TABLE table1_tmp ADD ( col8 VARCHAR(10);
MERGE INTO table1_tmp tt
USING table1 t
ON (t.col1 = tt.col1)
WHEN MATCHED THEN
UPDATE SET
tt.col7 = 'some defaul value that may be null',
tt.col7 = 'some value that may be null';
I also have tried to create the nullable values as not null to do it fast, and worked, the problem is when I return the columns to null, then that operation takes too much time. The last code ended up consuming also a great amount of time (more tha one hour in the merge).
Hope have an idea on how to improve performance in stuff like this.
Thanks in advance!
Maybe you can try using NVL while joining in merge:
MERGE INTO table1_tmp tt
USING table1 t
ON (nlv(t.col1,'-3') = nvl(tt.col1,'-3'))
WHEN MATCHED THEN ....
If you don't want update null values you can also do like this:
MERGE INTO table1_tmp tt
USING table1 t
ON (nlv(t.col1,'-3') = nvl(tt.col1,'-2'))
WHEN MATCHED THEN .....
At the end, I finished creating a temp table with data from the original table, and while doing the create, inserting the default values and decodes and any other stuff, like if I wanted to set something to NULL, I did the cast. Something like:
CREATE TABLE table1_tmp AS (
column1 default "default message",
column2, --This column with no change at all
column3, --This will take the value from the decode below
) AS SELECT
"default message" column1,
column2 --This column with no change at all,
decode(column3, "Something", NULL, "A", "B") column3,
FROM table1;
That is how I solved the problem. The time for coping a 23 million row's table was about 3 to 5 minutes, while updating used to take hours. Now just need to set privileges, constraints, indexes, comment, and that's it, but that stuff just takes seconds.
Thanks for the answer #thehazal could not check your approach, but sounds interesting.

UPDATE/INSERT statement within an IF statement within a LOOP using sequence

CURSOR text IS
SELECT *
FROM DATA
CURSOR MATCHES IS
SELECT NAME
FROM DATA
INTERSECT
SELECT DESCRIPTION
FROM my_table
BEGIN
FOR i IN text
OPEN MATCHES
FETCH MATCHES INTO MATCH
CLOSE MATCH
IF i IN MATCH
THEN
UPDATE my_table
SET col1 = correlating_new_column1, col2 = correlating_new_column2, col3 = correlating_new_column3
WHERE table_im_trying_to_populate.code = my_seq.curval
ELSE
INSERT INTO TABLE_IM_TRYING_TO_POPULATE(CODE, NAME, DESCRIPTION, col1, col2, col3)
VALUES(my_seq.nextval, other_name, other_description, correlating_new_column1, correlating_new_column2, correlating_new_column3)
END IF;
END LOOP;
Basically I am trying to take an explicit cursor I made that is a select statement of a table and then do a line by line loop of that and put it into my other existing table. If it comes across a name in my other exisiting table it will update some of the columns. Else it inserts the whole record into that table. I am attempting to use sequence to update the 'code' column so that it updates where the code from the other existing table = my_seq.curval. Then for the inset it just goes to the next val. I know this is complicated but Im really just trying to see if I have the setup correct. Just started using sql developer for oracle not to long ago.
There are a huge number of problems with your code. However, it looks like what you're trying to do can be achieved with a single MERGE statement, along the lines of:
merge into table_im_trying_to_populate tgt
using data_table src
on (tgt.name = src.other_name
and tgt.description = src.other_description)
when matched then
update set tgt.col1 = src.correlating_new_column1,
tgt.col2 = src.correlating_new_column2,
tgt.col3 = src.correlating_new_column3
when not matched then
insert (tgt.code, tgt.name, tgt.description, tgt.col1, tgt.col2, tgt.col3)
values (my_seq.nextval, src.other_name, src.other_description, src.correlating_new_column1, src.correlating_new_column2, src.correlating_new_column3);
This assumes that the other_name and other_description columns in the data table is unique. I've also had to guess at what the join condition should be, since the join condition you had in your example update statement (table_im_trying_to_populate.code = my_seq.currval) didn't make any sense - you don't use currval to join against as a general rule, since it isn't populated unless you've previously pulled a value from the sequence in the same session.
If this doesn't match what you're trying to do, please update your question with some sample data in both tables and the expected output, and we should be able to help you further.

Database trigger

I have no experience in writing database trigger but I need one in my current project.
My use case is the following. I have two tables - Table 1 and Table 2.
These tables have a 1 : m relation.
My usecase is, if all records in Table1 have "VALUE2" than value in Table2 should updated to VALUE2.
So if record-value with ID 3 of table1 is updated to VALUE2 than Value of table2 also should be updated to value2.
It would be great if someone could help me - Thanks a lo for than!
TABLE1:
ID FK_Table2 VALUE
-----------------------------
1 77 VALUE2
2 77 VALUE2
3 77 VALUE1
4 54 OTHERVALUE
TABLE2:
ID VALUE
---------------
77 VALUE1
So you need to learn and try basic trigger first.
CREATE OR REPLACE TRIGGER trigger_name
AFTER UPDATE ON TABLE1
FOR EACH ROW
BEGIN
/* trigger code goes here...*/
/* for this particular case you need to update value of table2 */
UPDATE TABLE2 SET VALUE = new.VALUE WHERE TABLE2.ID = new.FK_Table2 ;
END
Try and write some code. IF stucked... come back and let us know...
No matter which system, there are some basic rules or best practices you should know. One is that it is bad form (and outright prohibited in many systems) for a trigger to reach back out and query the very table the trigger is written for. Your use case requires the trigger on Table1 to go back out and read from Table1 during the Update operation. Not good.
One available option is to use a stored procedure to handle all the updates to this table. They are more awkward to work with (for example: if a parameter is NULL, does that mean put a NULL in the corresponding field or leave it unmodified?). For that reason, and with the understanding that this is based on the limited amount of information in the question, I would recommend one of two alternatives.
One is to have a stored procedure that is used only to change the VALUE field. That field is not changed in a vacuum, but as part of a larger process. The step in the process that actually ends up changing the field could then call the SP.
Another is to front the table with a view with an "instead of" trigger and perform all DML through the view. This is the method I prefer, at least on those systems that allow triggers on views. The view trigger may query the underlying table as needed.
As for the logic (SP or trigger) here is some pseudo code:
-- Make the update
update table1 set value = #somevalue
where id = #someid;
-- Get the group that id is in
select FK_Table2 into #somegroupid
from Table1
where id = #someid;
-- Are all the values in that group the same?
select count(*) into #OtherValues
from Table1
where FK_Table2 = #somegroupid
and value <> #somevalue;
-- If so, notify the other table.
if #OtherValues = 0 then
update table2 set value = #somevalue
where id = #somegroupid;
I hope this answers your immediate question. However, based on what you have shown us here, the major cause of the problem would seem to be poor design. Let us know the higher level requirement you are trying to fill here and I'll bet we could come up with some modeling changes that would make this a whole lot easier without having to get really clever with SPs or triggers.

CASE vs Multiple UPDATE queries for large data sets - Performance

For performance what option would be better for large data sets that are to be updated?
Using a CASE statement or Individual update queries?
CASE Example:
UPDATE tbl_name SET field_name =
CASE
WHEN condition_1 THEN 'Blah'
WHEN condition_2 THEN 'Foo'
WHEN condition_x THEN 123
ELSE 'bar'
END AS value
Individual Query Example:
UPDATE tbl_name SET field_name = 'Blah' WHERE field_name = condition_1
UPDATE tbl_name SET field_name = 'Foo' WHERE field_name = condition_2
UPDATE tbl_name SET field_name = 123 WHERE field_name = condition_x
UPDATE tbl_name SET field_name = 'bar' WHERE field_name = condition_y
NOTE: About 300,000 records are going to be updated and the CASE statement would have about 10,000 WHEN conditions. If using the individual queries it's about 10,000 as well
The CASE version.
This is because there is a good chance you are altering the same row more than once with the individual statements. If row 10 has both condition_1 and condition_y then it will need to get read and altered twice. If you have a clustered index this means two clustered index updates on top of whatever the other field(s) that were modified were.
If you can do it as a single statement, each row will be read only once and it should run much quicker.
I changed a similar process about a year ago that used dozens of UPDATE statements in sequence to use a since UPDATE with CASE and processing time dropped about 80%.
It seems logic to me that on the first option SQL Server will go through the table only once and for each row, it will evaluate the condition.
On the second, it will have to go through all table 4 times
So, for a table with 1000 rows, on the first option on the best case scenario we are talking about 1000 evaluations and worst case, 3000.
On the second we'll always have 4000 evaluations
So option 1 would be the faster.
As pointed out by Mitch, try making a temp table filling it with all the data you need, make a different temp table for each column (field) you want to change. You should also add an index to the temp table(s) for added performance improvement.
This way your update statement becomes (more or less):
UPDATE tbl_name SET field_name = COALESCE((SELECT value FROM temp_tbl WHERE tbl_name.conditional_field = temp_tbl.condition_value), field_name),
field_name2 = COALESCE((SELECT value FROM temp_tbl2 WHERE tbl_name.conditional_field2 = temp_tbl2.condition_value), field_name2)
and so on..
This should give you good performance while scaling up for large volumes of updates at once.

select the rows affected by an update

If I have a table with this fields:
int:id_account
int:session
string:password
Now for a login statement I run this sql UPDATE command:
UPDATE tbl_name
SET session = session + 1
WHERE id_account = 17 AND password = 'apple'
Then I check if a row was affected, and if one indeed was affected I know that the password was correct.
Next what I want to do is retrieve all the info of this affected row so I'll have the rest of the fields info.
I can use a simple SELECT statement but I'm sure I'm missing something here, there must be a neater way you gurus know, and going to tell me about (:
Besides it bothered me since the first login sql statement I ever written.
Is there any performance-wise way to combine a SELECT into an UPDATE if the UPDATE did update a row?
Or am I better leaving it simple with two statements? Atomicity isn't needed, so I might better stay away from table locks for example, no?
You should use the same WHERE statement for SELECT. It will return the modified rows, because your UPDATE did not change any columns used for lookup:
UPDATE tbl_name
SET session = session + 1
WHERE id_account = 17 AND password = 'apple';
SELECT *
FROM tbl_name
WHERE id_account = 17 AND password = 'apple';
An advice: never store passwords as plain text! Use a hash function, like this:
MD5('apple')
There is ROW_COUNT() (do read about details in the docs).
Following up by SQL is ok and simple (which is always good), but it might unnecessary stress the system.
This won't work for statements such as...
Update Table
Set Value = 'Something Else'
Where Value is Null
Select Value From Table
Where Value is Null
You would have changed the value with the update and would be unable to recover the affected records unless you stored them beforehand.
Select * Into #TempTable
From Table
Where Value is Null
Update Table
Set Value = 'Something Else'
Where Value is Null
Select Value, UniqueValue
From #TempTable TT
Join Table T
TT.UniqueValue = T.UniqueValue
If you're lucky, you may be able to join the temp table's records to a unique field within Table to verify the update. This is just one small example of why it is important to enumerate records.
You can get the effected rows by just using ##RowCount..
select top (Select ##RowCount) * from YourTable order by 1 desc