I have a database table with about 100 columns (bulky, I know). I have about half of these columns which I will need to update iteratively to set Is Null or "" values to "TBD".
I compiled all 50 some columns which need to be updated into an update query with Access SQL code that looked something like this...
UPDATE tablename
SET tablename.column1="TBD", tablename.column2="TBD", tablename.column3="TBD"....
WHERE tablename.column1 Is Null OR tablename.column1="" OR tablename.column2 Is Null OR tablename.column2="" OR tablename.column3 Is Null OR tablename.column3=""....
Two issues: This query with 50 columns receives a "query is too complex" error.
This query is also just functionally wrong...because I'm losing data within these columns due to the WHERE statement. Records that had values populated which I did not want to update are being updated because of the OR clause.
My question is how can I go about updating all of these columns and setting their null or empty values to a particular value (in this case, "TBD")?
I know that I can just use a select query to select the columns I need to update, run it, and just CTRL+H to find & replace "" to "TBD". However, I'm worried about the potential for this to introduce errors into my dataset. I also know I could also go through column by column and update these values via an update query. However, this would be quite time consuming with 50+ columns & the iterative updates which I need to run on the entire dataset.
I'm leaning towards this latter route. I am still wondering if there are any other scripted options which I can build into a query to overcome such an issue, and that leads me here to you.
Thank you!
You could just run 50 queries:
UPDATE table SET column1="TBD" WHERE column1 IS NULL OR column1 = "";
An optimization could be:
Create a temporary table which determines which rows actually would need an update: Concatenate all column values such that a single NULL or empty would result in an record in your temp table. This way you only have to scan the base table once.
Use the keys from that table to focus on those rows only.
Etc.
That is safe and only updates your empty values (where as your previous query would have updated all columns unless you would have checked every value first with an IFNULL).
This query style also does not run into the too complex issue
You could issue one query as:
UPDATE tablename
SET column1 = iif(column1 is null or column1 = "", "TBD", column1),
column2 = iif(column2 is null or column2 = "", "TBD", column2),
. . .;
If you don't mind potentially updating all rows, you can leave out the where clause.
Related
I have a table which I dynamically fill with some data I want to create some statistics for. I have one value which has some values following a certain pattern, so I created an additional column where I map the values to other values so I can group them.
Now before I run my statistics, I need to check if I have to remap these values which means that I have to check if there are null values in that column.
I can do a select like this:
select distinct 1
from my-table t
where t.status_rd is not null
;
The disadvantage is, that this returns exactly one row, but it has to perform a full select. Is there some way that I can stop the select for the first encounter? I'm not interested in the exact row, because when there is at least one row, I have to run an update on all of them anyway, but I would like to avoid running the update unnecessarily everytime.
In Oracle I would do it with rownum, but this doesn't exist in SQLite
select 1
from my-table t
where t.status_rd is not null
and rownum <= 1
;
Use LIMIT 1 to select the first row returned:
SELECT 1
FROM my_table t
WHERE t.status_rd IS NULL
LIMIT 1
Note: I changed the where clause from IS NOT NULL to IS NULL based on your problem description. This may or may not be correct.
I'm swapping column values in a table using the following statement:
UPDATE SwapTable
SET ValueA=ValueB
,ValueB=ValueA
This works and the values do get swapped, as can be verified by this SQL Fiddle.
However, if we did such thing in (mostly any) other language, we would end up with both ValueA and ValueB having identical values.
So my question is why/how this works in SQL.
You can just see the execution plan.
Select all the rows from the table and make it as a row set.
Open a transaction
Update the table referenced (SwapTable) with corresponding row address, from the old values read from the row set to the field reference.
Commit -- done updating.
I want to update or insert a row in a table. I have also created an index on the column I am searching for in the WHERE clause.
The thing I want to insert into the table may or may not already exist in the table so it may be an update or an insert.
So first I define a Boolean variable like "already_exists" and a select statement that goes and searches for the value in the table, if it finds it it will set the Boolean variable to true, else it will remain false.
Then I say oh OK if that variable is true, then run this update command on the table, if it is false run this insert command.
So is it the right way of doing it or there are better ways?
Yes.
Depending on your SQL platform, MERGE or UPSERT...
wikipedia Merge (SQL)
For performance what option would be better for large data sets that are to be updated?
Using a CASE statement or Individual update queries?
CASE Example:
UPDATE tbl_name SET field_name =
CASE
WHEN condition_1 THEN 'Blah'
WHEN condition_2 THEN 'Foo'
WHEN condition_x THEN 123
ELSE 'bar'
END AS value
Individual Query Example:
UPDATE tbl_name SET field_name = 'Blah' WHERE field_name = condition_1
UPDATE tbl_name SET field_name = 'Foo' WHERE field_name = condition_2
UPDATE tbl_name SET field_name = 123 WHERE field_name = condition_x
UPDATE tbl_name SET field_name = 'bar' WHERE field_name = condition_y
NOTE: About 300,000 records are going to be updated and the CASE statement would have about 10,000 WHEN conditions. If using the individual queries it's about 10,000 as well
The CASE version.
This is because there is a good chance you are altering the same row more than once with the individual statements. If row 10 has both condition_1 and condition_y then it will need to get read and altered twice. If you have a clustered index this means two clustered index updates on top of whatever the other field(s) that were modified were.
If you can do it as a single statement, each row will be read only once and it should run much quicker.
I changed a similar process about a year ago that used dozens of UPDATE statements in sequence to use a since UPDATE with CASE and processing time dropped about 80%.
It seems logic to me that on the first option SQL Server will go through the table only once and for each row, it will evaluate the condition.
On the second, it will have to go through all table 4 times
So, for a table with 1000 rows, on the first option on the best case scenario we are talking about 1000 evaluations and worst case, 3000.
On the second we'll always have 4000 evaluations
So option 1 would be the faster.
As pointed out by Mitch, try making a temp table filling it with all the data you need, make a different temp table for each column (field) you want to change. You should also add an index to the temp table(s) for added performance improvement.
This way your update statement becomes (more or less):
UPDATE tbl_name SET field_name = COALESCE((SELECT value FROM temp_tbl WHERE tbl_name.conditional_field = temp_tbl.condition_value), field_name),
field_name2 = COALESCE((SELECT value FROM temp_tbl2 WHERE tbl_name.conditional_field2 = temp_tbl2.condition_value), field_name2)
and so on..
This should give you good performance while scaling up for large volumes of updates at once.
I have a table which contains multiple columns.
Column 1 Column 2 Column 3
unique identifier alphanumerical value numerical value
The unique identifier is currently using the values from Column 2. If I wanted to use the values from Column 3 instead, which would be better suited for my situation? Replace or Update? Or is there another way I should go about doing this.
I'm using TOAD for Oracle for what it is worth.
Thank you.
Turns out what the OP wanted to do was simply set column1 to the value in column3, no replace was necessary. Just a straight update, as in:
UPDATE TheTable SET column1 = column3;
IF
you want to make a change in the table THEN
use UPDATE
ELSE IF
you want to just view the column1 values mapped with column3 values for particular instance THEN
use INSERT
The otherway around use UPDATE to change the contents of table, whereas replace is a function which really doesnt makes any changes in the contents of the table but just shows u the changed output.
You would use both.
update sometable set column1 = replace(column1,column2,column3)
You might want to do the following first to make sure you're replacing what you want to replace:
select replace(column1,column2,column3) from sometable