How can you make a new, non-nullable column whose initial data will just clone over from that of another column? - sql

Let's say there's a table in SQL Server 2008 with two columns: id and name. Now let's say you want to add a non-nullable column to the table called "description", and in the process of adding this column, you want to use the data in name as the original batch of data in description. Is there not a way to do this directly, i.e., without dropping the table and re-creating it or filling in the values manually post-addition or something? If there is a way, how? Thanks!

You could what you're asking as a computed column, but then you wouldn't be able to later modify the description independently.
What you can do to minimize the impact on the log etc. is to add it as nullable, then update the values in batches, then change it to not nullable.
ALTER TABLE dbo.foo ADD description VARCHAR(whatever) NULL;
SELECT 1;
BEGIN TRANSACTION;
WHILE ##ROWCOUNT > 0
BEGIN
COMMIT TRANSACTION;
BEGIN TRANSACTION;
UPDATE TOP (1000) dbo.foo
SET description = name
WHERE description IS NULL;
END
ALTER TABLE dbo.foo ALTER COLUMN description VARCHAR(whatever) NOT NULL;
COMMIT TRANSACTION;
Or as Dems suggested, add it as not nullable with a default value, then later remove the default constraint.

Related

Make sure only one record inserted in table with thousands of concurrent users

Recently, I needed to write a stored procedure to insert only one record when the first user come and ignore for others. I think the IF NOT EXISTS INSERT will not work for me. Also, some people saying online that MERGE adds race condition. Any quick way to achieve this? This is my code for now.
IF NOT EXISTS (SELECT ......)
INSERT
You might add another table to use as the lock mechanism.
Let's say your table's name is a, and the name of the table which has the locked value is check_a :
create table a (name varchar(10))
create table check_a (name varchar(10))
Insert only one record to the lock table:
insert into check_a values ('lock')
go
Then create a stored procedure which checks if there is a value in the main table. If there is no record, we might lock the only value in the table check_a and insert our value into the table a.
create proc insert_if_first
as
begin
set nocount on
if not exists (select name from a)
begin
declare #name varchar(10)
begin tran
select #name = name from check_a with (updlock)
if not exists (select name from a)
begin
insert into a values ('some value')
end
commit
end
end
go
First selection from the table a to check there is no record is for using system resources as low as we can. If there is a record in the table a, we can skip opening transaction and skip locking the row.
Second check is to make sure that while we are waiting to obtain the lock, no one inserted a row to the table a.
This way, only the first user which can lock check_a will be able to insert a value to the table a.
I'm guessing that you mean you want users to make a stored procedure that makes sure only one user can run the procedure. Then you need to use isolation levels. There are different Isolation levels, so you need to decide which one you need.
READ UNCOMMITTED
READ COMMITTED
REPEATABLE READ
SERIALIZABLE
You can read what they do here:
https://msdn.microsoft.com/en-us/library/ms173763.aspx

Create a trigger that updates a column in a table when a different column in the same table is updated - SQL

How to create a trigger that updates a column in a table when a different column in the same table is updated.
So far I have done the following which works when any new data is created. Its able to copy data from "Purchase Requisition" to "PO_Number" however when data has been modified in "Purchase Requisition" , no changes is made to "PO_Number" and the value becomes NULL. Any kind help will be seriously appreciated.
ALTER TRIGGER [dbo].[PO_Number_Trigger]
ON [dbo].[TheCat2]
AFTER INSERT
AS BEGIN
UPDATE dbo.TheCat2 SET PO_Number=(select Purchase_Requisition from inserted) where DocNo= (Select DocNo from inserted);
END
You need to add 'UPDATE' as well as insert to the trigger, otherwise it will only execute on new data, not updated data. Also added 'top 1' to the select statements from the inserted table to allow this to be 'safe' on batch updates, however it will only update 1 record.
ALTER TRIGGER [dbo].[PO_Number_Trigger]
ON [dbo].[TheCat2]
AFTER INSERT, UPDATE
AS BEGIN
UPDATE dbo.TheCat2 SET PO_Number=(select top 1 Purchase_Requisition from inserted) where DocNo= (Select top 1 DocNo from inserted);
END
This might do what you want:
Your trigger is altering all rows in TheCat2. Presumably, you only want to alter the new ones:
ALTER TRIGGER [dbo].[PO_Number_Trigger]
ON [dbo].[TheCat2] AFTER INSERT
AS
BEGIN
UPDATE tc
SET PO_Number = Purchase_Requisition
FROM dbo.TheCat2 tc JOIN
inserted i
on tc.DocNo = i.DocNo ;
END;
However, perhaps a computed column is sufficient for your purposes:
alter table add PO_Number as Purchase_Requisition;

SQL Trigger to add values of Two columns and store it in Third column

i have a table with three columns say pqty,prqty and balqty.
what i want to do is, have to add values of pqty and prqty. and then it should be stored in balqty. while inserting or updating this table, each row must be affect.
i used this trigger, and it worked sometimes and most of times it wont. i dont know why.
CREATE TRIGGER tsl on stockledger
FOR update
AS declare #pqty int, #prqty int;
select #pqty=i.pqty from inserted i;
select #prqty=i.prqty from inserted i;
update Stockledger set balqty = (#pqty - #prqty)
PRINT 'AFTER Update trigger fired.'
I don't think this is a good use of a trigger. Instead, if you have the capacity, consider using a computed column (with PERSISTED to enhance performance):
ALTER TABLE StockLedger DROP COLUMN balqty;
ALTER TABLE StockLedger ADD COLUMN balqty AS pqty - prqty PERSISTED;

Creating a sequence on an existing table

How can I create a sequence on a table so that it goes from 0 -> Max value?
I've tried using the following SQL code, but it does not insert any values into the table that I am using:
CREATE SEQUENCE rid_seq;
ALTER TABLE test ADD COLUMN rid INTEGER;
ALTER TABLE test ALTER COLUMN rid SET DEFAULT nextval('rid_seq');
The table I am trying to insert the sequence in is the output from another query. I can't figure out if it makes more sense to add the sequence during this initial query, or to add the sequence to the table after the query is performed.
Set the default value when you add the new column:
create sequence rid_seq;
alter table test add column rid integer default nextval('rid_seq');
Altering the default value for existing columns does not change existing data because the database has no way of knowing which values should be changed; there is no "this column has the default value" flag on column values, there's just the default value (originally NULL since you didn't specify anything else) and the current value (also NULL) but way to tell the difference between "NULL because it is the default" and "NULL because it was explicitly set to NULL". So, when you do it in two steps:
Add column.
Change default value.
PostgreSQL won't apply the default value to the column you just added. However, if you add the column and supply the default value at the same time then PostgreSQL does know which rows have the default value (all of them) so it can supply values as the column is added.
By the way, you probably want a NOT NULL on that column too:
create sequence rid_seq;
alter table test add column rid integer not null default nextval('rid_seq');
And, as a_horse_with_no_name notes, if you only intend to use rid_seq for your test.rid column then you might want to set its owner column to test.rid so that the sequence will be dropped if the column is removed:
alter sequence rid_seq owned by test.rid;
In PostgreSQL:
UPDATE your_table SET your_column = nextval('your_sequence')
WHERE your_column IS NULL;
I'm not fluent in postgresql so I'm not familiar with the "CREATE SEQUENCE" statement. I would think, though, that you're adding the column definition correctly. However, adding the column doesn't automatically insert data for existing rows. A DEFAULT constraint is for new rows. Try adding something like this afterwards to populate data on the existing rows.
DECLARE #i Int
SET #i = 0
SET ROWCOUNT 1
WHILE EXISTS (SELECT 1 FROM test WHERE rid IS NULL) BEGIN
UPDATE test SET rid = #i WHERE rid IS NULL
END
SET ROWCOUNT 0

Fastest way to update 120 Million records

I need to initialize a new field with the value -1 in a 120 Million record table.
Update table
set int_field = -1;
I let it run for 5 hours before canceling it.
I tried running it with transaction level set to read uncommitted with the same results.
Recovery Model = Simple.
MS SQL Server 2005
Any advice on getting this done faster?
The only sane way to update a table of 120M records is with a SELECT statement that populates a second table. You have to take care when doing this. Instructions below.
Simple Case
For a table w/out a clustered index, during a time w/out concurrent DML:
SELECT *, new_col = 1 INTO clone.BaseTable FROM dbo.BaseTable
recreate indexes, constraints, etc on new table
switch old and new w/ ALTER SCHEMA ... TRANSFER.
drop old table
If you can't create a clone schema, a different table name in the same schema will do. Remember to rename all your constraints and triggers (if applicable) after the switch.
Non-simple Case
First, recreate your BaseTable with the same name under a different schema, eg clone.BaseTable. Using a separate schema will simplify the rename process later.
Include the clustered index, if applicable. Remember that primary keys and unique constraints may be clustered, but not necessarily so.
Include identity columns and computed columns, if applicable.
Include your new INT column, wherever it belongs.
Do not include any of the following:
triggers
foreign key constraints
non-clustered indexes/primary keys/unique constraints
check constraints or default constraints. Defaults don't make much of difference, but we're trying to keep
things minimal.
Then, test your insert w/ 1000 rows:
-- assuming an IDENTITY column in BaseTable
SET IDENTITY_INSERT clone.BaseTable ON
GO
INSERT clone.BaseTable WITH (TABLOCK) (Col1, Col2, Col3)
SELECT TOP 1000 Col1, Col2, Col3 = -1
FROM dbo.BaseTable
GO
SET IDENTITY_INSERT clone.BaseTable OFF
Examine the results. If everything appears in order:
truncate the clone table
make sure the database in in bulk-logged or simple recovery model
perform the full insert.
This will take a while, but not nearly as long as an update. Once it completes, check the data in the clone table to make sure it everything is correct.
Then, recreate all non-clustered primary keys/unique constraints/indexes and foreign key constraints (in that order). Recreate default and check constraints, if applicable. Recreate all triggers. Recreate each constraint, index or trigger in a separate batch. eg:
ALTER TABLE clone.BaseTable ADD CONSTRAINT UQ_BaseTable UNIQUE (Col2)
GO
-- next constraint/index/trigger definition here
Finally, move dbo.BaseTable to a backup schema and clone.BaseTable to the dbo schema (or wherever your table is supposed to live).
-- -- perform first true-up operation here, if necessary
-- EXEC clone.BaseTable_TrueUp
-- GO
-- -- create a backup schema, if necessary
-- CREATE SCHEMA backup_20100914
-- GO
BEGIN TRY
BEGIN TRANSACTION
ALTER SCHEMA backup_20100914 TRANSFER dbo.BaseTable
-- -- perform second true-up operation here, if necessary
-- EXEC clone.BaseTable_TrueUp
ALTER SCHEMA dbo TRANSFER clone.BaseTable
COMMIT TRANSACTION
END TRY
BEGIN CATCH
SELECT ERROR_MESSAGE() -- add more info here if necessary
ROLLBACK TRANSACTION
END CATCH
GO
If you need to free-up disk space, you may drop your original table at this time, though it may be prudent to keep it around a while longer.
Needless to say, this is ideally an offline operation. If you have people modifying data while you perform this operation, you will have to perform a true-up operation with the schema switch. I recommend creating a trigger on dbo.BaseTable to log all DML to a separate table. Enable this trigger before you start the insert. Then in the same transaction that you perform the schema transfer, use the log table to perform a true-up. Test this first on a subset of the data! Deltas are easy to screw up.
If you have the disk space, you could use SELECT INTO and create a new table. It's minimally logged, so it would go much faster
select t.*, int_field = CAST(-1 as int)
into mytable_new
from mytable t
-- create your indexes and constraints
GO
exec sp_rename mytable, mytable_old
exec sp_rename mytable_new, mytable
drop table mytable_old
I break the task up into smaller units. Test with different batch size intervals for your table, until you find an interval that performs optimally. Here is a sample that I have used in the past.
declare #counter int
declare #numOfRecords int
declare #batchsize int
set #numOfRecords = (SELECT COUNT(*) AS NumberOfRecords FROM <TABLE> with(nolock))
set #counter = 0
set #batchsize = 2500
set rowcount #batchsize
while #counter < (#numOfRecords/#batchsize) +1
begin
set #counter = #counter + 1
Update table set int_field = -1 where int_field <> -1;
end
set rowcount 0
If your int_field is indexed, remove the index before running the update. Then create your index again...
5 hours seem like a lot for 120 million recs.
set rowcount 1000000
Update table set int_field = -1 where int_field<>-1
see how fast that takes, adjust and repeat as necessary
What I'd try first is
to drop all constraints, indexes, triggers and full text indexes first before you update.
If above wasn't performant enough, my next move would be
to create a CSV file with 12 million records and bulk import it using bcp.
Lastly, I'd create a new heap table (meaning table with no primary key) with no indexes on a different filegroup, populate it with -1. Partition the old table, and add the new partition using "switch".
When adding a new column ("initialize a new field") and setting a single value to each existing row, I use the following tactic:
ALTER TABLE MyTable
add NewColumn int not null
constraint MyTable_TemporaryDefault
default -1
ALTER TABLE MyTable
drop constraint MyTable_TemporaryDefault
If the column is nullable and you don't include a "declared" constraint, the column will be set to null for all rows.
declare #cnt bigint
set #cnt = 1
while #cnt*100<10000000
begin
UPDATE top(100) [Imp].[dbo].[tablename]
SET [col1] = xxxx
WHERE[col1] is null
print '#cnt: '+convert(varchar,#cnt)
set #cnt=#cnt+1
end
Sounds like an indexing problem, like Pabla Santa Cruz mentioned. Since your update is not conditional, you can DROP the column and RE-ADD it with a DEFAULT value.
In general, recommendation are next:
Remove or just Disable all INDEXES, TRIGGERS, CONSTRAINTS on the table;
Perform COMMIT more often (e.g. after each 1000 records that were updated);
Use select ... into.
But in particular case you should choose the most appropriate solution or their combination.
Also bear in mind that sometime index could be useful e.g. when you perform update of non-indexed column by some condition.
If the table has an index which you can iterate over I would put update top(10000) statement in a while loop moving over the data. That would keep the transaction log slim and won't have such a huge impact on the disk system. Also, I would recommend to play with maxdop option (setting it closer to 1).