If we have a giant fact table and want to add a new dimension, we can do it like this:
BEGIN TRANSACTION
ALTER TABLE [GiantFactTable]
ADD NewDimValueId INT NOT NULL
CONSTRAINT [temp_DF_NewDimValueId] DEFAULT (-1)
WITH VALUES -- table is not actually rebuilt!
ALTER TABLE [GiantFactTable]
WITH NOCHECK
ADD CONSTRAINT [FK_GiantFactTable_NewDimValue]
FOREIGN KEY ([NewDimValueId])
REFERENCES [NewDimValue] ([Id])
-- drop the default constraint, new INSERTs will specify a value for NewDimValueId column
ALTER TABLE [GiantFactTable]
DROP CONSTRAINT [temp_DF_NewDimValueId]
COMMIT TRANSACTION
NB: all of the above only manipulate table metadata and should be fast regardless of table size.
Then we can run a job to backfill GiantFactTable.NewDimValueId in small transactions, such that the FK is not violated. (At this point any INSERTs/UPDATEs - e.g. backfill operation - are verified by the FK since it's enabled, but not "trusted")
After the backfill we know the data is consistent, my question is how can SQL engine become enlightened too? Without taking the table offline.
This command will make the FK trusted but it requires a schema modification (Sch-M) lock and likely take hours (days?) taking the table offline:
ALTER TABLE [GiantFactTable]
WITH CHECK CHECK CONSTRAINT [FK_GiantFactTable_NewDimValue]
About the workload: Table has a few hundred partitions (fixed number), data is appended to one partition at a time (in a round-robin fashion), never deleted. There is also a constant read workload that uses the clustering key to get a (relatively small) range of rows from one partition at a time.
Checking one partition at a time, taking it offline, would be acceptable. But I can't find any syntax to do this. Any other ideas?
A few ideas come to mind but they aren't pretty:
Redirect workloads and run check constraint offline
Create a new table with the same structure.
Change the "insert" workload to insert into the new table
Copy the data from the partition used by the "read" workload to the new table (or a third table with the same structure)
Change the "read" workload to use the new table
Run alter table to check the constraint and let it take as long as it needs
Change the both workloads back to the main table.
Insert the new rows back into the main table
Drop new table(s)
A variation on the above is to switch the relevant partition to the new table in step 3. That should be faster than copying the data but I think you will have to copy (and not just switch) the data back after the constraint has been checked.
Insert all the data into a new table
Create a new table with the same structure and constraint enabled
Change the "insert" workload to the new table
Copy all the data from old to new table in batches and wait as long as it takes to complete
Change the "read" workload to the new table. If step 3 takes too long and the "read" workload needs rows that have only been inserted into the new table, you will have to manage this changeover manually.
Drop old table
Use index to speed up constraint check?
I have no idea if this works but you can try to create a non-clustered index on the foreign key column. Also make sure there's an index on the relevant unique key on the table referenced by the foreign key. The alter table command might be able to use them to speed up the check (at least by minimizing IO compared to doing a full table scan). The indexes, of course, can be created online to avoid any disruption.
Related
I have a massive job that runs nightly, and to have a smaller impact on the DB it runs on a table in a different schema (EmptySchema) that isn't in general use, and is then swapped out to the usual location (UsualSchema) using
ALTER SCHEMA TempSchema TRANSFER UsualSchema.BigTable
ALTER SCHEMA UsualSchema TRANSFER EmptySchema.BigTable
ALTER SCHEMA EmptySchema TRANSFER TempSchema.BigTable
Which effectively swaps the two tables.
However, I then need to set up indexes on the UsualSchema table. Can I do this by disabling them on the UsualSchema table and then re-enabling them once the swap has taken place? Or do I have to create them each time on the swapped out table? Or have duplicate indexes in both places and disable/enable them as necessary (leading to duplicates in source control, so not ideal)? Is there a better way of doing it?
There's one clustered index and five non-clustered indexes.
Thanks.
Indexes, including those that support constraints, are transferred by ALTER SCHEMA, so you can have them in both the source and target object schema.
Constraint names are schema-scoped based on the table schema and other indexes names are scoped by the table/view itself. It is therefore possible to have identical index names in the same schema but on different tables. Constraint names must be unique within the schema.
is it possible to create a autoserial index in order 1,2,3,4... in Informix and what would be the syntax. I have a query and some of my timestamps are identical so I was unable to query using a timestamp variable. Thanks!
These are the commands that I ran to add an id field to an existing table. While logged in to the dbaccess console.
alter table my_table add id integer before some_field;
create sequence myseq;
update my_table set id = myseq.nextval;
drop sequence myseq;
alter table my_table modify (id serial not null);
Thanks to #ricardo-henriques for pointing me in the right direction. These commands will allow you to run the instructions explained in his answer on your database.
That would be the SERIAL data type.
You can use, as #RET mention the SERIAL data type.
Next you will struggle with the fact that you can't add a SERIAL column to an existing table. Ways to work around:
Add an INTEGER column, populate with sequential numbers and then alter the column to SERIAL.
Unload the data to a file, drop the table and recreate it with the new column.
Create a new table with the new column, populate the new table with the data from the old, drop the old and rename the new.
...
Bear in mind that they may not be unique. Hence you have to create an unique index or a primary key or an unique constraint in the column to prevent duplicates.
Another notes you should be aware:
- Primary key don't allow NULLS, unique index and unique constraints allow (as long there is only one record), so you should specify NOT NULL on the column definition.
- If you use a primary key or a unique constraint you can create a foreign key to it.
- In primary key and unique constraint the validation of the uniqueness of the record is done in the end of the DML, for the unique index it is done row a row.
Seems you're getting your first touch with informix, welcome. Yes it can be a little bit hard on the beginning just remember:
Always search before asking, really search.
When in doubt or reached a dead end then ask away.
Try to trim down your case scenario, built your own case the simple away you can, these will not only help us to help us but you will practice and in some cases find the solution by yourself.
When error is involve always give the error code, in informix it is given at least one error code and sometimes an ISAM error too.
Keen regards.
I currently have a composite PK (clustered) consisting of 3 columns, let's call them A, B and C, all needed to ensure uniqueness. Due to external factors I need to modify this table by removing the current PK and adding a new index on a new column instead.
This is done by a standard
ALTER TABLE Table_Name DROP CONSTRAINT PK_Name
CREATE INDEX Index_Name ON Table_Name (NewColumn)
The problem is that the table is huge (some 70 million rows) and performing a drop on the current PK and then adding the new index takes over an hour. Is there any way to fix the situation in a more performance efficient way?
The table only has its composite PK, so no NCIs or FKs and other dependencies to worry about. I am working on SQL Server 2008.
One alternative is to create a new table with the right constraints. Then you copy over all the rows. At the end, you drop the old table, and sp_rename the new one. The rename is fast and that reduces downtime.
You might have to put some thought into rows that are being added during the copy operation. One way to deal with that is to rename to old table, and then copy any new rows over, and only after that rename the new table. This still results in a much shorter downtime than an altering the table in-place.
We have 8 million row table and we need to add a sequential id column to it. It is used for data warehousing.
From testing, we know that if we remove all the indexes, including the primary key index, adding a new sequential id column was like 10x faster. I still haven't figure out why dropping the indexes would help adding a identity column.
Here is the SQL that add identity column:
ALTER TABLE MyTable ADD MyTableSeqId BIGINT IDENTITY(1,1)
However, the table in question has dependencies, thus I cannot drop the primary key index unless I remove all the FK constraints. As a result adding identity column.
Is there other ways to improve the speed when adding a identity column, so that client down time is minimal?
or
Is there a way to add an identity column without locking the table, so that table can be access, or at least be queried?
The database is SQL Server 2005 Standard Edition.
Adding a new column to a table will acquire a Sch-M (schema modification) lock, which prevents all access to the table for the duration of the operation.
You may get some benefit from switching the database into bulk-logged or simple mode for the duration of the operation, but of course, do so only if you're aware of the effects this will have on your backup / restore strategy.
I have a table that has grown quite large that we are replicating to about 120 subscribers. A FK on that table does not have an index and when I ran an Execution Plan on a query that was causing issues it had this to say -->
/*
Missing Index Details from CaseNotesTimeoutQuerys.sql - mylocal\sqlexpress.MATRIX (WWCARES\pschaller (54))
The Query Processor estimates that implementing the following index could improve the query cost by 99.5556%.
*/
/*
USE [MATRIX]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[tblCaseNotes] ([PersonID])
GO
*/
I would like to add this but I am afraid it will FORCE a reinitialization. Can anyone verify or validate my concerns? Does it even work that way or would I need to run the script on each subscriber?
Any insight would be appreciated.
Adding an index shouldn't change the table to cause a reinitialise, but I suggest you set up a test version to make sure.
Adding a FK constraint on a field WILL NOT force the reinitialisation of the subscription. This is the complete code that we use when (1) adding a filed to a table and (2) defining this field as a FK:
use myDb
alter table Tbl_myTable
add
id_myOtherTable uniqueIdentifier Null
go
alter table Tbl_myTable
add
constraint Tbl_myTable_myOtherTableP foreign key(id_myOthertable)
references dbo.tbl_myOtherTable (id_myOtherTable)
go
These instructions (adding a field, adding an FK constraint) are replicated to subscribing databases without reinitialisation. In case you do not have to add the field, you must imperatively check that the constraint will be valid on all databases. If, for a reason or another, subscriber (s) does not respect the constraint set on publisher (p), then the constraint will not be propagated, and subscription will stop. You'll then have to arrange manually data on (s) so that it accepts the constraint propagated from (p)