Adding FK Index to existing table in Merge Replication Topology - sql

I have a table that has grown quite large that we are replicating to about 120 subscribers. A FK on that table does not have an index and when I ran an Execution Plan on a query that was causing issues it had this to say -->
/*
Missing Index Details from CaseNotesTimeoutQuerys.sql - mylocal\sqlexpress.MATRIX (WWCARES\pschaller (54))
The Query Processor estimates that implementing the following index could improve the query cost by 99.5556%.
*/
/*
USE [MATRIX]
GO
CREATE NONCLUSTERED INDEX [<Name of Missing Index, sysname,>]
ON [dbo].[tblCaseNotes] ([PersonID])
GO
*/
I would like to add this but I am afraid it will FORCE a reinitialization. Can anyone verify or validate my concerns? Does it even work that way or would I need to run the script on each subscriber?
Any insight would be appreciated.

Adding an index shouldn't change the table to cause a reinitialise, but I suggest you set up a test version to make sure.

Adding a FK constraint on a field WILL NOT force the reinitialisation of the subscription. This is the complete code that we use when (1) adding a filed to a table and (2) defining this field as a FK:
use myDb
alter table Tbl_myTable
add
id_myOtherTable uniqueIdentifier Null
go
alter table Tbl_myTable
add
constraint Tbl_myTable_myOtherTableP foreign key(id_myOthertable)
references dbo.tbl_myOtherTable (id_myOtherTable)
go
These instructions (adding a field, adding an FK constraint) are replicated to subscribing databases without reinitialisation. In case you do not have to add the field, you must imperatively check that the constraint will be valid on all databases. If, for a reason or another, subscriber (s) does not respect the constraint set on publisher (p), then the constraint will not be propagated, and subscription will stop. You'll then have to arrange manually data on (s) so that it accepts the constraint propagated from (p)

Related

violated - parent key not found when using generated SQL

I used a plugin in Intellij to generate SQL and it looks correct but I keep getting an error saying violated - parent key not found
Added for clarity: 'LOCAL.fOO_LANGUAGE_FK) violated - parent key not found'
To make it I used this:
create table fOO
(
Foo_ID NUMBER not null
constraint CAMPAIGN_PK
primary key,
You are attempting to insert a record into a table which references 5 other tables:
SHOP
SEVERITY
CAMPAIGN_USAGE
LAYOUT
SALE_QUALIFICATION
These references are enforced by constraints. These constraints are rules that the database enforces on your data, and it's based on your data model.
By comments shared in your follow-up, you also have a LANGUAGE table references by your SHOP table.
So to insert a record into your table, you have to make sure the referenced values are all present in your other 5 or more tables.
If you cannot insert the data in the correct order, it's quite common when building out your schema, to create the foreign key constraints DISABLED, INSERT all the data, COMMIT, and then ENABLE all the constraints back on.
To disable a constraint, for example 'CAMPAIGN_SHOP_FK', you can
alter table CAMPAIGN disable constraint CAMPAIGN_SHOP_FK;
It is VITAL that you enable your constraints back if you do not want orphaned rows in your data model.
Some folks will mistakenly rely on their software to ensure their data is 'clean'. If you do this, then you are betting your software is error free AND that no one is in the database touching your data. Both cases are rarely, if ever, true.

Checking foreign key constraint "online"

If we have a giant fact table and want to add a new dimension, we can do it like this:
BEGIN TRANSACTION
ALTER TABLE [GiantFactTable]
ADD NewDimValueId INT NOT NULL
CONSTRAINT [temp_DF_NewDimValueId] DEFAULT (-1)
WITH VALUES -- table is not actually rebuilt!
ALTER TABLE [GiantFactTable]
WITH NOCHECK
ADD CONSTRAINT [FK_GiantFactTable_NewDimValue]
FOREIGN KEY ([NewDimValueId])
REFERENCES [NewDimValue] ([Id])
-- drop the default constraint, new INSERTs will specify a value for NewDimValueId column
ALTER TABLE [GiantFactTable]
DROP CONSTRAINT [temp_DF_NewDimValueId]
COMMIT TRANSACTION
NB: all of the above only manipulate table metadata and should be fast regardless of table size.
Then we can run a job to backfill GiantFactTable.NewDimValueId in small transactions, such that the FK is not violated. (At this point any INSERTs/UPDATEs - e.g. backfill operation - are verified by the FK since it's enabled, but not "trusted")
After the backfill we know the data is consistent, my question is how can SQL engine become enlightened too? Without taking the table offline.
This command will make the FK trusted but it requires a schema modification (Sch-M) lock and likely take hours (days?) taking the table offline:
ALTER TABLE [GiantFactTable]
WITH CHECK CHECK CONSTRAINT [FK_GiantFactTable_NewDimValue]
About the workload: Table has a few hundred partitions (fixed number), data is appended to one partition at a time (in a round-robin fashion), never deleted. There is also a constant read workload that uses the clustering key to get a (relatively small) range of rows from one partition at a time.
Checking one partition at a time, taking it offline, would be acceptable. But I can't find any syntax to do this. Any other ideas?
A few ideas come to mind but they aren't pretty:
Redirect workloads and run check constraint offline
Create a new table with the same structure.
Change the "insert" workload to insert into the new table
Copy the data from the partition used by the "read" workload to the new table (or a third table with the same structure)
Change the "read" workload to use the new table
Run alter table to check the constraint and let it take as long as it needs
Change the both workloads back to the main table.
Insert the new rows back into the main table
Drop new table(s)
A variation on the above is to switch the relevant partition to the new table in step 3. That should be faster than copying the data but I think you will have to copy (and not just switch) the data back after the constraint has been checked.
Insert all the data into a new table
Create a new table with the same structure and constraint enabled
Change the "insert" workload to the new table
Copy all the data from old to new table in batches and wait as long as it takes to complete
Change the "read" workload to the new table. If step 3 takes too long and the "read" workload needs rows that have only been inserted into the new table, you will have to manage this changeover manually.
Drop old table
Use index to speed up constraint check?
I have no idea if this works but you can try to create a non-clustered index on the foreign key column. Also make sure there's an index on the relevant unique key on the table referenced by the foreign key. The alter table command might be able to use them to speed up the check (at least by minimizing IO compared to doing a full table scan). The indexes, of course, can be created online to avoid any disruption.

Primary Key and Unique Index -- sql scripts generated by SQL Developer

When export sql scripts by SQL Developer there are multiple options available, but either way there have to generate a UNIQUE INDEX on primary key like this
CREATE UNIQUE INDEX "SYS_C0018099" ON "TRANSACTION" ("ID")
and add PRIMARY KEY to the same table and same column
ALTER TABLE "TRANSACTION" ADD PRIMARY KEY ("ID")
So the question is: does it looks like kind of redundancy? I thought creating a primary key on a column should by default create an unique index on that column too? So why the first command is necessary?
And this may cause data redundancy?
I am on Oracle 11g so please share any ideas about why it should look like above.
Thanks in advance.
There is no redundancy - or only a little bit :)
The second command will use the index available if exists. Otherwise(if first DDL does not exists) will create an index.
The split into two commands is useful when you had given a proper name to the index and want to keep it.
UPDATE: The link indicated by Thomas Haratyk is a must read, I really like it: http://viralpatel.net/blogs/understanding-primary-keypk-constraint-in-oracle/
UPDATE2: a_horse_with_no_name is right, it can be done in a single statement like:
alter table TRANSACTION
add CONSTRAINT pk_test PRIMARY KEY (id);
So, it will keep the name(won't create a sysblalbla object name) and if you use the 'USING INDEX' keyword you can specify index atributes, for example storage atributes.
But again, you will not have any problems with those two statements, only an index is created.
Probably SQL Developer prefer to get a ddl per object and there might be cases when it's better its way.

How to bulk copy in parallel to a table with a clustered index?

There is a process that bulk inserts data into a sql table from 3 sources in parallel. After adding a primary key to this table, 2 of the bulk insert queries get cancelled after a while due to being the victim of a deadlock. This never happened until I added the primary key. I assume the problem has something to do with the clustered index that was created by adding the primary key.
For now I'm just going to remove the primary key and then create a non-clustered index on the table. I would like some more info on whether the problem is what I think it is, and if there is a way for me to add a clustered index without screwing the load process up.
Not sure if it is more poison than a cure, but Robert offered to drop the clustered index before a huge bulk insert:
http://www.simple-talk.com/sql/learn-sql-server/bulk-inserts-via-tsql-in-sql-server/
We just lock the table and minimally log transactions.

Do I need to create indexes on foreign keys on Oracle?

I have a table A and a table B. A has a foreign key to B on B's primary key, B_ID.
For some reason (I know there are legitimate reasons) it is not using an index when I join these two tables on the key.
Do I need to separately create an index on A.B_ID or should the existence of a foreign key provide that?
The foreign key constraint alone does not provide the index on Oracle - one must (and should) be created.
Creating a foreign key does not automatically create an index on A.B_ID. So it would generally make sense from a query performance perspective to create a separate index on A.B_ID.
If you ever delete rows in B, you definitely want A.B_ID to be indexed. Otherwise, Oracle will have to do a full table scan on A every time you delete a row from B to make sure that there are no orphaned records (depending on the Oracle version, there may be additional locking implications as well, but those are diminished in more recent Oracle versions).
Just for more info: Oracle doesn't create an index automatically (as it does for unique constraints) because (a) it is not required to enforce the constraint, and (b) in some cases you don't need one.
Most of the time, however, you will want to create an index (in fact, in Oracle Apex there's a report of "unindexed foreign keys").
Whenever the application needs to be able to delete a row in the parent table, or update the PK value (which is rarer), the DML will suffer if no index exists, because it will have to lock the entire child table.
A case where I usually choose not to add an index is where the FK is to a "static data" table that defines the domain of a column (e.g. a table of status codes), where updates and deletes on the parent table are never done directly by the application. However, if adding an index on the column gives benefits to important queries in the application, then the index will still be a good idea.
SQL Server has never put indexes onto foreign key columns automatically - check out Kim Tripp's excellent blog post on the background and history of this urban myth.
It's usually a good idea to index your foreign key columns, however - so yes, I would recommend making sure each FK column is backed up by an index; not necessarily on that one column alone - maybe it can make sense to create an index on two or three columns with the FK column as the first one in there. Depends on your scenario and your data.
For performance reasons an index should be created. Is used in delete operations on primary table (to check that the record you are deleting is not used) and in joins that usually a foreign key is involved. Only few tables (I do not create them in logs) could be that do not need the index but probably, in this cases probably you don't need the foreign key constraint as well.
BUT
There are some databases that already automatically create indexes on foreign Keys.
Jet Engine (Microsoft Access Files)
Firebird
MySQL
FOR SURE
SQL Server
Oracle
DOES NOT
As with anything relating to performance, it depends on many factors and there is no silve bullet e.g. in a very high activilty environment the maintainance of an index may be unacceptable.
Most salient here would seem to be selectivity: if the values in the index would be highly duplicated then it may give better performance to drop the index (if possible) and allow a table scan.
UNIQUE, PRIMARY KEY, and FOREIGN KEY constraints generate indexes that enforce or "back" the constraint (and are sometimes called backing indexes). PRIMARY KEY constraints generate unique indexes. FOREIGN KEY constraints generate non-unique indexes. UNIQUE constraints generate unique indexes if all the columns are non-nullable, and they generate non-unique indexes if one or more columns are nullable. Therefore, if a column or set of columns has a UNIQUE, PRIMARY KEY, or FOREIGN KEY constraint on it, you do not need to create an index on those columns for performance.