I have a distributed data base with two nodes. I have a table like this one in node2 (only in this node):
CREATE TABLE table2
(
cod_proveedor CHAR(15) REFERENCES proveedor(cod_proveedor),
cod_articulo CHAR(15) REFERENCES articulo(cod_articulo),
);
Now, I have the tables "articulo" in node1 and node2.
As we see, I am doing REFERENCES to nodo2.proveedor and nodo2.articulo because my table "table2" is in this node "node2".
I gotta do reference to nodo1.proveedor when I am creating the table but I don't know how...
Can you help me?
If "distributed database" means that you have two separate databases, you cannot create foreign key constraints in one database that references a table in another database.
You could create a materialized view in database 2 that pulls all the proveedor data from database 1 to database 2 and then create a foreign key constraint in database 2 that references the materialized view. Of course, since there would be a lag between when new data was written to the table on database 1 and when the materialized view was updated on database 2 that you could have windows where a child row couldn't be written despite the parent row existing on database 1. And if you deleted a row in database 1, you wouldn't find out whether there were child rows that would be orphaned until you tried to replicate that change to database 2. You'll need to write a lot of code to detect and to resolve these sorts of errors.
In Oracle, it would generally make far more sense to create a single database using RAC (Real Application Clusters) that is mounted on multiple physical servers. That would allow you to distribute the load across the database servers where each server has access to the full contents of the database rather than distributing subsets of data to different nodes.
Related
I have a table on premise that is about 21 million rows with a primary key constraint and when I search that table, there are no duplicates. This table is in an OLTP application database that is constantly moving.
I have the exact same table in Azure which has the same primary key constraint. This table is not an application table, it's just a copy of the one that is on-premise (the goal is to use this one for ad hoc queries, as a source for other systems, etc.).
When I use Azure Data Factory to select all_columns from table on premise to the table in Azure, it returns a violation of the primary key constraint. No matter how many times I run this data factory pipeline, it comes back with a primary key violation for duplicate keys (the keys are always changing though).
So I dropped the primary key constraint in Azure and ran the pipeline again, and sure enough, duplication exists.
Upon investigation, it appears that the on-premise database is doing an insert new record then update the old record to inactivate it. So for a fraction of a second, there are two active rows that ADF is grabbing to then try to insert into the table in Azure which of course fails because of duplicate primary keys.
Now to the best of my knowledge, this shouldn't be possible. You can't insert a new row that violates the primary key constraint. But ADF seems to be grabbing all the data and some of those rows are mid-flight where the insert has happened and the update to inactivate the old row hasn't happened yet.
For those that are curious, the insert happens and the update of the old row happens within less than a second... it's typically 10-20 microseconds. I don't know how this is possible and I don't know how to fix it (because I can't modify the application code). The database for the on-premise database is a SQL Server 2000 database and Azure SQL is an Azure SQL database.
Try with readpast hint. It should not select any rows in locking state.
SELECT * FROM yourtable WITH (readpast)
Since you have create_date and updated_date column then you can select rows older than 5 seconds to avoid duplication.
select * from yourtable where created_date<=dateadd(second,-5,getdate()) and updated_date<=dateadd(second,-5,getdate());
Need to enable the Fault tolerance in a Pipeline Azure Data Factory
Copy data from a Source SQL to a Sink SQL database. A primary key is defined in the sink SQL database, but no such primary key is defined in the source SQL server. The duplicated rows that exist in the source cannot be copied to the sink. Copy activity copies only the first row of the source data into the sink. The subsequent source rows that contain the duplicated primary key value are detected as incompatible and are skipped.
To configure Json Definition skip the incompatible rows in copy activity "enableSkipIncompatibleRow": true
Please Refer: https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-fault-tolerance
If possible to modify your application, need to check the Primary key constraint before insert or update using EXISTS() function.
Example:
IF EXISTS(SELECT * FROM Table_Name WHERE primary key condition)
BEGIN
UPDATE Table_Name
SET Col_Name= value
WHERE condition
END
ELSE
BEGIN
INSERT INTO Table_Name ( col_Name1,col_Name2,,.. )
VALUES ( ‘’,’’,’’,….)
END
i have a large query (Web Dashboard Query) with many temporary tables.i have created indexes on the temp table.the application that is using the query has a user management module with different levels of permissions.My question is are indexes created per session like the temp db ?
i don't want the indexes to be shared across sessions.
i have been doing something like
EXEC('CREATE INDEX idx_test'+ #sessionId + 'ON #TempTable (id1,id2)');
is this necessary. i have seen it done by some developers.
Indexes on temporary tables (#t, not ##t) are not shared across the sessions, and there is no need to invent a unique index name to an index on a temporary table.
What is different (and may be you have seen in the code from other developers) is CONSTRAINT NAME. Index name can be repeated many times for different tables, but a constraint name must be unique within the database.
So maybe you see the code for stored procedures that create a constraint name with reference to a session, this is an attempt to give a unique name to a constraint. Because if you launch a stored procedure that creates a temp table #t in two sessions, every session create it's own table with it's own name(not just #t, the system is adding additional symbols to a table name that makes it unique)
But if the same proc tries to create a CONSTRAINT PK_t, the first session will succeeded but the second will get an error that the constraint PK_t already exists in the database(tempdb)
Situation:
I have quite a few tables created in SQL
These tables are linked to MS Access
I can very easily "add" new entries to all but one of the tables
This table includes a Foreign key reference (but so do others)
All tables are created the same way and linked the same way
Problem:
I cannot add entries in the Spreadsheet View in Access. Generally you have some sort of entry like (where there is a * as well as empty row beneath the table you can click in and begin typing)
However this table looks like:
Right clicking on a record has "New Record" and "Delete Record" grayed out, while I can use this on other tables
I am creating the table using:
CREATE TABLE ProjectApprovers (
ProjectCode varchar(50) FOREIGN KEY REFERENCES ProjectCodes(ProjectCode),
RACFApprover varchar(50)
);
The reason I am confused is that it does not appear to be a SQL permissions problem because I can run the following code in Access:
INSERT INTO ProjectApprovers (ProjectCode,RACFApprover) VALUES ('ValidProjectCode','test123');
It seems these restrictions are only limited to the spreadsheet view. Additionally, identical syntax is used to create other tables which do not have this problem.
I am using this code to link my database tables.
Is something like this a permission problem? I have never referenced this problem table with permissions.
If Access doesn't recognize a primary key in the linked table, it will present the table as read-only in Datasheet View.
Fix this by adding a primary key in SQL Server. Then recreate the link in Access so it can notice the changed table structure.
Scenario:
I have a set of test data that needs to be deployed to our build server daily (our build server database is first overwritten with the current live database, and has all data over a month old removed).
This test data has foreign key references within it which need to stay.
I can't simply switch on IDENTITY_INSERT as the primary keys may clash with data that is already in the database (because we aren't starting from a blank database).
The test data needs to be able to be regenerated fairly regularly, so the though of going through the deploy script and fudging the id columns to be something outlandish (or a negative number for instance) and then changing the related foreign key columns to be the same id every time we regenerate the data doesn't thrill me.
Ideally I would like to know if there is a tool which can scan a database, pick up the foreign key constraints and generate the insert scripts accordingly, something like:
INSERT INTO MyTable VALUES('TEST','TEST');
DECLARE #Id INT;
SET #Id = (SELECT ##IDENTITY)
INSERT INTO MyRelatedTable VALUES(#Id,'TEST')
It sounds like you want to look into an ETL process that copes with the change in id. As you're using SQL Server, you can look at the OUTPUT clause - use this to build up some temporary tables that can map the "old" id to the "new" id for each primary key to map the foreign keys when migrating the "child" tables.
We have developed a large data migration from one DB schema to the other. We had built it based on the idea that the destination DB would be empty, however months ago we started putting clients on the new application which means their data is being housed in the new schema (the destination DB).
Now we're in a situation where the primary keys could overlap from the source to the destination DB and we're struggling to come up with a solution. The only solution I can think of is to check if the ID exists in the destination, updated the ID in the source to be 1 more than the greatest ID in the destination, and then migrate the record. This seems really cumbersome to have to do for hundreds of tables. Any ideas?
Sorry I don't know anything about SSIS but the following are a few ways to solve the problem using SQL.
When inserting into the destination tables, do not insert identities. As rows are inserted, capture the newly inserted identities and the old identities in a mapping table, see MERGE + OUTPUT INTO. Use the mapping table to update the tables that haven't been inserted, substituting the old identities with the new identities.
Of course for this to work, insertion into tables has to be done in an order that won't cause foreign key or constraint violations.
If you're not into doing all that, and you can lock users out of tables for short periods of time, DBCC CHECK INDENT could be used to 'reserve' identities. These new identities can then be used to update the old data and then insert with SET IDENTITY_INSERT ON.