Attempting to insert data into a SQL Server table with no primary key or clustered index intermittently comes up with strange bulk load error - sql

We have a stored procedure in SQL Server which inserts data from a staging table into a fact table. This is first joined onto various dimension tables to get their id columns.
Intermittently, it will throw the following error:
('42000', '[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Cannot bulk load. The bulk data stream was incorrectly specified as sorted or the data violates a uniqueness constraint imposed by the target table. Sort order incorrect for the following two rows: primary key of first row: (305603, 0xedd9f90001001a00), primary key of second row: (245634, 0x0832680003003200). (4819) (SQLExecDirectW)')
(The two rows mentioned in the error message usually change each time the error occurs)
The fact table doesn't have any primary key or clustered index, although it does have a few non-clustered indexes, so the error message is really confusing me.
The dimension tables DO have primary keys, but the fact table has no foreign keys related to them.
I have searched for hours on the internet but don't seem to have found a solution.
Does anybody know why I would receive such an error in this case? any help would be appreciated. We are using Azure SQL database.
I am aware that the database is badly designed however it was created before my time.

Related

SQL Server constraint enforcement being violated for split seconds

I have a table on premise that is about 21 million rows with a primary key constraint and when I search that table, there are no duplicates. This table is in an OLTP application database that is constantly moving.
I have the exact same table in Azure which has the same primary key constraint. This table is not an application table, it's just a copy of the one that is on-premise (the goal is to use this one for ad hoc queries, as a source for other systems, etc.).
When I use Azure Data Factory to select all_columns from table on premise to the table in Azure, it returns a violation of the primary key constraint. No matter how many times I run this data factory pipeline, it comes back with a primary key violation for duplicate keys (the keys are always changing though).
So I dropped the primary key constraint in Azure and ran the pipeline again, and sure enough, duplication exists.
Upon investigation, it appears that the on-premise database is doing an insert new record then update the old record to inactivate it. So for a fraction of a second, there are two active rows that ADF is grabbing to then try to insert into the table in Azure which of course fails because of duplicate primary keys.
Now to the best of my knowledge, this shouldn't be possible. You can't insert a new row that violates the primary key constraint. But ADF seems to be grabbing all the data and some of those rows are mid-flight where the insert has happened and the update to inactivate the old row hasn't happened yet.
For those that are curious, the insert happens and the update of the old row happens within less than a second... it's typically 10-20 microseconds. I don't know how this is possible and I don't know how to fix it (because I can't modify the application code). The database for the on-premise database is a SQL Server 2000 database and Azure SQL is an Azure SQL database.
Try with readpast hint. It should not select any rows in locking state.
SELECT * FROM yourtable WITH (readpast)
Since you have create_date and updated_date column then you can select rows older than 5 seconds to avoid duplication.
select * from yourtable where created_date<=dateadd(second,-5,getdate()) and updated_date<=dateadd(second,-5,getdate());
Need to enable the Fault tolerance in a Pipeline Azure Data Factory
Copy data from a Source SQL to a Sink SQL database. A primary key is defined in the sink SQL database, but no such primary key is defined in the source SQL server. The duplicated rows that exist in the source cannot be copied to the sink. Copy activity copies only the first row of the source data into the sink. The subsequent source rows that contain the duplicated primary key value are detected as incompatible and are skipped.
To configure Json Definition skip the incompatible rows in copy activity "enableSkipIncompatibleRow": true
Please Refer: https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-fault-tolerance
If possible to modify your application, need to check the Primary key constraint before insert or update using EXISTS() function.
Example:
IF EXISTS(SELECT * FROM Table_Name WHERE primary key condition)
BEGIN
UPDATE Table_Name
SET Col_Name= value
WHERE condition
END
ELSE
BEGIN
INSERT INTO Table_Name ( col_Name1,col_Name2,,.. )
VALUES ( ‘’,’’,’’,….)
END

Foreign key relationship can't be created (SQL Server)

I created a SQL Server database first (2 tables) and then tried to load data through SSIS data flow task. At the last step an error has occurred.
When I remove a relationship between two tables in the database, the SSIS task is completed successfully and the data is loaded! But, after I load data into the tables, I can't create relationship between them.
Based on this you can conclude that a relationship can be created when there is no data in a table. Just to mention, data types are the same in both tables.
How could I work out a solution?
Thank you!
It seems the error in SSIS is due to a foreign key violation. The purpose of the foreign key relationship is to prevent you for loading bad data. When you loaded without the FK, you inserted bad data and cannot create a (trusted) foreign key constraint afterward.
The solution is to either fix the source data or modify your package to avoid inserted data that doesn't exist in the referenced table. The latter can be done with a lookup task, sending found rows down the happy path to the target table. You could either ignore not found rows or write those to an error table or file.

postgreSQL returns blank

I'm trying to make a foreign key. When I execute the query below, it returns blank with neither an explanation nor an error.
alter table MySQL."GrossHomeSales"
add constraint fk_zip_code
foreign key (nhs_prev_zip) references MySQL."Location" (zip_code);
You have not mentioned what your sql client is. If you are using psql, it does say ALTER TABLE when you make an alteration to a table. If instead you were using pgadmin 3, it would say something like
Query returned successfully with no result in 345 msec.
The reason that you are not seeing any output could be because this is really large table with several million rows, Then it could take a few minutes to create the index so don't expect instant output.
The other likely reason for not receiving any immediate response is because the table has been locked by another long running query.
Finally your question title says postgresql but your table has mysql in it's name?? (this is not related, just curious)

SQL complains about a non-existing constraint

I'm really stumped on this one. We had a table with a two part primary key. The parts were a review_id (a foreign key from another table) and a time-stamp. Whoever designed this table didn't realize that some situations could result in two entries having identical time-stamps, and I was getting "ORA-00001: unique constraint" errors.
However, as this table was a log, it had no real need to have a primary key in the first place, so I removed the primary key constraint. Despite this constraint no longer existing, I'm still getting the same error.
I've tried adding elements to the PK to prevent the conflict as well as restoring the constraint but disabling it. Oracle SQL Developer insists that the database reflects the changes I've made, but the behavior suggests that it's still using the original PK. I thought it might be a caching problem, but even a complete reboot of my computer doesn't change it.
Any advice is appreciated.
An example of the commands I've run:
alter table "DATABASE"."DB_REVIEW_LOG" drop constraint "DB_REVIEW_LOG_PK";
update database.db_review_log set review_id=17494 where review_id = 17495;
and this is what I get back:
Error starting at line : 2 in command -
update database.db_review_log set review_id=17494 where review_id = 17495
Error report -
SQL Error: ORA-00001: unique constraint (DATABASE.DB_REVIEW_LOG_PK) violated
00001. 00000 - "unique constraint (%s.%s) violated"
*Cause: An UPDATE or INSERT statement attempted to insert a duplicate key.
For Trusted Oracle configured in DBMS MAC mode, you may see
this message if a duplicate entry exists at a different level.
*Action: Either remove the unique restriction or do not insert the key.
Turns out that despite it complaining about a constraint, the problem was a unique index with the same name that was causing the problems.
Thank you, Justin Cave

EF5 generates SQL Server CE constraints with dot in name

I am building a .NET disconnected client-server application that uses Entity Framework 5 (EF5) to generate a SQL Server CE 4.0 database from POCOs. The application allows the user to perform a bulk copy of data from the network SQL Server into the client's SQL Server CE database. This is very (VERY) slow, due to the constraints and indexes created by EF5. Temporarily dropping the constraints and indexes will reduce the 30-minute wait to 1 minute or less.
Before starting the bulk copy, the application executes queries to drop the constraints and indexes from the SQL Server CE tables. However, the commands fail, because EF5 created constraint names include the table schema name, dot, and table name. The dot in the constraint name is causing the drop command to fail, due to a parsing issue.
For example, POCO Customer creates table dbo.Customer with the primary key constraint PK_dbo.Customer_Id. The database performs as expected.
However, upon executing non-query:
ALTER TABLE Customer DROP CONSTRAINT PK_dbo.Customer;
SQL Server Compact ADO.NET Data Provider returns an error:
There was an error parsing the query.
[ Token line number = 1, Token line offset = 57, Token in error = . ]
Of course, using a secondary DataContext object that does not have foreign keys generate the database without the constraints, and then add them later works; but, that requires maintaining two DataContext objects and hopefully not forgetting to keep both updated. Therefore, I am looking for one of two solutions:
Compose the DROP statement in such a way that the . character is parsed
Prevent EF5 from using the . character in the constraint and index names
Thank you in advance for your help!
Wrap that bad boy in a []. It tells the parser that everything inside is the key name.
ALTER TABLE Customer DROP CONSTRAINT [PK_dbo.Customer];
Should run fine.
Personally I just wrap every identifier in brackets to avoid this exact issue. So I would write this query like this.
ALTER TABLE [Customer] DROP CONSTRAINT [PK_dbo.Customer];
I think it's more readable that way because you can instantly see identifiers.