MERGE Issue In Bulk Insert Sproc - sql

I am having an issue with my "Merging":
The MERGE statement attempted to UPDATE or DELETE the same row more than once.
Can someone help me fix this, as I do not understand how to correct the issue:
ALTER PROCEDURE [Files].[ImportFiles]
AS
-- Create a temporary table for the bulk import
CREATE TABLE #TempImportFileTable(
[fileID] [bigint] IDENTITY(1,1) NOT NULL,
[FileName] [nvarchar](max) NULL,
[FilePath] [nvarchar](max) NULL,
[FullPath] [nvarchar](max) NULL,
[FileSize] [nvarchar](max) NULL,
[FileExtension] [nvarchar](max) NULL,
[FileCreated] [nvarchar](max) NULL,
[FileLastAccessed] [nvarchar](max) NULL,
[FileModified] [nvarchar](max) NULL
CONSTRAINT [PK_fileID1] PRIMARY KEY CLUSTERED
(
[fileID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY];
-- Import into the temp table
BULK INSERT #TempImportFileTable FROM 'C:\Program Files\o7th FileSystem to DB\import.txt'
WITH(KEEPIDENTITY, FIELDTERMINATOR =',', ROWTERMINATOR = '\n');
-- Delete the Duplicate entries
DELETE FROM #TempImportFileTable WHERE fileID NOT IN (SELECT MAX(fileID) FROM #TempImportFileTable GROUP BY FullPath);
-- Now Merge the 2 tables
MERGE [Files].[File] AS TargetTable
USING #TempImportFileTable AS SourceTable
ON (TargetTable.FullPath = SourceTable.FullPath)
WHEN NOT MATCHED BY TARGET
THEN INSERT (FileName, FilePath, FileSize, FileExtension, FileCreated, FileLastAccessed, FileModified)
VALUES(SourceTable.FileName, SourceTable.FilePath, SourceTable.FileSize, SourceTable.FileExtension, SourceTable.FileCreated, SourceTable.FileLastAccessed, SourceTable.FileModified)
WHEN MATCHED
THEN UPDATE SET
TargetTable.FileName = SourceTable.FileName,
TargetTable.FilePath = SourceTable.FilePath,
TargetTable.FileSize = SourceTable.FileSize,
TargetTable.FileExtension = SourceTable.FileExtension,
TargetTable.FileCreated = SourceTable.FileCreated,
TargetTable.FileLastAccessed = SourceTable.FileLastAccessed,
TargetTable.FileModified = SourceTable.FileModified;

So you have duplicates in your #TempFileTable.
IF FileName + FilePath were enough to make a row unique, you could use this condition for your MERGE:
MERGE [Files].[File] AS TargetTable
USING #TempFileTable AS SourceTable
ON (
ISNULL(TargetTable.FileName,'') = ISNULL(SourceTable.FileName,'')
AND ISNULL(TargetTable.FileName,'') = ISNULL(SourceTable.FileName,'')
)
(I don't know why the ISNULL() function didn't work when you tried to call it earlier, but it should definitely work this way, and handle the issues that arise with null values.)
If you really have duplicate rows in your original file and you want to get rid of them, you may use this kind of code:
DELETE FROM #TempFileTable WHERE fileID IN (
SELECT u.fileID FROM(
select fileID, ROW_NUMBER() OVER(PARTITION BY FileName, FilePath OVER fileID) as r_number
FROM #TempFileTable
) where u.r_number>1
)
It is ugly, but when several rows have the same FileName and same FilePath, it will remove the rows with the highest fileID and keep only one.
EDIT: About the performance issue, first try to look at the estimated execution plan and see if you can add indexes to your tables. You can also try to break the MERGE procedure into on INSERT and one UPDATE statement. MERGE should be better, but in some situations it is actually worse than separate statements.

Merge statement has a limitation that it cannot update the same row more than once, or update and delete the same row. So please make sure that you have distinct rows with filename in you source data.
Also, while doing the update you don't have to update the filename column as its going to be same in both source and the target tables.
You may get a work around in this link: http://support.microsoft.com/kb/976316

Related

How to improve perfomance

i have the following table structure :
CREATE TABLE [dbo].[TableABC](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[FieldA] [nvarchar](36) NULL,
[FieldB] [int] NULL,
[FieldC] [datetime] NULL,
[FieldD] [nvarchar](255) NULL,
[FieldE] [decimal](19, 5) NULL,
PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
I do two type of CRUD operations with this table.
SELECT * FROM [dbo].[TableABC] WHERE FieldA = #FieldA
INSERT INTO [dbo].[TableABC](FieldA,FieldB,FieldC,FieldD,FieldE) VALUES (#FieldA,#FieldB,#FieldC,#FieldD,#FieldE)
FieldA has a unique value, but there is no constraint in the table.
Currently there are 6070755 rows in the table. Along with data growing , performance is getting slow.
Any suggestion , how to improve perfomance ? How to make CREATE and READ operation faster ?
now i faced problem , that select and insert takes too long , sometime more then 60 seconds
Read up on SQL basics- and Indices DEFINITELY are one. And if you have a unique value and no index on the field (constraint is irrelevant, unique index is good neough) - yes, that will get slower. SQL Server has to check the whole table.
So:
Add a unique index to Field a.
Given your 2 statements and the little "FieldA has a unique value, but there is no constraint in the table." I assume you are trying to enforce unique values there by selecting first. This will slow you down.
Instead make the index, and then try/catch the non unique sql errors - WAY faster. WAY faster. The index will make the insert a LITTLE slower, but you can save on the very slow select you do not totally.

SQL Server 2104\2016 Instead of reading on ONE partitions, reading on two partitions

I have a problem with partitions
Instead of reading on ONE partitions, reading on two partitions
I want the reading to take place at necessary partitions
sql server version 2014\2016
--create 2 PARTITION First for 0 value, second to 1 value
CREATE PARTITION FUNCTION PF_CreditRequest(bit)
AS RANGE LEFT FOR VALUES(0);
--CREATE SCHEME
CREATE PARTITION SCHEME PS_CreditRequest
AS PARTITION PF_CreditRequest
ALL TO ([PRIMARY]); --I want both partition in one file.
CREATE TABLE [dbo].[CreditRequest](
[Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY NONCLUSTERED,
[IsDeleted] [bit] NOT NULL CONSTRAINT [DF_CreditRequest_IsDeleted] DEFAULT ((0)),
[FIO] [nvarchar](100) )
CREATE CLUSTERED INDEX [CI_CreditRequest] ON [dbo].[CreditRequest]
(
[IsDeleted] ASC,
[Id] ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
ON [PS_CreditRequest]([IsDeleted]) --split on fild IsDeleted
--some test data
insert into CreditRequest ([IsDeleted], [FIO]) values
(1,'Nike'), (0, 'Jane'), (1, 'Patrik')
--hear is ok
SELECT $PARTITION.[PF_CreditRequest](IsDeleted), * FROM CreditRequest
WHERE IsDeleted = 0
--hear is the problem Instead of reading on ONE partitions, reading on two partitions
SELECT $PARTITION.[PF_CreditRequest](IsDeleted), * FROM CreditRequest
WHERE IsDeleted = 1
In order to have an actual split on some column, it should be in the first position in the index:
CREATE CLUSTERED INDEX [CI_CreditRequest] ON [dbo].[CreditRequest]
(IsDeleted, Id)
Also note that in order to effectively use such an index, you need to specify a value for the IsDeleted column in every query that touches this table. Otherwise, all partitions will be affected.
Another subtle point must be explicitly transformed into bit
SELECT $PARTITION.PF_CreditRequest, * FROM CreditRequest
WHERE IsDeleted = cast(1 as bit)
In this case, there will be reading only for one partition

temporary table | multi-part identifier could not be bound

Every other article I see has something to with JOINS... I'm not even trying to do a join! I'm just trying to run a simple UPDATE based off information in a temporary table. Here's the code...
BEGIN TRAN ArchiveMigration
-- insert into temporary table
CREATE TABLE #tblTemp(
[theID] [int] NOT NULL,
[ScheduleID] [int] NOT NULL,
[OverridingCustomerID] [int] NOT NULL,
[Timestamp] [datetime] NOT NULL,
[DeviceName] [nvarchar](max) NULL,
[DestinationTempCool] [int] NULL,
[DestinationMode] [nvarchar](max) NULL,
[DestinationTempHeat] [int] NULL,
CONSTRAINT [PK_#tblTemp] PRIMARY KEY CLUSTERED
(
[theID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
INSERT INTO #tblTemp ([theID], [ScheduleID], [OverridingCustomerID], Timestamp, DeviceName, DestinationTempCool, DestinationMode, DestinationTempHeat)
SELECT Id, ScheduleId, OverridingCustomerId, Timestamp, DeviceName, DestinationTempCool, DestinationMode, DestinationTempHeat
FROM CustomerScheduleOverride
WHERE Id = 836;
-- modify the extended info table
UPDATE ExtendedOverrideInfo
SET ExtendedOVerrideInfo.OverrideId = Null
WHERE ExtendedOverrideInfo.OverrideId = #tblTemp.[theID];
COMMIT TRAN
All I want to do is nullify the values of ExtendedOverrideInfo.OverrideId if said ID exists in the #tblTemp (statement is towards the bottom of the script). Any idea why I might be getting this message? Thanks in advanced!
Your current UPDATE syntax is incorrect, you will need to use a JOIN on your temporary table. This article from Pinal Dave provides a more detailed explanation.
UPDATE ExtendedOverrideInfo
SET ExtendedOverrideInfo.OverrideId = Null
FROM ExtendedOverrideInfo
INNER JOIN #tblTemp t on t.[theID]=ExtendedOverrideInfo.OverrideId
You update statment is totally wrong,the where clause is not correct,you have multiple choices here to resolve your problem:
make join with tmptable
use Exists key in your where clause
Or simply,if your purpose of creating tmptable is just to nullify,why not using cursor?or change your where statment to search record by id?

SQL Statement take long time to execute

I have a SQL Server database and having a table containing too many records. Before it was working fine but now when I run SQL Statement takes time to execute.
Sometime cause the SQL Database to use too much CPU.
This is the Query for the table.
CREATE TABLE [dbo].[tblPAnswer1](
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[AttrID] [int] NULL,
[Kidato] [int] NULL,
[Wav] [int] NULL,
[Was] [int] NULL,
[ShuleID] [int] NULL,
[Mwaka] [int] NULL,
[Swali] [float] NULL,
[Wilaya] [int] NULL,
CONSTRAINT [PK_tblPAnswer1] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
And the following down is the sql stored procedure for the statement.
ALTER PROC [dbo].[uspGetPAnswer1](#ShuleID int, #Mwaka int, #Swali float, #Wilaya int)
as
SELECT ID,
AttrID,
Kidato,
Wav,
Was,
ShuleID,
Mwaka,
Swali,
Wilaya
FROM dbo.tblPAnswer1
WHERE [ShuleID] = #ShuleID
AND [Mwaka] = #Mwaka
AND [Swali] = #Swali
AND Wilaya = #Wilaya
What is wrong in my SQL Statement. Need help.
Just add an index on ShuleID, Mwaka, Swali and Wilaya columns. The order of columns in the index should depend on distribution of data (the columns with most diverse values in it should be the first in the index, and so on).
And if you need it super-fast, also include all the remaining columns used in the query, to have a covering index for this particular query.
EDIT: Probably should move the float col (Swali) from indexed to included columns.
Add an Index on the ID column and include ShuleID, Mwaka, Swali and Wilaya columns. That should help improve the speed of the query.
CREATE NONCLUSTERED INDEX IX_ID_ShuleID_Mwaka_Swali_Wilaya
ON tblPAnswer1 (ID)
INCLUDE (ShuleID, Mwaka, Swali, Wilaya);
What is the size of the table? You may need additional indices as you are not using the primary key to query the data. This article by Pinal Dave provides a script to identify missing indices.
http://blog.sqlauthority.com/2011/01/03/sql-server-2008-missing-index-script-download/
It provides a good starting point for index optimization.

sometimes Identity isn't working

I have a following table
CREATE TABLE [dbo].[test_table]
(
[ShoppingCartID] [int] IDENTITY(1,1) NOT NULL,
[CartTimeoutInMinutes] [int] NOT NULL,
[MaximumOrderLimitPerUser] [int] NOT NULL,
[MaximumOrderLimitPerSession] [int] NOT NULL,
CONSTRAINT [PK_test_table] PRIMARY KEY CLUSTERED
(
[ShoppingCartID] ASC
)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
)
ON [PRIMARY]
GO
Sometimes Identity isn't working, it's start with 0 and sometimes its start with 1.
Thank you in advance.
How are you putting the data in there? If you are using regular INSERT it should start at 1. You can, however, bulk-insert into the table, or otherwise use identity-insert; in which case all bets are off:
create table test (
id int not null identity(1,1),
name varchar(20) not null)
set identity_insert test on
insert test (id, name) values (0, 'abc')
insert test (id, name) values (27, 'def')
set identity_insert test off
select * from test
with output:
id name
----------- --------------------
0 abc
27 def
Or is the problem relating to ##IDENTITY (in which case: use SCOPE_IDENTITY() instead).
Possible
Are you using DBCC CHECKIDENT? This is invoked by some data compare tools (eg Red Gate) and has the following behaviour:
DBCC CHECKIDENT ( table_name, RESEED, new_reseed_value )
Current identity value is set to the new_reseed_value.
If no rows have been inserted into the table since the table was created, or if all rows have been removed by using the TRUNCATE TABLE statement, the first row inserted after you run DBCC CHECKIDENT uses new_reseed_value as the identity. Otherwise, the next row inserted uses new_reseed_value + the current increment value.
Or: are you using SET IDENTITY_INSERT?
These assume you are looking at the table, rather then using ##IDENTITY (as Mark suggested)