Why a SQL Server join is taking too long to execute? - sql

I am populating a temporary table with data, below is the definition of my temp table.
DECLARE #PurgeFilesList TABLE
(
JobFileID BIGINT,
ClientID INT,
StatusID INT,
IsPurgeSuccessfully BIT,
ReceivedDate DATETIME,
FilePath VARCHAR(2000),
StatementPath VARCHAR(2000)
)
Insert logic to populate temp table, after this I am making an additional join with a table named Account:
SELECT
JobFileID,
PFL.ClientID,
StatusID,
IsPurgeSuccessfully,
ReceivedDate,
CASE
WHEN FilePath IS NULL THEN StatementPath
ELSE FilePath
END 'FilePath'
FROM
#PurgeFilesList PFL
INNER JOIN
Account(NOLOCK) A ON ISNULL(PFL.ClientID, 0) = ISNULL(A.ClientID, 0)
AND A.HoldStatementPurge = 0
But, this join is taking too much time. Although total number of rows in Account table is less than 5000.
Account table schema:
Column_name Type Computed Length
-----------------------------------------------
AccountID bigint no 8
AccountNumber varchar no 32
PrimaryCustomerName varchar no 100
LastName varchar no 100
ClientName varchar no 32
BankID varchar no 32
UpdatedDate datetime no 8
IsPurged bit no 1
PurgeDate datetime no 8
ClientID int no 4
HoldStatementPurge bit no 1
Kindly let me know, if any other info is required.
Execution Plan:

Since you are not using any column from Account so, i would use EXISTS :
select fl.JobFileID, fl.ClientID, fl.StatusID,
fl.IsPurgeSuccessfully, fl.ReceivedDate,
isnull(FilePath, StatementPath) as FilePath
from #PurgeFilesList fl
where fl.ClientID is null or
exists (select 1
from Account a
where a.clientid = fl.clientid and a.HoldStatementPurge = 0
);
For the performance, index would be helpful on Account(clientid,HoldStatementPurge) & same as table variable. Just make sure your table variable has some smaller amount of data if that is not the case then you will need to use temporary tables & provide appropriate index on that table.

Your Account schema is missing nullable yes/no information. Having said that I assume Account.ClientID is not nullable so ISNULL(PFL.ClientID, 0) = A.ClientID would do too. Anyway.
My guess is you are missing a couple of well placed indexes here such as:
CREATE INDEX IX_Account_ClientID_HoldStatementPurge ON Account(ClientID, HoldStatementPurge)
Or just
CREATE INDEX IX_Account_ClientID ON Account(ClientID)
I'd say try creating both while checking the query plan first.
Also, you might want to use a Temporary Table (CREATE TABLE #TempTable ...) for this scenario instead of a Table Variable (DECLARE #TempTable TABLE ...) so you can apply an additional index to speed up things:
CREATE TABLE #PurgeFilesList
(
JobFileID BIGINT PRIMARY KEY,
ClientID INT,
StatusID INT,
IsPurgeSuccessfully BIT,
ReceivedDate DATETIME,
FilePath VARCHAR(2000),
StatementPath VARCHAR(2000)
)
CREATE INDEX #IX_PurgeFilesList_ClientID ON #PurgeFilesList(ClientID)
The reason for this is that it is not possible to create non-clustered indexes on Table Variables (only a primary key is permitted).

Please check the record size in table #PurgeFilesList.
try to use Temp Table instead of Table Variable.

Related

Store more character than the assign column in SQL Server

Can I do something like this, column is of type nchar(8), but the string I wanted to store in the table is longer than that.
The reason I am doing this is because I want to convert from one table to another table. Table A is nchar(8) and Table B is nvarchar(100). I want all characters in Table B transfer to Table A without missing any single character.
If the nvarchar(100) contains only latin characters with a length up to 16 chars, then you can squeeze the nvarchar(100) into the nchar(8):
declare #t table
(
col100 nvarchar(100),
col8 nchar(8)
);
insert into #t(col100) values('1234567890123456');
update #t
set col8 = cast(cast(col100 as varchar(100)) as varbinary(100))
select *, cast(cast(cast(col8 as varbinary(100)) as varchar(100)) as nvarchar(100)) as from8to100_16charsmax
from #t;
If you cannot modify A, then you cannot use it to store the data. Create another table for the overflow . . . something like:
create table a_overflow (
a_pk int primary key references a(pk),
column nvarchar(max) -- why stop at 100?
);
Then, you can construct a view to bring in the data from this table when viewing a:
create view vw_a as
select . . . , -- all the other columns
coalesce(ao.column, a.column) as column
from a left join
a_overflow ao
on ao.a_pk = a.pk;
And, if you really want to "hide" the view, you can create an insert trigger on vw_a, which inserts the appropriate values into the two tables.
This is a lot of work. Simply modifying the table is much simpler. That said, this approach is sometimes needed when you need to modify a large table and altering a column would incur too much locking overhead.

Return the value of a column in another table matching a criteria

I have two tables that have 3 common columns however the rows are different, and a specific row for names which has duplicated values for other values of the other 2 columns which results in ambiguity form y reports.
My plan was to then use my first table, table A and just import in there using a function and through a calculated column (since table A and B both are updated weekly) the values from table B.
Now this is my attempt to create a function:
CREATE FUNCTION dbo.ReturnFlag
(
#name nvarchar(max)
, #date smalldatetime
, #DeptID nvarchar(max)
)
RETURNS bit as
begin
return [SpecialFlag]
from TABLEB
where [Name] = #name and [Date] = #date and [department] = #DeptID
end
However of course the syntax is incorrect. Joining tables might be an option but again, both table A and B are updated weekly and I do not want to join this weekly. Table A and B are also used by other report software. My objective is to Add a computed column which uses the function above to get the corresponding value from table B. I have my 3 variables I need, since Name is not unique, but if I search in table B for name and date and department at the same time I will get a unique row and from there I can pull the value of the table B in the column SpecialFlag. I believe that I am on the correct approach in the sense that I want this to be a one time fix, so then anytime I updated table A and B, the new computed column updates itself without me needing to run another query.
I think you want:
I tend to write scalar inline function without begin/end. However, you can include them if you like:
create function dbo.ReturnFlag (
#name nvarchar(max),
#date smalldatetime,
#DeptID nvarchar(max)
) returns bit as
return (select [SpecialFlag]
from TABLEB
where [Name] = #name and [Date] = #date and
[department] = #DeptID
);

How to update xml column node value with another column new value at same update query?

I want to change the value of 2 columns in one table. One column is varchar and the other is XML. First of all, I want to replace the value of the RECIPIENT column with the new value and replace the node value named as RecipientNo in the XML column with the new value of RecipientNo. How can I do these two operations in the same update function? The query below works. Secondly, DATARECORD table includes too many records. Does modify function take too much time to update the records? If so, how can I increase the performance of modify function or can you suggest another alternative solution? By the way, I cannot add index to DATARECORD table. Thanks.
Here is the sample row;
ID RECIPIENT RECORDDETAILS
1 1 <?xml version="1.0"?>
<MetaTag xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/XMLSchema">
<Code>123</Code>
<RecipientNo>123</RecipientNo>
<Name>xyz</Name>
</MetaTag>'
CREATE TABLE #TEMPTABLE(
ID bigint,
RECIPIENT nvarchar(max),
RECORDDETAILS xml
)
INSERT INTO #TEMPTABLE
SELECT ID,RECIPIENT,RECORDDETAILS
FROM DATARECORD WITH (NOLOCK)
WHERE cast(RECORDDETAILS as varchar(max)) LIKE '%<Code>123</Code>%' and cast(RECORDDETAILS as varchar(max)) LIKE '%MetaTag%'
UPDATE #TEMPTABLE SET RECIPIENT = CONCAT('["queryType|1","recipientNoIDENTIFICATION|',RECIPIENT,']')
UPDATE #TEMPTABLE SET RECORDDETAILS.modify('replace value of (MetaTag/RecipientNo/text())[1] with sql:column("RECIPIENT")')
UPDATE d
SET d.RECORDDETAILS =Concat('<?xml version="1.0"?>', CAST(t.RECORDDETAILS AS VARCHAR(max))),
d.RECIPIENT = t.RECIPIENT
FROM dbo.DATARECORD as d
Join #TEMPTABLE as t
ON t.ID = d.ID
It's certainly possible to update an SQL column and an XML node in the same update statement, e.g.:
create table DataRecord (
ID bigint not null primary key,
Recipient nvarchar(max) not null,
RecordDetails xml not null
);
insert DataRecord values
(1, N'1', N'<?xml version="1.0"?>
<MetaTag xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/XMLSchema">
<Code>123</Code>
<RecipientNo>123</RecipientNo>
<Name>xyz</Name>
</MetaTag>');
create table #TempTable (
ID bigint not null primary key,
Recipient nvarchar(max) not null,
RecordDetails xml not null
);
insert #TempTable
select ID, Recipient, RecordDetails
from DataRecord with (nolock)
where cast(RecordDetails as varchar(max)) like '%<Code>123</Code>%' and cast(RecordDetails as varchar(max)) like '%MetaTag%'
-- Change an SQL value and an XML node in the one update statement...
update tt set
Recipient = NewRecipient,
RecordDetails.modify('replace value of (/MetaTag/RecipientNo/text())[1] with sql:column("NewRecipient")')
from #TempTable tt
outer apply (
select NewRecipient = concat('["queryType|1","recipientNoIDENTIFICATION|', Recipient, '"]')
) Calc
select * from #TempTable
Which yields:
ID Recipient RecordDetails
1 ["queryType|1","recipientNoIDENTIFICATION|1"] <MetaTag
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xsd="http://www.w3.org/XMLSchema">
<Code>123</Code>
<RecipientNo>["queryType|1","recipientNoIDENTIFICATION|1"]</RecipientNo>
<Name>xyz</Name>
</MetaTag>
There are a couple of things contributing to your performance problem:
Converting XML, which SQL Server essentially stores in UTF-16 encoding, to varchar (twice) is expensive. It will also trash any Unicode characters outside your database's collation.
Performing like matches on the XML (converted to varchar) will be causing TABLE SCAN operations, converting and testing every row in your table.
Some things to consider:
Add XML Index(es) to the RecordDetails column and use something like WHERE RecordDetails.exists('/MetaTag/Code[.="123"]) to short list the rows to be updated.
Alternatively, pre-shred your RecordDetails, persist the value of /MetaTag/Code/text() in a table column (e.g.: MetaTagCode), and use something like WHERE MetaTagCode='123' in your query. Adding an index to that column will allow SQL to do a much cheaper INDEX SCAN when searching for the desired value instead of a TABLE SCAN.
Since you say you cannot add indexes you're basically going to have to tolerate TABLE SCANs and just wait it out.

is it possible to "clone" a table variable?

I have a table variable with about 20 columns. I'd like to essentially reuse a single table variable structure for 2 different result sets. The 2 result sets should be represented in different table variables so I can't reuse a single table variable. Therefore, I was wondering if there was a way to clone a single table variable for reuse. For example, something like this:
DECLARE #MyTableVar1 TABLE(
Col1 INT,
Col2 INT
}
DECLARE #MyTableVar2 TABLE = #MyTableVar1
I'd like to avoid creating duplicate SQL if I can reuse existing SQL.
That is not possible, use temp table instead
if object_id('tempdb..#MyTempTable1') is not null drop table #MyTempTable1
Create TABLE #MyTempTable1 (
Col1 INT,
Col2 INT
)
if object_id('tempdb..#MyTempTable2') is not null drop table #MyTempTable2
select * into #MyTempTable2 from #MyTempTable1
update :
As suggested by Eric in comment, if you are looking for just table schema and not the data inside the first table then
select * into #MyTempTable2 from #MyTempTable1 where 1 = 0
You can create a user-defined table type which is typically meant for using table valued parameters for stored procedures. Once the type is created, you can use it as a type to declare any number of table variables just like built-in types. This comes closest to you requirement.
Ex:
CREATE TYPE MyTableType AS TABLE
( COL1 int
, COL2 int )
DECLARE #MyTableVar1 AS MyTableType
DECLARE #MyTableVar2 AS MyTableType
A few things to note with this solution
MyTableType becomes a database level type. It is not local to a specific stored procedure.
If you have to ever change the definition of the table, then you have to drop the code/sprocs using the TVP type, then recreate the table type with new definition and related sprocs. Typically this is a non-issue as the code and the type are created/recreated together.
You could use a temp table and select into... they perform better since their statistics are better.
create table #myTable(
Col1 INT null,
Col2 INT null
}
...
select *
into #myTableTwo
from #myTable
You can create one table variable and add type column in the table and use the type column in your queries to filter the data.
By this you are using one table to hold more than one type of data.
Hope this helps.
declare #myTable table(
Col1 INT null,
Col2 INT null,
....
Type INT NULL
}
insert into #myTable(...,type)
select ......,1
insert into #myTable(...,type)
select ......,2
select * from #myTable where type =1
select * from #myTable where type =2

Optimizing SQL query to return Record with tags

I was looking for help to optimize a query I am writing for SQL Server. Given this database schema:
TradeLead object, a record in this table is a small article.
CREATE TABLE [dbo].[TradeLeads]
(
[TradeLeadID] INT NOT NULL PRIMARY KEY IDENTITY(1,1),
Title nvarchar(250),
Body nvarchar(max),
CreateDate datetime,
EditDate datetime,
CreateUser nvarchar(250),
EditUser nvarchar(250),
[Views] INT NOT NULL DEFAULT(0)
)
Here's the cross reference table to link a TradeLead article to an Industry record.
CREATE TABLE [dbo].[TradeLeads_Industries]
(
[ID] INT NOT NULL PRIMARY KEY IDENTITY(1,1),
[TradeLeadID] INT NOT NULL,
[IndustryID] INT NOT NULL
)
Finally, the schema for the Industry object. These are essentially just tags, but a user is unable to enter these. The database will have a specific amount.
CREATE TABLE [dbo].[Industries]
(
IndustryID INT NOT NULL PRIMARY KEY identity(1,1),
Name nvarchar(200)
)
The procedure I'm writing is used to search for specific TradeLead records. The user would be able to search for keywords in the title of the TradeLead object, search using a date range, and search for a TradeLead with specific Industry Tags.
The database will most likely be holding around 1,000,000 TradeLead articles and about 30 industry tags.
This is the query I have come up with:
DECLARE #Title nvarchar(50);
SET #Title = 'Testing';
-- User defined table type containing a list of IndustryIDs. Would prob have around 5 selections max.
DECLARE #Selectedindustryids IndustryIdentifierTable_UDT;
DECLARE #Start DATETIME;
SET #Start = NULL;
DECLARE #End DATETIME;
SET #End = NULL;
SELECT *
FROM(
-- Subquery to return all the tradeleads that match a user's criteria.
-- These fields can be null.
SELECT TradeLeadID,
Title,
Body,
CreateDate,
CreateUser,
Views
FROM TradeLeads
WHERE(#Title IS NULL OR Title LIKE '%' + #Title + '%') AND (#Start IS NULL OR CreateDate >= #Start) AND (#End IS NULL OR CreateDate <= #End)) AS FTL
INNER JOIN
-- Subquery to return the TradeLeadID for each TradeLead record with related IndustryIDs
(SELECT TI.TradeLeadID
FROM TradeLeads_Industries TI
-- Left join the selected IndustryIDs to the Cross reference table to get the TradeLeadIDs that are associated with a specific industry.
LEFT JOIN #SelectedindustryIDs SIDS
ON SIDS.IndustryID = TI.IndustryID
-- It's possible the user has not selected any IndustryIDs to search for.
WHERE (NOT EXISTS(SELECT 1 FROM #SelectedIndustryIDs) OR SIDS.IndustryID IS NOT NULL)
-- Group by to reduce the amount of records.
GROUP BY TI.TradeLeadID) AS SelectedIndustries ON SelectedIndustries.TradeLeadID = FTL.TradeLeadID
With about 600,000 TradeLead records and with an average of 4 IndustryIDs attached to each one, the query takes around 8 seconds to finish on a local machine. I would like to get it as fast as possible. Any tips or insight would be appreciated.
There's a few points here.
Using constructs like (#Start IS NULL OR CreateDate >= #Start) can cause a problem called parameter sniffing. Two ways of working around it are
Add Option (Recompile) to the end of the query
Use dynamic SQL to only include the criteria that the user has asked for.
I would favour the second method for this data.
Next, the query can be rewritten to be more efficient by using exists (assuming the user has entered industry ids)
select
TradeLeadID,
Title,
Body,
CreateDate,
CreateUser,
[Views]
from
dbo.TradeLeads t
where
Title LIKE '%' + #Title + '%' and
CreateDate >= #Start and
CreateDate <= #End and
exists (
select
'x'
from
dbo.TradeLeads_Industries ti
inner join
#Selectedindustryids sids
on ti.IndustryID = sids.IndustryID
where
t.TradeLeadID = ti.TradeLeadID
);
Finally you will want at least one index on the dbo.TradeLeads_Industries table. The following are candidates.
(TradeLeadID, IndustryID)
(IndustryID, TradeLeadID)
Testing will tell you whether one or both is useful.