SQL Server 2008 inconsistent results - sql

I just released some code into production that is randomly causing errors. I already fixed the problem by totally changing the way I was doing the query. However, it still bothers me that I don't know what was causing the problem in the first place so was wondering if someone might know the answer. I have the following query inside of a stored procedure. I'm not looking for comments about that's not a good practice to make queries with nested function calls and things like that :-). Just really want to find out why it doesn't work consistently. Randomly the function in the query will return a non-numeric value and cause an error on the join. However, if I immediately rerun the query it works fine.
SELECT cscsf.cloud_server_current_software_firewall_id,
dbo.fn_GetCustomerFriendlyFromRuleName(cscsf.rule_name, np.policy_name) as rule_name,
cscsf.rule_action,
cscsf.rule_direction,
cscsf.source_address,
cscsf.source_mask,
cscsf.destination_address,
cscsf.destination_mask,
cscsf.protocol,
cscsf.port_or_port_range,
cscsf.created_date_utc,
cscsf.created_by
FROM CLOUD_SERVER_CURRENT_SOFTWARE_FIREWALL cscsf
LEFT JOIN CLOUD_SERVER cs
ON cscsf.cloud_server_id = cs.cloud_server_id
LEFT JOIN CLOUD_ACCOUNT cla
ON cs.cloud_account_id = cla.cloud_account_id
LEFT JOIN CONFIGURATION co
ON cla.configuration_id = co.configuration_id
LEFT JOIN DEDICATED_ACCOUNT da
ON co.dedicated_account_id = da.dedicated_account_id
LEFT JOIN CORE_ACCOUNT ca
ON da.core_account_number = ca.core_account_id
LEFT JOIN NETWORK_POLICY np
ON np.network_policy_id = (select dbo.fn_GetIDFromRuleName(cscsf.rule_name))
WHERE cs.cloud_server_id = #cloud_server_id
AND cs.current_software_firewall_confg_guid = cscsf.config_guid
AND ca.core_account_id IS NOT NULL
ORDER BY cscsf.rule_direction, cscsf.cloud_server_current_software_firewall_id
if you notice the join
ON np.network_policy_id = (select dbo.fn_GetIDFromRuleName(cscsf.rule_name))
calls a function.
Here is that function:
ALTER FUNCTION [dbo].[fn_GetIDFromRuleName]
(
#rule_name varchar(100)
)
RETURNS varchar(12)
AS
BEGIN
DECLARE #value varchar(12)
SET #value = dbo.fn_SplitGetNthRow(#rule_name, '-', 2)
SET #value = dbo.fn_SplitGetNthRow(#value, '_', 2)
SET #value = dbo.fn_SplitGetNthRow(#value, '-', 1)
RETURN #value
END
Which then calls this function:
ALTER FUNCTION [dbo].[fn_SplitGetNthRow]
(
#sInputList varchar(MAX),
#sDelimiter varchar(10) = ',',
#sRowNumber int = 1
)
RETURNS varchar(MAX)
AS
BEGIN
DECLARE #value varchar(MAX)
SELECT #value = data_split.item
FROM
(
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) as row_num FROM dbo.fn_Split(#sInputList, #sDelimiter)
) AS data_split
WHERE
data_split.row_num = #sRowNumber
IF #value IS NULL
SET #value = ''
RETURN #value
END
which finally calls this function:
ALTER FUNCTION [dbo].[fn_Split] (
#sInputList VARCHAR(MAX),
#sDelimiter VARCHAR(10) = ','
) RETURNS #List TABLE (item VARCHAR(MAX))
BEGIN
DECLARE #sItem VARCHAR(MAX)
WHILE CHARINDEX(#sDelimiter,#sInputList,0) <> 0
BEGIN
SELECT #sItem=RTRIM(LTRIM(SUBSTRING(#sInputList,1,CHARINDEX(#sDelimiter,#sInputList,0)-1))), #sInputList=RTRIM(LTRIM(SUBSTRING(#sInputList,CHARINDEX(#sDelimiter,#sInputList,0)+LEN(#sDelimiter),LEN(#sInputList))))
IF LEN(#sItem) > 0
INSERT INTO #List SELECT #sItem
END
IF LEN(#sInputList) > 0
INSERT INTO #List SELECT #sInputList -- Put the last item in
RETURN
END

The reason it is "randomly" returning different things has to do with how SQL Server optimizes queries, and where they get short-circuited.
One way to fix the problem is the change the return value of fn_GetIDFromRuleName:
return (case when isnumeric(#value) then #value end)
Or, change the join condition:
on np.network_policy_id = (select case when isnumeric(dbo.fn_GetIDFromRuleName(cscsf.rule_name)) = 1)
then dbo.fn_GetIDFromRuleName(cscsf.rule_name) end)
The underlying problem is order of evaluation. The reason the "case" statement fixes the problem is because it checks for a numeric value before it converts and SQL Server guarantees the order of evaluation in a case statement. As a note, you could still have problems with converting numbers like "6e07" or "1.23" which are numeric, but not integers.
Why does it work sometimes? Well, clearly the query execution plan is changing, either statically or dynamically. The failing case is probably on a row that is excluded by the WHERE condition. Why does it try to do the conversion? The question is where the conversion happens.
WHere the conversion happens depends on the query plan. This may, in turn, depend on when the table cscf in question is read. If it is already in member, then it might be read and attempted to be converted as a first step in the query. Then you would get the error. In another scenario, the another table might be filtererd, and the rows removed before they are converted.
In any case, my advice is:
NEVER have implicit conversion in queries.
Use the case statement for explicit conversions.
Do not rely on WHERE clauses to filter data to make conversions work. Use the case statement.

Related

Stored Procedure was working fine but suddenly stopped for certain values

So I've been working on a number of stored procedures for an SSRS report I'm building and have an odd error and need a pair of fresh eyes to see what I could be missing.
My procedure is pretty simple - SELECT various columns from some JOINed tables, INSERT them into a #temp table and SELECT all of the contents of the table to display as detail rows in my report.
My complete procedure is shown here:
USE [DB]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[rpt_select_ACHW_Ob]
#PR_ID INT
AS BEGIN
SET NOCOUNT ON;
CREATE TABLE #temp
(BuildingNum INT,
MTHD VARCHAR(50),
ObDescript NVARCHAR(50),
SizeRemarks VARCHAR(50),
ObLength NUMERIC,
ObWidth NUMERIC,
ObArea NUMERIC,
Stories NUMERIC,
Grade VARCHAR(50),
YearBuilt SMALLINT,
Condition VARCHAR(50),
Phys NUMERIC,
FC VARCHAR(50),
PercentDone NUMERIC,
TaxValue NUMERIC)
DECLARE #RowCounter INT, #h INT
SET #RowCounter = 11
INSERT INTO #temp(BuildingNum, MTHD, ObDescript, SizeRemarks, ObLength, ObWidth, ObArea, Stories, Grade, YearBuilt, Condition, Phys, FC, PercentDone, TaxValue)
SELECT ob.Ob_LineNumber AS BuildingNum,
CASE WHEN ob.SP IS NOT NULL THEN 'S' ELSE 'P' END AS MTHD,
som.Id AS ObDescript,
CASE WHEN ob.SF IS NULL THEN ob.CN ELSE CAST((ob.SF + '/' + ob.CN) AS VARCHAR) END AS SizeRemarks,
CASE WHEN ob.ob_Length IS NULL THEN 0 ELSE ob.Ob_Length END AS ObLength,
CASE WHEN ob.ob_Width IS NULL THEN 0 ELSE ob.Ob_Width END AS ObWidth,
CASE WHEN ob.ob_Length IS NULL WHEN ob.ob_Width IS NULL THEN 0 THEN 0 ELSE (ob.Ob_Length * ob.Ob_Width) END AS ObArea,
ob.Ob_NStories AS OBStories,
sovg.Grade AS ObGrade,
ob.Ob_YearBuilt As ObYearBuilt,
ob.Ob_ConditionCode AS ObConditionCode,
ob.DR AS phys,
ob.FC AS FC,
ob.Ob_PercentComplete AS ObPercentComplete,
ob.Ob_ValueTax AS TaxValue
FROM t_Ob ob WITH (NOLOCK)
LEFT JOIN t_ObToPR otpr WITH (NOLOCK) ON ob.Ob_ID=otpr.Ob_ID
LEFT JOIN t_PR pr WITH (NOLOCK) ON otpr.PR_ID=pr.PR_ID
LEFT JOIN t_S_Grade sovg WITH (NOLOCK) ON ob.S_Grade_ID=sovg.S_Grade_ID
LEFT JOIN t_SObD sod WITH (NOLOCK) ON ob.SObD_ID=sod.SObD_ID
LEFT JOIN t_SObM som WITH (NOLOCK) ON sod.SObM_ID=som.SObM_ID
WHERE pr.PR_Id = #PR_ID
SET #h = (SELECT COUNT(*) FROM #temp)
WHILE #h < #RowCounter OR #h % #RowCounter > 0
BEGIN
INSERT INTO #temp (BuildingNum) VALUES (NULL)
SET #h = #h + 1
END
SELECT * FROM #temp
ORDER BY CASE WHEN BuildingNum IS NULL THEN 1 ELSE 0 END, BuildingNum
END
As I said, I've been having a strange issue with this code. It's been working fine for the past two weeks for all test cases. I'm using EXEC to select the records based on the parameter #PR_ID and it was working fine. Yesterday, after not having touched ANYTHING with the code, I've begun generating an error code for only certain PR_ID values:
Msg 8114, Level 16, State 5, Procedure rpt_select_ACHW_Ob, Line 28 [Batch Start Line 2]
Error converting data type varchar to numeric.
Line 28 leads you to FC VARCHAR(50) which I've checked 10 times. All of the data types declared in the #temp table match up perfectly with the values being selected. Does anyone have any ideas as to why this has stopped working?
Here's a dbFiddle link with some sample data
Currently working with SQL Server 2012.
One of two things happened. The less likely is your original query was written correctly and someone changed an underlying table -- changing an integer column to a string column -- and then populated the string column with non-numeric data.
The more likely scenario is that your original query has implicit conversion. This is just a problem waiting to happen -- and now you know why. You have an error message from SQL Server and it doesn't specify the table, the row, or the column where the problem occurs. Arrggg!
My suggestion is to go through the query and check every expression and comparison to be sure the types are compatible (number/number, string/string, datetime/datetime is sufficient). If they are not, add explicit conversions. You can add the conversions using try_convert() or try_cast(), which will at least avoid the error (at the expense of producing NULLs).
I wish SQL Server had a "no-implicit conversions" mode where it would warn you that a query was using such conversions. Alas, no. So, get into the habit of writing your queries so all conversions are explicit.
EDIT:
For instance (based on the comments), this expression:
CAST((ob.SF + '/' + ob.CN) AS VARCHAR
should be:
CAST( (ob.SF as VARCHAR(255)) + '/' + ob.CN) AS VARCHAR(255))
Note that you should include lengths in all CHAR()/VARCHAR() references in SQL Server.
Per your comment on #Gordon's answer: If accurate, mark Gordon's response as your solve.
This will fail:
Declare #SF Int = 12
Declare #CN VarChar(10) = '2'
Select CAST((#SF + '/' + #CN) AS VARCHAR)
Unless you add a Cast:
Select CAST((Cast(#SF As VarChar(10)) + '/' + #CN) AS VARCHAR)
Result
12/2

How to process SQL string char-by-char to build a match weight?

The problem: I need to display fields for user entry on a form, dynamic to some lookup criteria.
My current solution: I've created a SQL table with some field entry criteria, based on a relatively simple matching criteria. The match criteria basically is such that Lookup Value starts with Match Code, and the most precise match is found by doing a LEN comparison.
select
f.[IS_REQUIRED]
, f.[MASK]
, f.[MAX_LENGTH]
, f.[MIN_LENGTH]
, f.[RESOURCE_KEY]
, f.[SEQUENCE]
from [dbo].[MY_RECORD] r with(nolock)
inner join [dbo].[ENTRY_FORMAT] f with(nolock)
on r.[LOOKUP_VALUE] like f.[MATCH_CODE]
-- Logic to filter by single, most-precise record match.
cross apply (
select f1.[SEQUENCE]
from [dbo].[ENTRY_FORMAT] f1 with(nolock)
where f.[SEQUENCE] = f1.[SEQUENCE]
and s.[MATCH_CODE] like f1.[MATCH_CODE]
group by f1.[SEQUENCE]
having len(f.[MATCH_CODE]) = max(len(f1.[MATCH_CODE]))
) tFilter
where r.[ID] = #RecordId
Current issues with this is that the most precise match has to be calculated each and every call, against each and every match. Additionally, I'm only currently able to support the % in the MATCH_CODE. (e.g., '%' is the default for all LOOKUP_VALUE, while an entry of '12%' would be the more precise match for a LOOKUP_VALUE of '12345', and MATCH_CODE of '12345' should obviously me the most precise match.) However, I would like to add support for [4-7], etc. wildcards. Going just off of LEN, this would definitely be wrong, because '[4-7]' adds a lot to the length, but, for example '12345' is still the desired match over '123[4-7]'
My desired update: To add a MATCH_WEIGHT column to ENTRY_FORMAT, which I can update via a trigger on insert/update. For my initial implementation, I'm just looking for something that can go through MATCH_CODE, character by character, increasing MATCH_WEIGHT, but treating [..] as just a single character when doing so. Is there a good mechanism (UDF - either SQL or CLR? CURSOR?) for iterating through characters of a varchar field to calculate a value in this way? Something like increasing MATCH_WEIGHT by two per non-wildcard, and perhaps by one on a wildcard; with details to be further thought out and worked out...
The goal being to use a query more like:
select
f.[IS_REQUIRED]
, f.[MASK]
, f.[MAX_LENGTH]
, f.[MIN_LENGTH]
, f.[RESOURCE_KEY]
, f.[SEQUENCE]
from [dbo].[MY_RECORD] r with(nolock)
-- Logic to filter by single, most-precise record match.
cross apply (
select top 1
f1.[MATCH_CODE]
, f1.[SEQUENCE]
from [dbo].[ENTRY_FORMAT] f1 with(nolock)
where r.[LOOKUP_VALUE] like f1.[MATCH_CODE]
group by f1.[SEQUENCE]
order by f1.[MATCH_WEIGHT] desc
) tFilter
inner join [dbo].[ENTRY_FORMAT] f with(nolock)
on f.[MATCH_CODE] = tFilter.[MATCH_CODE]
and f.[SEQUENCE] = tFilter.[SEQUENCE]
where r.[ID] = #RecordId
Note: I realize this is a relatively fragile setup. The ENTRY_FORMAT records are only entered by developers, who are aware of the restrictions, so for now assume that valid data is entered, and which does not cause match collisions.
With some help, I've come up with one implementation (answer below), but am still unsure as to my total design, so welcoming better answers or any criticism.
From Steve's answer on another question, I've used much of the body to create a function to accomplish support for the [..] wildcard at the end of a match code.
CREATE FUNCTION CalculateMatchWeight
(
-- Add the parameters for the function here
#MatchCode varchar(100)
)
RETURNS smallint
AS
BEGIN
-- Declare the return variable here
DECLARE #Result smallint = 0;
-- Add the T-SQL statements to compute the return value here
DECLARE #Pos int = 1, #N0 int = ascii('0'), #N9 int = ascii('9'), #AA int = ascii('A'), #AZ int = ascii('Z'), #Wild int = ascii('%'), #Range int = ascii('[');
DECLARE #Asc int;
DECLARE #WorkingString varchar(100) = upper(#MatchCode)
WHILE #Pos <= LEN(#WorkingString)
BEGIN
SET #Asc = ascii(substring(#WorkingString, #Pos, 1));
If ((#Asc between #N0 and #N9) or (#Asc between #AA and #AZ))
SET #Result = #Result + 2;
ELSE
BEGIN
-- Check wildcard matching, update value according to match strength, and stop calculating further.
-- TODO: In the future we may wish to have match codes with wildcards not just at the end; try to figure out a mechanism to calculating that case.
IF (#Asc = #Range)
BEGIN
SET #Result = #Result + 2;
SET #Pos = 100;
END
IF (#Asc = #Wild)
BEGIN
SET #Result = #Result + 1;
SET #Pos = 100;
END
END
SET #Pos = #Pos + 1
END
-- Return the result of the function
RETURN #Result
END
I've checked that this can generate desired output for the current cases that I'm trying to cover:
SELECT [dbo].[CalculateMatchWeight] ('12345'); -- Most precise (10)
SELECT [dbo].[CalculateMatchWeight] ('123[4-5]'); -- Middle (8)
SELECT [dbo].[CalculateMatchWeight] ('123%'); -- Least (7)
Now I can call this function in a trigger on INSERT/UPDATE to update the MATCH_WEIGHT:
CREATE TRIGGER TRG_ENTRY_FORMAT_CalcMatchWeight
ON ENTRY_FORMAT
AFTER INSERT,UPDATE
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- Insert statements for trigger here
DECLARE #NewMatchWeight smallint = (select dbo.CalculateMatchWeight(inserted.MATCH_CODE) from inserted),
#CurrentMatchWeight smallint = (select inserted.MATCH_WEIGHT from inserted);
IF (#CurrentMatchWeight <> #NewMatchWeight)
BEGIN
UPDATE ENTRY_FORMAT
SET MATCH_WEIGHT = #NewMatchWeight
FROM inserted
WHERE ENTRY_FORMAT.[MATCH_CODE] = inserted.[MATCH_CODE]
AND ENTRY_FORMAT.[SEQUENCE] = inserted.[SEQUENCE]
END
END

**Occasional** Arithmetic overflow error converting expression to data type int

I'm running an update script to obfuscate data and am occasionally experiencing the arithmetic overflow error message, as in the title. The table being updated has 260k records and yet the update script will need to be run several times to produce the error. Although it's so rare I can't rely on the code until it's fixed as it's a pain to debug.
Looking at other similar questions, this is often resolved by changing the data type e.g from INT to BIGINT either in the table or in a calculation. However, I can't see where this could be required. I've reduced the script to the below as I've managed to pin point it to the update of one column.
A function is being called by the update and I've included this below. I suspect that, due to the randomness of the error, the use of the NEW_ID function could be causing it but I haven't been able to re-create the error when just running this part of the function multiple times. The NEW_ID function can't be used in functions so it's being called from a view, also included below.
Update script:
UPDATE dbo.Addresses
SET HouseNumber = CASE WHEN LEN(HouseNumber) > 0
THEN dbo.fn_GenerateRandomString (LEN(HouseNumber), 1, 1, 1)
ELSE HouseNumber
END
NEW_ID view and random string function
CREATE VIEW dbo.vw_GetNewID
AS
SELECT NEWID() AS New_ID
CREATE FUNCTION dbo.fn_GenerateRandomString (
#stringLength int,
#upperCaseBit bit,
#lowerCaseBit bit,
#numberBit bit
)
RETURNS nvarchar(100)
AS
BEGIN
-- Sanitise string length values.
IF ISNULL(#stringLength, -1) < 0
SET #stringLength = 0
-- Generate a random string from the specified character sets.
DECLARE #string nvarchar(100) = ''
SELECT
#string += c2
FROM
(
SELECT TOP (#stringLength) c2 FROM (
SELECT c1 FROM
(
VALUES ('A'),('B'),('C')
) AS T1(c1)
WHERE #upperCaseBit = 1
UNION ALL
SELECT c1 FROM
(
VALUES ('a'),('b'),('c')
) AS T1(c1)
WHERE #lowerCaseBit = 1
SELECT c1 FROM
(
VALUES ('0'),('1'),('2'),('3'),('4'),('5'),('6'),('7'),('8'),('9')
) AS T1(c1)
WHERE #numberBit = 1
)
AS T2(c2)
ORDER BY (SELECT ABS(CHECKSUM(New_ID)) from vw_GetNewID)
) AS T2
RETURN #string
END
Addresses table (for testing):
CREATE TABLE dbo.Addresses(HouseNumber nchar(32) NULL)
INSERT Addresses(HouseNumber)
VALUES ('DSjkmf jkghjsh35hjk h2jkhj3h jhf'),
('SDjfksj3548 ksjk'),
(NULL),
(''),
('2a'),
('1234567890'),
('An2b')
Note: only 7k of the rows in the addresses table have a value entered i.e. LEN(HouseNumber) > 0.
An arithmetic overflow in what is otherwise string-based code is confounding. But there is one thing that could be causing the arithmetic overflow. That is your ORDER BY clause:
ORDER BY (SELECT ABS(CHECKSUM(New_ID)) from vw_GetNewID)
CHECKSUM() returns an integer, whose range is -2,147,483,648 to 2,147,483,647. Note the absolute value of the smallest number is 2,147,483,648, and that is just outside the range. You can verify that SELECT ABS(CAST('-2147483648' as int)) generates the arithmetic overflow error.
You don't need the checksum(). Alas, you do need the view because this logic is in a function and NEWID() is side-effecting. But, you can use:
ORDER BY (SELECT New_ID from vw_GetNewID)
I suspect that the reason you are seeing this every million or so rows rather than every 4 billion rows or so is because the ORDER BY value is being evaluated multiple times for each row as part of the sorting process. Eventually, it is going to hit the lower limit.
EDIT:
If you care about efficiency, it is probably faster to do this using string operations rather than tables. I might suggest this version of the function:
CREATE VIEW vw_rand AS SELECT rand() as rand;
GO
CREATE FUNCTION dbo.fn_GenerateRandomString (
#stringLength int,
#upperCaseBit bit,
#lowerCaseBit bit,
#numberBit bit
)
RETURNS nvarchar(100)
AS
BEGIN
DECLARE #string NVARCHAR(255) = '';
-- Sanitise string length values.
IF ISNULL(#stringLength, -1) < 0
SET #stringLength = 0;
DECLARE #lets VARCHAR(255) = '';
IF (#upperCaseBit = 1) SET #lets = #lets + 'ABC';
IF (#lowerCaseBit = 1) SET #lets = #lets + 'abc';
IF (#numberBit = 1) SET #lets = #lets + '0123456789';
DECLARE #len int = len(#lets);
WHILE #stringLength > 0 BEGIN
SELECT #string += SUBSTRING(#lets, 1 + CAST(rand * #len as INT), 1)
FROM vw_rand;
SET #stringLength = #stringLength - 1;
END;
RETURN #string
END;
As a note: rand() is documented as being exclusive of the end of its range, so you don't have to worry about it returning exactly 1.
Also, this version is subtly different from your version because it can pull the same letter more than once (and as a consequence can also handle longer strings). I think this is actually a benefit.

SSRS Multivalue Parameter in Dataset Query issue

I Have an embedded dataset in my report which I pass parameters into.
This works fine for a single select using the = Sign in my And line
I would of thought and google results seem to be saying the same that i can just change the = sign to 'IN'
FROM [database].[dbo].[itemTable]
right Outer Join [database].[dbo].[CategoryTable]
on [database].[dbo].[itemTable].Category= [database].[dbo].[CategoryTable].Category And ([database].[dbo].[itemTable].Region = #pRegion) And ([database].[dbo].[itemTable].CategoryLN = #pCategoryLN )
where [database].[dbo].[CategoryTable].Category != 'RETIRED'
Above works fine but if I change to
[database].[dbo].[itemTable].Region IN #pRegion'
The query window says Incorrect syntax near '#pRegion'.
Looks like all you are missing is brackets around the parameter.
[database].[dbo].[itemTable].Region IN (#pRegion)
Also make sure you don't edit/parse the parameter values.
We've resolved this issue by using a database table-valued function (probably found somewhere on the internet, but I can't remember where)
CREATE FUNCTION [database].[dbo].[ParamSplit]
(
#List nvarchar(max), -- string returned from multivalue report parameter
#SplitOn nvarchar(5) -- separator character
)
RETURNS #RtnValue table
(
Id int identity(1,1),
Value nvarchar(100)
)
AS
BEGIN
While (Charindex(#SplitOn,#List)>0)
Begin
Insert Into #RtnValue (value)
Select Value = ltrim(rtrim(Substring(#List,1,Charindex(#SplitOn,#List)-1)))
Set #List = Substring(#List,Charindex(#SplitOn,#List)+len(#SplitOn),len(#List))
End
Insert Into #RtnValue (Value)
Select Value = ltrim(rtrim(#List))
Return
END
Then you can use it in your dataset query.
where [database].[dbo].[itemTable].Region IN (Select [dbo].[ParamSplit].[Value] from [database].[dbo].[ParamSplit](#pRegion,','))

sql server cursor

I want to copy data from one table (rawdata, all columns are VARCHAR) to another table (formatted with corresponding column format).
For copying data from the rawdata table into formatted table, I'm using cursor in order to identify which row is affected. I need to log that particular row in an error log table, skip it, and continue copying remaining rows.
It takes more time to copying. Is there any other way to achieve this?
this is my query
DECLARE #EntityId Varchar(16) ,
#PerfId Varchar(16),
#BaseId Varchar(16) ,
#UpdateStatus Varchar(16)
DECLARE CursorSample CURSOR FOR
SELECT EntityId, PerfId, BaseId, #UpdateStatus
FROM RawdataTable
--Returns 204,000 rows
OPEN CursorSample
FETCH NEXT FROM CursorSample INTO #EntityId,#PerfId,#BaseId,#UpdateStatus
WHILE ##FETCH_STATUS = 0
BEGIN
BEGIN TRY
--try insertting row in formatted table
Insert into FormattedTable
(EntityId,PerfId,BaseId,UpdateStatus)
Values
(Convert(int,#EntityId),
Convert(int,#PerfId),
Convert(int,#BaseId),
Convert(int,#UpdateStatus))
END TRY
BEGIN CATCH
--capture Error EntityId in errorlog table
Insert into ERROR_LOG
(TableError_Message,Error_Procedure,Error_Log_Time)
Values
(Error_Message()+#EntityId,’xxx’, GETDATE())
END CATCH
FETCH NEXT FROM outerCursor INTO #EntityId, #BaseId
END
CLOSE CursorSample
DEALLOCATE CursorSampler –cleanup CursorSample
You should just be able to use a INSERT INTO statement to put the records directly into the formatted table. INSERT INTO will perform much better than using a cursor.
INSERT INTO FormattedTable
SELECT
CONVERT(int, EntityId),
CONVERT(int, PerfId),
CONVERT(int, BaseId),
CONVERT(int, UpdateStatus)
FROM RawdataTable
WHERE
IsNumeric(EntityId) = 1
AND IsNumeric(PerfId) = 1
AND IsNumeric(BaseId) = 1
AND IsNumeric(UpdateStatus) = 1
Note that IsNumeric can sometimes return 1 for values that will then fail on CONVERT. For example, IsNumeric('$e0') will return 1, so you may need to create a more robust user defined function for determining if a string is a number, depending on your data.
Also, if you need a log of all records that could not be moved into the formatted table, just modify the WHERE clause:
INSERT INTO ErrorLog
SELECT
EntityId,
PerfId,
BaseId,
UpdateStatus
FROM RawdataTable
WHERE
NOT (IsNumeric(EntityId) = 1
AND IsNumeric(PerfId) = 1
AND IsNumeric(BaseId) = 1
AND IsNumeric(UpdateStatus) = 1)
EDIT
Rather than using IsNumeric directly, it may be better to create a custom UDF that will tell you if a string can be converted to an int. This function worked for me (albeit with limited testing):
CREATE FUNCTION IsInt(#value VARCHAR(50))
RETURNS bit
AS
BEGIN
DECLARE #number AS INT
DECLARE #numeric AS NUMERIC(18,2)
SET #number = 0
IF IsNumeric(#value) = 1
BEGIN
SET #numeric = CONVERT(NUMERIC(18,2), #value)
IF #numeric BETWEEN -2147483648 AND 2147483647
SET #number = CONVERT(INT, #numeric)
END
RETURN #number
END
GO
The updated SQL for the insert into the formatted table would then look like this:
INSERT INTO FormattedTable
SELECT
CONVERT(int, CONVERT(NUMERIC(18,2), EntityId)),
CONVERT(int, CONVERT(NUMERIC(18,2), PerfId)),
CONVERT(int, CONVERT(NUMERIC(18,2), BaseId)),
CONVERT(int, CONVERT(NUMERIC(18,2), UpdateStatus))
FROM RawdataTable
WHERE
dbo.IsInt(EntityId) = 1
AND dbo.IsInt(PerfId) = 1
AND dbo.IsInt(BaseId) = 1
AND dbo.IsInt(UpdateStatus) = 1
There may be a little weirdness around handling NULLs (my function will return 0 if NULL is passed in, even though an INT can certainly be null), but that can be adjusted depending on what is supposed to happen with NULL values in the RawdataTable.
You can put a WHERE clause in your cursor definition so that only valid records are selected in the first place. You might need to create a function to determine validity, but it should be faster than looping over them.
Actually, you might want to create a temp table of the invalid records, so that you can log the errors, then define the cursor only on the rows that are not in the temp table.
Insert into will work much more better than Cursor.
As Cursor work solely in Memory of your PC and slows down the optimization of SQL Server. We should avoid using Cursors but (of course) there are situations where usage of Cursor cannot be avoided.