I have big SQL query like this:
Select Distinct [Student].[Class].roll_nbr as [PERIOD-NBR],[Student].[Class].ent_nbr as [CLASS-NBR],
IsNull(Stuff((SELECT CAST(', ' AS Varchar(MAX)) + CAST([Student].[Subject].ent_nbr AS Varchar(MAX))
FROM [Student].[Subject]
WHERE [Student].[Subject].roll_nbr = [Student].[Class].roll_nbr
and ([Student].[Subject].class_nbr = [Student].[Class].roll_assignment_nbr
or ([Student].[Class].roll_assignment_nbr = '0'
and [Student].[Subject].class_nbr = [Student].[School].bus_stop) )
AND [Student].[Subject].ent_nbr <> ''
FOR XML PATH ('')), 1, 2, ''), '')
AS [OLD-STUDENT-NBR.OLD],IsNull(Stuff((SELECT CAST(', ' AS Varchar(MAX)) + ....
It goes on and on and a page long query, which builds a report. The problem I am having is some variable is erring out with message:
Error converting data type varchar to numeric.
This is very generic error does not tell me which variable. Is there any way to pinpoint which variable is erring out in sql 2008?
Comment out half the columns, if the error continues, comment out another half. If the error stops, it's in the section you just commented out. Rinse-repeat.
When faced with this type of error in the past, I've narrowed it down by commenting out portions of the query, see if it executes, then uncomment portions of the query until it point right to the error.
Not that I know of. However, you could try the following procedure:
1) Identify what columns are being converted.
2) Execute the select with half of them. If it executes well, then the problem is in the other half.
3) Repeat 2 (halving the number of columns) until you have come to a single candidate.
If query execution is long, keep track of all combinations tried and their result, as the problem could be affecting more than one column. This leads to:
4) If the problem continues, then there is a second affected column. Discard all columns present in queries that have executed without problem, plus the incorrect one just discovered, and start again with this set on 2).
5) Repeat until the original query (and necessary modifications) executes with no issue.
Related
I am using SSIS to transform a raw data row into a transaction. Everything was going well until I added logic for a new field called "SplitPercentage" to the SQL command. The new field simply converts the value to a decimal, for example 02887 would transform into 0.2887.
The new logic works as intended, but now it takes 8 hours to run instead of 5 minutes.
Please see entire original code vs new code here:
Greatly appreciate any help!
New logic resulting in poor performance:
IF TRIM(SUBSTRING(#line, 293, 1)) = 1
BEGIN
SET #SplitPercentage = 1
END
ELSE
BEGIN
SET #SplitPercentage = CAST(''.'' + TRIM(SUBSTRING(#line, 294, 4)) AS decimal(7, 4))
END
While your current code is not ideal, I don't see anything in your new expression (SUBSTRING(), TRIM(), concatenation, CAST) that would account for such a drastic performance hit. I suspect the cause lies elsewhere.
However, I believe your expression can be simplified to eliminate the IF. Given a 5-character field "nnnnn" that you wish to treat as a decimal n.nnnn, you should be able to do this in a single statement using STUFF() to inject the decimal point:
#SplitPercentage = CAST(STUFF(SUBSTRING(#line, 293, 5), 2, 0, '.') AS decimal(7, 4))
The STUFF() injects the decimal point at position 2 (replacing 0 characters). I see no need for the TRIM().
(You would to double up the quotes for use within your Exec ('...') statement.)
Please try to change IF/ELSE block of code as follows:
SET #SplitPercentage = IIF(TRIM(SUBSTRING(#line, 293, 1)) = ''1''
, 1.0000
, CAST(''.'' + TRIM(SUBSTRING(#line, 294, 4)) AS DECIMAL(7, 4)));
A challenge you've run into is "I have a huge dynamic query process that I cannot debug." When I run into these issues, I try to break the problem down into smaller, solvable, set based options.
Reading that wall of code, my psuedocode would be something like
For all the data in Inbound_Transaction_Source by a given Source value (#SourceName)
Do all this data validation, type correction and cleanup by slicing out the current line into pieces
You can then lose the row-based approach by slicing your data up. I favor using CROSS APPLY at this point in my life but a CTE, Dervied Table, whatever makes sense in your head is valid.
Why I favor this approach though, is you can see what you're building, test it, and then modify it without worrying you're going to upset a house of cards.
-- Column ordinal declaration and definition is offsite
SELECT
*
FROM
[dbo].[Inbound_Transaction_Source] AS ITS
CROSS APPLY
(
SELECT
CurrentAgentNo = SUBSTRING(ITS.line, #CurrentAgentStartColumn, 10)
, CurrentCompMemo = SUBSTRING(ITS.line, #CompMemoStartColumn + #Multiplier, 1)
, CurrentCommAmount = SUBSTRING(ITS.line, #CommAmountStartColumn + #Multiplier, 9)
, CurrentAnnCommAmount = SUBSTRING(ITS.line, #AnnCommAmountStartColumn + #Multiplier, 9)
, CurrentRetainedCommAmount = SUBSTRING(ITS.line, #RetainedCommAmountStartColumn + #Multiplier, 9)
, CurrentRetainedSwitch = SUBSTRING(ITS.line, #RetainedSwitchStartColumn + #Multiplier, 9)
-- etc
-- A sample of your business logic
, TransactionSourceSystemCode = SUBSTRING(ITS.line, 308, 3)
)NamedCols
CROSS APPLY
(
SELECT
-- There's some business rules to be had here for first year processing
-- Something special with position 102
SUBSTRING(ITS.line,102 , 1) AS SeniorityBit
-- If department code? is 0079, we have special rules
, TRIM(SUBSTRING(ITS.line,141, 4)) As DepartmentCode
)BR0
CROSS APPLY
(
SELECT
CASE
WHEN NamedCols.TransactionSourceSystemCode in ('LVV','UIV','LMV') THEN
CASE WHEN BR0.SenorityBit = '0' THEN '1' ELSE '0' END
WHEN NamedCols.TransactionSourceSystemCode in ('CMP','FAL') AND BR0.DepartmentCode ='0079' THEN
CASE WHEN BR0.SenorityBit = '1' THEN '0' ELSE '1' END
WHEN NamedCols.TransactionSourceSystemCode in ('UIA','LMA','RIA') AND BR0.SenorityBit > '1' THEN
'1'
WHEN NamedCols.TransactionSourceSystemCode in ('FAL') THEN
'1'
ELSE '0'
END
)FY(IsFirstYear)
WHERE Source = #SourceName
ORDER BY Id;
Why did processing take increase from 5 minutes to 8 hours?
It likely had nothing to do with the change to the dynamic SQL. When an SSIS package run is "taking forever" relative to normal, then preferably while it's still running, look at your sources and destinations and make note of what it happening as it's likely one of the two.
A cursor complicates your life and is not needed once you start thinking in sets but it's unlikely to be the source of the performance problems given than you have a solid baseline of what normal is. Plus, this query is a single table query with a single filter.
Your SSIS package's data flow is probably chip shot Source to Destination Extract and Load or Slurp and Burp with no intervening transformation (as the logic is all in the stored procedure). If that's the case, then the only two possible performance points of contention are the source and destination. Since the source appears trivial, then it's likely that some other process had the destination tied up for those 8 hours. Had you run something like sp_whoisactive on the source and destination, you can identify the process that is blocking your run.
This query has been used successfully before but now throws this error and I can't seem to avoid it. The query runs for around half it's usual circa 20 min runtime and then fails with this error:
Msg 537, Level 16, Line 3
Invalid length parameter passed to LEFT or SUBSTRING function.
Query:
USE [INDEXES]
GO
SELECT DISTINCT
[ADDRESS]
,[POSTCODE]
,[POSTCODE DISTRICT]
,[LKP_FULL_FULL_PC]
,[LKP_FULL_PCS]
,[LKP_NO_ST_FULL_PCS]
INTO [INDEXES].[dbo].[LR_LOOKUP]
FROM [PROP_DATA].[dbo].[LR_Standardised_Lookups]
WHERE LEN(POSTCODE) > 0
I'm assuming this is a data issue because the query runs for around 10 mins before failing, but I can't fathom what the issue is as there are no column manipulations in the query. It's simply saying if there is a POSTCODE value include it.
Note that I've also tried using:
WHERE p.POSTCODE IS NOT NULL (with AS p on the FROM clause) but gives me same result.
WHERE DATALENGTH(POSTCODE) - same result
I've seen lots of posts on this error but they all seem to be using string manipulation which results in invalid results on some rows, whereas I am not - it's just a simple match clause here.
UPDATE: I've tried many functions in the WHERE, then I dropped the WHERE altogether - same error message.
...now I'm really confused, the error makes no sense in context.
This might run a little better for you:
USE [INDEXES]
GO
SELECT DISTINCT
[ADDRESS],
[POSTCODE],
[POSTCODE DISTRICT],
[LKP_FULL_FULL_PC],
[LKP_FULL_PCS],
[LKP_NO_ST_FULL_PCS]
INTO [INDEXES].[dbo].[LR_LOOKUP]
FROM [PROP_DATA].[dbo].[LR_Standardised_Lookups]
WHERE POSTCODE <> ''
AND POSTCODE IS NOT NULL
I am comparing data from two different databases (one MariaDB and one SQL Server) within my Node project, and am then doing inserts and updates as necessary depending on the comparison results.
I have a question about this code that I use to iterate through results in Node, going one at a time and passing in values to check against (note - I am more familiar with Node and JS than with SQL, hence this question):
SELECT TOP 1
CASE
WHEN RM00101.CUSTCLAS LIKE ('%CUSR%')
THEN CAST(REPLACE(LEFT(LR301.DOCNUMBR, CHARINDEX('-', LR301.DOCNUMBR)), '-', '') AS INT)
ELSE 0
END AS Id,
CASE
WHEN LR301.RMDTYPAL = 7 THEN LR301.ORTRXAMT * -1
WHEN LR301.RMDTYPAL = 9 THEN LR301.ORTRXAMT * -1
ELSE LR301.ORTRXAMT
END DocumentAmount,
GETDATE() VerifyDate
FROM
CRDB..RM20101
INNER JOIN
CRDB..RM00101 ON LR301.CUSTNMBR = RM00101.CUSTNMBR
WHERE
CONVERT(BIGINT, (REPLACE(LEFT(LR301.DOCNUMBR, CHARINDEX('-', LR301.DOCNUMBR)), '-', ''))) = 589091
Currently, the above works for me for finding records that match. However, if I enter a value that doesn't yet exist - in this line below, like so:
WHERE CONVERT(BIGINT, (REPLACE(LEFT( LR301.DOCNUMBR, CHARINDEX('-', LR301.DOCNUMBR)), '-', ''))) = 789091
I get this error:
Error converting data type varchar to bigint.
I assume the issue is that, if the value isn't found, it can't cast it to an INTEGER, and so it errors out. Sound right?
What I ideally want is for the query to execute successfully, but just return 0 results when a match is not found. In JavaScript I might doing something like an OR clause to handle this:
const array = returnResults || [];
But I'm not sure how to handle this with SQL.
By the way, the value in SQL Server that's being matched is of type char(21), and the values look like this: 00000516542-000. The value in MariaDB is of type INT.
So two questions:
Will this error out when I enter a value that doesn't currently match?
If so, how can I handle this so as to just return 0 rows when a match isn't found?
By the way, as an added note, someone suggested using TRY_CONVERT, but while this works in SQL Server, it doesn't work when I use it with the NODE mssql package.
I think the issue is happening because the varchar value is not always made of numbers. You can make the comparison in varchar format itself to avoid this issue:
WHERE (REPLACE(LEFT( LR301.DOCNUMBR, CHARINDEX('-', LR301.DOCNUMBR)), '-', '')) = '789091'
Hope this helps.
Edit: based on the format in the comment, this should do the trick;
WHERE REPLACE(LTRIM(REPLACE(REPLACE(LEFT( LR301.DOCNUMBR, CHARINDEX('-', LR301.DOCNUMBR)),'0',' '),'-','')),' ','0') = '789091'
I am self taught in T-SQL, so I am sure that I can gain efficiency in my code writing, so any pointers are welcomed, even if unrelated to this specific problem.
I am having a problem during a nightly routine I wrote. The database program that is creating the initial data is out of my control and is loosely written, so I have bad data that can blow up my script from time to time. I am looking for assistance in adding error checking into my script so I lose one record instead of the whole thing blowing up.
The code looks like this:
SELECT convert(bigint,(SUBSTRING(pin, 1, 2)+ SUBSTRING(pin, 3, 4)+ SUBSTRING(pin, 7, 5) + SUBSTRING(pin, 13, 3))) AS PARCEL, taxyear, subdivisn, township, propclass, paddress1, paddress2, pcity
INTO [ASSESS].[dbo].[vpams_temp]
FROM [ASSESS].[dbo].[Property]
WHERE parcelstat='F'
GO
The problem is in the first part of this where the concatenation occurs. I am attempting to convert this string (11-1111-11111.000) into this number (11111111111000). If they put their data in correctly, there is punctuation in exactly the correct spots and numbers in the right spots. If they make a mistake, then I end up with punctuation in the wrong spots and it creates a string that cannot be converted into a number.
How about simply replacing "-" and "." with "" before CONVERT to BIGINT?
To do that you would simply replace part of your code with
SELECT CONVERT(BIGINT,REPLACE(REPLACE(pin,"-",""), ".","")) AS PARCEL, ...
Hope it helps.
First, I would use replace() (twice). Second, I would use try_convert():
SELECT try_convert(bigint,
replace(replace(pin, '-', ''), '.', '')
) as PARCEL,
taxyear, subdivisn, township, propclass, paddress1, paddress2, pcity
INTO [ASSESS].[dbo].[vpams_temp]
FROM [ASSESS].[dbo].[Property]
WHERE parcelstat = 'F' ;
You might want to check if there are other characters in the value:
select pin
from [ASSESS].[dbo].[Property]
where pin like '%[^-0-9.]%';
Why not just:
select cast(replace(replace('11-1111-11111.000','-',''),'.','') as bigint)
simply, use the next code:-
declare #var varchar(100)
set #var = '11-1111-11111.000'
select convert(bigint, replace(replace(#var,'-',''),'.',''))
Result:-
11111111111000
I am taking a text input from the user, then converting it into 2 character length strings (2-Grams)
For example
RX480 becomes
"rx","x4","48","80"
Now if I directly query server like below can they somehow make SQL injection?
select *
from myTable
where myVariable in ('rx', 'x4', '48', '80')
SQL injection is not a matter of length of anything.
It happens when someone adds code to your existing query. They do this by sending in the malicious extra code as a form submission (or something). When your SQL code executes, it doesn't realize that there are more than one thing to do. It just executes what it's told.
You could start with a simple query like:
select *
from thisTable
where something=$something
So you could end up with a query that looks like:
select *
from thisTable
where something=; DROP TABLE employees;
This is an odd example. But it does more or less show why it's dangerous. The first query will fail, but who cares? The second one will actually work. And if you have a table named "employees", well, you don't anymore.
Two characters in this case are sufficient to make an error in query and possibly reveal some information about it. For example try to use string ')480 and watch how your application will behave.
Although not much of an answer, this really doesn't fit in a comment.
Your code scans a table checking to see if a column value matches any pair of consecutive characters from a user supplied string. Expressed in another way:
declare #SearchString as VarChar(10) = 'Voot';
select Buffer, case
when DataLength( Buffer ) != 2 then 0 -- NB: Len() right trims.
when PatIndex( '%' + Buffer + '%', #SearchString ) != 0 then 1
else 0 end as Match
from ( values
( 'vo' ), ( 'go' ), ( 'n ' ), ( 'po' ), ( 'et' ), ( 'ry' ),
( 'oo' ) ) as Samples( Buffer );
In this case you could simply pass the value of #SearchString as a parameter and avoid the issue of the IN clause.
Alternatively, the character pairs could be passed as a table parameter and used with IN: where Buffer in ( select CharacterPair from #CharacterPairs ).
As far as SQL injection goes, limiting the text to character pairs does preclude adding complete statements. It does, as others have noted, allow for corrupting the query and causing it to fail. That, in my mind, constitutes a problem.
I'm still trying to imagine a use-case for this rather odd pattern matching. It won't match a column value longer (or shorter) than two characters against a search string.
There definitely should be a canonical answer to all these innumerable "if I have [some special kind of data treatment] will be my query still vulnerable?" questions.
First of all you should ask yourself - why you are looking to buy yourself such an indulgence? What is the reason? Why do you want add an exception to your data processing? Why separate your data into the sheep and the goats, telling yourself "this data is "safe", I won't process it properly and that data is unsafe, I'll have to do something?
The only reason why such a question could even appear is your application architecture. Or, rather, lack of architecture. Because only in spaghetti code, where user input is added directly to the query, such a question can be ever occur. Otherwise, your database layer should be able to process any kind of data, being totally ignorant of its nature, origin or alleged "safety".