I have this code snippet I'm playing with (forgive the generic names):
create function GetList
(#d1 varchar(3), #d2 varchar(3), #d3 varchar(3))
returns table
as
return
with List
as
(
select x.pattern
from (values (#d1), (#d2), (#d3)) as x(pattern)
)
select * from list
This is eventually going to be a user-supplied list which they will use to query something else out, but playing around with this made me curious. If I were to run
select * from GetList('1111111','222','333')
I will get the same results as if I only entered in 3 characters for each. Since I limited the varchar parameter to characters, are the others completely ignored? Is there any potential nastiness that can happen if I have a varchar parameter that is 'overflowed' like this (other than the loss of data at the end of the string, of course)
The other characters totally ignored since you limited the parameter to a length of 3.
The only issue that you could have is if you actually wanted to return the characters that are over the length of 3.
For example, you pass in 1234567 and you actually want the whole value, you will only get 123.
If you are limiting the input parameter to 3, then there would be no reason to try and pass in a longer value. If there is a chance that you will pass in longer values, then you should increase the length of the parameter.
Related
Initial situation
I have a relatively large table (ca. 0.7 Mio records) where an nvarchar field "MediaID" contains largely media IDs in proper hexadecimal notation (as they should).
Within my "sequential" query (each query depends on the output of the query before, this is all in pure T-SQL) I have to convert these hexadecimal values into decimal bigint values in order to do further calculations and filtering on these calculated values for the subsequent queries.
--> So far, no problem. The "sequential" query works fine.
Problem
Unfortunately, some of these Media IDs do contain non-hex characters - most probably because there was some typing errors by the people which have added them or through import errors from the previous business system.
Because of these non-hex chars, the whole query fails (of course) because the conversion hits an error.
For my current purpose, such rows must be skipped/ignored as they are clearly wrong and cannot be used (there are no medias / data carriers in use with the current business system which can have non-hex character IDs).
Manual editing of the data is not an option as there are too many errors and it is not clear with what the data must be replaced.
Challenge
To create a query which only returns records which have valid hex values within the media ID field.
(Unfortunately, my SQL skills are not enough to create the above query. Your help is highly appreciated.)
The relevant section of the larger query looks like this (xxxx is where your help comes in :-))
select
pureMediaID
, mediaID
, CUSTOMERID
,CONTRACT_CUSTOMERID
from
(
select concat('0x', Replace(Ltrim(Replace(mediaID, '0', ' ')), ' ', '0')) AS pureMediaID
--, CUSTOMERID
, *
from M_T_CONTRACT_CUSTOMERS
where mediaID is not null
and mediaID like '0%'
and xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
) as inner1
EDIT: As per request I have added here some good and some bad data:
Good:
4335463357
4335459809
1426427996
4335463509
4335515039
4335465134
4427370396
4335415661
4427369036
4335419089
004BB03433
004e7cf9c6
00BD23133
00EE13D8C1
00CCB5522C
00C46522C
00dbbe3433
Bad:
4564589+
AB6B8BFC.8
7B498DFCnm
DB218DFChb
d<tgfh8CFC
CB9E8AFCzj
B458DFCjhl
rytzju8DFC
BFCtdsjshj
DB9888FCgf
9BC08CFCyx
EB198DFCzj
4B628CFChj
7B2B8DFCgg
After I did upgrade the compatibility level of the SQL instance to SQL2016 (it was below 2012 before) I could use try_convert with same syntax as the original convert function as donPablo has pointed out. With that the query could run fully through and every MediaID which is not a correct hex value gets nicely converted into a null value - really, really nice.
Exactly what I needed.
Unfortunately, the solution of ALICE... didn't work out for me as this was also (strangely) returning records which had the "+" character within them.
Edit: The added comment of Alice... where you create a calculated field like this:
CASE WHEN "KEY" LIKE '%[^0-9A-F]%' THEN 0 ELSE 1 end as xyz
and then filter in the next query like this:
where xyz = 1
works also with SQL Instances with compatibility level < SQL 2012.
Great addition for people which still have to work with older SQL instances.
An option (although not ideal in terms of performance) is to check the characters in the MediaID through a case statement and regular expression
Hexadecimals cannot contain characters other than A-F and numbers between 0 and 9
CASE WHEN MediaID LIKE '%[0-9A-F]%' THEN 1 ELSE 0 END
I would recommend writing a function that can be used to evaluate MediaID first and checks if it is hexadecimal and then running the query for conversion
Can you comment on what approach shown below is preferable? I hope the question will not be blocked as "opinionated". I would like to believe there is an explanation that makes that clear.
Context: I have a code for mirroring 3rd party table contents to my own table (optimization). It worked some time flawlessly until the size/modification of the database reached some threshold.
The optimization is based on row version values of more tables, and remembering the maximum of the values from the source tables. This way I am able to update my local table incrementally, much faster than rebuilding it from time to time from scratch.
The problem started to appear when the row-version value exceeded the 4byte value. After some effort, I have spotted that the upper 4 bytes of the binary(8) value were set to 0. Later, the suspect was found to have a form COALESCE(MAX(row_version), 1).
The COALESCE was used to cover the case when the local table is fresh, containing now data -- for comparing the MAX(row_version) of source tables with something meaningful.
The examples to show the bug: To simulate the last mentioned situation, I want to convert the NULL value of the binary(8) column to 1. I am adding also the ISNULL usage that was added later. The original code contained the COALESCE only.
DECLARE #bin8null binary(8) = NULL
SELECT 'bin NULL' AS the_variable, #bin8null AS value
SELECT 'coalesce 1' AS op, COALESCE(#bin8null, 1) AS good_value
SELECT 'coalesce 1 + convert' AS op, CONVERT(binary(8), COALESCE(#bin8null, 1)) AS good_value
SELECT 'isnull 1' AS op, ISNULL(#bin8null, 1) AS good_value
SELECT 'isnull 0x1' AS op, ISNULL(#bin8null, 0x1) AS bad_value
(There is a bug in the image coalesce 0x1 + convert fixed later in the code to coalesce 1 + convert, but not fixed in the image.)
The application bug appeared when the binary value was bigger than the part that could be stored in 4 bytes. Here the 0xAAAAAAAA was used. (Actually, the 0x00000001 was the case, and it was difficult to spot that the single 1 was changed to 0.)
DECLARE #bin8 binary(8) = 0xAAAAAAAA01BB3A35
SELECT 'bin' AS the_variable, #bin8 AS value
SELECT 'coalesce 1' AS op, COALESCE(#bin8, 1) AS bad_value
SELECT 'coalesce 1 + convert' AS op, CONVERT(binary(8), COALESCE(#bin8, 1)) AS bad_value
SELECT 'coalesce 0x1 + convert ' AS op, CONVERT(binary(8), COALESCE(#bin8, 0x1)) AS good_value
SELECT 'isnull 1' AS op, ISNULL(#bin8, 1) AS good_value
SELECT 'isnull 0x1' AS op, ISNULL(#bin8, 0x1) AS good_value
When executed in Microsoft SQL Server Management Studio on MS-SQL Server 2014, the result looks like this:
Description -- my understanding: The COALESCE() seems to derive the type of the result from the type of the last processed argument. This way, the non-NULL binary(8) was converted to int, and that lead to the loss of upper 4 bytes. (See the 2nd and 3rd red bad_value on the picture. The difference between the two cases is only in decimal/hexadecimal form of display.)
On the other hand, the ISNULL() seems to preserve the type of the first argument, and converts the second value to that type. One should be careful to understand that binary(8) is more like a series of bytes. The interpretation as one large integer is only the interpretation. Hence, the 0x1 as the default value does not expand as 8bytes integer and produces bad value.
My solution: So, I have fixed the bug using ISNULL(MAX(row_version), 1). Is that correct?
This is not a bug. They're documented to handle data type precedence differently. COALESCE determines the data type of the output based on examining all of the arguments, while ISNULL has a more simplistic approach of inspecting only the first argument. (Both still need to contain values which are all compatible, meaning they are all possible to convert to the determined output type.)
From the COALESCE topic:
Returns the data type of expression with the highest data type precedence.
The ISNULL topic does not make this distinction in the same way, but implicitly states that the first expression determines the type:
replacement_value must be of a type that is implicitly convertible to the type of check_expression.
I have a similar example (and describe several other differences between COALESCE and ISNULL) here. Basically:
DECLARE #int int, #datetime datetime;
SELECT COALESCE(#int, CURRENT_TIMESTAMP);
-- works because datetime has a higher precedence than the chosen output type, int
2020-08-20 09:39:41.763
GO
DECLARE #int int, #datetime datetime;
SELECT ISNULL(#int, CURRENT_TIMESTAMP);
-- fails because int, the first (and chosen) output type, has a lower precedence than datetimeMsg 257, Level 16, State 3Implicit conversion from data type datetime to int is not allowed. Use the CONVERT function to run this query.
Let me start of by saying:
This is not a "bug".
ISNULL and COALESCE are not the same function, and operate quite differently.
ISNULL takes 2 parameters, and returns the second parameter if the first has a value NULL. If the 2 parameters are different datatypes, then the dataype of the first datatype is returned (implicitly casting the second value).
COALESCE takes 2+ parameters, and returns the first non-NULL parameter. COALESCE is a short hand CASE expression, and uses Data Type Precendence to determine the returned data type.
As a result, this is why ISNULL returns what you expect, there is no implicit conversion in your query for the non-NULL variable.
For the COALESCE there is implicit conversion. binary has the lowest precedence of all the data types, with a rank of 30 (at time of writing). The value 1 is an int, and has a precedence of 16; far higher than 30.
As a result COALESCE(#bin8, 1) will implicitly convert the value 0xAAAAAAAA01BB3A35 to an int and then return that value. You see this as SELECT CONVERT(int,0xAAAAAAAA01BB3A35) returns 29047349, which your first "bad" value; it's not "bad", it's correct for what you wrote.
Then for the latter "bad" value, we can convert that int value (29047349) back to a binary, which results in 0x0000000001BB3A35, which is, again the result you get.
TL;DR: checking return types of functions is important. ISNULL returns the data type of first parameter and will implicitly convert the second if needed. For COALESCE it uses Data Type Precedence, and will implicitly convert the returned value to the data type of with the highest precedence of all the possible return values.
I've been tasked with creating a report for my company. The report is generated from the results returned by the Stored Procedure spGenerateReport, which has multiple filters.
Inside the SP, this is how the filter is expected to work:
SELECT * FROM MyTable WHERE column1 IN (
'filters', 'for', 'this', 'report'
)
Entering the code above yields ~30000 rows in 9s. However, I want to be able to change my SP's filter by passing it a single argument (since I may use 1 or 2 or n filters), like so:
spGenerateReport 'Filters,for,this,report'
For this I have the User-Created Function fnSplitString (yes, I do know that there is a STRING_SPLIT function but I can't use it due to a lower compatibility level of my database) which splits a single string into a table, like so:
SELECT splitData FROM fnSplitString('Filters,for,this,report')
Returns:
splitData
------
Filters
for
this
report
Thus the final code in my SP is:
SELECT * FROM MyTable WHERE column1 IN (
SELECT * FROM fnSplitString('Filters,for,this,report')
)
However, this instead yields ~10000 rows in 60s. The time taken to complete this SP is weird but isn't too much of a problem, however nearly a quarter of my rows disappearing into the void certainly is. The results only have rows from the first couple filters (for example, 'Filters' and 'for'; if I change the order of the arguments (e.g.: fnSplitString('report,for,Filters,this')), I get a different number of rows, and only from filters 'report', 'for', 'Filters'! I don't understand why using the function returns different results than those obtained when using the literal strings. Is there some inside gimmick that I'm not aware of?
PS - I'm sorry in advance for being bad at explaining myself, and for any grammar mistakes
You should definitely be getting the same results with both techniques. Something is wrong.
You havent posted the fnSplitString code but I suspect fnSplitString is not outputting the last string in the list, or maybe the last string in the list is being truncated before it reaches fnSplitString so that no matches are found.
e.g. if the parameter going into your spGenerateReport stored procedure is varchar(20) then what will reach the function is 'Filters,for,this,rep' with the last bit truncated.
SSRS, for example, will truncate strings that are being passed into an SP instead of warning you with an error message
I did a rather easy view to return only rows where there is number is CONTRACT_ID column. CONTRACT_ID has data type number(8).
CREATE OR REPLACE VIEW cid AS
SELECT *
FROM transactions
WHERE contract_id IS NOT NULL
AND LENGTH(contract_id) > 0;
View works just fine until I scroll down to row ~2950 where I get ORA-01722. Same thing happens if I want to export data to Excel, my file gets only ~2950 rows instead of expected ~20k.
Any idea what might be causing this and how to resolve this issue?
Many thanks!
You wrote too much SQL.. The following will provide all the results you require:
CREATE OR REPLACE VIEW cid AS
SELECT *
FROM transactions
WHERE contract_id IS NOT NULL
You can't LENGTH() a number - a number is either null or it's a value, so you don't need this kind of check.
Passing a number to LENGTH() will turn it into a string first, i.e. LENGTH(TO_CHAR(numbercolumn)). You don't even need a LENGTH() check for null strings, as to oracle NULL string and a zero length string are equivalent, and calling LENGTH() on an empty string or a null, will return null, not 0 (so LENGTH(myNullStr) = 0 doesnt work out; it's not comparing 0 = 0, it's comparing null = 0 and null compared with anything is always false).
The only time this seems to cause confusion is when the string columns in the table are CHAR types rather than VARCHAR types, and people forget that assigning an empty string to a CHAR causes it to become space padded out to the CHAR length hence, not a zero length string any more
First of all, you should remove redundant condition about length(), it's senseless. I'm not sure how it can produce such error, but check whether error disappered after it.
If no, replace star (*) to some field names, say, contract_id. If it will fix error - it would appoint that error source somewhere into removed fields (say, if generated column used).
I cannot imagine how error can be still alive after that, by if so, I'd tried to move it into other tablespace and add into fields list a call of logging function which stores rowid's of rows read - thus check which row produces error.
I am using in insert statement to convert BDE table (source) to a Firebird table (destination) using IB Datapump. So the INSERT statement is fed by source table values via parameters. One of the source field parameters is alphanum (SOURCECHAR10 char(10), holds mostly integers and needs to be converted to integer in the (integer type) destination column NEWINTFLD. If SOURCECHAR10 is not numeric, I want to assign 0 to NEWINTFLD.
I use IIF and SIMILAR to to test whether the string is numeric, and assign 0 if not numeric as follows:
INSERT INTO "DEST_TABLE" (......, "NEWINTFLD",.....)
VALUES(..., IIF( :"SOURCECHAR10" SIMILAR TO '[[:DIGIT:]]*', :"SOURCECHAR10", 0),..)
For every non numeric string however, I still get conversion errors (DSQL error code = -303).
I tested with only constants in the IIF result fields like SOURCECHAR10" SIMILAR TO '[[:DIGIT:]]*', 1, 0) and that works fine so somehow the :SOURCECHAR10 in the true result field of the IIF generates the error.
Any ideas how to get around this?
When your query is executed, the parser will notice that second use of :"SOURCECHAR10" is used in a place where an integer is expected. Therefor it will always convert the contents of :SOURCECHAR10 into an integer for that position, even though it is not used if the string is non-integer.
In reality Firebird does not use :"SOURCECHAR10" as parameters, but your connection library will convert it to two separate parameter placeholders ? and the type of the second placeholder will be INTEGER. So the conversion happens before the actual query is executed.
The solution is probably (I didn't test it, might contain syntax errors) to use something like (NOTE: see second example for correct solution):
CASE
WHEN :"SOURCECHAR10" SIMILAR TO '[[:DIGIT:]]*'
THEN CAST(:"SOURCECHAR10" AS INTEGER)
ELSE 0
END
This doesn't work as this is interpreted as a cast of the parameter itself, see CAST() item 'Casting input fields'
If this does not work, you could also attempt to add an explicit cast to VARCHAR around :"SOURCECHAR10" to make sure the parameter is correctly identified as being VARCHAR:
CASE
WHEN :"SOURCECHAR10" SIMILAR TO '[[:DIGIT:]]*'
THEN CAST(CAST(:"SOURCECHAR10" AS VARCHAR(10) AS INTEGER)
ELSE 0
END
Here the inner cast is applied to the parameter itself, the outer cast is applied when the CASE expression is evaluated to true