How many ways can you generate an error converting varchar to numeric that won't be caught by ISNUMERIC()? - sql

I am in the process of loading a bunch of tables into SQL Server and converting them from varchar to specific data types (int, date, etc.). One frustration is how many different ways there are to break the conversion from string to numeric (int, decimal, etc) and that there is not an easy diagnostic tool to find the offending rows (besides ISNUMERIC() which doesn't work all the time).
Here is my list of ways to break the conversion that won't get caught by ISNUMERIC().
The string contains scientific notation (ie 3.55E-10)
The string contains a blank ('')
The string contains a non-alphanumeric symbol ('$', '-', ',')
Here's what I'm currently using to compensate:
SELECT
CASE
WHEN [MyColumn] IN ('','-') THEN NULL -- deals with blanks
WHEN [MyColumn] LIKE '%E%' THEN CONVERT(DECIMAL(20, 4), CONVERT(FLOAT(53), [MyColumn])) -- deals with scientific notation
ELSE CAST(REPLACE(REPLACE([MyColumn] , '$', ''), '-', '') AS DECIMAL(20, 4))
END [MyColumn] -- deals with special characters
FROM
MyTable
Does anyone else have others? Or good ways to diagnose?

Don't use ISNUMERIC(). If you are on 2012+ then you could use TRY_CAST or TRY_CONVERT.
If you are on older versions, you could use some syntax like this:
SELECT *
FROM #TableA
WHERE ColA NOT LIKE '%[^0-9]%'

You can try to use LIKE '%[0-9]%' instead of ISNUMERIC()
SELECT col, CASE WHEN col NOT LIKE '%[^0-9]%' and col<>''
THEN 1
ELSE 0
END
FROM T

You can use NOT LIKE to exclude anything that isn't a digit... and REPLACE for commas and periods. Naturally, you can add other nested REPLACE functions for values you want to accept.
declare #var varchar(64) = '55,5646'
SELECT
CASE
WHEN replace(replace(#var,'.',''),',','') NOT LIKE '%[^0-9]%'
THEN 1
ELSE 0
END
This allows you to accept decimals for your decimal / numeric / float conversions.

Related

Removing Leading Zeros Part 2

SQL Server 2012
I have 3 columns in my table that will be using a function. '[usr].[Udf_OverPunch]'. and substring.
Here is my code:
[usr].[Udf_OverPunch](SUBSTRING(col001, 184, 11)) as REPORTED_GAP_DISCOUNT_PREVIOUS_AMOUNT
This function works appropriately for what I need it to do. It is basically converting symbols or letters to a designated number based on a data dictionary.
The problem I am having is that there are leading zeros. I just asked a questions about leading zeroes but it won't allow me to do it with the function columns because of the symbols cannot be converted to int.
This is what I am using to get rid of leading zeros (but leave one zero) in my code for the other columns:
cast(cast(SUBSTRING(col001, 217, 6) as int) as varchar(25)) as PREVIOUS_REPORTING_PERIOD
This works well at turning a value of '000000' to just one '0' or a value of '000060' to '60' but will not work with the function because of the symbol or letter (when trying to convert to int).
As I mentioned, I have 3 columns which produce values that look something like this when the function is not being used:
'0000019753{'
'0000019748G'
'0000019763H'
My goal here is to use the function while also removing the leading zeros (unless they are all zeros then keep one zero).
This is what I attempted that isn't working because the value contains a character that isn't an integer:
[usr].[Udf_OverPunch]cast(cast(SUBSTRING(col001, 184, 6) as int) as varchar(25)) as REPORTED_GAP_DISCOUNT_PREVIOUS_AMOUNT,
Please let me know if you have any ideas or need more information. :)
select case when col like '%[^0]%' then substring(col,patindex('%[^0]%',col),len(col)) when col like '%0%' then '0' else col end
from tab
or
select case when col like '%[^0]%' then right(col,len(ltrim(replace(col,'0',' ')))) when col like '%0%' then '0' else col end
from tab
I am handling such replacement with T-SQL CLR function that allows replacement using regular expressions. So, the solution will be like this:
[dbo].[fn_Utils_RegexReplace] ([value], '^0{1,}(?=.)', '')
You need to create such function because there are no regular expression support in T-SQL (build-in).
How to create regex replace function in T-SQL?
For example:
try this,
declare #i varchar(50)='0000019753}'--'0000019753'
select case when ISNUMERIC(#i)=1 then
cast(cast(#i as bigint) as varchar(50)) else #i end
or
[usr].[Udf_OverPunch]( case when ISNUMERIC(col001)=1 then
cast(cast(col001 as bigint) as varchar(50)) else col001 end)

Is it possible to Compare two columns in Microsoft SQL server so that the comparison skips punctuation marks and other character like %, ' etc?

I have two columns having data like below.
Column1
AMC Standard, School
Column2
AMC Standard School.
In need to compare these two columns such that comparison is made for the words only and not for any additional, meaning from the above example Column1 and ColumnC are match but due to the Comma ",' and the period sign "." the simple comparison of Column1 and Column2 suggests it as a mismatch.
you can replace the non comparable characters to empty string (in your case , and .)and then compare them. Something like this.
SELECT 1 WHERE REPLACE('AMC Standard, School',',','') = REPLACE('AMC Standard School.','.','')
Based on jarlh comments, You should (if possible) update the columns and remove the punctuation marks if they are not using in any comparison and display.
One option is to use SQL Servers SoundEx() and Difference() functions (https://msdn.microsoft.com/en-us/library/ms187384.aspx and https://msdn.microsoft.com/en-us/library/ms188753.aspx respectively)
DECLARE #val1 varchar(50) = 'AMC Standard, School'
, #val2 varchar(50) = 'AMC Standard School.'
;
SELECT #val1
, #val2
, SoundEx(#val1)
, SoundEx(#val2)
, Difference(SoundEx(#val1), SoundEx(#val2))
;
The return value of Difference() is between 0 and 4, with a higher number signifying a closer match.
IMPORTANT NOTE: This type of comparison is not as exacting as a method that cleans up your data beforehand as in those scenarios you can use an exact (a=a) comparison, whereas this method looks for similar values.
Try like this
DECLARE #column1 VARCHAR(100)='AMC Standard, School (Near to ABC Building)'
DECLARE #column2 VARCHAR(100)='AMC Standard, School (Opposite KFC)'
SELECT 'MATCHED' AS COLUMN_COMPARE
WHERE replace(replace(replace(#column1, ',', ''), '.', ''), substring(#column1, CHARINDEX('(', #column1), CHARINDEX(')', #column1) - 1), '') = replace(replace(replace(#column2, ',', ''), '.', ''), substring(#column2, CHARINDEX('(', #column2), CHARINDEX(')', #column2) - 1), '')

How to convert Varchar to Int in sql server 2008?

How to convert Varchar to Int in sql server 2008.
i have following code when i tried to run it wont allowed me to convert Varchar to Int.
Select Cast([Column1] as INT)
Column1 is of Varchar(21) NOT NULL type and i wanted to convert it into Int.
actually i am trying to insert Column1 into another table having Field as INT.
can someone please help me to convert this ?
Spaces will not be a problem for cast, however characters like TAB, CR or LF will appear as spaces, will not be trimmed by LTRIM or RTRIM, and will be a problem.
For example try the following:
declare #v1 varchar(21) = '66',
#v2 varchar(21) = ' 66 ',
#v3 varchar(21) = '66' + char(13) + char(10),
#v4 varchar(21) = char(9) + '66'
select cast(#v1 as int) -- ok
select cast(#v2 as int) -- ok
select cast(#v3 as int) -- error
select cast(#v4 as int) -- error
Check your input for these characters and if you find them, use REPLACE to clean up your data.
Per your comment, you can use REPLACE as part of your cast:
select cast(replace(replace(#v3, char(13), ''), char(10), '') as int)
If this is something that will be happening often, it would be better to clean up the data and modify the way the table is populated to remove the CR and LF before it is entered.
you can use convert function :
Select convert(int,[Column1])
That is how you would do it, is it throwing an error? Are you sure the value you are trying to convert is convertible? For obvious reasons you cannot convert abc123 to an int.
UPDATE
Based on your comments I would remove any spaces that are in the values you are trying to convert.
That is the correct way to convert it to an INT as long as you don't have any alpha characters or NULL values.
If you have any NULL values, use
ISNULL(column1, 0)
Try the following code. In most case, it is caused by the comma issue.
cast(replace([FIELD NAME],',','') as float)
Try with below command, and it will ask all values to INT
select case when isnumeric(YourColumn + '.0e0') = 1
then cast(YourColumn as int)
else NULL
end /* case */
from YourTable
There are two type of convert method in SQL.
CAST and CONVERT have similar functionality. CONVERT is specific to SQL Server, and allows for a greater breadth of flexibility when converting between date and time values, fractional numbers, and monetary signifiers. CAST is the more ANSI-standard of the two functions.
Using Convert
Select convert(int,[Column1])
Using Cast
Select cast([Column1] as int)

SQL IsNumeric not working

The reserve column is a varchar, to perform sums on it I want to cast it to a deciaml.
But the SQL below gives me an error
select
cast(Reserve as decimal)
from MyReserves
Error converting data type varchar to numeric.
I added the isnumeric and not null to try and avoid this error but it still persists, any ideas why?
select
cast(Reserve as decimal)
from MyReserves
where isnumeric(Reserve ) = 1
and MyReserves is not null
See here: CAST and IsNumeric
Try this:
WHERE IsNumeric(Reserve + '.0e0') = 1 AND reserve IS NOT NULL
UPDATE
Default of decimal is (18,0), so
declare #i nvarchar(100)='12121212121211212122121'--length is>18
SELECT ISNUMERIC(#i) --gives 1
SELECT CAST(#i as decimal)--throws an error
Gosh, nobody seems to have explained this correctly. SQL is a descriptive language. It does not specify the order of operations.
The problem that you are (well, were) having is that the where does not do the filtering before the conversion takes place. Order of operations, though, is guaranteed for a case statement. So, the following will work:
select cast(case when isnumeric(Reserve) = 1 then Reserve end as decimal)
from MyReserves
where isnumeric(Reserve ) = 1 and MyReserves is not null
The issue has nothing to do with the particular numeric format you are converting to or with the isnumeric() function. It is simply that the ordering of operations is not guaranteed.
It seems that isnumeric has some Problems:
http://www.sqlhacks.com/Retrieve/Isnumeric-problems
(via internet archive)
According to that Link you can solve it like that:
select
cast(Reserve as decimal)
from MyReserves
where MyReserves is not null
and MyReserves * 1 = MyReserves
Use try_cast (sql 2012)
select
try_cast(Reserve as decimal)
from MyReserves
IsNumeric is a problem child -- SQL 2012 and later has TRY_CAST and TRY_CONVERT
If you're on an earlier version then you can write a function that'll convert to a decimal (or NULL if it won't convert). This uses the XML conversion functions that don't throw errors when the number won't fit ;)
-- Create function to convert a varchar to a decimal (returns null if it fails)
IF EXISTS( SELECT * FROM sys.objects WHERE object_id = OBJECT_ID( N'[dbo].[ToDecimal]' ) AND type IN( N'FN',N'IF',N'TF',N'FS',N'FT' ))
DROP FUNCTION [dbo].[ToDecimal];
GO
CREATE FUNCTION ToDecimal
(
#Value VARCHAR(MAX)
)
RETURNS DECIMAL(18,8)
AS
BEGIN
-- Uses XML/XPath to convert #Value to Decimal because it returns NULL it doesn't cast correctly
DECLARE #ValueAsXml XML
SELECT #ValueAsXml = Col FROM (SELECT (SELECT #Value as Value FOR XMl RAW, ELEMENTS) AS Col) AS test
DECLARE #Result DECIMAL(38,10)
-- XML/XPath will return NULL if the VARCHAR can't be converted to a DECIMAL(38,10)
SET #Result = #ValueAsXml.value('(/row/Value)[1] cast as xs:decimal?', 'DECIMAL(38,10)')
RETURN CASE -- Check if the number is within the range for a DECIMAL(18,8)
WHEN #Result >= -999999999999999999.99999999 AND #Result <= 999999999999999999.99999999
THEN CONVERT(DECIMAL(18,8),#Result)
ELSE
NULL
END
END
Then just change your query to:
select dbo.ToDecimal(Reserve) from MyReserves
isnumeric is not 100% reliable in SQL - see this question Why does ISNUMERIC('.') return 1?
I would guess that you have value in the reserve column that passes the isnumeric test but will not cast to decimal.
Just a heads up on isnumeric; if the string contains some numbers and an 'E' followed by some numbers, this is viewed as an exponent. Example, select isnumeric('123E0') returns 1.
I had this same problem and it turned out to be scientific notation such as '1.72918E-13' To find this just do where Reserve LIKE '%E%'. Try bypassing these and see if it works. You'll have to write code to convert these to something usable or reformat your source file so it doesn't store any numbers using scientific notation.
IsNumeric is possibly not ideal in your scenario as from the highlighted Note on this MSDN page it says "ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($)."
Also there is a nice article here which further discusses ISNUMERIC.
Try (for example):
select
cast(Reserve as decimal(10,2))
from MyReserves
Numeric/Decimal generally want a precision an scale.
I am also facing this issue and I solved by below method. I am sharing this because it may helpful to some one.
declare #g varchar (50)
set #g=char(10)
select isnumeric(#g),#g, isnumeric(replace(replace(#g,char(13),char(10)),char(10),''))

SQL Server, where field is int?

how can I accomplish:
select * from table where column_value is int
I know I can probably inner join to the system tables and type tables but I'm wondering if there's a more elegant way.
Note that column_value is a varchar that "could" have an int, but not necessarily.
Maybe I can just cast it and trap the error? But again, that seems like a hack.
select * from table
where column_value not like '[^0-9]'
If negative ints are allowed, you need something like
where column_value like '[+-]%'
and substring(column_value,patindex('[+-]',substring(column_value,1))+1,len(column_value))
not like '[^0-9]'
You need more code if column_value can be an integer that exceeds the limits of the "int" type, and you want to exclude such cases.
Here if you want to implement your custom function
CREATE Function dbo.IsInteger(#Value VARCHAR(18))
RETURNS BIT
AS
BEGIN
RETURN ISNULL(
(SELECT CASE WHEN CHARINDEX('.', #Value) > 0 THEN
CASE WHEN CONVERT(int, PARSENAME(#Value, 1)) <> 0 THEN 0 ELSE 1 END
ELSE 1
END
WHERE ISNUMERIC(#Value + 'e0') = 1), 0)
END
ISNUMERIC returns 1 when the input
expression evaluates to a valid
integer, floating point number, money
or decimal type; otherwise it returns
0. A return value of 1 guarantees that expression can be converted to one of
these numeric types.
I would do a UDF as Svetlozar Angelov suggests, but I would check for ISNUMERIC first (and return 0 if not), and then check for column_value % 1 = 0 to see if it's an integer.
Here's what the body might look like. You have to put the modulo logic in a separate branch because it will throw an exception if the value isn't numeric.
DECLARE #RV BIT
IF ISNUMERIC(#value) BEGIN
IF CAST(#value AS NUMERIC) % 1 = 0 SET #RV = 1
ELSE SET #RV = 0
END
ELSE SET #RV = 0
RETURN #RV
This should handle all cases without throwing any exceptions:
--This handles dollar-signs, commas, decimal-points, and values too big or small,
-- all while safely returning an int.
DECLARE #IntString as VarChar(50) = '$1,000.'
SELECT CAST((CASE WHEN --This IsNumeric check here does most of the heavy lifting. The rest is Integer-Specific
ISNUMERIC(#IntString) = 1
--Only allow Int-related characters. This will exclude things like 'e' and other foreign currency characters.
AND #IntString NOT LIKE '%[^ $,.\-+0-9]%' ESCAPE '\'--'
--Checks that the value is not out of bounds for an Integer.
AND CAST(REPLACE(REPLACE(#IntString,'$',''),',','') as Decimal(38)) BETWEEN -2147483648 AND 2147483647
--This allows values with decimal-points for count as an Int, so long as there it is not a fractional value.
AND CAST(REPLACE(REPLACE(#IntString,'$',''),',','') as Decimal(38)) = CAST(REPLACE(REPLACE(#IntString,'$',''),',','') as Decimal(38,2))
--This will safely convert values with decimal points to casting later as an Int.
THEN CAST(REPLACE(REPLACE(#IntString,'$',''),',','') as Decimal(10))
END) as Int)[Integer]
Throw this into a Scalar UDF and call it ReturnInt().
If the value comes back as NULL, then it's not an int (so there's your IsInteger() requirement)
If you don't like typing "WHERE ReturnInt(SomeValue) IS NOT NULL", you could throw it into another scalar UDF called IsInt() to call this function and simply return "ReturnInt(SomeValue) IS NOT NULL".
The cool thing is, the UDF can serve double duty by returning the "safely" converted int value.
Just because something can be an int doesn't mean casting it as an int won't throw a huge exception. This takes care of that for you.
Also, I'd avoid the other solutions because this universal approach will handle commas, decimals, dollar signs, and checks the acceptable Int value's range while the other solutions do not - or they require multiple SET operations that prevent you from using the logic in a Scalar-Function for maximum performance.
See the examples below and test them against my code and others:
--Proves that appending "e0" or ".0e0" is NOT a good idea.
select ISNUMERIC('$1' + 'e0')--Returns: 0.
select ISNUMERIC('1,000' + 'e0')--Returns: 0.
select ISNUMERIC('1.0' + '.0e0')--Returns: 0.
--While these are numeric, they WILL break your code
-- if you try to cast them directly as int.
select ISNUMERIC('1,000')--Returns: 1.
select CAST('1,000' as Int)--Will throw exception.
select ISNUMERIC('$1')--Returns: 1.
select CAST('$1' as Int)--Will throw exception.
select ISNUMERIC('10.0')--Returns: 1.
select CAST('10.0' as Int)--Will throw exception.
select ISNUMERIC('9999999999223372036854775807')--Returns: 1. This is why I use Decimal(38) as Decimal defaults to Decimal(18).
select CAST('9999999999223372036854775807' as Int)--Will throw exception.
Update:
I read a comment here that you want to be able to parse a value like '123.' into an Integer. I have updated my code to handle this as well.
Note: This converts "1.0", but returns null on "1.9".
If you want to allow for rounding, then tweak the logic in the "THEN" clause to add Round() like so:
ROUND(CAST(REPLACE(REPLACE(#IntString,'$',''),',','') as Decimal(10)), 0)
You must also remove the "AND" that checks for "decimal-points" to allow for Rounding or Truncation.
Why not use the following and test for 1?
DECLARE #TestValue nvarchar(MAX)
SET #TestValue = '1.04343234e5'
SELECT CASE WHEN ISNUMERIC(#TestValue) = 1
THEN CASE WHEN ROUND(#TestValue,0,1) = #TestValue
THEN 1
ELSE 0
END
ELSE null
END AS Analysis
If you are purely looking to verify a string is all digits and not just CAST-able to INT you can do this terrible, terrible thing:
select LEN(
REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE( REPLACE(
'-1.223344556677889900e-1'
,'0','') ,'1','') ,'2','') ,'3','') ,'4','') ,'5','') ,'6','') ,'7','') ,'8','') ,'9','')
)
It returns 0 when the string was empty or pure digits.
To make it a useful check for "poor-man's" Integer you'd have to deal with empty string, and an initial negative sign. And manually make sure it isn't too long for your variety of INTEGER.