How to select node from potentially not well-formed xml as a varchar? - sql

I have varying 'message' columns which is a varchar that should be an xml, but some of them may not be well-formed or valid. I am trying to weed out the rows that have a given input value to a node like this:
Select * from messagelog where message like '%1234567%'
But when I filter those to try and lift another node (1234567) whos value I do not know, I come across the issue.
I've casting every entry to a xml wont work since like 1% of messages are not valid.
This code doesn't parse the varchar into xml, but returns a substring if it exists. However, I get a conversion error on the charindex = 0 case. Some MessageIds are these large varchars.
Is there anything that I'm missing here? Am I SOL for using SQL to parse not well-formed XML varchars?
case when CAST(charindex('<RelatesToMessageID>', message) as varchar(100)) = 0
then 1
substring(message, charindex('<RelatesToMessageID>', message)+20, charindex('</RelatesToMessageID>', message)-charindex('<RelatesToMessageID>', message)-20)
from messagelog
Conversion failed when converting the varchar value '959B91D824324108948261EC2A81CD92' to data type int.

Your CASE is returning both a VARCHAR and an INT. You should change your then 1 to then '1' so both parts of your CASE return a VARCHAR

I saw that I could select the substring only in locations where there are an existing NCPDPID. This would get rid of the case altogether.
if exists(Select * from messagelog where message like '%<NCPDPID>1234567</NCPDPID>%')
select substring(message, charindex('<MessageID>', message)+11, charindex('</MessageID>', message)-charindex('<MessageID>', message)-11) from messagelog where message like '%<NCPDPID>1234567</NCPDPID>%'


Select case returning an error when both elemements not varchar in some cases

I wanted to return a value formatted with commas at every thousand if a number or just the value if it wasn't a number
I used the following statement which returned the error:
Conversion failed when converting the nvarchar value '1,000' to data type int.
Declare #QuantityToDelete int = 1000
WHEN ISNUMERIC(#QuantityToDelete)=1
THEN format(cast(#QuantityToDelete as int),'N0')
ELSE #QuantityToDelete
END [Result]
I can get it to work by using the following
WHEN ISNUMERIC(#QuantityToDelete)=1
THEN format(cast(#QuantityToDelete as int),'N0')
ELSE cast(#QuantityToDelete as varchar)
END [Result]
Why doesn't the first example work when the ELSE #QuantityToDelete part of the statement isn't returned?
If I use the below switching the logic condition
WHEN ISNUMERIC(#QuantityToDelete)=0
THEN format(cast(#QuantityToDelete as int),'N0')
ELSE #QuantityToDelete
END [Result]
Which is expected, but no error, the case statement still has unmatched return types an nvarchar and an int as in the first example just different logic?
The important point to note is that a case expression returns a single scalar value, and that value has a single data type.
A case expression is fixed, it must evaluate the same and work the same for that query at runtime no matter what data flows through the query - in other words, the result of the case expression cannot be an int for some rows and a string for others.
Remember that the result of a query can be thought of, and used as, a table - so just like a table where you you define a column as being a specific data type, you cannot have a column where the data type can be different for rows of data.
Therefore with a case expression, SQL Server must determine at compile time what the resulting data type will be, which it does (if necessary) using data type precedence. If the case expression has different data types returned in different execution paths then it will attempt to implicitly cast them to the type with the highest precedence.
Hence your case expression that attempts to return two different data types fails because it's trying to return both a nvarchar and int and SQL Server is implicitly casting the nvarchar value to an int - and failing.
The second one works because you are controlling the casting and both paths result in the same varchar data type which works fine.
Also note that when defining a varchar it's good practice to define its length also, you can easily get complacent as it works here because the default length is 30 when casting however the default is 1 otherwise.
See the relevant part of the documentation

Converting nvarchar to int, converting phone with symbols with only numbers

I am trying to convert phone number from the column 'phone' from the table 'Clients'. I have tried the following syntaxes, but I still get error messages -
1. SELECT CAST(phone as int)
FROM Clients
Error: Conversion failed when converting the nvarchar value '030-3456789' to data type int
2. SELECT CONVERT(int, phone)
FROM Clients
Conversion failed when converting the nvarchar value '030-3456789' to data type int.
FROM Clients
The query doesn't return error but there is no result, the column is empty.
It looks (from your example syntax) like you might be using SQL Server.
If that's the case and it's 2017+ you can do the following which copes with any combination of non-numeric values.
Based on your comments the following should work
select Try_Convert(bigint, Replace(Translate('(5) 789-0123','()-',' '),' ',''))
Result: 57890123
If you are using SQL Server 2016 or earlier you have to nest multiple replacements:
select Try_Convert(bigint, Replace(Replace(Replace(Replace('(5) 789-0123)','-',''),'(',''),')',''),' ',''))
Because at least some of your records cannot be covert to numeric by default, as the indicated one 030-3456789
You basically need to replace/eliminate the dash in between:
SELECT cast(replace('12-3', '-', '') as int)
Return null value or numberic value of 2

I'm needing to return values in SQL query that are either null or 2 for broker reason codes. I've tried using a.BROKER_REASON in (2,null), but it only pulls back 2's. I've tried using "a.BROKER_REASON is null or a.BROKER_REASON = 2" and get error msg "Conversion failed when converting the varchar value '+MULTI' to data type int." Is there an easy way to return rows with null values or values of 2?
That error looks more like your value stored isn't an actual number but rather a small string.
a.BROKER_REASON is null or a.BROKER_REASON = '2'
If not you may have extra spaces there stored, and can happen depending on storage engines/table definitions, in which case you can do a LTRIM or LEFT (depends on what SQL database you use) or equivalent to trim off excess spaces.

sql convert error on view tables

SELECT logicalTime, traceValue, unitType, entName
FROM vwSimProjAgentTrace
WHERE valueType = 10
AND agentName ='AtisMesafesi'
AND ( entName = 'Hawk-1')
AND simName IN ('TipSenaryo1_0')
AND logicalTime IN (
SELECT logicalTime
FROM vwSimProjAgentTrace
WHERE valueType = 10 AND agentName ='AtisIrtifasi'
AND ( entName = 'Hawk-1')
AND simName IN ('TipSenaryo1_0')
AND CONVERT(FLOAT , traceValue) > 123
) ORDER BY simName, logicalTime
This is my sql command and table is a view table...
each time i put "convert(float...) part " i get
Msg 8114, Level 16, State 5, Line 1
Error converting data type nvarchar to float.
this error...
One (or more) of the rows has data in the traceValue field that cannot be converted to a float.
Make sure you've used the right combination of dots and commas to signal floating point values, as well as making sure you don't have pure invalid data (text for instance) in that field.
You can try this SQL to find the invalid rows, but there might be cases it won't handle:
SELECT * FROM vwSimProjAgentTrace WHERE NOT ISNUMERIC(traceValue)
You can find the documentation of ISNUMERIC here.
If you look in BoL (books online) at the convert command, you see that a nvarchar conversion to float is an implicit conversion. This means that only "float"-able values can be converted into a float. So, every numeric value (that is within the float range) can be converted. A non-numeric value can not be converted, which is quite logical.
Probably you have some non numeric values in your column. You might see them when you run your query without the convert. Look for something like comma vs dot. In a test scenario a comma instead of a dot gave me some problems.
For an example of isnumeric, look at this sqlfiddle

How does one filter based on whether a field can be converted to a numeric?

I've got a report that has been in use quite a while - in fact, the company's invoice system rests in a large part upon this report (Disclaimer: I didn't write it). The filtering is based upon whether a field of type VarChar(50) falls between two numeric values passed in by the user.
The problem is that the field the data is being filtered on now not only has simple non-numeric values such as '/A', 'TEST' and a slew of other non-numeric data, but also has numeric values that seem to be defying any type of numeric conversion I can think of.
The following (simplified) test query demonstrates the failure:
Declare #StartSummary Int,
#EndSummary Int
Select #StartSummary = 166285,
#EndSummary = 166289
Select SummaryInvoice
From Invoice
Where IsNull(SummaryInvoice, '') <> ''
And IsNumeric(SummaryInvoice) = 1
And Convert(int, SummaryInvoice) Between #StartSummary And #EndSummary
I've also attempted conversions using bigint, real and float and all give me similar errors:
Msg 8115, Level 16, State 2, Line 7
Arithmetic overflow error converting
expression to data type int.
I've tried other larger numeric datatypes such as BigInt with the same error. I've also tried using sub-queries to sidestep the conversion issue by only extracting fields that have numeric data and then converting those in the wrapper query, but then I get other errors which are all variations on a theme indicating that the value stored in the SummaryInvoice field can't be converted to the relevant data type.
Short of extracting only those records with numeric SummaryInvoice fields to a temporary table and then querying against the temporary table, is there any one-step solution that would solve this problem?
Edit: Here's the field data that I suspect is causing the problem:
IsNumeric states that this field is numeric - which it is. But attempting to convert it to BigInt causes an arithmetic overflow. Any ideas? It doesn't appear to be an isolated incident, there seems to have been a number of records populated with data that causes this issue.
It seems that you are gonna have problems with the ISNUMERIC function, since it returns 1 if can be cast to any number type (including ., ,, e0, etc). If you have numbers longer than 2^63-1, you can use DECIMAL or NUMERIC. I'm not sure if you can use PATINDEX to perform an regex look on SummaryInvoice, but if you can, then you should try this:
SELECT SummaryInvoice
FROM Invoice
WHERE ISNULL(SummaryInvoice, '') <> ''
AND CASE WHEN PATINDEX('%[^0-9]%',SummaryInvoice) > 0 THEN CONVERT(DECIMAL(30,0), SummaryInvoice) ELSE -1 END
BETWEEN #StartSummary And #EndSummary
You can't guarantee what order the WHERE clause filters will be applied.
One ugly option to decouple inner and outer.
Select TOP 2000000000
From Invoice
Where IsNull(SummaryInvoice, '') <> ''
And IsNumeric(SummaryInvoice) = 1
ORDER BY SummaryInvoice
) foo
Convert(int, SummaryInvoice) Between #StartSummary And #EndSummary
Another using CASE
Select SummaryInvoice
From Invoice
Where IsNull(SummaryInvoice, '') <> ''
CASE WHEN IsNumeric(SummaryInvoice) = 1 THEN Convert(int, SummaryInvoice) ELSE -1 END
Between #StartSummary And #EndSummary
Edit: after question update
use decimal(38,0) not int
Change ISNUMERIC(SummaryInvoice) to ISNUMERIC(SummaryInvoice + '0e0')
AND with IsNumeric(SummaryInvoice) = 1, will not short circuit in SQL Server.
But may be you can use
AND (CASE IsNumeric(SummaryInvoice) = 1 THEN Convert(int, SummaryInvoice) ELSE 0 END)
Between #StartSummary And #EndSummary
Your first issue is to fix your database structure so bad data cannot get into the field. You are putting a band-aid on a wound that needs stitches and wondering why it doesn't heal.
Database refactoring is not fun, but it needs to be done when there is a data integrity problem. I assume you aren't really invoicing someone for 11,111,111,111,111,111,111,111,111 or 'test'. So don't allow those values to ever get entered (if you can't change the structure to the correct data type, consider a trigger to prevent bad data from going in) and delete the ones you do have that are bad.