SQL Server - simple select and conversion between int and string - sql

I have a simple select statement like this:
SELECT [dok__Dokument].[dok_Id],
[dok__Dokument].[dok_WartUsNetto],
[dok__Dokument].[dok_WartUsBrutto],
[dok__Dokument].[dok_WartTwNetto],
[dok__Dokument].[dok_WartTwBrutto],
[dok__Dokument].[dok_WartNetto],
[dok__Dokument].[dok_WartVat],
[dok__Dokument].[dok_WartBrutto],
[dok__Dokument].[dok_KwWartosc]
FROM [dok__Dokument]
WHERE [dok_NrPelnyOryg] = 2753
AND [dok_PlatnikId] = 174
AND [dok_OdbiorcaId] = 174
AND [dok_PlatnikAdreshId] = 625
AND [dok_OdbiorcaAdreshId] = 624
Column dok_NrPelnyOryg is of type varchar(30), and not null.
The table contained both integer and string values in this column and this select statement was fired millions of times.
However recently this started crashing with message:
Conversion failed when converting the varchar value 'garbi czerwiec B' to data type int.
Little explanation: the table contains multiple "document" records and the mentioned column contains document original number (which comes from multiple different sources).
I know I can fix this by adding '' around the the number, but I'm rather looking for an explanation why this used to work and while not changing anything now it crashes.

It's possible that a plan change (due to changed statistics, recompile etc) led to this data being evaluated earlier (full scan for example), or that this particular data was not in the table previously (maybe before this started happening, there wasn't bad data in there). If it is supposed to be a number, then make it a numeric column. If it needs to allow strings as well, then stop treating it like a number. If you properly parameterize your statements and always pass a varchar you shouldn't need to worry about whether the value is enclosed in single quotes.

All those equality comparison operations are subject to the Data Type Precedence rules of SQL Server:
When an operator combines two
expressions of different data types,
the rules for data type precedence
specify that the data type with the
lower precedence is converted to the
data type with the higher precedence.
Since character types have lower precedence than int types, the query is basically the same as:
SELECT ...
FROM [dok__Dokument]
WHERE cast([dok_NrPelnyOryg] as int) = 2753
...
This has two effects:
it makes all indexes on columns involved in the WHERE clause useless
it can cause conversion errors.
You're not the first to have this problem, in fact several CSS cases I faced had me eventually write an article about this: On SQL Server boolean operator short-circuit.
The correct solution to your problem is that if the field value is numeric then the column type should be numeric. since you say that the data come from a 3rd party application you cannot change, the best solution is to abandon the vendor of this application and pick one that knows what is doing. Short of that, you need to search for character types on character columns:
SELECT ...
FROM [dok__Dokument]
WHERE [dok_NrPelnyOryg] = '2753'
...
In .Net managed ADO.Net parlance this means you use a SqlCommand like follows:
SqlCommand cmd = new SqlCommand (#" SELECT ...
FROM [dok__Dokument]
WHERE [dok_NrPelnyOryg] = #nrPelnyOryg
... ");
cmd.Parameters.Add("#nrPelnyOryg", SqlDbType.Varchar).Value = "2754";
...
Just make sure you don't fall into he easy trap of passing in a NVARCHAR parameter (Unicode) for comparing with a VARCHAR column, since the same data type precendence rules quoted before will coerce the comparison to occur on the NVARCHAR type, thus rendering indexes, again, useless. the easiest way to fall for this trap is to use the dredded AddWithValue and pass in a string value.

Your query stopped working because someone inserted the text string in to the field you are querying using INT. Up until that time it was possible to implicitly convert the data but now that's no longer the case.
I'd go check your data and, more importantly, the model; as Aaron said do you need to allow strings in that field? If not, change the data type to prevent this happening in the future.

Related

I have an issue trying to UNION All in SQL Server 2008

I am having to create a second header line and am using the first record of the Query to do this. I am using a UNION All to create this header record and the second part of the UNION to extract the Data required.
I have one issue on one column.
,'Active Energy kWh'
UNION ALL
,SUM(cast(invc.UNITS as Decimal (15,0)))
Each side are 11 lines before and after the Union and I have tried all sorts of combinations but it always results in an error message.
The above gives me "Error converting data type varchar to numeric."
Any help would be much appreciated.
The error message indicates that one of your values in the INVC table UNITS column is non-numeric. I would hazard a guess that it's either a string (VARCHAR or similar) column or something else - and one of the values has ended up in a state where it cannot be parsed.
Unfortunately there is no way other than checking small ranges of the table to gradually locate the 'bad' row (i.e. Try running the query for a few million rows at a time, then reducing the number until you home in on the bad data). SQL 2014 if you can get a database restored to it has the TRY_CONVERT function which will permit conversions to fail, enabling a more direct check - but you'll need to play with this on another system
(I'm assuming that an upgrade to 2014 for this feature is out of the question - your best bet is likely just looking for the bad row).
The problem is that you are trying to mix header information with data information in a single query.
Obviously, all your header columns will be strings. But not all your data columns will be strings, and SQL Server is unhappy when you mix data types this way.
What you are doing is equivalent to this:
select 'header1' as col1 -- string
union all
select 123.5 -- decimal
The above query produces the following error:
Error converting data type varchar to numeric.
...which makes sense, because you are trying to mix both a string (the header) with a decimal field.
So you have 2 options:
Remove the header columns from your query, and deal with header information outside your query.
Accept the fact that you'll need to convert the data type of every column to a string type. So when you have numeric data, you'll need to cast the column to varchar(n) explicitly.
In your case, it would mean adding the cast like this:
,'Active Energy kWh'
UNION ALL
,CAST(SUM(cast(invc.UNITS as Decimal (15,0))) AS VARCHAR(50)) -- Change 50 to appropriate value for your case
EDIT: Based on comment feedback, changed the cast to varchar to have an explicit length (varchar(n)) to avoid relying on the default length, which may or may not be long enough. OP knows the data, so OP needs to pick the right length.

Why is SQL Server giving me wrong output?

Accidently I noticed a bug-like behaviour in SSMS. I was querying from a table named Candidate with the below query.
select CandidateId, CandidateName from Candidate
where CandidateId='73415005-77C6-4D4B-9947-02D6B148E03F2'
I was copy-pasting the CandidateId which is a unique identifier, but somehow I added a two (2) in the end. Actually the candidate id I was querying to was '73415005-77C6-4D4B-9947-02D6B148E03F' and there is no candidate with candidateid 73415005-77C6-4D4B-9947-02D6B148E03F2 (that is not even a GUID i suppose)
But still, I was getting the result back.
You can see in the query and the result, the CandidateId's are different. Why is it happening so? Anyone please explain.
The top-level description is that the string is being converted to a unique identifier, so the last digit is ignored.
This logic is documented. First, unique identifiers have a slightly higher operator precedence than strings. The relevant part of the documentation:
uniqueidentifier
nvarchar (including nvarchar(max) )
nchar
varchar (including varchar(max) )
char
This is why the conversion is to uniqueidentifier rather than to a string.
Second, this is a case where SQL Server does "silent conversion". That is, it converts the first 36 characters and doesn't generate an error for longer strings. This is also documented:
The following example demonstrates the truncation of data when the
value is too long for the data type being converted to. Because the
uniqueidentifier type is limited to 36 characters, the characters that
exceed that length are truncated.
So, the behavior that you see is not a bug. It is documented behavior, combining two different aspects of documented SQL Server functionality.
Because your column CandidateId is of type GUID the right (string) part of the condition gets converted to uniqueidentifier data type and truncated. You can see this in your execution plan. There will be a Scalar Operator(CONVERT_IMPLICIT(uniqueidentifier,[#1],0)) in your index seek/scan operator.
That's because you probably have a convert_implicit in your execution plan and SQL converted '73415005-77C6-4D4B-9947-02D6B148E03F2' into a guid.
SQL Truncate data when the value is too long for the data type being converted to.
Since you try to compare uniqueidentifier field with text variable, SQL convert it to uniqueidentifier. It is not a bug.
Ex:
select Cast('73415005-77C6-4D4B-9947-02D6B148E03F2' as uniqueidentifier)
Result :
73415005-77C6-4D4B-9947-02D6B148E03F

PROC SQL error: "ERROR: Expression using equals (=) has components that are of different data types."

I am trying to subset my data with PROC SQL, and it is giving me an error when I use my variable TNM_CLIN_STAGE_GROUP. Example below:
PROC SQL;
create table subset as
select ncdb.*
from ncdb
where YEAR_OF_DIAGNOSIS>2002
AND SEX = 2
AND LATERALITY IN (1,2,3)
AND HISTOLOGY = 8500
AND TNM_CLIN_STAGE_GROUP = 1;
quit;
ERROR: Expression using equals (=) has components that are of different data types.
When I run the same code, but take out the variable TNM_CLIN_STAGE_GROUP, the code works. Anyone know what the problem with that variable's name is?
That error indicates a difference in type. SAS has only two types, numeric and character, so the variable is probably character; verify the specific values, but in general it likely needs quotations (single or double, doesn't matter in this case).
If it is not a hardcoded value, but a value of another variable, you can use PUT to convert to character or INPUT to convert to numeric, whichever is easier to convert based on the data.
SAS in a data step will happily convert this for you, but in SQL and SQL-like (WHERE statements) it does not automatically convert character to numeric and vice versa; you must provide the correct type.
Before doing equality, check what you are trying to compare.
Check the structure of you ncbd table, in particulary field type of TNM_CLIN_STAGE_GROUP
You would see the real type, if its a varchar, you need to use single quote like #JChao suggest in is comment.
If its another type, so you need to adapt the comparator or use cast if you don t have choice.

Should I tell NHibernate/FNH to explicitly use a string data type for a param mapped to a string column?

A cohort of mine is building a somewhat long search query based on various input from the user. We've got NHibernate mapped up using Fluent NHibernate, and aside from some noob mistakes, all seems to be going well.
One issue we can't resolve in the case of this search is that for a particular parameter, NHibernate is creating sql that treats the input as int when we explicitly need it to be a string. We have a string property mapped to an nvarchar(255) column which mostly contains integer numbers, excluding some arbitrary inputs like "temporary" or long numbers like 4444444444444444 which is beyond the int limit.
In the course of testing, I've seen a couple things: 1) If I prepend a 0 to the incoming value, NH generates the sql param as a string, appropriately so; 2) If the value can realistically be converted to an int, the resulting sql treats it as so. In case #2, if I run the generated sql directly through sql server, I get an exception when the query comes across an non-integer value (such as the examples I listed above). For some reason, when I just let NH do it's thing, I'm getting appropriate records back, but it doesn't make sense; I would expect it to fail or at least tell me that something is wrong with some records that can't be evaluated by SqlServer.
The mapping is simple, the data store is simple; I would be ok leaving well enough alone if I at least understood why/how NHibernate is making this work when running the same state manually fails... Any thoughts?
Are you running the exact same code directly into SQL Server?
NHibernate parameterises all of its queries, and will in doing so define what value is passed through to SQL in the parameters. Which is probably what you're asking about, the reason SQL my fail, is that by default it will only know the difference if you input:
select * from table_name
where col_name = 5
in comparison with
select * from table_name
where col_name = '5'
If you do not define it as a string with the 's it will search for an int, and try to convert all the varchar's to ints, which will obviously fail in some cases with strings.

Force numerical order on a SQL Server 2005 varchar column, containing letters and numbers?

I have a column containing the strings 'Operator (1)' and so on until 'Operator (600)' so far.
I want to get them numerically ordered and I've come up with
select colname from table order by
cast(replace(replace(colname,'Operator (',''),')','') as int)
which is very very ugly.
Better suggestions?
It's that, InStr()/SubString(), changing Operator(1) to Operator(001), storing the n in Operator(n) separately, or creating a computed column that hides the ugly string manipulation. What you have seems fine.
If you really have to leave the data in the format you have - and adding a numeric sort order column is the better solution - then consider wrapping the text manipulation up in a user defined function.
select colname from table order by dbo.udfSortOperator(colname)
It's less ugly and gives you some abstraction. There's an additional overhead of the function call but on a table containing low thousands of rows in a not-too-heavily hit database server it's not a major concern. Make notes in the function to optomise later as required.
My answer would be to change the problem. I would add an operatorNumber field to the table if that is possible. Change the update/insert routines to extract the number and store it. That way the string conversion hit is only once per record.
The ordering logic would require the string conversion every time the query is run.
Well, first define the meaning of that column. Is operator a name so you can justify using chars? Or is it a number?
If the field is a name then you will use chars, and then you would want to determine the fixed length. Pad all operator names with zeros on the left. Define naming rules for operators (I.E. No leters. Or the codes you would use in a series like "A001")
An index will sort the physical data in the server. And a properly define text naming will sort them on a query. You would want both.
If the operator is a number, then you got the data type for that column wrong and needs to be changed.
Indexed computed column
If you find yourself ordering on or otherwise querying operator column often, consider creating a computed column for its numeric value and adding an index for it. This will give you a computed/persistent column (which sounds like oxymoron, but isn't).