Determine varchar content in nvarchar columns - sql

I have a bunch of NVARCHAR columns which I suspect contain perfectly storable data in VARCHAR columns. However I can't just go and change the columns' type into VARCHAR and hope for the best, I need to do some sort of check.
I want to do the conversion because the data is static (it won't change in the future) and the columns are indexed and would benefit from a smaller (varchar) index compared to the actual (nvarchar) index.
If I simply say
ALTER TABLE TableName ALTER COLUMN columnName VARCHAR(200)
then I won't get an error or a warning. Unicode data will be truncated/lost.
How do I check?

Why not cast there and back to see what data gets lost?
This assumes column is nvarchar(200) to start with
SELECT *
FROM TableName
WHERE columnName <> CAST(CAST(columnName AS varchar(200)) AS nvarchar(200))

Hmm interesting.
I'm not sure you can do this in a SQL query itself. Are you happy to do it in code? If so, you can get all the records, then loop over all the chars in the string and check. But man it's a slow way.

Related

Can a computed field be set to anything other that VARCHAR(MAX)?

I have a table with a field District which is VARCHAR(5)
When I create a computed field:
ALTER TABLE
Postcode
ADD
DistrictSort1
AS
(dbo.fn_StripCharacters(District, '^A-Z'))
PERSISTED;
The computed field DistrictSort1 is added as NVARCHAR(MAX)
Is it possible to change the NVARCHAR to anything other than (MAX)?
Are there any performance issues?
The obvious answer would be to CAST/CONVERT the value explicitly in your computed column:
ALTER TABLE dbo.Postcode
ADD DistrictSort1 AS CONVERT(varchar(5),(dbo.fn_StripCharacters(District, '^A-Z')) PERSISTED;
I would, however, suggest looking at your function fn_StripCharacters, which is currently set up to return an nvarchar(MAX). User defined functions, unlike those built into SQL Server, cannot return different data types based on their input parameter(s). As a result, whenever you reference your function, you will get an nvarchar(MAX) back.
As a result, sometimes it's best to have multiple similar versions of the same function. For one like this, form example, you might want 4, that return varchar and nvarchar values in non-MAX and MAX lengths.

Alter column from varchar to decimal when nulls exist

How do I alter a sql varchar column to a decimal column when there are nulls in the data?
I thought:
ALTER TABLE table1
ALTER COLUMN data decimal(19,6)
But I just get an error, I assume because of the nulls:
Error converting data type varchar to numeric. The statement has been terminated.
So I thought to remove the nulls I could just set them to zero:
ALTER TABLE table1
ALTER COLUMN data decimal(19,6) NOT NULL DEFAULT 0
but I dont seem to have the correct syntax.
Whats the best way to convert this column?
edit
People have suggested it's not the nulls that are causing me the problem, but non-numeric data. Is there an easy way to find the non-numeric data and either disregard it, or highlight it so I can correct it.
If it were just the presence of NULLs, I would just opt for doing this before the alter column:
update table1 set data = '0' where data is null
That would ensure all nulls are gone and you could successfully convert.
However, I wouldn't be too certain of your assumption. It seems to me that your new column is perfectly capable of handling NULL values since you haven't specified not null for it.
What I'd be looking for is values that aren't NULL but also aren't something you could turn in to a real numeric value, such as what you get if you do:
insert into table1 (data) values ('paxdiablo is good-looking')
though some may argue that should be treated a 0, a false-y value :-)
The presence of non-NULL, non-numeric data seems far more likely to be causing your specific issue here.
As to how to solve that, you're going to need a where clause that can recognise whether a varchar column is a valid numeric value and, if not, change it to '0' or NULL, depending on your needs.
I'm not sure if SQL Server has regex support but, if so, that'd be the first avenue I'd investigate.
Alternatively, provided you understand the limitations (a), you could use isnumeric() with something like:
update table1 set data = NULL where isnumeric(data) = 0
This will force all non-numeric values to NULL before you try to convert the column type.
And, please, for the love of whatever deities you believe in, back up your data before attempting any of these operations.
If none of those above solutions work, it may be worth adding a brand new column and populating bit by bit. In other words set it to NULL to start with, and then find a series of updates that will copy data to this new column.
Once you're happy that all data has been copied, you should then have a series of updates you can run in a single transaction if you want to do the conversion in one fell swoop. Drop the new column and then do the whole lot in a single operation:
create new column;
perform all updates to copy data;
drop old column;
rename new column to old name.
(a) From the linked page:
ISNUMERIC returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($).
Possible solution:
CREATE TABLE test
(
data VARCHAR(100)
)
GO
INSERT INTO test VALUES ('19.01');
INSERT INTO test VALUES ('23.41');
ALTER TABLE test ADD data_new decimal(19,6)
GO
UPDATE test SET data_new = CAST(data AS decimal(19,6));
ALTER TABLE test DROP COLUMN data
GO
EXEC sp_RENAME 'test.data_new' , 'data', 'COLUMN'
As people have said, that error doesn't come from nulls, it comes from varchar values that can't be converted to decimal. Most typical reason for this I've found (after checking that the column doesn't contain any logically false values, like non-digit characters or double comma values) is when your varchar values use comma for decimal pointer, as opposed to period.
For instance, if you run the following:
DECLARE #T VARCHAR(256)
SET #T = '5,6'
SELECT #T, CAST(#T AS DEC(32,2))
You will get an error.
Instead:
DECLARE #T VARCHAR(256)
SET #T = '5,6'
-- Let's change the comma to a period
SELECT #T = REPLACE(#T,',','.')
SELECT #T, CAST(#T AS DEC(32,2)) -- Now it works!
Should be easy enough to look if your column has these cases, and run the appropriate update before your ALTER COLUMN, if this is the cause.
You could also just use a similar idea and make a regex search on the column for all values that don't match digit / digit+'.'+digit criteria, but i suck with regex so someone else can help with that. :)
Also, the american system uses weird separators like the number '123100.5', which would appear as '123,100.5', so in those cases you might want to just replace the commas with empty strings and try then?

Converting varchar to nvarchar in SQL Server failed

I have SQL Server table that contains columns of type varchar(50) as a result of a CSV import using the SQL Server Import wizard.
I was wanting to know how I can change this data type to nvarchar(9) without getting a SQL Server truncation error.
I tried doing a bulk update to set the data types and column sizes that I need but still had the truncation error message when I tried to load the csv into the empty database table I created (with my required data types that I need).
Grateful for any help.
Since you are willing to lose data and nvarchar will only be able to store 9 non-unicode charaters, then select only 9 characters from your source table, You do the truncation rather than Sql server doing it for you.
The Following Query will trim any White spaces from the strings, Then take only 9 characters from the string and convert them to NVARCHAR(9) for you.....
CREATE TABLE New_TABLE (Col1 NVARCHAR(9), Col2 NVARCHAR(9))
GO
INSERT INTO New_TABLE (Col1, Col2)
SELECT CONVERT(NVARCHAR(9),LEFT(LTRIM(Col1), 9))
,CONVERT(NVARCHAR(9),LEFT(LTRIM(Col2), 9))
FROM Existing_Table
GO
Bulk insert into temp table with varchar(50) and insert to actual table
insert into tableName
select cast(tempcolumn as nvarchar(9)) from temptable
And it is also important to check field types of destination table. Just spent 3 hours because of same error with random casting, trimming, substring and at the end noticed, that colleague created table with too short field lengths.
I hope it helps somebody...
If you encounter this error during Import/Export Tasks, you can use the select cast(xxx as nvarchar(yyy)) as someName in the "Write a query to specify the data to transfer" option
varchar and nvarchar only use the length needed for the data stored. If you need unicode support certainly convert to nvarchar, but modifying it from 50 to 9 - what is the point?
If your data is ALWAYS exactly 9, consider using char(9), and following one of the transformation suggestions above...

Comparing nvarchar with bigint in sql server

I am investigating a performance issue for the following sql statement:
Update tableA
set columnA1 = columnB1
from tableB
where tableA.columnA2 = tableB.columnB2
The problem is that tableA.columnA2 is of type nvarchar(50) while tableB.columnB2 is of type bigint. My question is how sql server execute such query; does it cast bigint to nvarchar and compare using nvarchar comparing operators or does it cast nvarchar to bigint and compare with bigint comparing operators.
Another thing: if I had to leave those column types as is tableA.columnA2, tableB.columnB2' how can I rewrite this query to enhance performance?
Note: this query is only working on around 100,000 records, but it takes like forever.
Thanks in advance, really appreciate your help.
In the comparison, the nvarchar will be converted to bigint, because bigint has a higher precedence
See http://msdn.microsoft.com/en-us/library/ms190309.aspx
EDIT:
I was assuming that the conversion is always to the data type of the updated table. But this is wrong! #podiluska's answer is correct, as I tested with a statement similar to that in the question, and in the plan for the update statement, you see that the conversion is always to bigint when you compare a bigint and a nvarchar column, no matter if the bigint or the nvarchar column is on the side of the updated table: The query plan always contains an expression Scalar Operator(CONVERT_IMPLICIT(bigint,[schema1].[table1].[col1],0)) for the nvarchar column.
To help the performance, you can create a calculated column in the table B with the nvarchar column using the expression cast(ColumnA2 as bigint). Then you could build an index on this and columnB1.

Modifying a column type with data, without deleting the data

I have a column which I believe has been declared wrongly. It contains data and I do not wish to lose the data.
I wish to change the definition from varchar(max) to varchar(an integer). I was under the impression I cannot just alter the column type?
Is the best method to create a temp column, "column2", transfer the data to this column, from the column with the problematic type, delete the problem column and then rename the temp column to the original problematic column?
If so, how do I copy the values from the problem column to the new column?
EDIT: For anyone with same problem, you can just use the ALTER statements.
As long as the data types are somewhat "related" - yes, you can absolutely do this.
You can change an INT to a BIGINT - the value range of the second type is larger, so you're not in danger of "losing" any data.
You can change a VARCHAR(50) to a VARCHAR(200) - again, types are compatible, size is getting bigger - no risk of truncating anything.
Basically, you just need
ALTER TABLE dbo.YourTable
ALTER COLUMN YourColumn VARCHAR(200) NULL
or whatever. As long as you don't have any string longer than those 200 characters, you'll be fine. Not sure what happens if you did have longer strings - either the conversion will fail with an error, or it will go ahead and tell you that some data might have been truncated. So I suggest you first try this on a copy of your data :-)
It gets a bit trickier if you need to change a VARCHAR to an INT or something like that - obviously, if you have column values that don't "fit" into the new type, the conversion will fail. But even using a separate "temporary" new column won't fix this - you need to deal with those "non-compatible" cases somehow (ignore them, leave NULL in there, set them to a default value - something).
Also, switching between VARCHAR and NVARCHAR can get tricky if you have e.g. non-Western European characters - you might lose certain entries upon conversion, since they can't be represented in the other format, or the "default" conversion from one type to the other doesn't work as expected.
Calculate the max data length store int that column of that table.
Select max(len(fieldname)) from tablename
Now you can decrease the size of that column up to result got in previous query.
ALTER TABLE dbo.YourTable
ALTER COLUMN YourColumn VARCHAR(200) NULL
According to the PostgreSQL docs, you can simply alter table
ALTER TABLE products ALTER COLUMN price TYPE numeric(10,2);
But here's the thing
This will succeed only if each existing entry in the column can be converted to the new type by an implicit cast. If a more complex conversion is needed, you can add a USING clause that specifies how to compute the new values from the old.
add a temp column2 with type varchar(NN), run update tbl set column2 = column, check if any error happens; if everything is fine, alter your original column, copy data back and remove column2.