SQL Server NText field limited to 43,679 characters? - sql

I working with SQL Server data base in order to store very long Unicode string. The field is from type 'ntext', which theoretically should be limit to 2^30 Unicode characters.
From MSDN documentation:
ntext
Variable-length Unicode data with a maximum string length of 2^30 - 1 (1,073,741,823) bytes. Storage size, in bytes, is two times the string length that is entered. The ISO synonym for ntext is national
text.
I'm made this test:
Generate 50,000 characters string.
Run an Update SQL statement
UPDATE [table]
SET Response='... 50,000 character string...'
WHERE ID='593BCBC0-EC1E-4850-93B0-3A9A9EB83123'
Check the result - what actually stored in the field at the end.
The result was that the field [Response] contain only 43,679 characters. All the characters at the end of the string was thrown out.
Why this happens? How I can fix this?
If this is really the capacity limit of this data type (ntext), which another data type can store longer Unicode string?

Based on what I've seen, you may just only be able to copy 43679 characters. It is storing all the characters, they're in the db(check this with Select Len(Reponse) From [table] Where... to verify this), and SSMS has problem copying more than when you go to look at the full data.

NTEXT datatype is deprecated and you should use NVARCHAR(MAX).
I see two possible explanations:
Your ODBC driver you use to connect to database truncate parameter value when it is too long (try using SSMS)
You write you generate your input string. I suspect you generate CHAR(0) which is Null literal
If second is your case make sure you cannot generate \0 char.
EDIT:
I don't know how you check the length but keep in mind that LEN does not count trailing whitespaces
SELECT LEN('aa ') AS length -- 2
,DATALENGTH('aa ') AS datalength -- 7
Last possible solution I see you do sth like:
SELECT 'aa aaaa'
-- result in SSMS `aa aaaa`: so when you count you lose all multiple whitespaces
Check query below if returns 100k:
SELECT DATALENGTH(ntext_column)

For all bytes; Grid result on right click and click save result to file.

Can confirm. The actual limit is 43679. Had a problem with a subscription service for a week now. Every data looked good, but it still gave us an error that one of the fields have invalid values, even tho, it got correct values in. It turned out that the parameters was stored in NText and it maxed out at 43679 characters. And because we cannot change the database design, we had to make 2 different subscriptions for the same thing and put half of the entities to the other one.

Related

How to get rid of special character in Netezza columns

I am transferring data from one Netezza database to another using Talend, an ETL tool. When I pull data from a varchar(30) field and try to put it in the new database's varchar(30) field, it gives an error saying it's too long. Logs show the field has whitespace at the end followed by a square, representing some character I can't figure out. I attached a screenshot of the logs below. I have tried writing SQL to pull this field and replace what I thought was a CRLF, but no luck. When I do a select on the field and get the length, it has a few extra characters than what you see, so something is there and I want to get rid of it. Trimming does not do anything.
This SQL does not return a length shorter than simply doing length() on the column itself. Does anyone know what else it could be?
SELECT LENGTH(trim(translate(TRANSLATE(<column>, chr(13), ''), chr(10), ''))) as len_modified
Note that the last column in the logs, where you see a square in brackets, is supposed to show the last character examined.
Save the data to a larger target table size that works. If 30 character data put it in a 500 character table. Get it to work. Then look through character by character on the fields that are the longest to determine what character is being added. Use commands like ascii() to determine the ascii value of the individual characters and the beginning and end. Most likely you are getting some additional character in the beginning or the end. Determine what the extra character data is and then write code to remove it or to never load it so that it fits in the 30 character column. Or just leave your target column with longer and include the additional characters. For example Varchar(30) becomes Varchar(32) (waste the space but don't alter the data as it comes in to you).

SQL NVARCHAR(MAX) returning ASCII and Weird Characters instead of Text

I have an SQL Table and I'm trying to return the values as a string.
The values should be city names like Sydney, Melbourne, Port Maquarie etc.
But When I run a select I either get black results or as detailed in the first picture some strange backwards L character. The column is an NVARCHAR(MAX)
SELECT ctGlobalName FROM Crm.Cities
Then I tried using MSSQL's Edit top 200 rows feature and I could see the names of the cities, but also all these weird ascii characters.
Now I didn't create the database, I'm just running queries on it. Some things I've read have suggested it is a problem with the Collation. But the table is SQL_Latin1_General_CP1_CI_AS which matches the server collation.
I'm sure there must be something I can add to my select query to return the values as an ordinary string. Is there something I can do to my select query to return the expected format without the weird characters?
An NVARCHAR datatype can store Unicode characters, which are used for languages that are not supported by the ASCII character set i.e. non-English (or related) languages such as Chinese or Indonesian. If your SQL Server or Windows doesn't have that language installed then you might see strange-looking representations of the data.
On the other hand, it could also be that the application that updates this table has just stored bad data in that column.
Either way you might need to do some string manipulation to strip out the characters you don't want.

Text datatype in SQL got trimmed to 8000 characters

In SQL I have a table with around 2000 rows and i store XML's in a column. The data type of the column is text. Around 80 rows are currently trimmed off to exact 8000 characters. There are values in that column which has more than 8000 characters.what could be the issue.
8000 characters is actually the maximum amount of characters that SSMS will display (by default), it will return to your front end the full amount - at least in the case of varchar(max). I think that is the case with text as well, but I don't use text type anymore :P
It should be noted, that you shouldn't be using the text data type for sql server, as it is in the process of deprecation.
See: https://msdn.microsoft.com/en-us/library/ms143729.aspx for information on texts deprecation.

count number of characters in nvarchar column

Does anyone know a good way to count characters in a text (nvarchar) column in Sql Server?
The values there can be text, symbols and/or numbers.
So far I used sum(datalength(column))/2 but this only works for text. (it's a method based on datalength and this can vary from a type to another).
You can find the number of characters using system function LEN.
i.e.
SELECT LEN(Column) FROM TABLE
Use
SELECT length(yourfield) FROM table;
Use the LEN function:
Returns the number of characters of the specified string expression, excluding trailing blanks.
Doesn't SELECT LEN(column_name) work?
text doesn't work with len function.
ntext, text, and image data types will be removed in a future version
of Microsoft SQL Server. Avoid using these data types in new
development work, and plan to modify applications that currently use
them. Use nvarchar(max), varchar(max), and varbinary(max) instead. For
more information, see Using Large-Value Data Types.
Source
I had a similar problem recently, and here's what I did:
SELECT
columnname as 'Original_Value',
LEN(LTRIM(columnname)) as 'Orig_Val_Char_Count',
N'['+columnname+']' as 'UnicodeStr_Value',
LEN(N'['+columnname+']')-2 as 'True_Char_Count'
FROM mytable
The first two columns look at the original value and count the characters (minus leading/trailing spaces).
I needed to compare that with the true count of characters, which is why I used the second LEN function. It sets the column value to a string, forces that string to Unicode, and then counts the characters.
By using the brackets, you ensure that any leading or trailing spaces are also counted as characters; of course, you don't want to count the brackets themselves, so you subtract 2 at the end.

text or varchar?

I have 2 columns containing text one will be max 150 chars long and the other max 700 chars long,
My question is, should I use for both varchar types or should I use text for the 700 chars long column ? why ?
Thanks,
The varchar data type in MySQL < 5.0.3 cannot hold data longer than 255 characters. While in MySQL >= 5.0.3 it has a maximum of 65,535 characters.
So, it depends on the platform you're targeting, and your deployability requirements. If you want to be sure that it will work on MySQL versions less than 5.0.3, go with a text type data field for your longer column
http://dev.mysql.com/doc/refman/5.0/en/char.html
An important consideration is that with varchar the database stores the data directly in the table, and with text the database stores a pointer to a separate tablespace in which the data is stored. So, unless you run into the limit of a row length (64K in MySQL, 4-32K in DB2, 8K in SQL Server 2000) I would normally use varchar.
Not sure about mysql specifically, but in MS SQL you definitely should use a VARCHAR for anything under 8000 characters long, if you want to be able to run any sort of comparison on the value in the field. For example this would be possible with a VARCHAR:
select your_column from your_table
where your_column like '%dogs%'
but not with a TEXT field.
More information regarding TEXT field in mysql 5.4 can be found here and more information about the VARCHAR field can be found here.
I'd pick the option based on the type of data being entered. For example, if it's a (potentially) super long username and a biography, I'd do varchar and text. Sure, you're limiting the bio to 700 chars, but what if you add HTML formatting down the road, keeping the 700 char limit but allowing HTML tags for formatting?
Twitter may use text for their tweets, since it could be quicker to add meta data to the posts (e.g. url href's and #user PK's) to cache the additional data instead of caluclate it every rendering.