Can unicode chars in sql queries cause slow db performance? - sql

Can use of unicode chars in queries cause database to slow down?
I am using a query like
Select * from table where name='xyz¿½'
After this query my application slows down permanently until I restart it.
I am using c3p0 hibernate's connection pool

A modern database should support Unicode, but that may be restricted to certain data types.
For example SQL Server only supports Unicode for the following data types:
nchar
nvarchar
nvarchar(max) – new in SQL Server 2005
ntext
Unicode string constants (say within stored procedures/functions) should be preceded with the letter N e.g. N'abcd'

I found that sybase does not use an index when query contains unicode characters. It may be due to some charset settings in my version.

Related

SQL Server datetime2 in OPENQUERY

We're migrating form SQL Server 2005 to 2014 for a pretty large environment. And we've noticed that OPENQUERY behaves differently when interacting with MySQL database when it comes to datetime. Previously, it would translate just fine to DATETIME column. With 2014 (I assume started in 2008 or so), it now converts to DATETIME2 (with maximum precision). This causes problems when comparing to or inserting into DATETIME columns.
Is there a way to specify on a server-level (or specify default) for which type those will translate to? Rewriting all of the queries will be quite an undertaking, and I'd like to avoid this now, if possible (rather rewrite as we edit or introduce new things).
Try to Use VARCHAR datatype while migration of date fields, and it is always easy to Convert/Cast in various types as per need.

Unicode collation for SQL Server 2005?

I need a collation for a database that correctly stores any Unicode character in a SQL Server 2005 instance. The column currently is of type nvarchar (can be changed). How can I do that?
Collation has no connection to storage of N[VAR]CHAR data - it states the rules of comparison between strings.
So - you made the right choice - NVARCHAR

Oracle-->SQL - forced conversion from non-unicode to unicode?

I have an ETL that is importing tables from Oracle to SQL 2008 using the OLEDB FastLoad.
The data in Oracle is non-unicode.
When the table is created in SQL it is created with unicode datatypes.
For some reason the datatypes are being forced from non-unicode to unicode.
Do any of you know of a way to stop this from happening?
Possibly a Oracle driver problem?
I'm presuming you are using SSIS?
Guess what, SSIS wants everything to be unicode, so it assumes that all incoming data is in unicode. If you don't want it to be unicode, you will need to convert each field using a dataconversion task.
This is something you might want to try. Check the value of the NLS_LANG variable in the Oracle Database you are importing to. Changing this variable before running the ETL could help you.
Check the NLS_LANG faq here:
http://www.oracle.com/technology/tech/globalization/htdocs/nls_lang%20faq.htm

What is the most "database independent" way of creating a variable length text field in a database

I want to create a text field in the database, with no specific size (it will store text of length unknown in some case) - the particular text are serialized simple object (~ JSON)
What is the most database independent way to do this :
- a varchar with no size specified (don't think all db support this)
- a 'text' field, this seems to be common, but I don't believe it's a standard
- a blob or other object of that kind ?
- a varchar of a a very large size (that's inefficient and wastes disk space probably)
- Other ?
I'm using JDBC, but I'd like to use something that is supported in most DB (oracle, mysql, postgresql, derby, HSQL, H2 etc...)
Thanks.
a varchar of a a very large size (that's inefficient and wastes disk space probably)
That's gonna be the most portable option. Limit yourself to 2000 characters and you should be fine for most databases (oracle being the current 2000 limiter, but be wary of old mysql versions as well). I wouldn't worry too much about disk space, either. Most databases only use disk for the actual data saved in the field.
Do you really need to support all six of those databases? (hint: No.)
I've come to the opinion that writing universally portable SQL DDL is not worth the trouble. YAGNI.
You should support the databases you are currently using, and be prepared to adapt to a database that you adopt in the future.
Re your comment: The only standard SQL variable-length data types are VARCHAR and BLOB. VARCHAR is for string data and its declaration includes a character set and collation. BLOB is for binary data and does not support charset/collation.
Other data types such as VARCHAR(max), CLOB, or TEXT are vendor extensions:
VARCHAR(max): MS SQL Server
NVARCHAR(max): MS SQL Server
LONGVARCHAR: Derby, H2, HSQLDB
CLOB: Derby, H2, HSQLDB, Oracle, SQLite
NCLOB: Oracle
TEXT: MS SQL Server, MySQL, PostgreSQL, SQLite
NTEXT: MS SQL Server
Use a BLOB. JDBC2.0 API supports it and so any driver that supports JDBC2.0 (J2SE 5.0 on) should support it.
The advantages of BLOB are :
1. Size can be as large as 4G-1 (Oracle. other databases not so sure)
2. Can store any data you wish (even images serialized into some field in your JSON structure)
3. Completely neutral to transport across OS
4. You can still take advantage of indexes on keys that reference the BLOB so that searches on ids etc, don;t have to be done by getting at the structure.
Use a framework like hibernate, so you won't have the problem to find a universal solution. I don't think that you can use one universal type in every mentioned database. The databases differ to much, I guess.
text is perhaps best but to be removed shortly from SQL Server and there is no DBMS independent option for all you listed.
Saying that, portability is overrated when it comes to SQL. You're more likely to change your client code before you change DBMS. Pick one and go with that....

Converting non-unicode SQL Server data and stored procs to Unicode

I need to convert a non-unicode SQL Server 2005 database to a unicode based database. I have hundreds of stored procs and of course the data is stored in varchar. I know that I need to change all the data types to the unicode equivalent (varchar to nvarchar) but don't I have to change how the stored procs are written or will they continue to work as before? I am trying to figure out what is necessary to change from non-unicode to unicode for a large database with many stored procs.
Yes, you need to update your data and stored procedures, but an important thing to remember is that you only need to change some of your columns to UNICODE. For anything that is "internal", you don't need to pay the UNICODE cost.
There is a lot of work to do for this change, but don't change everything blindly. I've been on the receiving end of that kind of change before, and it's painful. (Using nvarchar(1) to store 'y' and 'n' is stupid.)