SQL Query and Unicode Issue

SQL Query and Unicode Issue - sql

I have a really weird issue with Sql queries on unicode data. Here's what I've got:
Sql Server Express 2008 R2 AS
Table containing chinese characters/words/phrases (100,000 rows)
When I run the following, I get the correct row + 36 other rows returned... when it should only be the one row:
SELECT TOP 1000 [ID]
,[MyChineseColumn]
,UNICODE([MyChineseColumn])
FROM [dbo].[MyTableName]
WHERE [MyChineseColumn]= N'㐅'
As you'd expect, the row with 㐅 is returned, but also the following: 〇, 宁, 㮸 and a bunch of others...
Anyone have any ideas what is going on here? This has really got me confused and I am not sure how to solve this one (tried "Googling" already)...
Thanks

Please check the column is using an appropriate Chinese collation as that will determine the semantics used in this type of comparison.

You may want to try and use a binary collation, these characters seem to be somehow matched as identical (possibly by ignoring case and/or accents, depending on the used collation).

Related

Data filtered differently in sql and crystal reports

Problem arises when filtering string columns with symbols '-'.
For example query bellow returns ~280 rows:
"SELECT code FROM client WHERE code >= 'M-SOLUTIONS' AND code <= 'MUZIKOS'"
but CR with record selection bellow only returns 20 rows:
{client.code} >= 'M-SOLUTIONS' AND {client.code} <= 'MUZIKOS'
If I put 'Lxxx' instead of 'M-SOLUTIONS' then returned data is correct. Any ideas how to overcome this issue? I used PostgreSql database over Odbc connection.

Apparently they use different collations. Some collations will ignore punctuation on a first pass, using it only if the values are otherwise equal. Figure out which collation you want to use, then make sure both CR and PostgreSQL use that one.

how to convert image type to varchar sybase

after some time i landed in sybase (ASE 15.. to be specific) world and i am bit terrified over time
missing functions and functionality i know from sql server makes me feel like i am in early 90'
to the point
i have to prepare single shot report
an have some text stored as image column (dont know why someone did that)
so what i did was
select CAST(CAST(REQUEST AS VARBINARY(16384)) AS VARCHAR(16384)) as RequestBody
from table
the problem emarges becouse some requests are longer than 16384
and have no idea how to get the data
and what is even worse i dont know where to look for information as sybase documentation is in best case scarse, and in comparison with MS world its non existant

According to the docs you need to use CONVERT function like this:
SELECT CONVERT(VARBINARY(2048), raw_data) as raw_data_str FROM table;

Instead of using varbinary(16384) and varchar(16384), try using varbinary(max) and varchar(max). In that case, the maximum datalength will be 2 GB.
See:
http://msdn.microsoft.com/en-us/library/ms176089.aspx and http://msdn.microsoft.com/en-us/library/ms188362.aspx
What is the length of the REQUEST column in the table?

SQL find-and-replace regular-expression capturing-group limit?

I need to convert data from a spreadsheet into insert statements in SQL. I've worked out most of the regular expressions for using the find and replace tool in SSMS, but I'm running into an issue when trying to reference the 9th parenthesized item in my final replace.
Here is the original record:
Blue Doe 12/21/1967 1126 Queens Highway Torrance CA 90802 N 1/1/2012
And this is what I need (for now):
select 'Blue','Doe','19671221','1126 Queens Highway','Torrance','CA','90802','N','20120101'
Due to limitations on the number of parenthesized items allowed I have to run through the replace three times. This may work into a stored procedure if I can make first make this work as a POC.
This is the first matching expression:
^{:w:b:w:b}{:z}/{:z}/{:z:b[0-9A-Za-z:b]+:b:w:b[A-Z]+:b:z:b:w:b}{:z}/{:z}/{:z}
And the replace: \10\2/0\3/\40\5/0\6/\7
This adds zeros to the months and days so that they have at least two characters.
The next match reformats the dates into the format required in the query (no comments about not using a date field. This is a client requirement for the database).
Matching expression:
^{:w:b:w:b}[0-9]*{[0-9]^2}/[0-9]*{[0-9]^2}/{:z}{:b[0-9A-Za-z:b]+:b:w:b[A-Z]+:b:z:b:w:b}[0-9]*{[0-9]^2}/[0-9]*{[0-9]^2}/{:z}
And the replace: \1\4\(2,2)\(2,3)\5\8\(2,6)\(2,7)
Finally, the final match inserts the results into the SQL statement that will get used in an insert statement.
Matching expression:
^{:w}:b{:w}:b{:z}:b{[0-9A-Za-z:b]+}:b{:w}:b{[A-Z]+}:b{:z}:b{:w}:b{:z}
And the replace: select '\1','\2','\3','\4','\5','\6','\7','\8','\9'
It all works except the last replacement. For some reason the \9 is NOT getting the data from the match. If I just replace the whole replace expression with \9 I get a blank space. If I use \8, I get N. If I eliminate the 8th parenthesized item, thus making my 9th item eighth, it returns what I want, 20120101.
So my question is, does SSMS / SQL allow for 9 tagged expressions when using find / replace and regular expressions? Or am I missing something here? I know there are other ways to do this. I'm just trying to get it done quickly as a POC before we move this into a sproc or application.
Thanks for any assistance.
-Peter

None of your matching expressions work with the record you provided in my MS SQL Server Management Studio 2008r2.
From your description it sounds like there is an issue with the Tagged Expression 9 since the desired result is returned when using Tagged Expression 8, but not 9. You may want to ask Microsoft or report it as a bug.
A quicker solution would be to move the text you are performing the Find/Replace on in SSMS to a spread sheet and use cell formulas to parse the data into insert commands. If you have MS Excel the CONCATENATE, FIND, and MID functions will probably be useful. Also, it helps to split the values into their own columns so you can format the date, then use one concatenate to build your insert.
Please let me know if you need an example.
Update: I tried your example in MS SQL Server Management Studio 2008r2, Visual Studio 2005, and Visual Studio 2010 with the same result you get, \9 returns an empty string. Checking around I found that others are also having this issue (see the community content from Henrique Evaristo) and that the whole system has been replaced in the new editors.
So in answer to your question, SSMS does not support 9 tagged expressions due to a bug.
If you are unable to use the Spreadsheet idea you could try splitting the action into two parts, setting the first 8 values, then swinging back again to do the last. For example:
^{:w}:b{:w}:b{:z}:b{[0-9A-Za-z:b]+}:b{:w}:b{[A-Z]+}:b{:z}:b{:w}:b:z
select '\1','\2','\3','\4','\5','\6','\7','\8','\0'
:w:b:w:b:z:b[0-9A-Za-z:b]+:b:w:b[A-Z]+:b:z:b:w:b{:z}
\1

WHERE Clause Requires Like When Equals Should Work

I have a query that I think should look like this:
select *
from Requesters
where CITIZEN_STATUS = 'OS-IE ';
The field CITIZEN_STATUS, whose data type is varchar(15), has a trailing space for this particular value. I have pasted it into Notepad++ and looked at it with a hex editor, and the final space is indeed 0x20.
For the query to work, I have to write it like this:
select *
from Requesters
where CITIZEN_STATUS like 'OS-IE%';
So, obviously, I have a workaround and the question is not urgent. But I would really like to know why the first query fails to do what I expect. Does anyone have any ideas?
I should mention I am using SQL Server 2005 and can provide more information about the configuration if needed.

In MySQL 5, this query works. However, it does not distinguish on trailing whitespace. The query matches 'OS-IE ' as well as 'OS-IE'. In SQL Server 2005 you can use a regular expression that defines the end of a line. The correct character for this is the dollar sign '$' to indicate that you do want the space. See http://msdn.microsoft.com/en-us/magazine/cc163473.aspx

Full Text Searching for single characters

I have a table with a TEXT column where the contents is just strings of CSV numbers. Example ",1,76,77,115," Each string can have an arbitrary number of numbers.
I am trying to set up Full Text Indexing so that I can search this column rapidly. This works great. Instead of running queries with
where MY_COL LIKE '%,77,%' and MY_COL LIKE '%,115,%'
I can do
where CONTAINS(MY_COL,'77 and 115')
However, when I try to search for a single character it doesn't work.
where CONTAINS(MY_COL,'1')
But I know that there should be records returned! I quickly found that I need to edit the Noise file and rebuild the index. But even after doing that it still doesn't work.

Working with relational databases that way is going to hurt.
Use a proper schema. Either store the values in different rows or use an array datatype for the column.
That will make solving the problem trivial.

I fixed my own problem, although I'm not exactly sure what fixed it.
I dropped my table and populated a new one (my program does batch processing) and created a new Full Text Index. Maybe I wasn't being patient enough to allow the indexing to fully rebuild.

Agreed. How does 12,15,33 not return that record for a search for 1 with fulltext? Use an actual table schema to accomplish this.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL Query and Unicode Issue - sql

Please check the column is using an appropriate Chinese collation as that will determine the semantics used in this type of comparison.

You may want to try and use a binary collation, these characters seem to be somehow matched as identical (possibly by ignoring case and/or accents, depending on the used collation).

Related

Data filtered differently in sql and crystal reports

how to convert image type to varchar sybase

SQL find-and-replace regular-expression capturing-group limit?

WHERE Clause Requires Like When Equals Should Work

Full Text Searching for single characters

Categories

Resources