Like with Regular Expression not giving right result in sql server - sql

declare #test varchar(50)
set #test='sad#fd'
if #test LIKE '%[a-zA-Z0-9 ./,()?''+-]%'
print 'yes'
else
print 'no'
My above code giving yes result as it should give no as I am not allowing '#' in regular expression. Is there anything wrong?
I want to handle this in my stored procedure where string is alpha numeric with specified list of special character allowed. What should I do?

The result is "Yes", because u have an letter s which is matching the condition
to get more clear, try running the below code
declare #test varchar(1000)
set #test='####'
if #test LIKE '%[a-zA-Z0-9 ./,()?''+-]%'
print 'yes'
else
print 'no'

SQL Server doesn't really have native regular expressions1, but what you're trying to achieve can still be done with LIKE by introducing a double negative:
declare #test varchar(50)
set #test='sad#fd'
if #test NOT LIKE '%[^a-zA-Z0-9 ./,()?''+-]%'
print 'yes'
else
print 'no'
% matches any number of characters. ^ inverts a character range. So, now we're asking - is the string any number of characters, then a character not in the set a-zA-Z0-9 ./,()?''+-, then any number of characters? - or, to put it another way, does this string contain any characters outside of the given set of characters?
1You can access a fully featured regex engine from the .NET framework by using the CLR integration. It's one of the usual samples given when talking about CLR integration. But not really needed here.

Related

SQL statement to check for empty string - T-SQL

I am trying to write a WHERE clause for where a certain string variable is not null or empty. The problem I am running into is that certain non-empty strings equal the N'' literal. For instance:
declare #str nvarchar(max) = N'㴆';
select case when #str = N'' then 1 else 0 end;
Yields 1. From what I can gather on Wikipedia, this particular unicode character is a pictograph for submerging something, which is not semantically equal to an empty string. Also, the string length is 1, at least in T-SQL.
Is there a better (accurate) way to check a T-SQL variable for the empty string?
I found a blog, https://bbzippo.wordpress.com/2013/09/10/sql-server-collations-and-string-comparison-issues/
which explained that
The problem is because the “default” collation setting
(SQL_Latin1_General_CP1_CI_AS) for SQL Server cannot properly compare
Unicode strings that contain so called Supplementary Characters
(4-byte characters).
A fix is to use a collation that doesn't have problems with the supplementary characters. For example:
select case when N'㴆' COLLATE Latin1_General_100_CI_AS_KS_WS = N'' then 1 else 0 end;
will return 0. See the blog for more examples.
Since you are comparing to the empty string, another solution would be to test the string length.
declare #str1 nvarchar(max) =N'㴆';
select case when len(#str1) = 0 then 1 else 0 end;
This will return 0 as expected.
This also yields 0 when the string is null.
EDIT:
Thanks to devio's comment, I dug a bit deeper and found a comment from Erland Sommarskog https://groups.google.com/forum/#!topic/microsoft.public.sqlserver.server/X8UhQaP9KF0
that in addition to not supporting Supplementary Characters, the Latin1_General_CP1_CI_AS collation doesn't handle new Unicode characters correctly. So I'm guessing that the 㴆 character is a new Unicode character.
Specifying the collation Latin1_General_100_CI_AS will also fix this issue.

SQL Server's ISNUMERIC function

I need to checking a column where numeric or not in SQL Server 2012.
This my case code.
CASE
WHEN ISNUMERIC(CUST_TELE) = 1
THEN CUST_TELE
ELSE NULL
END AS CUSTOMER_CONTACT_NO
But when the '78603D99' value is reached, it returns 1 which means SQL Server considered this string as numeric.
Why is that?
How to avoid this kind of issues?
Unfortunately, the ISNUMERIC() function in SQL Server has many quirks. It's not exactly buggy, but it rarely does what people expect it to when they first use it.
However, since you're using SQL Server 2012 you can use the TRY_PARSE() function which will do what you want.
This returns NULL:
SELECT TRY_PARSE('7860D399' AS int)
This returns 7860399
SELECT TRY_PARSE('7860399' AS int)
https://msdn.microsoft.com/en-us/library/hh213126.aspx
Obviously, this works for datatypes other than INT as well. You say you want to check that a value is numeric, but I think you mean INT.
Although try_convert() or try_parse() works for a built-in type, it might not do exactly what you want. For instance, it might allow decimal points, negative signs, and limit the length of digits.
Also, isnumeric() is going to recognize negative numbers, decimals, and exponential notation.
If you want to test a string only for digits, then you can use not like logic:
(CASE WHEN CUST_TELE NOT LIKE '%[^0-9]%'
THEN CUST_TELE
END) AS CUSTOMER_CONTACT_NO
This simply says that CUST_TELE contains no characters that are not digits.
Nothing substantive to add but a couple warnings.
1) ISNUMERIC() won't catch blanks but they will break numeric conversions.
2) If there is a single non-numeric character in the field and you use REPLACE to get rid of it you still need to handle the blank (usually with a CASE statement).
For instance if the field contains a single '-' character and you use this:
cast(REPLACE(myField, '-', '') as decimal(20,4)) myNumField
it will fail and you'll need to use something like this:
CASE WHEN myField IN ('','-') THEN NULL ELSE cast(REPLACE(myField, '-', '') as decimal(20,4)) END myNumField

Cast as INT check SQL

I need to update a column in a table. But only where the cast to an INT fails.
I have the following so far - but this updates all the records.
begin try
select cast(customerid as int) from Table_staging;
end try
begin catch
update Table_staging
set incorrectformat = 0
end catch
update Table_staging
set incorrectformat = 0
where not customerid like '%[^0-9]%'
should be sufficient. Basically, we mark incorrectformat for any row where customerid is not a string that contains any number of characters, then a character not in the set 0-9, than any number of characters.
I.e. the values that don't match this are precisely the ones only containing digits.
And the main issue with ISNUMERIC is that it answers a question that I don't believe anyone would ever, rightfully, ask - "Can this character string be converted to any of the numeric data types? I don't care which of those types it can be converted to, and there's no need to tell me which ones in response either"

How to write SQL query with many % wildcard characters

I have a coloumn in Sql Server table as:
companystring = {"CompanyId":0,"CompanyType":1,"CompanyName":"Test
215","TradingName":"Test 215","RegistrationNumber":"Test
215","Email":"test215#tradeslot.com","Website":"Test
215","DateStarted":"2012","CompanyValidationErrors":[],"CompanyCode":null}
I want to query the column to search for
companyname like '%CompanyName":"%test 2%","%'
I want to know if I'm querying correctly, because for some search string it does not yield the proper result. Could anyone please help me with this?
Edit: I have removed the format bold
% is a special character that means a wildcard. If you want to find the actual character inside a string, you need to escape it.
DECLARE #d TABLE(id INT, s VARCHAR(32));
INSERT #d VALUES(1,'foo%bar'),(2,'fooblat');
SELECT id, s FROM #d WHERE s LIKE 'foo[%]%'; -- returns only 1
SELECT id, s FROM #d WHERE s LIKE 'foo%'; -- returns both 1 and 2
Depending on your platform, you might be able to use some combination of regular expressions and/or lambda expressions which are built into its main libraries. For example, .NET has LINQ , which is a powerful tool that abstracts querying and which provides leveraging for searches.
It looks like you have JSON data stored in a column called "companystring". If you want to search within the JSON data from SQL things get very tricky.
I would suggest you look at doing some extra processing at insert/update to expose the properties of the JSON you want to search on.
If you search in the way you describe, you would actually need to use Regular Expressions or something else to make it reliable.
In your example you say you want to search for:
companystring like '%CompanyName":"%test 2%","%'
I understand this as searching inside the JSON for the string "test 2" somewhere inside the "CompanyName" property. Unfortunately this would also return results where "test 2" was found in any other property after "CompanyName", such as the following:
-- formatted for readability
companystring = '{
"CompanyId":0,
"CompanyType":1,
"CompanyName":"Test Something 215",
"TradingName":"Test 215",
"RegistrationNumber":"Test 215",
"Email":"test215#tradeslot.com",
"Website":"Test 215",
"DateStarted":"2012",
"CompanyValidationErrors":[],
"CompanyCode":null}'
Even though "test 2" isn't in the CompanyName, it is in the text following it (TradingName), which is also followed by the string "," so it would meet your search criteria.
Another option would be to create a view that exposes the value of CompanyName using a column defined as follows:
LEFT(
SUBSTRING(companystring, CHARINDEX('"CompanyName":"', companystring) + LEN('"CompanyName":"'), LEN(companystring)),
CHARINDEX('"', SUBSTRING(companystring, CHARINDEX('"CompanyName":"', companystring) + LEN('"CompanyName":"'), LEN(companystring))) - 1
) AS CompanyName
Then you could query that view using WHERE CompanyName LIKE '%test 2%' and it would work, although performance could be an issue.
The logic of the above is to get everything after "CompanyName":":
SUBSTRING(companystring, CHARINDEX('"CompanyName":"', companystring) + LEN('"CompanyName":"'), LEN(companystring))
Up to but not including the first " in the sub-string (which is why it is used twice).

advanced word searching in sql

i need to write a query in sql server which selects rows containing two word with (at least / at most / exactly) specified number of word between them ...
i wrote this code for implementing exact number of words in between :
SELECT simpledtext
FROM booktexts
WHERE simpledtext LIKE '%hello [^ ] [^ ] search%'
and this code for implementing minimum number of words in between :
SELECT simpledtext
FROM booktexts
WHERE simpledtext LIKE '%hello [^ ] [^ ] % search%'
but i don't know how to write the max words in between t-sql code ...
and the other question is is it possible to implement these kinds of query with full-text-search in sql server 2012 ?
Your like string would only match single character words. If this is what you need, you could put something together like this:
declare #str1 varchar(1024) = 'and hello w w w search how are you',
#str2 varchar(1024) = 'and hello w w search how are you',
#likeStr varchar(512),
#pos int,
#maxMatch int;
set #maxMatch = 2;
set #pos = 0;
set #likeStr = '%hello';
while (#pos < #maxMatch)
begin
set #likeStr += ' [^ ]';
set #pos += 1;
end
set #likeStr += ' search%';
select #likeStr, (case when #str1 like #likeStr then 1 else 0 end), (case when #str2 like #likeStr then 1 else 0 end)
If this isn't what you need, and you know how many characters the words are going to be, you could use [a-zA-Z] in the like string in the loop.
However, I expect this also will not be what you're after. My suggestion would then be to abandon like strings, and move on to the more sophisticated regular expressions.
Unfortunately you can't load System.dll directly into SQL Server 2008 (I think this also applies to SQL Server 2012), so you would need to create a custom .NET assembly and load this into your database. Your should use the IsDeterministic annotation in your .NET code, and load the custom assembly into SQL Server with permission_set = safe. This should ensure you get parallelism for your function, and that you can use it in places like computed columns.
SQL Server is very good at running .NET code, i.e. it can be very
performant. Writing what you need in regular expressions should be
very easy.
As for Full Text Search, contains() is basically a Full Text predicate, and you would have to enable this in SQL Server to use it. near() is used inside contains() predicates. I think this is bulky for what you want to do, both in terms of supported functionality (it does inflections of words for fuzzy matching), and what you need to enable to use it (runs an extra windows service).