SQL Server LIKE caret (^) for NOT does not work as expected - sql

I was reading the article at mssqltips and wanted to try the caret in regex. I understand regex pretty well and use it often, although not much in SQl Server queries.
For the following list of names, I had thought that 1) select * from people where name like '%[^m]%;' will return those names that do not contain 'm'. But it doesn't work like that. I know I can do 2) select * from people where name not like '%m%'; to get the result I want, but I'm just baffled why 1) doesn't work as expected.
Amy
Jasper
Jim
Kathleen
Marco
Mike
Mitchell
I am using SQL Server 2017, but here is a fiddle:
sql fiddle

'%[^m]%' would be true for any string containing a character that is not m. An expanded version would be '%[Any character not m]%'. Since all of those strings contain a character other than m, they are valid results.
If you had a string like mmm, where name like '%[^m]%' would not return that row.

Related

Find phone numbers with unexpected characters using SQL

I posted the same question below for SQL in Oracle here and was provided the SQL info within that works.
However, I now need to perform the same in a DB2 database and if I attempt to run the same SQL it errors out.
I need to find rows where the phone number field contains unexpected characters.
Most of the values in this field look like:
123456-7890
This is expected. However, we are also seeing character values in this field such as * and #.
I want to find all rows where these unexpected character values exist.
Expected:
Numbers are expected
Hyphen with numbers is expected (hyphen alone is not)
NULL is expected
Empty is expected
This SQL works in Oracle:
...
WHERE regexp_like(phone_num, '[^ 0123456789-]|^-|-$')
When using the same SQL above in DB2, the statement errors out.
I found it easiest to answer your question by phrasing a regex which matches the positive cases. Then, we can just use NOT to find the negative cases. DB2 supports a REGEXP_LIKE function:
SELECT *
FROM yourTable
WHERE
NOT REGEXP_LIKE(phone_num, '^[0-9]+(-?[0-9]+)*$') AND
COALESCE(phone_num, '') <> '';
Here is a demo of the regex:
Demo
For newer version of db2, regexp is the way to go. If you are on an older version (perhaps why you get an error), you can replace all accepted chars with '' and check if the result is an empty string. Can't check right now, but from memory, it would be
WHERE TRANSLATE(phone_num, '', '0123456789-')<>''
EDIT:
For what it's worth your regexp works for V11 so you probably have an older version of Db2. Example of translate and regexp side by side:
]$ db2 "with t(s) as ( values '123456-7890', '12345*-7890' )
select s, 'regexp' as method from t
where regexp_like(s, '[^ 0123456789-]|^-|-$')
union all
select s, 'translate' as method
from t where TRANSLATE(s, '', '0123456789-')<>''"
S METHOD
----------- ---------
12345*-7890 translate
12345*-7890 regexp
2 record(s) selected.

SQL Server and Oracle query ignoring accents

I have a table called A and a column called Keywords Varchar(255). The Keywords column can contain strings like "TEST, CÃO, ódio" and so on... with or without accents:
ID Keywords
1 TEST, CÃO, ódio, oracle, SQL, açaí
2 Valor, Deputado Rafael, Costelão, estilo
3 São Sebastião, cao, projeto de lei
I'm trying to create a SQL query that compare strings ignoring brazilian accents (áéíóúç and so on...). So if the user searches for "cao", it should return the rows 1 and 3 in the example.
I tried something like:
SELECT keywords
FROM A WHERE UPPER(TRANSLATE(keywords,
'ÁÇÉÍÓÚÀÈÌÒÙÂÊÎÔÛÃÕËÜáçéíóúàèìòùâêîôûãõëü','ACEIOUAEIOUAEIOUAOEUaceiouaeiouaeiouaoeu'))
LIKE UPPER((TRANSLATE('%cao%',
'ÁÇÉÍÓÚÀÈÌÒÙÂÊÎÔÛÃÕËÜáçéíóúàèìòùâêîôûãõëü', 'ACEIOUAEIOUAEIOUAOEUaceiouaeiouaeiouaoeu')));
But it doesn't work.
I also tried using NLS_SORT, but it is only for Oracle, and I need a query that works both on SQL Server and Oracle (it's a client requirement). How can I do that?
One issue is that Microsoft SQL Server did not have a translate function until 2017. It does now, but since it doesn't work for you, you are probably not on this version yet.
You can do a nested replace instead. This is not difficult but is tedious to write. Once it is written and tested, it will be fine.
The Microsoft SQL Server documentation explains this: https://learn.microsoft.com/en-us/sql/t-sql/functions/translate-transact-sql
You also should be aware of the character encoding that is being used in Oracle and SQL Server. With the translate and replace functions you should be OK, but if you ever transfer data via files it will be important. I have described more of this at: http://www.thedatastudio.net/dodgy_characters.htm
Here's an example for the first few characters you want to translate:
select
replace
(
replace
(
replace
(
replace
(
'ABÇDÉFGHÍJÁBÇDÉFGHÍJ', 'Á', 'A'
), 'Ç', 'C'
), 'É', 'E'
), 'Í', 'I'
) as clean_keyword;
Just substitute your keyword for 'ABÇDÉFGHÍJÁBÇDÉFGHÍJ'.
The result is:
ABCDEFGHIJABCDEFGHIJ
There is an example on https://learn.microsoft.com/en-us/sql/t-sql/functions/translate-transact-sql too.

SQL Server full text search few words at a time

I understand that in order to find records that contain more than one word, I need to use AND between them, like this:
select *
from table1
where contains(name , '"bob" AND "marly"')
The problem is that my user doesn't type "bob" AND "marly", he types bob marly.
Before I start parsing this string and splitting it, is there a cleaner way to do this?
I'm using SQL Server 2012.
Thanks.
The below query will check if the string contains bob marly.
SELECT * FROM table1 WHERE name LIKE '%'+'bob'+'%'+'marly'+'%'

How to pull a string of numbers out of a table that are placed randomly

I'm attempting to isolate eight digits from a cell that contains other numbers as well as text and no rhyme or reason to where it is placed. An example return would look something like this:
will deliver 11/07 in USA at 12:30 with conf# 12345678
I need the conf# only, but it could be at the end, beginning, middle of the string and I don't know how to isolate it. I'm working in DB2 so I can't use functions such as PATINDEX or CHARINDEX, so what are my other option for pulling out only "12345678" regardless of where it is located?
While DB2 doesn't have PATINDEX or CHARINDEX, it does have LOCATE.
If your DB2 version supportx pureXML, you can use the regular expression support in XQuery, something like:
select xmlcast(
xmlquery(
' if (fn:matches( $YOURCOLUMN, "(^|.*[^\d])(\d{8})([^\d].*$|$)")) then fn:replace( $YOURCOLUMN,"(^|.*[^\d])(\d{8})([^\d].*$|$)","$2") else "" '
)
as varchar(20)
)
from YOURTABLE
This assumes that 8-digit sequence appears only once in the column. You may need to tweak the regex to support some border cases.

Matching exactly 2 characters in string - SQL

How can i query a column with Names of people to get only the names those contain exactly 2 “a” ?
I am familiar with % symbol that's used with LIKE but that finds all names even with 1 a , when i write %a , but i need to find only those have exactly 2 characters.
Please explain - Thanks in advance
Table Name: "People"
Column Names: "Names, Age, Gender"
Assuming you're asking for two a characters search for a string with two a's but not with three.
select *
from people
where names like '%a%a%'
and name not like '%a%a%a%'
Use '_a'. '_' is a single character wildcard where '%' matches 0 or more characters.
If you need more advanced matches, use regular expressions, using REGEXP_LIKE. See Using Regular Expressions With Oracle Database.
And of course you can use other tricks as well. For instance, you can compare the length of the string with the length of the same string but with 'a's removed from it. If the difference is 2 then the string contained two 'a's. But as you can see things get ugly real soon, since length returns 'null' when a string is empty, so you have to make an exception for that, if you want to check for names that are exactly 'aa'.
select * from People
where
length(Names) - 2 = nvl(length(replace(Names, 'a', '')), 0)
Another solution is to replace everything that is not an a with nothing and check if the resulting String is exactly two characters long:
select names
from people
where length(regexp_replace(names, '[^a]', '')) = 2;
This can also be extended to deal with uppercase As:
select names
from people
where length(regexp_replace(names, '[^aA]', '')) = 2;
SQLFiddle example: http://sqlfiddle.com/#!4/09bc6
select * from People where names like '__'; also ll work