Regular Expression "a{2}" not working - sql

I have a record with emp_name = "Rajat" and it is not getting returned.
My query is -
SELECT * FROM employees WHERE emp_name regexp "a{2}"
Please explain why it is not working

Your regex search for aa
https://www.regex101.com/r/yA1qA2/1
What you need is:
a.*a
https://www.regex101.com/r/hA5yS8/1

a{2} means two consecutive as. Your string doesn't match it.
To match a string with two a characters that might not be consecutive characters you should use a.*a.

a{2} will match two straight a characters, i.e.: aa.
To match 2 alternate a characters you can use:
select * from employees where emp_name REGEXP "^.*a.*a.*$";

I think all you need is LIKE with %:
LIKE - Simple pattern matching
Character Description
% Matches any number of characters, even zero
characters
_ Matches exactly one character
LIKE pattern match... succeeds only if the pattern matches the entire value
So, you can use a more specific
select * from employees where emp_name like '%a_a%';
Or, a more "generic" (allowing more characters than 1 between a and a:
select * from employees where emp_name like '%a%a%';
However, since in MySQL, SQL patterns are case-insensitive by default, so, you might have to use REGEXP with BINARY to narrow down your search results:
Prior to MySQL 3.23.4, REGEXP is case sensitive.
From MySQL 3.23.4 on, if you really want to force a REGEXP comparison
to be case sensitive, use the BINARY keyword to make one of the
strings a binary string.
SELECT * FROM employees WHERE emp_name REGEXP BINARY 'a.*a';

Related

Find phone numbers with unexpected characters using SQL in Oracle?

I need to find rows where the phone number field contains unexpected characters.
Most of the values in this field look like:
123456-7890
This is expected. However, we are also seeing character values in this field such as * and #.
I want to find all rows where these unexpected character values exist.
Expected:
Numbers are expected
Hyphen with numbers is expected (hyphen alone is not)
NULL is expected
Empty is expected
Tried this:
WHERE phone_num is not like ' %[0-9,-,' ' ]%
Still getting rows where phone has numbers.
from https://regexr.com/3c53v address you can edit regex to match your needs.
I am going to use example regex for this purpose
select * from Table1
Where NOT REGEXP_LIKE(PhoneNumberColumn, '^[+]*[(]{0,1}[0-9]{1,4}[)]{0,1}[-\s\./0-9]*$')
You can use translate()
...
WHERE translate(Phone_Number,'a1234567890-', 'a') is NOT NULL
This will strip out all valid characters leaving behind the invalid ones. If all the characters are valid, the result would be NULL. This does not validate the format, for that you'd need to use REGEXP_LIKE or something similar.
You can use regexp_like().
...
WHERE regexp_like(phone_num, '[^ 0123456789-]|^-|-$')
[^ 0123456789-] matches any character that is not a space nor a digit nor a hyphen. ^- matches a hyphen at the beginning and -$ on the end of the string. The pipes are "ors" i.e. a|b matches if pattern a matches of if pattern b matches.
Oracle has REGEXP_LIKE for regex compares:
WHERE REGEXP_LIKE(phone_num,'[^0-9''\-]')
If you're unfamiliar with regular expressions, there are plenty of good sites to help you build them. I like this one

What is this Oracle regexp matching in this production code?

Here's the code that is in production:
dynamic_sql := q'[ with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]]' || q'[') AND
user_code not in ('A','E','I')
order by 1]';
Start at the beginning and search bizz_buzz
Match any one character that is NOT Z
Match any two characters that are not Y6
What's the ']' after the 6?
Then what?
I think that StackOverflow's formatting is causing some of the confusion in the answers. Oracle has a syntax for a string literal, q'[...]', which means that the ... portion is to be interpreted exactly as-is; so for instance it can include single quotes without having to escape each one individually.
But the code formatting here doesn't understand that syntax, so it is treating each single-quote as a string delimiter, which makes the result look different that how Oracle really sees it.
The expression is concatenating two such string literals together. (I'm not sure why - it looks like it would be possible to write this as a single string literal with no issues.) As pointed out in another answer/comment, the resulting SQL string is actually:
with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]') AND
user_code not in ('A','E','I')
order by 1
And also as pointed out in another answer, the [^Y6] portion of the regex matches a single character, not two. So this expression should simply match any string whose first character is not 'Z' and whose second character is neither 'Y' nor '6'.
When not in couples ] means... Well... Itself:
^[^Z][^Y6]]/
^ assert position at start of the string
[^Z] match a single character not present in the list below
Z the literal character Z (case sensitive)
[^Y6] match a single character not present in the list below
Y6 a single character in the list Y6 literally (case sensitive)
] matches the character ] literally
Start at the beginning and search bizz_buzz
Match any one character that is NOT Z
Match any two one characters that is not Y or 6
What's the ']' after the 6? it's a ]
I'm afraid I have to post this here as the comment section is inappropriate for the formatting required. After your edit above that shows the entire statement, I ran this to see what the string ends up being:
select q'[ with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]]' || q'[') AND
user_code not in ('A','E','I')
order by 1]' txt
from dual;
It ended up yielding this:
with cte as
select user_id,
user_name
from user_table
where regexp_like (bizz_buzz,'^[^Z][^Y6]') AND
user_code not in ('A','E','I')
order by 1
It is apparent now that the closing bracket and quote at the end of the regex belong to the first alternate quote string and not to the regex. This is concatenating 2 alternate quoted strings which is a tad confusing as it sure looked like part of the regex. If anything you are learning the importance of comments for the poor person behind you! Please comment this accordingly when you are done figuring this out. Even include a link to this post.

Matching exactly 2 characters in string - SQL

How can i query a column with Names of people to get only the names those contain exactly 2 “a” ?
I am familiar with % symbol that's used with LIKE but that finds all names even with 1 a , when i write %a , but i need to find only those have exactly 2 characters.
Please explain - Thanks in advance
Table Name: "People"
Column Names: "Names, Age, Gender"
Assuming you're asking for two a characters search for a string with two a's but not with three.
select *
from people
where names like '%a%a%'
and name not like '%a%a%a%'
Use '_a'. '_' is a single character wildcard where '%' matches 0 or more characters.
If you need more advanced matches, use regular expressions, using REGEXP_LIKE. See Using Regular Expressions With Oracle Database.
And of course you can use other tricks as well. For instance, you can compare the length of the string with the length of the same string but with 'a's removed from it. If the difference is 2 then the string contained two 'a's. But as you can see things get ugly real soon, since length returns 'null' when a string is empty, so you have to make an exception for that, if you want to check for names that are exactly 'aa'.
select * from People
where
length(Names) - 2 = nvl(length(replace(Names, 'a', '')), 0)
Another solution is to replace everything that is not an a with nothing and check if the resulting String is exactly two characters long:
select names
from people
where length(regexp_replace(names, '[^a]', '')) = 2;
This can also be extended to deal with uppercase As:
select names
from people
where length(regexp_replace(names, '[^aA]', '')) = 2;
SQLFiddle example: http://sqlfiddle.com/#!4/09bc6
select * from People where names like '__'; also ll work

regexp for all accented characters in Oracle

I am trying to find data that has accented characters. I've tried this:
select *
from xml_tmp
where regexp_like (XMLTYpe.getClobVal(xml_tmp.xml_data), unistr('\0090'))
And it works. It finds all records where the XML data field contains É. The problem is that it only matches the upper-case E with an accent. I tried to write a more generic query to find ALL data with accented vowels (a, e, i, o, u, upper and lowercase, with any accents) using equivalence classes. I wanted a regex to match only accented vowels, but I'm not sure how to get it, as equivalence classes such as [[=e=]] match all e's (with or without accents).
Also, this does not actually work:
select *
from xml_tmp
where regexp_like (XMLTYpe.getClobVal(xml_data),'É');
(using Oracle 10g)
How about
SELECT *
FROM xml_tmp
WHERE REGEXP_LIKE
( REGEXP_REPLACE
( XMLTYpe.getClobVal(xml_tmp.xml_data),
'[aeiouAEIOU]',
'-'
)
'[[=a=][=e=][=i=][=o=][=u=]]'
)
;
? That will eliminate any unaccented vowels before performing the REGEXP_LIKE.
(It's ugly, I know. But it should work.)
After some more experimenting, I have found that this seems to work ok:
select *
from xml_tmp
where regexp_like(XMLTYpe.getClobVal(xml_data),'[^[:graph:][:space:]]')
I had thought that [:graph:] would include all upper and lower case characters, with or without accents, but it seems that it only matches unaccented characters.
Further experimentation shows that this might not work in all cases. Try these queries:
select *
from dual
where regexp_like (unistr('\0090'),'[^[:graph:][:space:]]');
DUMMY
-------
X
(the match succeeded)
So it looks like the character that's been causing me trouble matches this pattern.
select *
from dual
where regexp_like ('É','[^[:graph:][:space:]]');
DUMMY
-------
(the match failed)
When I try to run this query with the accented E as copied-and-pasted, the match fails! I guess whatever I copied-and-pasted is actually different. Ugh, I think I now hate working with changing character encodings.

SQL Wildcard Question

Sorry I am new to working with databases - I am trying to perform a query
that will get all of the characters that are similar to a string in SQL.
For example,
If I am looking for all users that begin with a certain string, something like S* or Sm* that would return "Smith, Smelly, Smiles, etc..."
Am I on the right track with this?
Any help would be appreciated, and Thanks in advance!
Sql can't do this ... placing the percentage symbol in the middle.
SELECT * FROM users WHERE last_name LIKE 'S%th'
you would need to write a where clause and an and clause.
SELECT * FROM users WHERE last_name LIKE 'S%' and last_name LIKE '%th'
The LIKE operator is what you are searching for, so for your example you would need something like:
SELECT *
FROM [Users]
WHERE LastName LIKE 'S%'
The % character is the wild-card in SQL.
to get all the users with a lastname of smith
SELECT *
FROM [Users]
WHERE LastName ='Smith'
to get all users where the lastname contains smith do this, that will also return blasmith, smith2 etc etc
SELECT *
FROM [Users]
WHERE LastName LIKE '%Smith%'
If you want everything that starts with smith do this
SELECT *
FROM [Users]
WHERE LastName LIKE 'Smith%'
Standard (ANSI) SQL has two wildcard characters for use with the LIKE keyword:
_ (underscore). Matches a single occurrence of any single character.
% (percent sign). Matches zero or more occurrences of any single character.
In addition, SQL Server extends the LIKE wildcard matching to include character set specification, rather like a normal regular expresion character set specifier:
[character-set] Matches a single character from the specified set
[^character-set] Matches a single character not in the specified set.
Character sets may be specified in the normal way as a range as well:
[0-9] matches any decimal digit.
[A-Z] matches any upper-case letter
[^A-Z0-9-] matches any character that isn't a letter, digit or hyphen.
The semantics of letter matching of course, a dependent on the collation sequence in use. It may or may not be case-sensitive.
Further, to match a literal left square bracket ('[]'), you must use the character range specifier. You won't get a syntax error, but you won't get a match, either.
where x.field like 'x[[][0-9]]'
will match text that looks like 'x[0]' , 'x[8]', etc. But
where 'abc[x' like 'abc[x'
will always be false.
you might also like the results of SOUNDEX, depending on your preference for last name similarity.
select *
from [users]
where soundex('lastname') = soundex( 'Smith' )
or upper(lastname) like 'SM%'
Your question isn't entirely clear.
If you want all the users with last name Smith, a regular search will work:
SELECT * FROM users WHERE last_name = 'Smith'
If you want all the users beginning with 'S' and ending in 'th', you can use LIKE and a wildcard:
SELECT * FROM users WHERE last_name LIKE 'S%th'
(Note the standard SQL many wildcard of '%' rather than '*').
If you really want a "sounds like" match, many databases support one or more SOUNDEX algorithm searches. Check the documentation for your specific product (which you don't mention in the question).