How to use dash (-) character in LIKE query - sql

In my database I have to filter records where name ends with -N,
but when I make the WHERE clause like in the following query it returns me no records, because - is a wild card character.
I am using this query in Oracle database:
select * from product where productname like '%-N'
but the database has records that end with this product name

At first I thought that Oracle allows to specify a range [a-z] in the LIKE operator, and that needs to treat - in a special way. So, my suggestion was to escape the dash:
select * from product where productname like '%\-N' ESCAPE '\'
https://docs.oracle.com/cd/B13789_01/server.101/b10759/conditions016.htm
On the other hand, as #Amadan correctly said in the comment, Oracle's LIKE operator only recognises two wildcard characters: _ and %.
It means that escaping the - should not change anything.
Which means that most likely the dash symbol in the query is not the same dash symbol that you have in your table. There are many-many-many different dashes and hyphens in unicode. Here are the most common. Hyphen-Minus (0x002D), En-Dash (0x2013, Alt+0150), Em-Dash (0x2014, Alt+0151).
- – —

'-' is not a wildcard for like (as mentioned elsewhere).
So, start with names that end in 'N':
where productname like '%N'
Does this do what you want?
If not, you can then go to a regular expression. For instance, to find anything other than a digit or letter before the 'N':
where regexp_like(productname, '[^a-zA-Z0-9]N$')
You can refine regexp_like() if this doesn't return what you expect.

Your query should work as expected. Here's an example:
WITH cteData as (SELECT 'ABC-N' AS PRODUCTNAME FROM DUAL UNION ALL
SELECT 'ABCN' AS PRODUCTNAME FROM DUAL UNION ALL
SELECT 'ABC' AS PRODUCTNAME FROM DUAL UNION ALL
SELECT 'DEFGHI-J-K-L-M-N' AS PRODUCTNAME FROM DUAL UNION ALL
SELECT 'DEFGHI-J-K-L-M-' AS PRODUCTNAME FROM DUAL UNION ALL
SELECT 'MY DOG HAS FLEAS' AS PRODUCTNAME FROM DUAL)
SELECT *
FROM cteData
WHERE PRODUCTNAME LIKE '%-N';
As expected, this returns:
ABC-N
DEFGHI-J-K-L-M-N
If you're not getting the results you expected there's something else going on that you haven't showed us.
SQLFiddle here
Best of luck.

Related

ORACLE: How to use regexp_like to find a string with single quotes between two characters?

I need to query the DB for all records that have two single quite between characters. Example : We've, who's.
I have the regex https://regex101.com/r/6MtB9j/1 but it doesn't work with REGEXP_LIKE.
Tried this
SELECT content
FROM MyTable
WHERE REGEXP_LIKE (content, '(?<=[a-zA-Z])''(?=[a-zA-Z])')
Appreciate the help!
Oracle regex does not support lookarounds.
You do not actually need lookaround in this case, you can use
SELECT content
FROM MyTable
WHERE REGEXP_LIKE (content, '[a-zA-Z]''[a-zA-Z]')
This will work since REGEXP_LIKE only attempts one match, and if there is a match, it returns true, otherwise, false (eventually, fetching a record or not).
Lookarounds are useful in case you need to replace or extract values, when matches may overlap.
If you just need a single quote in a string, you can use:
where content like '%''%'
If they specifically need to be letters, then you need a regular expression:
regexp_like(content, '[a-zA-Z][''][a-zA-Z]')
or:
regexp_like(content, '[a-zA-Z]\'[a-zA-Z]')
If I understand well, you may need something like
regexp_count(content, '[a-zA-Z]''[a-zA-Z]') = 2.
For example, this
with myTable(content) as
(
select q'[what's]' from dual union all
select q'[who's, what's]' from dual union all
select q'[who's, what's, I'm]' from dual
)
select *
from myTable
where regexp_count(content, '[a-zA-Z]''[a-zA-Z]') = 2
gives
CONTENT
------------------
who's, what's

How to remove leftmost group of numbers from string in Oracle SQL?

I have a string like T_44B56T4 that I'd like to make T_B56T4. I can't use positional logic because the string could instead be TE_2BMT that I'd like to make TE_BMT.
What is the most concise Oracle SQL logic to remove the leftmost grouping on consecutive numbers from the string?
EDIT:
regex_replace is unavailable but I have LTRIM,REPLACE,SUBSTR, etc.
would this fit the bill? I am assuming there are alphanumeric characters, then underscore, and then the numbers you want to remove followed by anything.
select regexp_replace(s, '^([[:alnum:]]+)_\d*(.*)$', '\1_\2')
from (
select 'T_44B56T4' s from dual union all
select 'TXM_1JK7B' from dual
)
It uses regular expressions with matched groups.
Alphanumeric characters before underscore are matched and stored in first group, then underscore followed by 0-many digits (it will match as many digits as possible) followed by anything else that is stored in second group.
If we have a match, the string will be replaced by content of the first group followed by underscore and content of the second group.
if there is no match, the string will not be changed.
It seems that you must use standard string functions, as regular expression functions are not available to you. (Comment under Gordon Linoff's answer; it would help if you would add the same at the bottom of your original question, marked clearly as EDIT).
Also, it seems that the input will always have at least one underscore, and any digits that must be removed will always be immediately after the first underscore.
If so, here is one way you could solve it:
select s, substr(s, 1, instr(s, '_')) ||
ltrim(substr(s, instr(s, '_') + 1), '0123456789') as result
from (
select 'T_44B56T4' s from dual union all
select 'TXM_1JK7B' from dual union all
select '34_AB3_1D' from dual
)
S RESULT
--------- ------------------
T_44B56T4 T_B56T4
TXM_1JK7B TXM_JK7B
34_AB3_1D 34_AB3_1D
I added one more test string, to show that only digits immediately following the first underscore are removed; any other digits are left unchanged.
Note that this solution would very likely be faster than regexp solutions, too (assuming that matters; sometimes it does, but often it doesn't).
If I understand correctly, you can use regexp_replace():
select regexp_replace('T_44B56T4', '_[0-9]+', '_')
Here is a db<>fiddle with your two examples.
Note: Your questions says the left most grouping, but the examples all have the number following an underscore, so the underscore seems to be important.
EDIT:
If you really just want the first string of digits replaced without reference to the underscore:
select regexp_replace(code, '[0-9]+', '', 1, 1)
from (select 'T_44B56T4' as code from dual union all select 'TE_2BMT' from dual ) t

get everything before a string including itself oracle

I need to get everything before a string including itself and replace it with something else after that. For example, if I have a value in column as 28/29/81/732536/1496071 then I want to select everything before 81 including itself, i.e I want 28/29/81 from it and replace it with some other string. I have tried the below, but I am getting only 28/29.
SELECT SUBSTR(eda.ATTRIBUTE_VALUE, 0, INSTR(eda.ATTRIBUTE_VALUE, '81')-2) AS output, ATTRIBUTE_VALUE
FROM EVENT_DYNAMIC_ATTRIBUTE eda
The solution will have to work when the "token" ( the '81' in your example ) appears between two slashes, or right at the beginning of the string and before a slash, or right after the last slash at the end of the string. It should not match if '81' appears as part of a "token" (between slashes or before the first or after the last slash). Also, if the "token" appears more than once, it should be replaced (with everything before it) only once, and if it doesn't appear at all, then the original string should be unchanged.
If these are the rules, then you can do something like I show below. If any of the rules are different, the solution can be modified to accommodate.
I created a few input strings to test all these cases in a WHERE clause. I also created the "search token" and the "replacement text" in a second subquery in the WITH clause. The entire WITH clause should be replaced - it is not part of the solution, it is only for my own testing. In the main query you should use your actual table and column names (and/or hardcoded text).
I use REGEXP_REPLACE to find the token and replace it and everything that comes before it (but not the slash after it, if there is one) with the replacement text. I must be careful with that slash after the search token; I use a backreference in the replacement string in REGEXP_REPLACE for that purpose.
with
event_dynamic_attribute ( attribute_value ) as (
select '28/29/81/732536/1496071' from dual union all
select '29/33/530813/340042/88' from dual union all
select '81/6883/3902/81/993' from dual union all
select '123/45/6789/81' from dual
),
substitution ( token, replacement ) as (
select '81', 'mathguy is great' from dual
)
select attribute_value,
regexp_replace (attribute_value, '(^|.*?/)' || token || '(/|$)',
replacement || '\2', 1, 1) new_attrib_value
from event_dynamic_attribute cross join substitution
;
ATTRIBUTE_VALUE NEW_ATTRIB_VALUE
----------------------- ----------------------------------------
28/29/81/732536/1496071 mathguy is great/732536/1496071
29/33/530813/340042/88 29/33/530813/340042/88
81/6883/3902/81/993 mathguy is great/6883/3902/81/993
123/45/6789/81 mathguy is great
you can use something like this:
SELECT 'STRING_TO_REPLACE_WITH' || SUBSTR(eda.ATTRIBUTE_VALUE, INSTR(eda.ATTRIBUTE_VALUE, '81') + 2) AS output
FROM EVENT_DYNAMIC_ATTRIBUTE eda;

How to match and replace sections of a string in SQL

I'm pulling a list of popular sites from my database, but I want to combine results that are from the same domain. I've been able to do this partially by using :
REGEXP_REPLACE(site, '%|^www([123])?\.|^m\.|^mobile\.|^desktop\.')) as site
so that "www.facebook.com" and "facebook.com" or "m.facebook.com"
- all of which appear in the database - are treated as the same when I do a select distinct.
However, I want to take this a step further by writing an expression that looks at each string between periods. If a match is found consecutively in three or more strings between periods, then I want to treat those as the same. I simply can't predict every possible string that could come before "facebook.com", or any other site.
So for example:
"my.careerone.com.au" and
"careerone.com.au" match in three places.
Or "yahoo.realestate.com.au" and "rs.realestate.com.au" match in three places.
Any ideas on how to achieve this?
#David code will work in Vertica as well but not so well performance wise maybe.
You can use Vertica's own internal functions such as TRIM & REGEXP_REPLACE.
After borrowing #David Faber reg exp i endend-up with this.
select TRIM(LEADING '.' from REGEXP_REPLACE(col_name,'^.*((\.[^.]+){3})$', '\1')) AS fixed_dn from table_name;
I don't have Vertica available so I tested this in Oracle SQL (which does have REGEXP_REPLACE() that is similar to Vertica's). Not sure what the CTE syntax would be in Vertica but you'll be querying against a table anyway:
WITH d1 AS (
SELECT 'my.careerone.com.au' AS domain_nm FROM dual
UNION ALL
SELECT 'careerone.com.au' FROM dual
UNION ALL
SELECT 'yahoo.realestate.com.au' FROM dual
UNION ALL
SELECT 'rs.realestate.com.au' FROM dual
)
SELECT domain_nm, TRIM('.' FROM REGEXP_REPLACE(domain_nm, '^.*((\.[^.]+){3})$', '\1')) AS domain_nm_fix
FROM d1;
What REGEXP_REPLACE() does here is trim the highest level subdomains from the domain name, if it exists and if there are more than 3 levels. If there are only three levels then nothing will be replaced as the regex won't match -- that is why the leading . character then has to be trimmed. So, for example, careerone.com.au will be unaltered, while my.careerone.com.au will be changed to .careerone.com.au by the REGEXP_REPLACE(), from which the leading . then has to be trimmed.

RegEx: Repeated identical vowels in a string - Oracle SQL

I need to only display those strings (name of manufacturers) that contain 2 or more identical vowels in Oracle11g. I am using a RegEx to find this.
SELECT manuf_name "Manufacturer", REGEXP_LIKE(manuf_name,'([aeiou])\2') Counter FROM manufacturer;
For example:
The RegEx accepts
OtterBox
Abca
abcA
The RegEx rejects
Samsung
Apple
I am not sure how to proceed ahead.
I think you want something like this:
WITH mydata AS (
SELECT 'OtterBox' AS manuf_name FROM dual
UNION ALL
SELECT 'Apple' FROM dual
UNION ALL
SELECT 'Samsung' FROM dual
)
SELECT * FROM mydata
WHERE REGEXP_LIKE(manuf_name, '([aeiou]).*\1', 'i');
I am not sure why you used \2 as a backreference instead of \1 -- \2 doesn't refer to anything in this regex. Also, note the wildcard and quantifier .* to indicate that there can be any number of any character between the first occurrence of the vowel and the second. Third, note the 'i' parameter to indicate a case-insensitive search (which I think is what you want since you say that the regex should match "OtterBox").
SQL Fiddle here.
David yours wasn't quite working for me. What about this?
\w*([aeiou])\w*\1+\w*
https://regex101.com/r/eE3iC2/3
EDIT: updated one per suggestions:
.*([aeiou]).*\1.*
https://regex101.com/r/eE3iC2/5