MySQL LIKE IN()? - sql

My current query looks like this:
SELECT * FROM fiberbox f WHERE f.fiberBox LIKE '%1740 %' OR f.fiberBox LIKE '%1938 %' OR f.fiberBox LIKE '%1940 %'
I did some looking around and can't find anything similar to a LIKE IN() - I envision it working like this:
SELECT * FROM fiberbox f WHERE f.fiberbox LIKE IN('%140 %', '%1938 %', '%1940 %')
Any ideas? Am I just thinking of the problem the wrong way - some obscure command I've never seen.
MySQL 5.0.77-community-log

A REGEXP might be more efficient, but you'd have to benchmark it to be sure, e.g.
SELECT * from fiberbox where field REGEXP '1740|1938|1940';

Paul Dixon's answer worked brilliantly for me. To add to this, here are some things I observed for those interested in using REGEXP:
To Accomplish multiple LIKE filters with Wildcards:
SELECT * FROM fiberbox WHERE field LIKE '%1740 %'
OR field LIKE '%1938 %'
OR field LIKE '%1940 %';
Use REGEXP Alternative:
SELECT * FROM fiberbox WHERE field REGEXP '1740 |1938 |1940 ';
Values within REGEXP quotes and between the | (OR) operator are treated as wildcards. Typically, REGEXP will require wildcard expressions such as (.*)1740 (.*) to work as %1740 %.
If you need more control over placement of the wildcard, use some of these variants:
To Accomplish LIKE with Controlled Wildcard Placement:
SELECT * FROM fiberbox WHERE field LIKE '1740 %'
OR field LIKE '%1938 '
OR field LIKE '%1940 % test';
Use:
SELECT * FROM fiberbox WHERE field REGEXP '^1740 |1938 $|1940 (.*) test';
Placing ^ in front of the value indicates start of the line.
Placing $ after the value indicates end of line.
Placing (.*) behaves much like the % wildcard.
The . indicates any single character, except line breaks. Placing .
inside () with * (.*) adds a repeating pattern indicating any number
of characters till end of line.
There are more efficient ways to narrow down specific matches, but that requires more review of Regular Expressions. NOTE: Not all regex patterns appear to work in MySQL statements. You'll need to test your patterns and see what works.
Finally, To Accomplish Multiple LIKE and NOT LIKE filters:
SELECT * FROM fiberbox WHERE field LIKE '%1740 %'
OR field LIKE '%1938 %'
OR field NOT LIKE '%1940 %'
OR field NOT LIKE 'test %'
OR field = '9999';
Use REGEXP Alternative:
SELECT * FROM fiberbox WHERE field REGEXP '1740 |1938 |^9999$'
OR field NOT REGEXP '1940 |^test ';
OR Mixed Alternative:
SELECT * FROM fiberbox WHERE field REGEXP '1740 |1938 '
OR field NOT REGEXP '1940 |^test '
OR field NOT LIKE 'test %'
OR field = '9999';
Notice I separated the NOT set in a separate WHERE filter. I experimented with using negating patterns, forward looking patterns, and so on. However, these expressions did not appear to yield the desired results. In the first example above, I use ^9999$ to indicate exact match. This allows you to add specific matches with wildcard matches in the same expression. However, you can also mix these types of statements as you can see in the second example listed.
Regarding performance, I ran some minor tests against an existing table and found no differences between my variations. However, I imagine performance could be an issue with bigger databases, larger fields, greater record counts, and more complex filters.
As always, use logic above as it makes sense.
If you want to learn more about regular expressions, I recommend www.regular-expressions.info as a good reference site.

Regexp way with list of values
SELECT * FROM table WHERE field regexp concat_ws("|",
"111",
"222",
"333");

You can create an inline view or a temporary table, fill it with you values and issue this:
SELECT *
FROM fiberbox f
JOIN (
SELECT '%1740%' AS cond
UNION ALL
SELECT '%1938%' AS cond
UNION ALL
SELECT '%1940%' AS cond
) с
ON f.fiberBox LIKE cond
This, however, can return you multiple rows for a fiberbox that is something like '1740, 1938', so this query can fit you better:
SELECT *
FROM fiberbox f
WHERE EXISTS
(
SELECT 1
FROM (
SELECT '%1740%' AS cond
UNION ALL
SELECT '%1938%' AS cond
UNION ALL
SELECT '%1940%' AS cond
) с
WHERE f.fiberbox LIKE cond
)

Sorry, there is no operation similar to LIKE IN in mysql.
If you want to use the LIKE operator without a join, you'll have to do it this way:
(field LIKE value OR field LIKE value OR field LIKE value)
You know, MySQL will not optimize that query, FYI.

Just note to anyone trying the REGEXP to use "LIKE IN" functionality.
IN allows you to do:
field IN (
'val1',
'val2',
'val3'
)
In REGEXP this won't work
REGEXP '
val1$|
val2$|
val3$
'
It has to be in one line like this:
REGEXP 'val1$|val2$|val3$'

This would be correct:
SELECT * FROM table WHERE field regexp concat_ws("|",(
"111",
"222",
"333"
));

Flip operands
'a,b,c' like '%'||field||'%'

Just a little tip:
I prefer to use the variant RLIKE (exactly the same command as REGEXP) as it sounds more like natural language, and is shorter; well, just 1 char.
The "R" prefix is for Reg. Exp., of course.

You can get desired result with help of Regular Expressions.
SELECT fiberbox from fiberbox where fiberbox REGEXP '[1740|1938|1940]';
We can test the above query please click SQL fiddle
SELECT fiberbox from fiberbox where fiberbox REGEXP '[174019381940]';
We can test the above query please click SQL fiddle

You can use like this too:
SELECT
*
FROM
fiberbox f
JOIN (
SELECT
substring_index( substring_index( '1740,1938,1940', ',', help_topic_id + 1 ), ',',- 1 ) AS sub_
FROM
mysql.help_topic
WHERE
help_topic_id <(
length( '1740,1938,1940' )- length(
REPLACE ( '1740,1938,1940', ',', '' ))+ 1
) AS b
) ON f.fiberBox LIKE concat('%',
b.sub_,
'%')

You can use like this too:
SELECT * FROM fiberbox WHERE fiber IN('140 ', '1938 ', '1940 ')

Related

SQL find '%' between %s

I need to find (exclude in fact) any results that contain '%' sign, wherever in a string field. That would mean ... WHERE string LIKE '%%%'. Googling about escaping gave me the following ideas. The first throws syntax error, the second returns rows but there are records actually contain '%'.
1st:
SELECT * FROM table
WHERE string NOT LIKE '%!%%' ESCAPE '!'
///tried with different escape characters
2nd:
SELECT * FROM table
WHERE string NOT LIKE '%[%]%'
Trying on GCP BigQuery.
Try:
SELECT *
FROM table
WHERE string NOT LIKE '%!%%' {ESCAPE '!'}
With curly braces as shown in microsoft sql server docs
Or also:
WITH indata(s) AS (
SELECT 'not excluded'
UNION ALL SELECT '%excluded'
UNION ALL SELECT 'Ex%cluded'
UNION ALL SELECT 'Excluded%'
)
SELECT * FROM indata WHERE INSTR(s,'%') = 0;
-- out s
-- out --------------
-- out not excluded
find (exclude in fact) any results that contain '%'
Consider below simple approach
select *
from your_table
where not regexp_contains(string , '%')

RTRIM a pattern, not all the characters

I have strings like these:
JAPANNO
CHINANO
BROOKLYNNO
I want to delete the 'NO' from all of the strings. I tried this:
rtrim(string, 'NO')
but for example in the case of BROOKLYNNO, I got this:
BROOKLY.
It deletes all the N-s from the end. How can I delete just the pattern of 'NO'?
I know I can do it with substr, but the TechOnTheNet says there is a way to delete a pattern with RTRIM, and I really want to know the way.
Thank you in advance!
We may consider doing a regex replacement via REGEXP_REPLACE, if you give a context for when NO should be removed and when it should not. For example, if you wanted to remove NO from the ends of your strings only, we could do the following:
UPDATE yourTable
SET col = REGEXP_REPLACE(col, 'no$', '', 1, 0, 'i');
You could use TRIM(TRAILING ... FROM):
SELECT col_name,
REPLACE(TRIM(TRAILING '^' FROM REPLACE(col_name, 'NO', '^')), '^', 'NO') AS res
FROM tab;
DBFiddle Demo
Have a look at this, maybe?
declare #string varchar(150) = 'BROOKLYNNO'
select LEN(#string)
select LEFT(#string,(LEN(#string)-2))
You can then update your column with the output from the final select statement, which trims the last two letters from the string.
I suppose it might be worth asking how you're getting the data that you have here, strings appended with "NO"?

What's the equivalent of Excel's `left(find(), -1)` in BigQuery?

I have names in my dataset and they include parentheses. But, I am trying to clean up the names to exclude those parentheses.
Example: ABC Company (Somewhere, WY)
What I want to turn it into is: ABC Company
I'm using standard SQL with google big query.
I've done some research and I know big query has left(), but I do not know the equivalent of find(). My plan was to do something that finds the ( and then gives me everything to the left of -1 characters from the (.
My plan was to do something that finds the ( and then gives me everything to the left of -1 characters from the (.
Good plan! In BigQuery Standard SQL - equivalent of LEFT is SUBSTR(value, position[, length]) and equivalent of FIND is STRPOS(value1, value2)
With this in mind your query can look like (which is exactly as you planned)
#standardSQL
WITH names AS (
SELECT 'ABC Company (Somewhere, WY)' AS name
)
SELECT SUBSTR(name, 1, STRPOS(name, '(') - 1) AS clean_name
FROM names
Usually, string functions are less expensive than regular expression functions, so if you have pattern as in your example - you should go with above version
But in more generic cases, when pattern to clean is more dynamic like in Graham's answer - you should go with solution in Graham's answer
Just use REGEXP_REPLACE + TRIM. This will work with all variants (just not nested parentheses):
#standardSQL
WITH
names AS (
SELECT
'ABC Company (Somewhere, WY)' AS name
UNION ALL
SELECT
'(Somewhere, WY) ABC Company' AS name
UNION ALL
SELECT
'ABC (Somewhere, WY) Company' AS name)
SELECT
TRIM(REGEXP_REPLACE(name,r'\(.*?\)',''), ' ') AS cleaned
FROM
names
Use REGEXP_EXTRACT:
SELECT
RTRIM(REGEXP_EXTRACT(names, r'([^(]*)')) AS new_name
FROM yourTable
The regex used here will greedily consume and match everything up until hitting an opening parenthesis. I used RTRIM to remove any unwanted whitespace picked up by the regex.
Note that this approach is robust with respect to the edge case of an address record not having any term with parentheses. In this case, the above query would just return the entire original value.
I can't test this solution at the moment, but you can combine SUBSTR and INSTR. Like this:
SELECT CASE WHEN INSTR(name, '(') > 0 THEN SUBSTR( name, 1, INSTR(name, '(') ) ELSE name END as name FROM table;

SQL special group by on list of strings ending with *

I would like to perform a "special group by" on strings with SQL language, some ending with "*". I use postgresql.
I can not clearly formulate this problem, even if I have partially solved it, with select, union and nested queries which are not elegant.
For exemple :
1) INPUT : I have a list of strings :
thestrings
varchar(9)
--------------
1000
1000-0001
1000-0002
2000*
2000-0001
2000-0002
3000*
3000-00*
3000-0001
3000-0002
2) OUTPUT : That I would like my "special group by" return :
1000
1000-0001
1000-0002
2000*
3000*
Because 2000-0001 and 2000-0002 are include in 2000*,
and because 3000-00*, 3000-0001 and 3000-0002 are includes in 3000*
3) SQL query I do :
SELECT every strings ending with *
UNION
SELECT every string where the begining NOT IN (SELECT every string ending with *) <-- with multiple inelegant left functions and NOT IN subqueries
4) That what I'm doing return :
1000
1000-0001
1000-0002
2000*
3000*
3000-00* <-- the problem
The problem is : 3000-00* staying in my result.
So my question is :
How can I generalize my problem? to remove all string who have a same begining string in the list (ending with *) ?
I think of regular expressions, but how to pass a list from a select in a regex ?
Thanks for help.
Select only strings for which no master string exists in the table:
select str
from mytable
where not exists
(
select *
from mytable master
where master.str like '%*'
and master.str <> mytable.str
and rtrim(mytable.str, '*') like rtrim(master.str, '*') || '%'
);
Assuming that only one general pattern can match any given string, the following should do what you want:
select coalesce(tpat.thestring, t.thestring) as thestring
from t left join
t tpat
on t.thestring like replace(tpat.thestring, '*', '%') and
t.thestring <> tpat.thestring
group by coalesce(tpat.thestring, t.thestring);
However, that is not your case. However, you can adjust this with distinct on:
select distinct on (t.thestring) coalesce(tpat.thestring, t.thestring)
from t left join
t tpat
on t.thestring like replace(tpat.thestring, '*', '%') and
t.thestring <> tpat.thestring
order by t.thestring, length(tpat.thestring)

Select statement with column contains '%'

I want to select names from a table where the 'name' column contains '%' anywhere in the value. For example, I want to retrieve the name 'Approval for 20 % discount for parts'.
SELECT NAME FROM TABLE WHERE NAME ... ?
You can use like with escape. The default is a backslash in some databases (but not in Oracle), so:
select name
from table
where name like '%\%%' ESCAPE '\'
This is standard, and works in most databases. The Oracle documentation is here.
Of course, you could also use instr():
where instr(name, '%') > 0
One way to do it is using replace with an empty string and checking to see if the difference in length of the original string and modified string is > 0.
select name
from table
where length(name) - length(replace(name,'%','')) > 0
Make life easy on yourselves and just use REGEXP_LIKE( )!
SQL> with tbl(name) as (
select 'ABC' from dual
union
select 'E%FS' from dual
)
select name
from tbl
where regexp_like(name, '%');
NAME
----
E%FS
SQL>
I read the documentation mentioned by Gordon. The relevent sentence is:
An underscore (_) in the pattern matches exactly one character (as opposed to one byte in a multibyte character set) in the value
Here was my test:
select c
from (
select 'a%be' c
from dual) d
where c like '_%'
The value a%be was returned.
While the suggestions of using instr() or length in the other two answers will lead to the correct answer, they will do so slowly. Filtering on function results simply take longer than filtering on fields.