Where clause to exclude specific email domains

Where clause to exclude specific email domains - sql

I have a list of emails and I want to write a where statement, to exclude rows that only contain the email domains %#icloud.com or %#mac.com
For example, emails list looks like this:
abc#gmail.com; 123#hotmail.com
123#outlook.com; abc#icloud.com
123#icloud.com;
123#icloud.com; abc#mac.com
the desired output should look like this:
abc#gmail.com; 123#hotmail.com
123#outlook.com; abc#icloud.com (this row should be returned because it also contains '#outlook.com' which isn't on my exclude list)

Given negative lookaheads are not supported, away to achieve that is two remove the unwanted matched, and then look for an "any email left" match:
SELECT column1
,REGEXP_REPLACE(column1, '#((icloud)|(mac))\\.com', '') as cleaned
,REGEXP_LIKE(cleaned, '.*#.*\\.com.*') as logic
FROM VALUES
('abc#gmail.com; 123#hotmail.com'),
('123#outlook.com; abc#icloud.com'),
('123#icloud.com;'),
('123#icloud.com; abc#mac.com');
gives:
COLUMN1
CLEANED
LOGIC
abc#gmail.com; 123#hotmail.com
abc#gmail.com; 123#hotmail.com
TRUE
123#outlook.com; abc#icloud.com
123#outlook.com; abc
TRUE
123#icloud.com;
123;
FALSE
123#icloud.com; abc#mac.com
123; abc
FALSE
which can be merged into one line:
,REGEXP_LIKE(REGEXP_REPLACE(column1, '#((icloud)|(mac))\\.com'), '.*#.*\\.com.*') as logic

If you prefer a more vanilla approach to Simeon's solution
where replace(replace(col,'#icloud.com',''), '#mac.com','') like '%#%'
In Snowflake, the replacement string is optional, which shortens that to
where replace(replace(col,'#icloud.com'), '#mac.com') like '%#%'

This is based on string split approach in SQL server, using split_to_table function, you probably have to tweak the syntax a little:
select *
from t
where exists (
select *
from split_to_table(t.emails, ';') as sv
where sv.value not like '%#icloud.com'
and sv.value not like '%#mac.com'
)

Related

BigQuery - Using regexp with LIKE operator (?)

I'd like to get productids from url and I've almost finetuned a query to do it but still there is an issue I cannot solve.
The url usually looks like this:
/xp-pen/toll-spe43-deco-pro-small-medium-spe43-tobuy-p665088831/
or
/harry-potter-es-a-tuz-serlege-2019-m19247107/
As you can see there are two types of ids:
in general, ids start with '-p'
ids of some special products start with '-m'
I created this case when statement:
CASE
WHEN MAX(hits.page.pagePath) LIKE '%-p%'
THEN MAX(REGEXP_REPLACE(REGEXP_EXTRACT(
hits.page.pagePath, '-p[0-9]+/'), '\\-|p|/', ''))
WHEN MAX(hits.page.pagePath) LIKE '%-m%'
THEN MAX(REGEXP_REPLACE(REGEXP_EXTRACT(
hits.page.pagePath, '-m[0-9]+/'), '\\-|m|/', ''))
ELSE NULL
END AS productId
It's a little complicated at the first look but I really needed a regexp_replace and a regexp_extract because '-p' or '-m' characters doesn't appear only before the id but it can be multiplied times in a url.
The problem with my code is that there are some special cases when the url looks like this:
/elveszett-profeciak-2019-m17855487/
As you can see the id starts with '-m' but the url also contains '-p'. In this case the result is empty value in the query.
I think it could be solved by modifying the like operator in the when part of the case when statement: LIKE '%-p%' or LIKE '%-m%'
It would be great to have a regexp expression after or instead of the LIKE operator. Something similar to the parameter of '-p[0-9]+/' what I used in regexp_extract function.
So what I would need is to define in the when part of the statement that if the '-p' or '-m' text is followed by numbers in the urls
I'm not sure it's possible to do or not in BQ.

So what I would need is to define in the when part of the statement that if the '-p' or '-m' text is followed by numbers in the urls
I think you want '-p' and '-m' followed by digits. If so, I think this does what you want:
select regexp_extract(url, '-[pm][0-9]+')
from (select '/xp-pen/toll-spe43-deco-pro-small-medium-spe43-tobuy-p665088831/' as url union all
select '/elveszett-profeciak-2019-m17855487/' union all
select '/harry-potter-es-a-tuz-serlege-2019-m19247107/'
) x

How to use LIKE in WHERE clause to get first 5 characters of variable?

I have a variable varchar that always takes in 10 digits. How can I use the LIKE operator to find/use only the first 5 digits of the variable?
my query:
variable IN VARCHAR2
SELECT * FROM items WHERE name LIKE SUBSTRING(variable, 1, 5)

... WHERE name LIKE '12345%'
will match any string that starts 12345. the '%' is a wildcard. You can also use the wildcard to match anywhere in the string: ... WHERE name LIKE '%12345%' will match a string with 12345 anywhere within it.
Edit for completeness: WHERE name LIKE '%12345' will match any string that ends with those five characters.

Try this:
SELECT * FROM items WHERE name LIKE (SUBSTRING(variable, 1, 5) + '%')

I guess you can use LEFT() like this:
SELECT * FROM items WHERE LEFT(name,5)=LEFT(variable,5);
Or if you you want to use LIKE with a wildcard, you can do this:
SELECT * FROM items WHERE name LIKE CONCAT(LEFT(variable,5),'%')
A few more example in the Demo fiddle
Edit: The above solution is for MySQL/MariaDB because earlier the tag of this question have MySQL but it's also my fault for not recognizing OP description of the datatype VARCHAR2. I might as well just post a suggestion related to the rdbms.
So, my first suggestion there using LEFT() however Oracle don't have that function, therefore:
SELECT * FROM items WHERE SUBSTR(name,1,5)=SUBSTR(variable,1,5);
or using concatenation operator
SELECT * FROM items WHERE name LIKE SUBSTR(variable,1,5)||'%'
Demo fiddle

asterisk or percentage sign in impala

The percentage sign (%) is used as the "everything" wildcard instead of an asterisk. It will match zero or more characters.
As #onedaywhen said, the two have same function.
But in impala, I find they only work in different specific situation.
show tables like ' '
Suppose in my database opd, there are there table,
opd.haha
opd.haha1
opd.abc
input:
show tables like 'haha*'
output:
opd.haha
opd.haha1
input:
show tables like 'haha%'
output:
Done. 0 results.
select ... like
select 'haha' like 'ha%' -- true
select 'haha' like 'ha*' -- false
select 'haha' like 'ha__' -- true
select 'haha' like 'haha%' -- true
My question is that
To summarise,
asterisk sign only works in show tables clause, and
percentage sign only works in select clause
Is this comment right?

The standard wildcards for like are:
_ which represents a single character
% which represents zero or more characters
like does not implement regular expressions.
If you want regular expressions, then use regexp_like().

Oracle LIKE Not working

Trying to get my grips on Oracle from a SQL environment.
Does anyone know why this query returns 0?
SELECT COUNT( * ) FROM MORGS.LOGS l
WHERE ( l.LOCATION = 'X:\Import\XXX006' ) AND
( l.DIRECTION = 'IN' ) AND
( 'XXX006-Test.txt' LIKE '%XXX006.D$Date,YYYYMMDD$.T$Date,HHNNSS$%' ) -- It fails on this condition
Please take note that 'XXX006-Test.txt' on the left handside of LIKE is the value of the column in the table. I've just hard-coded it here just to demo.
Thanks in advance.

Actually LIKE is working. I'm afraid it's your logic that's faulty. The premise of LIKE is that the whole text in the first parameter exists in its entirety in the second, with wildcards to omit irrelevant characters from the matching.
So this is TRUE ...
where 'ABC' like 'ABC%'
... and this is FALSE ...
where 'ABC' like 'ABCDEF'
Looking at your actual test:
( 'XXX006-Test.txt' LIKE '%XXX006.D$Date,YYYYMMDD$.T$Date,HHNNSS$%' )
we notice that the string XXX006-Test.txt does not exist in XXX006.D$Date,YYYYMMDD$.T$Date,HHNNSS$ so LIKE quite rightly returns FALSE.
" Do you know how I can split the RHS on a '.' and grab only the first index of the split results which is 'XXX006'?"
If the required match is always six characters long the simplest thing is
substr('XXX006-Test.txt', 1, 6)
If the leading thing is variable, you can use regular expressions. To extract everything before the dot:
regexp_replace ( 'XXX006-Test.txt', '(.+)\.txt$','\1' )
Although given the values in the two strings you might want to match on the dash instead ...
regexp_replace ( 'XXX006-Test.txt', '([a-z0-9]+)\-(.*)','\1' )
Depends how stable the pattern is.

SQL Find names that contain a letter (without using Like)

I need to write a select statement that returns all last names from a column that contains the letter A. I can't use LIKE. I am trying to do so with SUBSTR.

I don't think substr is the way to go. instr, on the other hand, may do the trick:
SELECT last_name
FROM mytable
WHERE INSTR(last_name, 'A') > 0
EDIT:
As David Bachmann Jeppesen mentioned, Oracle is case sensitive, so if you want to find last names containing any case of "A", you could do something like this:
SELECT last_name
FROM mytable
WHERE INSTR(UPPER(last_name), 'A') > 0

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Where clause to exclude specific email domains - sql

If you prefer a more vanilla approach to Simeon's solution where replace(replace(col,'#icloud.com',''), '#mac.com','') like '%#%' In Snowflake, the replacement string is optional, which shortens that to where replace(replace(col,'#icloud.com'), '#mac.com') like '%#%'

This is based on string split approach in SQL server, using split_to_table function, you probably have to tweak the syntax a little: select * from t where exists ( select * from split_to_table(t.emails, ';') as sv where sv.value not like '%#icloud.com' and sv.value not like '%#mac.com' )

Related

BigQuery - Using regexp with LIKE operator (?)

How to use LIKE in WHERE clause to get first 5 characters of variable?

asterisk or percentage sign in impala

Oracle LIKE Not working

SQL Find names that contain a letter (without using Like)

Categories

Resources