How to check for multiple patterns in Google BigQuery SQL? (LIKE + IN) - google-bigquery

So I have to search for a bunch of names in a BigQuery table which I collect periodically in another dataset. The dataset is too large at this point containing almost ~60k names and I no longer can do
SELECT * FROM base.table WHERE name LIKE '%name1%' OR name LIKE '%name2%.....
As I tried it using a python script with:
SELECT * FROM base.table WHERE name LIKE({' OR '.join([f'ulv4.full_name LIKE %{name}%' for name in names])})
But the character limit for query exceeds for this many names. I tried looking at solutions like this and other answers to the same question but no answer seems to work for BigQuery Standard SQL. Any help in this regard is highly appreciated.

You should keep the names in another table and then join to it:
SELECT *
FROM base.table t1
WHERE EXISTS (SELECT 1 FROM other.table t2
WHERE t1.name LIKE CONCAT('%', t2.name, '%'));
Then, any record in the base.table would only match if it contains some substring name from the other table.

Related

Excluding Records Based on Another Column's Value

I'm working in Redshift and have two columns from an Adobe Data feed:
post_evar22 and post_page_url.
Each post_evar22 has multiple post_page_url values as they are all the pages that the ID visited.
(It's basically a visitor ID and all the pages they visited)
I want to write a query where I can list distinct post_evar22 values that have never been associated with a post_page_url that contains '%thank%' or '%confirm%'.
In the dataset below, ID1 would be completely omitted from the query results bceause it was associated with a thank-you page and a confirmation page.
This is a case for NOT EXISTS:
select distinct post_evar22
from table t1
where not exists (
select 1
from table t2
where t2.post_evar22 = t1.post_evar22
and (t2.post_page_url like '%thank%' or t2.post_page_url like '%confirm%')
)
Or MINUS if your dbms supports it:
select post_evar22 from table
minus
select post_evar22 from table where (post_page_url like '%thank%' or post_page_url like '%confirm%')
Seems fairly straight forward. Am I missing something?
SELECT DISTINCT post_evar22
FROM table
WHERE post_page_url NOT LIKE '%thank%'
AND post_page_url NOT LIKE'%confirm%

Looking to see if a column of long strings contains any item from a list of shorter strings [duplicate]

So I have to search for a bunch of names in a BigQuery table which I collect periodically in another dataset. The dataset is too large at this point containing almost ~60k names and I no longer can do
SELECT * FROM base.table WHERE name LIKE '%name1%' OR name LIKE '%name2%.....
As I tried it using a python script with:
SELECT * FROM base.table WHERE name LIKE({' OR '.join([f'ulv4.full_name LIKE %{name}%' for name in names])})
But the character limit for query exceeds for this many names. I tried looking at solutions like this and other answers to the same question but no answer seems to work for BigQuery Standard SQL. Any help in this regard is highly appreciated.
You should keep the names in another table and then join to it:
SELECT *
FROM base.table t1
WHERE EXISTS (SELECT 1 FROM other.table t2
WHERE t1.name LIKE CONCAT('%', t2.name, '%'));
Then, any record in the base.table would only match if it contains some substring name from the other table.

select query to perform data existance in other table where data need to split and each part need to check

I dont want to use any function or any procedure.
I want simple select query to check the existance of the each part of string.
like i have one table dummy which have name column
Id name
1 as;as;as
2 asd;rt
and child table
child_id name
23 as
24 asd
25 rt
so any i can do that
i have tried like
select substr(first_name,1,instr(first_name,';')-1) from dummy;
select substr(first_name,instr(first_name,';')+1,instr(first_name,';')-1)
from dummy;
Which is giving only first/second part but other part
how to get other part
If I've got it right - You need to join these tables if child's NAME is included in a DUMMY.Name
SQLFiddle example
select t1.*,
t2.child_id,
t2.name as t2name
from t1
left join t2 on (';'||t1.name||';' like '%;'||t2.name||';%')
I would need more information on this question. We do not know if you have to detect more than one of the possible strings on just one field.
You could use three like clauses for the three possible scenarios
LIKE column_name ||'%;'
LIKE '%;'|| column_name
LIKE ';%'|| column_name ||'%;'
But it would probably work better for the future learning about building regular expressions. Here is a webpage that helped me a lot: txt2re.com

How to search multiple columns in MySQL?

I'm trying to make a search feature that will search multiple columns to find a keyword based match. This query:
SELECT title FROM pages LIKE %$query%;
works only for searching one column, I noticed separating column names with commas results in an error. So is it possible to search multiple columns in mysql?
If it is just for searching then you may be able to use CONCATENATE_WS.
This would allow wild card searching.
There may be performance issues depending on the size of the table.
SELECT *
FROM pages
WHERE CONCAT_WS('', column1, column2, column3) LIKE '%keyword%'
You can use the AND or OR operators, depending on what you want the search to return.
SELECT title FROM pages WHERE my_col LIKE %$param1% AND another_col LIKE %$param2%;
Both clauses have to match for a record to be returned. Alternatively:
SELECT title FROM pages WHERE my_col LIKE %$param1% OR another_col LIKE %$param2%;
If either clause matches then the record will be returned.
For more about what you can do with MySQL SELECT queries, try the documentation.
If your table is MyISAM:
SELECT *
FROM pages
WHERE MATCH(title, content) AGAINST ('keyword' IN BOOLEAN MODE)
This will be much faster if you create a FULLTEXT index on your columns:
CREATE FULLTEXT INDEX fx_pages_title_content ON pages (title, content)
, but will work even without the index.
1)
select *
from employee em
where CONCAT(em.firstname, ' ', em.lastname) like '%parth pa%';
2)
select *
from employee em
where CONCAT_ws('-', em.firstname, em.lastname) like '%parth-pa%';
First is usefull when we have data like : 'firstname lastname'.
e.g
parth patel
parth p
patel parth
Second is usefull when we have data like : 'firstname-lastname'. In it you can also use special characters.
e.g
parth-patel
parth_p
patel#parth
Here is a query which you can use to search for anything in from your database as a search result ,
SELECT * FROM tbl_customer
WHERE CustomerName LIKE '%".$search."%'
OR Address LIKE '%".$search."%'
OR City LIKE '%".$search."%'
OR PostalCode LIKE '%".$search."%'
OR Country LIKE '%".$search."%'
Using this code will help you search in for multiple columns easily
SELECT * FROM persons WHERE (`LastName` LIKE 'r%') OR (`FirstName` LIKE 'a%');
Please try with above query.

Is there any way to combine IN with LIKE in an SQL statement?

I am trying to find a way, if possible, to use IN and LIKE together. What I want to accomplish is putting a subquery that pulls up a list of data into an IN statement. The problem is the list of data contains wildcards. Is there any way to do this?
Just something I was curious on.
Example of data in the 2 tables
Parent table
ID Office_Code Employee_Name
1 GG234 Tom
2 GG654 Bill
3 PQ123 Chris
Second table
ID Code_Wildcard
1 GG%
2 PQ%
Clarifying note (via third-party)
Since I'm seeing several responses which don't seems to address what Ziltoid asks, I thought I try clarifying what I think he means.
In SQL, "WHERE col IN (1,2,3)" is roughly the equivalent of "WHERE col = 1 OR col = 2 OR col = 3".
He's looking for something which I'll pseudo-code as
WHERE col IN_LIKE ('A%', 'TH%E', '%C')
which would be roughly the equivalent of
WHERE col LIKE 'A%' OR col LIKE 'TH%E' OR col LIKE '%C'
The Regex answers seem to come closest; the rest seem way off the mark.
I'm not sure which database you're using, but with Oracle you could accomplish something equivalent by aliasing your subquery in the FROM clause rather than using it in an IN clause. Using your example:
select p.*
from
(select code_wildcard
from second
where id = 1) s
join parent p
on p.office_code like s.code_wildcard
In MySQL, use REGEXP:
WHERE field1 REGEXP('(value1)|(value2)|(value3)')
Same in Oracle:
WHERE REGEXP_LIKE(field1, '(value1)|(value2)|(value3)')
Do you mean somethign like:
select * FROM table where column IN (
SELECT column from table where column like '%%'
)
Really this should be written like:
SELECT * FROM table where column like '%%'
Using a sub select query is really beneficial when you have to pull records based on a set of logic that you won't want in the main query.
something like:
SELECT * FROM TableA WHERE TableA_IdColumn IN
(
SELECT TableA_IdColumn FROM TableB WHERE TableA_IDColumn like '%%'
)
update to question:
You can't combine an IN statement with a like statement:
You'll have to do three different like statements to search on the various wildcards.
You could use a LIKE statement to obtain a list of IDs and then use that in the IN statement.
But you can't directly combine IN and LIKE.
Perhaps something like this?
SELECT DISTINCT
my_column
FROM
My_Table T
INNER JOIN My_List_Of_Value V ON
T.my_column LIKE '%' + V.search_value + '%'
In this example I've used a table with the values for simplicity, but you could easily change that to a subquery. If you have a large list (like tens of thousands) then performance might be rough.
select *
from parent
where exists( select *
from second
where office_code like trim( code_wildcard ) );
Trim code_wildcard just in case it has trailing blanks.
You could do the Like part in a subquery perhaps?
Select * From TableA Where X in (Select A from TableB where B Like '%123%')
tsql has the contains statement for a full-text-search enabled table.
CONTAINS(Description, '"sea*" OR "bread*"')
If I'm reading the question correctly, we want all Parent rows that have an Office_code that matches any Code_Wildcard in the "Second" table.
In Oracle, at least, this query achieves that:
SELECT *
FROM parent, second
WHERE office_code LIKE code_wildcard;
Am I missing something?