How to implement "like" in BigQuery? - google-bigquery

I am trying to run a simple query making a restriction of like % in BigQuery, but LIKE is not in their syntax, so how can it be implemented?

You can use the REGEXP_MATCH function (see the query reference page):
REGEXP_MATCH('str', 'reg_exp')
Instead of using the % syntax used by LIKE, you should use regular expressions (detailed syntax definition here)

LIKE is officially supported in BigQuery Standard SQL -
https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#comparison_operators
And I think it also works in Legacy SQL!

REGEXP_MATCH returns true if str matches the regular expression. For string matching without regular expressions, use CONTAINS instead of REGEXP_MATCH.
https://developers.google.com/bigquery/docs/query-reference#stringfunctions

REGEXP_MATCH is great if you know how to use it, but for those who aren't sure there won't be any commonly used special characters such as '.','$' or '?' in the lookup string, you can use LEFT('str', numeric_expr) or RIGHT('str', numeric_expr).
ie if you had a list of names and wanted to return all those that are LIKE 'sa%'
you'd use:
select name from list where LEFT(name,2)='sa'; (with 2 being the length of 'sa')
Additionally, if you wanted to say where one column's values are LIKE another's, you could swap out the 2 for LENGTH(column_with_lookup_strings) and ='sa' for =column_with_lookup_strings, leaving it looking something like this:
select name from list where LEFT(name,LENGTH(column_with_lookup_strings))= column_with_lookup_strings;
https://cloud.google.com/bigquery/query-reference

Related

Looking for information on specific operators in Bigquery, if they exist, and if not, what other operators perform similar functions

I am looking to understand whether or not the below functions are supported in bigquery. I have tried to use them and they are not recognized. If they are not supported, could you recommend what could be used to replace them?
ILIKE operator - case insensitive version of LIKE operator
IGNORE CASE - way to get around not having ILIKE, bigquery does not seem to support
CONTAINS operator - way to get around using wildcard operators with LIKE
Is the only way to do this with the LOWER() operator?
Thanks for the help!
BigQuery is already case sensitive so ILIKE should work with just LIKE.
For IGNORE, since BigQuery is case sensitive you have to use UPPER or LOWER in combination with LIKE. So, UPPER(column) LIKE '%BLAH%'.
For CONTAINS there is REGEXP_CONTAINS, and there is more info here: https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#regexp_contains

How can I select all the occurence of a Username with numbers at the end?

I'm trying to create a request to find all the occurence of a name in my Usernames table:
for example I have:
SAnderso
BBobby
SAnderso1
SAnderso2
SAnderso99
and I'd like to get all the SAnderso(here 1,3,4,5)
here's what I tried:
SELECT * FROM Utilisateur WHERE NomUtilisateur LIKE 'SAnderso%[0123456789]' OR NomUtilisateur = 'SAnderso'
but when I do this nothing is shown in the results
can you help me ?
In any database, this might do want you want:
WHERE NomUtilisateur LIKE 'SAnderso%'
If you specifically need to check for numbers after the name, then the method depends on the database. In most databases, you'll need to use the extension for regular expressions.
In previous questions you have asked about MySQL so, assuming MySQL, you can use REGEXP_LIKE which allows regular expressions to be used:
SELECT * FROM Utilisateur
WHERE REGEXP_LIKE(NomUtilisateur, '^(SAnderso|SAnderso.*[0-9])$')
This is based on your attempt where you used % in LIKE so that there may be characters following "SAnderso" before the digit(s). If you only want "SAnderso" optionally followed by digits then you would change the regex pattern to ^SAnderso[0-9]*$
thanks to #mhawke who helped me find about REGEX in sql database the solution was to use it but in maria db REGEXP_LIKE does'nt exist so I looked on the documentation and foudn out I could do it like this:
SELECT * FROM Utilisateur WHERE NomUtilisateur REGEXP '^(SAnderso|SAnderso.*[0-9])$'

DB2 complex like

I have to write a select statement following the following pattern:
[A-Z][0-9][0-9][0-9][0-9][A-Z][0-9][0-9][0-9][0-9][0-9]
The only thing I'm sure of is that the first A-Z WILL be there. All the rest is optional and the optional part is the problem. I don't really know how I could do that.
Some example data:
B/0765/E 3
B/0765/E3
B/0764/A /02
B/0749/K
B/0768/
B/0784//02
B/0807/
My guess is that I best remove al the white spaces and the / in the data and then execute the select statement. But I'm having some problems writing the like pattern actually.. Anyone that could help me out?
The underlying reason for this is that I'm migrating a database. In the old database the values are just in 1 field but in the new one they are splitted into several fields but I first have to write a "control script" to know what records in the old database are not correct.
Even the following isn't working:
where someColumn LIKE '[a-zA-Z]%';
You can use Regular Expression via xQuery to define this pattern. There are many question in StackOverFlow that talk about patterns in DB2, and they have been solved with Regular Expressions.
DB2: find field value where first character is a lower case letter
Emulate REGEXP like behaviour in SQL

Can you use DOES NOT CONTAIN in SQL to replace not like?

I have a table called logs.lognotes, and I want to find a faster way to search for customers who do not have a specific word or phrase in the note. I know I can use "not like", but my question is, can you use DOES NOT CONTAINS to replace not like, in the same way you can use:
SELECT *
FROM table
WHERE CONTAINS (column, ‘searchword’)
Yes, you should be able to use NOT on any boolean expression, as mentioned in the SQL Server Docs here. And, it would look something like this:
SELECT *
FROM table
WHERE NOT CONTAINS (column, ‘searchword’)
To search for records that do not contain the 'searchword' in the column. And, according to
Performance of like '%Query%' vs full text search CONTAINS query
this method should be faster than using LIKE with wildcards.
You can also simply use this:
select * from tablename where not(columnname like '%value%')

REGEXP_LIKE in SQLAlchemy

Any one knows how could I use the equivalent of REGEXP_LIKE in SQLAlchemy? For example I'd like to be able to do something like:
sa.Session.query(sa.Table).filter(sa.Table.field.like(regex-to match))
Thanks for your help!
It should (I have no access to Oracle) work like this:
sa.Session.query(sa.Table) \
.filter(sa.func.REGEXP_LIKE(sa.Table.c.column, '[[:digit:]]'))
In cases when you need to do database specific function which is not supported by SQLAlchemy you can use literal filter. So you can still use SQLAlchemy to build query for you - i.e. take care about joins etc.
Here is example how to put together literal filter with PostgreSQL Regex Matching operator ~
session.query(sa.Table).filter("%s ~':regex_pattern'" % sa.Table.c.column.name).params(regex_pattern='stack')
or you can manually specify table and column as a part of literal string to avoid ambigious column names case
session.query(sa.Table).filter("table.column ~':regex_pattern'" ).params(regex_pattern='[123]')
This is not fully portable, but here is a Postgres solution, which uses a ~ operator. We can use arbitrary operators thus:
sa.Session.query(sa.Table).filter(sa.Table.field.op('~', is_comparison=True)(regex-to match))
Or, assuming a default precedence of 0,
sa.Session.query(sa.Table).filter(sa.Table.field.op('~', 0, True)(regex-to match))
This also works with ORM constructs:
sa.Session.query(SomeClass).filter(SomeClass.field.op('~', 0, True)(regex-to match))