How do you query an array in Standard SQL that meets a certain conditional? - sql

I am trying to pull records whose arrays only meet a certain condition.
For example, I want only the results that contain "IAB3".
Here is what the table looks like
Table Name:
bids
Column Names:
BidderBanner / WinCat
Entries:
1600402 / null
1911048 / null
1893069 / [IAB3-11, IAB3]
1214894 / IAB3
How I initially thought it would be
SELECT * FROM bids WHERE WinCat = "IAB3"
but I get an error that says no match for operator types array, string.
The database is in Google Big Query.

Below is for BigQuery Standard SQL
#standardSQL
SELECT * FROM `project.dataset.bids` WHERE 'IAB3' IN UNNEST(WinCat)
You can test, play with above using sample data from your question as in example below
#standardSQL
WITH `project.dataset.bids` AS (
SELECT 1600402 BidderBanner, NULL WinCat UNION ALL
SELECT 1911048, NULL UNION ALL
SELECT 1893069, ['IAB3-11', 'IAB3'] UNION ALL
SELECT 1214894, ['IAB3']
)
SELECT * FROM `project.dataset.bids` WHERE 'IAB3' IN UNNEST(WinCat)
with result

you need to use single quotes in sql for all strings. it should be WHERE WinCat = 'IAB3' not WHERE WinCat = "IAB3"

One method uses unnest(), something like this:
SELECT b.*
FROM bids b
WHERE 'IAB3' IN (SELECT unnest(b.WinCats))
However, array syntax varies among the databases that support them and they are no part of "standard SQL".

this will work:
SELECT * FROM bids WHERE REGEXP_LIKE (WinCat, '(.)*(IAB3)+()*');

Related

Group By Using Wildcards in Big Query

I have this query:
SELECT SomeTableA.*
FROM SomeTableB
LEFT JOIN SomeTableA USING (XYZ)
GROUP BY SomeTableA.*
I know that I cannot do the GROUP BY part with wildcards. At the same time, I don't really like listing all the columns (can be up to 20) manually.
Could this be added as new feature? Or is there any way how to easily get the list of all 20 columns from SomeTableA for the GROUP BY part?
If you really have the exact query shown in your question - then try below instead - no grouping required
#standardSQL
SELECT DISTINCT *
FROM `project.dataset.tableA`
WHERE xyz IN (SELECT xyz FROM `project.dataset.tableB`)
As of Group By Using Wildcards in Big Query this sounds more like grouping by struct which is not supported so you can submit feature request if you want - https://issuetracker.google.com/issues/new?component=187149&template=0

Using SQL Query to return value from BigQuery User Defined Function

Can I use a query in Google BigQuery User Defined Function to return some value? I've been searching docs and stackoverflow for hours without any luck and I have a very specific use case where I need to return a single scalar value based on the values of multiple columns.
Following will be the use case for the query:
SELECT campaign,source,medium, get_channel(campaign,source,medium)
FROM table_name
the get_channel() UDF will use these parameters and a complex select statement to return a single scalar value for the row. I've prepared the query, I just need to find a way to use that query in the UDF, for which I, honestly am at loss and without a cause.
Is my use case correct? Is this even possible? Are there any alternatives to do this?
Looks like you want to use UDF to select scalar value off of some lookup table. if so, NO - you cannot reference a table in UDF - see more in Limits and Limitations
But if you just want to have some complex manipulation with arguments - sure - see dummy example below
#standardSQL
CREATE TEMPORARY FUNCTION get_channel(campaign INT64, source INT64, medium INT64) AS ((
SELECT campaign + source + medium as result_of_complex_select_statement
));
WITH `project.dataset.table_name` AS (
SELECT 1 AS campaign, 2 AS source, 3 AS medium UNION ALL
SELECT 4, 5, 6 UNION ALL
SELECT 7, 8, 9
)
SELECT
campaign,
source,
medium,
get_channel(campaign,source,medium) AS channel
FROM `project.dataset.table_name`
You should rather use JOIN to achieve your goal

BigQuery Wildcard tables with Regex and date range

Is it possible to combine the table wildcard functions as documented here?
I've taken a look through the Table Query functions SO answer, but doesn't quite seem to cover my use case.
I have table names in the format: s_CUSTOMER_ID_YYYYMMDD
I can find all the tables for a customer ID using:
SELECT *
FROM TABLE_QUERY([project:dataset],
'REGEXP_MATCH(table_id, r"^s_CUSTOMER_ID")')
And I can find all the tables for a date range via:
SELECT *
FROM (TABLE_DATE_RANGE([project:dataset],
TIMESTAMP('2016-01-01'),
TIMESTAMP('2016-03-01')))
But how do I query for both at the same time?
I tried using sub queries like this:
SELECT * FROM
(SELECT *
FROM TABLE_QUERY([project:dataset],
'REGEXP_MATCH(table_id, r"^s_CUSTOMER_ID")'))
,(SELECT *
FROM (TABLE_DATE_RANGE([project:dataset],
TIMESTAMP('2016-01-01'),
TIMESTAMP('2016-03-01'))))
...but the parser complains of Error: Can't parse table: project:dataset.
Adding a dot so they are project:dataset. brings an error Error: Error preparing subsidiary query: Dataset project:dataset. not found
Are my table names poorly done? What would be a better way of organising them if so?
Below quick "solution" - should work and you can improve it based on real/extra requirements you probably have
SELECT *
FROM
TABLE_QUERY([project:dataset],
'REGEXP_MATCH(table_id, r"^s_CUSTOMER_ID")
AND RIGHT(table_id, 8) BETWEEN "20160101" AND "20160301"')

Emulate subquery with no main table in access

I can do this in SQL Server:
SELECT 'HERRAMIENTA ELÉCTRICA' AS TIPO_PRODUCTO,
0 AS DEPRECIACION,
(select sum(empid) from HR.employees) STOCK
but in Access the same query show me the next error:
Query input must contain at least one table or query
So which could be the best form to emulate this? Make a query with any other table looks dirty for me.
EDIT 1:, HR.employees It may no have data, but i want show constants ('HERRAMIENTA ELÉCTRICA',''0') and 0 in the third column, maybe using isnull and this is not the problem here.
Why not to select directly:
select 'HERRAMIENTA ELÉCTRICA' AS TIPO_PRODUCTO,
0 AS DEPRECIACION,
IIF(ISNULL(sum(empid)), 0, sum(empid)) AS STOCK
from HR.employees
This simply doesn't work in Access. You need a FROM clause.
So you need to have a dummy table with one record, even if you don't use a single field from that table.
SELECT 'HERRAMIENTA ELÉCTRICA' AS TIPO_PRODUCTO,
0 AS DEPRECIACION,
(select sum(empid) from HR.employees) STOCK
FROM Dummy_Table
Using this example as empty table:
with employ as
(select 2 as col from dual
minus
select 2 as col from dual)
The query is this one:
select 'HERRAM' as tipo,
0 as deprec,
coalesce(sum(col), 0) as STOCK
from employ;
coalesce(x, value) sets the column to value when X is null
In Access, you can use a system table, and Val and Nz for the zero value:
SELECT TOP 1
'HERRAMIENTA ELÉCTRICA' AS TIPO_PRODUCTO,
0 AS DEPRECIACION,
Val(Nz((select sum(empid) from HR.employees), 0)) AS STOCK
FROM
MSysObjects

SQL reverse LIKE

I have a table holding a list of countries. Say one of these countries is 'Macedonia'
What SQL query would return the 'Macedonia' record if a search is made for 'Republic of Macedonia'?
I believe that in linq it would be something like
var countryToSearch = "Republic of Macedonia";
var result = from c in Countries
where countryToSearch.Contains(c.cName)
select c;
Now what would the SQL equivalent for the query above be?
Had it been the other way round (i.e. the database has the long version of the country name stored) the below query should work:
Select * from country
where country.Name LIKE (*Macedonia*)
but I do not see how I can reverse it.
Side note: the country names in the table will always be the short version of country names
Just reverse the operator (and fix the syntax)
Select * from country
where 'Republic of Macedonia' LIKE CONCAT('%',country.Name,'%')
You can use CHARINDEX for this.
Select * from country
where CHARINDEX(country.Name, 'Republic of Macedonia') > 0
For SQLite, a substring (mysubstring column) in a table (mytable) can be compared against an arbitrary larger test string:
select * from mytable where instr('Somewhere substring is here', mytable.mySubstring) > 0