how to enable standard sql for BigQuery using bq shell - google-bigquery

bq query has --use_legacy_sql flag that can be set to false to enable standard query.
How to do the same if bq shell is used
I tried below variations and both of those failed with error Unknown command line flag 'use_legacy_sql'.
bq --use_legacy_sql=false shell
bq shell --use_legacy_sql=false

It doesn't look like it's possible currently, so I filed a feature request. The alternative is to pass it to "query" each time, although that feels very verbose. For example:
$ bq shell
myproject> query --use_legacy_sql=false SELECT [1, 2, 3] AS arr

Related

List all objects in a dataset using bq ls command in BigQuery

I am trying to list all the objects present in a dataset in BigQuery.
I tried using bq ls projectID:dataset_name command in Google SDK shell. However this returned only the list of tables present in the dataset. I am interested in listing all the stored procedures present in the same dataset.
It is possible to get the list of functions with a query:
bq query --nouse_legacy_sql \
'SELECT
*
FROM
mydataset.INFORMATION_SCHEMA.ROUTINES'
It is possible to include all routines in the bq ls command by setting the flag --routines=true:
bq ls dataset_name --routines=true
The default value is false. Routines include persistent user-defined functions, table functions (Preview), and stored procedures. See GCP docs for more detail.

How do I specify query parameters for a scheduled query in BigQuery?

I am trying to set up some Scheduled Queries against my tables in BigQuery.
https://cloud.google.com/bigquery/docs/scheduling-queries
I would like these queries to use parameters, e.g.
bq query --use_legacy_sql=false \
--parameter=label::TEST_LABEL \
"SELECT #label AS label"
I cannot find any way to do this via the bq commandline tool or the API. Passing it in as a --parameter flag, or a field to the params JSON does not work.
bq mk \
--transfer_config \
--project_id=my_project \
--location=US \
--display_name='Test Scheduled Query' \
--params='{"query":"INSERT my_project.my_data.test SELECT #label AS label;"}' \
--data_source=scheduled_query \
--service_account_name=my-svc-act#my_project.iam.gserviceaccount.com
You got some concept wrong. Not sure what is your use case.
Scheduled Queries
A scheduled query is setup to run periodically in the background, and it's triggers automatically. Hence as it's automatically scheduled, there is no way to provide custom parameters in the way you have described. This is due to scheduled queries using features of BigQuery Data Transfer Service.
A scheduled query has only a few inbuilt runtime parameters such as #run_time, more on this here. You will find further information in this other link.
You probably want to execute a query on-demand, and not a scheduled query.
On-demand queries
Example:
bq query \
--use_legacy_sql=false \
--parameter=corpus::romeoandjuliet \
--parameter=min_word_count:INT64:250 \
'SELECT
word, word_count
FROM
`bigquery-public-data.samples.shakespeare`
WHERE
corpus = #corpus
AND
word_count >= #min_word_count
ORDER BY
word_count DESC'
More on running parameterized queries here.

big query command line to execute multiple sql files

Does anyone here knows how to execute multiple sql files in bq command line? Sample if I have 2 sql files named test1.sql and test2.sql, how should I do it?
If I do this:
bq query --use_legacy_sql=false > test1.sql
this only executes the test1.sql.
What I want to do is to execute both test1.sql and tes2.sql.
There isn't : bq query
Best option if you want one line is to use the && operator.
bq query --use_legacy_sql=false > test1.sql && bq query --use_legacy_sql=false > test2.sql
There is an alternative way, which is using a shell script in order to loop through all of the files:
#!/bin/bash
FILES="/path/to/sqls"
for f in $FILES
do
bq query --use_legacy_sql=false < "$f"
done

BigQuery error when using wildcard in "select * from ..." only when executed on GCE VM

I get an error from BigQuery when running a basic query with a wildcard:
bq query --use_legacy_sql=false "SELECT * FROM mydata.states LIMIT 10"
The problem is with the * - here is the error I get from bq when running it on the VM in GCE:
Error in query string: Error processing job '...': Field 'workspace' not found in table 'mydata.states'.
The "workspace" is the name of the directory in my current working directory - it appears that bq is expanding that (similar to ls *).
The same command works just fine in the bq shell without expanding * to the first directory it finds. The same query works perfectly fine on my local ubuntu outside of GCE.
If I list columns explicitly it works fine. I can't figure out what makes bq to replace * with the directory name in my current path and how to disable that?
I have two very similar machines running bq command line version 2.0.24 and both are ubuntu 14.04. Other than this, the * works in bash just as expected, including set -f that stops expansion all together, but it has no effect on bq...
The funny thing is that * works as expected when used in a query like this:
bq query --use_legacy_sql=false "SELECT COUNT(*) FROM mydata.states LIMIT 10"
The other odd thing is that this also works fine:
echo "SELECT * FROM mydata.states LIMIT 10" | bq query
The BigQuery command line client does not expand the * itself; that's caused by Bash. The best long-term solution would be to put your query into a file, e.g. my_query.sql. Then you can do:
bq query --use_legacy_sql=false < my_query.sql
Now you don't need to worry about escaping any part of the query, since the query text is read from the file.

Google Bigquery BQ command line execute query from a file

I use the bq command line tool to run queries, e.g:
bq query "select * from table"
What if I store the query in a file and run the query from that file? is there a way to do that?
The other answers seem to be either outdated or needlessly brittle. As of 2019, bq query reads from stdin, so you can just redirect your file into it:
bq query < myfile.sql
Query parameters are passed like this:
bq query --parameter name:type:value < myfile.sql
There is another way.
Try this:
bq query --flagfile=[your file with absolute path]
Ex:
bq query --flagfile=/home/user/abc.sql
You can run a query from a text file with a little bit of shell magic:
$ echo "SELECT 17" > qq.txt
$ bq query "$(cat qq.txt)"
Waiting on bqjob_r603d91b7e0435a0f_00000150c56689c6_1 ... (0s) Current status: DONE
+-----+
| f0_ |
+-----+
| 17 |
+-----+
Note this works on any unix variant (including mac). If you're using a windows, this should work under powershell but not the default cmd prompt.
If you are using standard sql (Not Legacy Sql).
**Steps:**
1. Create .sql file (you can you any extension).
2. Put your query in that. Make sure (;) at the end of the query.
3. Go to command line ad execute below commands.
4. If you want add parameter then you have to specify sequentially.
Example:
bq query --use_legacy_sql=False "$(cat /home/airflow/projects/bql/query/test.sql)"
for parameter
bq query --use_legacy_sql=False --parameter=country::USA "$(cat /home/airflow/projects/bql/query/test.sql)"
cat >/home/airflow/projects/bql/query/test.sql
select * from l1_gcb_trxn.account where country=#country;
This thread offers good solution
bq query `cat my_query.sql`
bq query --replace --use_legacy_sql=false --destination_table=syw-analytics:store_ranking.SHC_ENGAGEMENT_RANKING_TEST
"SELECT RED,
DEC,
REDEM
from `\syw.abc.xyz\`"