Command to get the sql query of a BigQuery view - google-bigquery

We have a created a large number of views in BigQuery using standardSql. Now we need to see for the correctness of these created views.
Is there a bq command to get the sql query with which these views have been created in BigQuery?
This command will prevent manual effort of checking for the correctness of these views

Use the show command with the view flag.
e.g. bq show --view <project>:<dataset>.<view>

You can also use the --format=prettyjson flag (instead of --view) so you can easily get the query content when running a script, for example:
bq show --format=prettyjson <project>:<dataset>.<view>

Related

Duplicate several tables in bigquery project at once

In our BQ export schema, we have one table for each day as per the screenshot below.
I want to copy the tables before a certain date (2021-feb-07). I know how to copy one day at a time via the UI, but is there not a way to use the cloud console to write a code for copying the selected date range, all at once? Or maybe an sql command directly from a query window?
I think you should transform your sharding tables into a partitioned table. So you can handled your tables with just a single query. As mention in the official documentation, partitioned tables perform better.
To make the conversion, you can just execute the following commands in the console.
bq partition \
--time_partitioning_type=DAY \
--time_partitioning_expiration 259200 \
mydataset.sourcetable_ \
mydataset.mytable_partitioned
This will make your sharded tables sourcetable_(xxx) into a single partitioned table mytable_partitioned which can be query with just a single query trough your entire set of data entries.
SELECT
*
FROM
`myprojectid.mydataset.mytable_partitioned`
WHERE
_PARTITIONTIME BETWEEN TIMESTAMP('2022-01-01') AND TIMESTAMP('2022-01-03')
For more details about the conversion commands you can check this link. Also, I recommend to check the links about querying partionated tables and partiotioned tables for more details.

List all objects in a dataset using bq ls command in BigQuery

I am trying to list all the objects present in a dataset in BigQuery.
I tried using bq ls projectID:dataset_name command in Google SDK shell. However this returned only the list of tables present in the dataset. I am interested in listing all the stored procedures present in the same dataset.
It is possible to get the list of functions with a query:
bq query --nouse_legacy_sql \
'SELECT
*
FROM
mydataset.INFORMATION_SCHEMA.ROUTINES'
It is possible to include all routines in the bq ls command by setting the flag --routines=true:
bq ls dataset_name --routines=true
The default value is false. Routines include persistent user-defined functions, table functions (Preview), and stored procedures. See GCP docs for more detail.

Google Big Query: Determine invalid views (e.g. dryRun & list)

We have several views in numerous projects & datasets in Google Big Query. Is there a way to list all invalid views? E.g. to "re-validate" all views and then to get a list?
While it might not cover all problems I think I could execute a view using the dryRun parameter to determine its state (https://cloud.google.com/bigquery/docs/dry-run-queries). But in this case I would like to determine all existing views (over all projects, or - as this may be a bad idea - at least within one project) and then to trigger the view with the dryRun parameter and to store the results somewhere/somehow.
Hints how to do that are appreciated.
Regards,
HerrB92
I am not aware of any built-in tools to do this, but it should be doable with some scripting.
bq ls command will return list of datasets, then for each dataset you can continue running bq ls <dataset> (or use SELECT * FROM dataset.INFORMATION_SCHEMA.TABLES WHERE TABLE_TYPE = 'VIEW'), then run each view with --dry_run flag.

Use Bigquery API or bq command line tool to create new table

I am trying to come up with a programmatic way to generate a new bigquery table from a pre-existing table. I know how to do this to create a new view using the bq command line tool
bq --project_id='myProject' mk --view="SELECT * FROM [MYDATASET.database]" myProject.newTable
But that creates a view and that doesn't help me, I need to create a table for a bunch of reasons.
I would be happy to create a view and then be able to generate a new table from that view periodically, but I can't figure out how to do that without doing it manually though the bigquery web interface.
I'd appreciate any help.
Brad
If it's a normal table (not a view), you can use the copy command:
bq cp <source> <destination>
If you're trying to materialize a view, or if you need to modify the contents of the table in the process (e.g., adding/removing/transforming fields), you can run a query with a destination table:
bq query \
--destination_table=<destination> \
--allow_large_results \
--noflatten_results \
'SELECT ... FROM <source>'
The query option is more powerful, but you'll get charged for running the query. The copy is free.

BigQuery bq command - load only if table is empty or doesn't exist

I'm executing a load command with bq, e.g:
bq load ds.table gs://mybucket/data.csv dt:TIMESTAMP,f1:INTEGER
I would like to load the data only if the table is empty or doesn't exist.
Is it possible?
EDIT:
Basically I would like the WRITE_EMPTY API option via the bq command line tool:
https://cloud.google.com/bigquery/docs/reference/v2/jobs#configuration.load.writeDisposition
If the table already exists and contains data, a 'duplicate' error is returned in the job result.
If you go check bq.py, that has the source code for the BigQuery CLI, you'll find out that the _Load() method doesn't implement an option for the WRITE_EMPTY API option. It's either the default WRITE_APPEND or the optional WRITE_TRUNCATE.
As you indicate, the API does support WRITE_EMPTY - if you want to see this as an option on the CLI, you can submit a feature request at https://code.google.com/p/google-bigquery/issues/list?q=label:Feature-Request
You can use the BQ command-line tool.
Get Table Information
bq show <project_id>:<dataset_id>.<table_id>
List tables
bq ls [project_id:][dataset_id]