Can I use big query export data statement and scheduled the query? - google-bigquery

I have a similar question asked in this link BigQuery - Export query results to local file/Google storage
I need to extract data from 2 big query tables using joins and where conditions. The extracted data has to be placed in a file on cloud storage. Mostly csv file. I want to go with a simple solution. Can I use big query export data statement In standard sql and schedule it?? Does it has a limitation of 1 Gb export?? If yes, what is the best possible way to implement this? Creating another temp table to save results from the query and using a data flow job to extras the data from the temp table? Please advise.
Basically google cloud now supports below
Please see code snippet in cloud documentation
https://cloud.google.com/bigquery/docs/reference/standard-sql/other-statements#exporting_data_to_csv_format
I’m thinking if I can use the above statement to export data into a file and select query will have join from 2 tables and other conditions.
This query will be a scheduled query in big query.
Any inputs please??

Related

From Big Query to Google cloud storage

I like to export data from Big query to Google cloud storage using any script. Also for multiple table using loop save in CSV format and overwrite existing file.
Also how can we schedule this script.
If anybody have answer that will be great help.
Thanks in advance
Common way to approach this problem is to use Airflow and write a DAG to meet your requirements.
But if you want to iterate tables and dump them in GCS on a regular basis only with BigQuery, following could be another option.
1. Export Data
You can export data to GCS with EXPORT DATA statement in BigQuery script.
EXPORT DATA OPTIONS(
uri='gs://bucket/folder/*.csv',
format='CSV',
overwrite=true,
header=true,
field_delimiter=';') AS
SELECT field1, field2 FROM mydataset.table1 ORDER BY field1 LIMIT 10
https://cloud.google.com/bigquery/docs/reference/standard-sql/other-statements
2. Loops and Dynamic SQL
If you have a list of table you want to dump, you can loop those tables in BigQuery FOR loop.
https://cloud.google.com/bigquery/docs/reference/standard-sql/procedural-language#loops
And you need to generate EXPORT DATA script dynamically for each table. To do so, you can use EXECUTE IMMEDIATE Dynamic SQL.
https://cloud.google.com/bigquery/docs/reference/standard-sql/procedural-language#execute_immediate
3. Scheduling
BigQuery provides a feature to schedule a user query and you can use it for your purpose.
https://cloud.google.com/bigquery/docs/scheduling-queries#set_up_scheduled_queries

BigQuery to GCS and GCS to Mysql

I am creating a Airflow pipeline where I use the BigQueryOperator to query my BigQuery tables and use the BigQueryToCloudStorageOperator to export the result table to GCS as csv.
I need to move the csv to a mysql database where it should be stored as a table in the mysql database.
Can I please get any advice or ideas on how to implement this. Thanks!
Since your use case is query data in BigQuery and store data in your MySql database you can use BigQueryToMySqlOperator.
Fetches the data from a BigQuery table (alternatively fetch data for
selected columns) and insert that data into a MySQL table.

What is the easiest way to query a CSV file in Oracle SQL Developer?

I have a fairly simple CSV file that I would like to use within a SQL query. I'm using Oracle SQL Developer but none of the solutions I have found on the web so far seem to have worked. I don't need to store the data (unless I can use temp tables?) just to query it and show results.
Thank You!
You need to create an EXTERNAL TABLE. This essentially maps a CSV (or indeed any flat file) to a table. You can then use that table in queries. You will not be able to perform DML on the external table.

Google Bigquery query execution using google cloud dataflow

Is it possible to execute Bigquery's query using Google cloud data flow directly and fetch data, not reading data from table then putting conditions?
For example, PCollections res=p.apply(BigqueryIO.execute("Select col1,col2 from publicdata:samples.shakeseare where ...."))
Instead of reinventing using iterative method what Bigquery queries already implemented, we can use the same directly.
Thanks and Regards
Ajay K N
BigQueryIO currently only supports reading from a Table and not a Query or View (FAQ).
One way to work around this is in your main program to create a BigQuery permanent table by issuing a query before you run your Dataflow job. After, your job runs you could delete the table.

Querying (SQL) Oracle/Toad dump files without importing them

We're doing a monthly data dump of our databases using Toad for Oracle's Export function. We've got some SQL queries to create statistics about the data. I'd like to compare the results of the current state with the last few dumps.
I can open the files with the Export File Browser in Toad (v11) and sort/filter the data using the GUI, but that's not powerful enough. Is there a way to query the dump files with SQL without having to take extra steps like creating a new schema and importing it?
By far the best way would be to reimport the data.