Easiest way to bulk load 100 sql csv table dumps to bigquery - google-bigquery

Hi what would be the easiest and fastest way to bulk load 100 csv sql table dumps to new tables in bigquery.
Br,
Noomster

The easiest way is to create a Cloud Function that observes a bucket and when you upload a file to the bucket it will be imported into BigQuery, see a code example here:
BigQuery: How to autoreload table with new storage JSON files?

Related

Read a CSV file from S3 using Trino

Scenario:
I upload a CSV file into a S3 bucket and now I would like to read this table using Trino.
Is it possible to just read the table without the CREATE TABLE statement? Maybe simple SELECT only? Or I have to CREATE TABLE everytime I want to read the CSV file?

How to dynamically create table in Snowflake getting schema from parquet file which stored in AWS

Could you help me to load a couple of parquet files to Snowflake.
I've got about 250 parquet-files which stored in AWS stage.
250 files = 250 different tables.
I'd like to dynamically load them into Snowflake tables.
So, I need:
Get schema from parquet file... I've read that I could get the schema from parquet file using parquet-tools (Apache).
Create table using schema from the parquet file
Load data from parquet-file to this table.
Could anyone help me how to do that? Does exist the most efficient way to realize it? (by using GUI Snowflake, for example). Can't find it.
Thanks.
If the schema of the files is same you can put them in a single stage and use the Infer-Schema function. This will give you the schema of the parquet files.
https://docs.snowflake.com/en/sql-reference/functions/infer_schema.html
In case all files have different schema then I'm afraid you have to infer the schema on each file.

BigQuery to GCS and GCS to Mysql

I am creating a Airflow pipeline where I use the BigQueryOperator to query my BigQuery tables and use the BigQueryToCloudStorageOperator to export the result table to GCS as csv.
I need to move the csv to a mysql database where it should be stored as a table in the mysql database.
Can I please get any advice or ideas on how to implement this. Thanks!
Since your use case is query data in BigQuery and store data in your MySql database you can use BigQueryToMySqlOperator.
Fetches the data from a BigQuery table (alternatively fetch data for
selected columns) and insert that data into a MySQL table.

How to clean data read from a csv file before steaming insert into bigquery table?

I have a csv file (delimiter | instead of ,) in the cloud storage bucket, which has data formats say 10022019 as DATE , however I need to transform it to the accepted DATE format of 2019-02-10 in bigquery, can I achieve this by a could function which reads and transform the data at the same time before it streams insert the data into a bigquery table?
Thanks.
Best regards,
Since you have your data in cloud storage you may consider a cloud function to adjust data quality before streaming/loading to BigQuery.
If you were to use load BigQury API you can consider serverless rule base data quality adjustment with StorageMirror, followed by rule base data loading with BqTail

Not able to fetch more than 15k rows from BQ table

I have standalone Java application, which i am using to fetch data from BQ table in the form of CSV file. My BQ table has more than 50k rows. My Java application is not able to read more than 15k rows into CSV file. Please suggest any solution.
You can't get more than 128MB of data unless you're storing the result in destination table. Perhaps you should save the result in destination table and then to call Export class which will produce a CSV file in Google Cloud Storage bucket.