Does Google Cloud Dataprep support importing Google Drive Sheets as data sources? - google-bigquery

I'm importing datasets in Google Cloud Dataprep (by Trifacta) to perform transformations on my data sources. But I can't see Google Drive Sheets in the list after connecting them to Big Query Console. I'm about to use them as rules for my transformations.
I've already created another dataset and the problem persists.
Is it possible to import them or not supported yet?
Thanks,

You are right. According to the documentation Dataprep only supports native BigQuery tables and views as BigQuery sources.
You could try downloading your Drive sheets as csv and then creating a BigQuery table from it, or maybe you could create a load job from your external table into a new native table using:
SELECT * FROM my_dataset.my_external_table

Related

export BigQuery output to Google CloudStore

Our organization has data in Google Bigtable - hosted by our Vendor. We want to run jobs in BigQuery to query from Bigtable and export the data to CloudStore as .csv files without the storing the data as a dataset in BigQuery.
We do not want to store in BigQuery datasets as we are not doing any analysis using BigQuery as all Analysis is done using on premise Analytical solution.
Is this possible ?
You have a few options, and the best solution would be to automate using Cloud Workflows.
The steps I see would be:
Export from BigTable in Avro or Parquet format to Cloud Storage.
There is a gcloud and API way to do this described here.
You then import the exported files into BigQuery.
There is a a way to use bq CLI tool and API way as well to do this described here.
Then you export from BigQuery to multiple CSV files as it's documented here.
You get multiple CSV files, you can then run the gcloud compose tool to merge them.
All the above can be done in Cloud Workflows. Each call can be implemented either via API (preferred) or using the command line options using Cloud Build triggers for example. For Workflow syntax you can get guidance from this article, and the linked content from the footer section of the article.

Is it possible to automate the extract of Apple News / iCloud News Publisher analytics data?

I'm trying to set up a dashboard in Google Data Studio with Apple News analytics data as one of the sources.
I can see you can download this analytics data manually as a CSV - does anyone know a way of automating this extract? Automatically appending the data weekly to a BigQuery table would be ideal, or Google Sheets or directly into Data Studio if not.
Thanks.
You can load your CSV into BigQuery [1], or schedule a load job, and then use it in datastudio through a BigQuery reader package. Otherwise, if you do not need to append the data you can simply import it with other packages as "Custom JSON/CSV/XML" By Supermetrics.
[1] https://cloud.google.com/bigquery/docs/loading-data#supported_data_formats

Can we import data from BigQuery into Google Sheets?

I'm only seeing how to load data to my bigquery database but i don't see any wiki or post to help me on my question : Is it possible to load data to an external source like google sheet from a bigquery database (that can go thought the cloud storage plateforme)
you can export data from BigQuery, yes. One way to bring BigQuery data into Google Sheets is with the new Google Sheets Data Connector, see the screenshot below for where you'll find this on the GS interface:
Once connected, you can then write an SQL query and pull this data directly into a Google Sheet.
Here's a link to some official documentation on this data connector: https://support.google.com/docs/answer/9077536

How to refresh google drive data source - Google Big Query

I have a question regarding refreshing google big query table where the data source is google drive.
Imagine, you have CSV file on google drive and every day someone updates for you.
1. The filename is not changing
2. location URI is same
How can I refresh my big query table by using this google drive file?
Could you please guide me or send me related links?
Thanks
From the BigQuery docs:
Loading data into BigQuery from Google Drive is not currently
supported, but you can query data in Google Drive using an external
table.
The link above provides instructions on how to create an external table that references your stored-in-Drive data source. Considering that you want to be querying data from a Google Drive file which you will be updating in Drive, this is the solution you are looking for (in contrast to downloading your csv locally and then loading it into BQ, in which case you would then have to be updating directly in BQ).

google-cloud-dataflow : How to read data from a Database and write to BigQuery

I need to setup a data pipeline from some source databases like Oracle, MySQL and load the data to BigQuery.
How can I use google-cloud-dataflow to read data from a database(jdbc connection) and write to BigQuery tables using Python.
Also, I have some hive tables in an on-premise Hadoop cluster, how do I transfer this data to BigQuery.
I couldn't find the right documentation or examples to achieve this.
Can you please point me in the right direction.
I applied a solution in my project to provide such thing, you need to follow these steps:
Load data from Google Cloud SQL to Google Cloud storage in CSV by following this link.
Load the CSV data from Google cloud storage directly into BigQuery by following this link.