I started to test Google AdWords transfers for Big Query (https://cloud.google.com/bigquery/docs/adwords-transfer).
I have few questions for which I cannot find answers anywhere.
Is it possible to e.g. edit which columns are downloaded from AdWords to Big Query? E.g. Keyword report has only ad group ID column but not ad group text name.
Or is it possible to decide which tables=reports are downloaded? The transfer creates around 60 tables and I need just 5...
DZ
According to here, AdWords data transfer
store your AdWords data into a Dataset. So, the inputs are in terms of Adwords customer IDs (minimum one customer ID) and the output is a collection of Datasets.
I think, you need a modified version of PubSub to store special columns or tables in BigQuery.
Related
We are using the Google Ads transfer in BigQuery to ingest our Google Ads data. One thing I have noticed when querying the results is that all of the metrics are exactly 156x of the values we would expect in the Google Ads UI (cost, clicks, etc.)
We have tested multiple transfers and each time we have this same issue. The transfer process seems pretty straight forward, but am I missing something? Has anyone else noticed a similar issue or have any ideas of what to look at to adjust in the data transfer?
For which tables do you notice this behavior?
The dimension tables such as Customer, Campaign, AdGroup are exported every day and so are partitioned by day.
This could cause your duplication?!
You only need the latest partition/day.
So this is for example how I get the latest account / customer data:
SELECT
-- main reason I cast all the id's to string is because BI reporting tool will not see it as a metric but as a dimension field
CAST(customer_id AS STRING) AS account_id, --globally unique, see also: https://developers.google.com/google-ads/api/docs/concepts/api-structure
customer_descriptive_name,
customer_auto_tagging_enabled,
customer_currency_code,
customer_manager,
customer_test_account,
customer_time_zone,
_DATA_DATE AS date, --source table is paritioned on date
_LATEST_DATE,
CASE WHEN _DATA_DATE = _LATEST_DATE THEN TRUE ELSE FALSE END is_most_recent_record
FROM
`YOURPROJECTID.google_ads.ads_Customer_YOURID`
WHERE
_DATA_DATE = _LATEST_DATE
I am aware that Google Analytics can be linked to Bigquery using BigQuery Linking features in the GA.
But I experienced the drawback that it's scheduled at a random time. So, it's messed up my table with dependencies to these GA data, which I set up at 9 AM using DBT -- so if the GA data is updated above 9 AM, my table won't have today's GA data.
My questions are:
Is there a way to schedule the updated GA data to have constant time, as the cronjob did?
Or if there is not any. Is there a way for DBT to run the job after the GA data is updated on bigquery?
Unfortunately Google provide no SLA on the BigQuery export from Google Analytics 3, if you have the option the best solution would be to migrate to Google Analytics 4, which was an almost realtime export to BigQuery and appears to be much more robust. Find out more on the official Google support page.
I currently get around this by using event based triggers that look at the meta data of a table, or check for the existence of a sharded table for yesterday, then proceed down downstream jobs, I'm sure you could achieve something similar with DBT.
Here is some example SQL code which checks for the existence of yesterday's Google Analytics sharded table by returning the maximum timestamp:
SELECT MAX(cast(PARSE_DATE('%Y%m%d', SUBSTR(table_id,13)) as timestamp)) as max_date
FROM `my_ga_dataset.__TABLES__`
WHERE table_id LIKE'%ga_sessions_%'
AND table_id NOT LIKE '%intraday%'
AND PARSE_DATE('%Y%m%d', SUBSTR(table_id,13)) >= CURRENT_DATE() -9
This works for sharded tables, if you want to use table metadata to get the date/time of the last table update you can use INFORMATION_SCHEMA:
https://cloud.google.com/bigquery/docs/information-schema-tables
We are using Google data studio to create mobile analytic reports using google Firebase data, linking with BigQuery. We have a dataset created in the report which uses a query to pull data from BigQuery - SELECT * FROM table.events_* which returns 69.4 GB data (verified in validator). The problem is when we create report using this query in dataset, for each report we are charged for 'BigData Analysis' in Tebibytes, which is way too much. But when we calculate the pricing for query, it is not even $1 for the data that we use.
Not sure why the data is processed in tebibytes. Here are some details about the data table in BigData -
Table size: 246.69 MB
Number of columns: 57
Tried query with some filters as well, but still returning Tebibytes data for single report. Is reducing or filtering number of columns only way to restrict the data processing? what transactions comes under BigData Analysis? (other than query processing).
Your help is greatly appreciated. Thanks in advance
Our organization is in e-commerce and users are looking to change a filter everyday with a different list of items, and none of the users will have their own license, just read-only access. The data is connected through Google Big Query, is there a way to have this bulk filter upload capability without the License owners having to touch the filter each time?
Example
Product ID is the filter
Monday: they have a list of 10,000 ID's they want to check sales for
Tuesday: They have a new list of 4,000 different ID's they want to check sales for.
Without clicking each ID each time, is there a way to just upload a list, csv, google sheet etc.
We thought users can upload a list of Product ids to Google sheets which can map to a BigQuery table. We can use it to join with the sales table and get the relevant data. However this becomes unmanageable when we have more than 1 user as users might step on to others data.
Any suggestions/recommendations are welcome. Our team is pretty new to Tableau as such. Let me know if any additional details are needed.
Have you tried changing the filter type to "Multi Values (custom list)" and then having the report user paste their list into the filter? See below:
I added all my AdWords Accounts via the automatic transfer to BigQuery. All AdWords tables are now in BigQuery. But today I realized, that the same tables from different accounts have a different column count.
Here one example:
AdWordsAccount1.p_Customer_21530XXX --> 11 columns
AdWordsAccount2.p_Customer_23450XXX --> 12 columns
In this example AdWordsAccount2 contains the "AccountTimeZoneId" (Description:
Deprecated by Adwords API v201702. Please use AccountTimeZone instead). Why is this column missing in the other table?
It is only one example. Other tables like Campaign, AdGroup... have also a different column count.
I am looking forward for your help!