BigQuery Data Location Setting - google-bigquery

Is there a way to determine "BigQuery Data Location Setting", similar to "Cloud Storage Data Location Setting" or “Datastore Data Location Setting”?
Apparently there are some legal & tax issues for companies operating outside of the US when using services hosted in the US. Our legal guys have asked me to configure the BigQuery location to be in EU. But i couldn't find where to configure this.
Thanks

There isn't currently a way to locate your BigQuery data in the EU. Right now, all of it is located in the United States.
That said, one of the reasons why this hasn't been done yet is due to lack of customer interest in EU datacenters. If you have a relationship with google cloud support and want this feature, please let them know. Alternately, vote up the question and we'll take that into account when we prioritize new features.

This appears to have changed now, so you can actually select the EU datacenter:
http://techcrunch.com/2015/04/16/google-opens-cloud-dataflow-to-all-developers-launches-european-zone-for-bigquery/

Another issue arises when you want to copy datasets from one region to the other which is not currently possibly (at least directly). Here is how you can check the location of your dataset. Open up a Google Cloud Shell and enter this command:
bq show --format=prettyjson {PROJECT_ID}:{DATASET_NAME} | grep location
However, note that you cannot edit the location. You will need to backup/export all your tables, delete the dataset, and recreate the dataset with the desired location.

Related

BigQuery Error loading location is interrupting scheduled queries

A few days ago, I started receiving an error in my Scheduled Queries dashboard Error loading location europe-west8: BigQuery Data Transfer Service does not yet support location: europe-west8.
I'm in the US, so I have set all 4 of my storage buckets are set to US or REGION, and have confirmed their locations.
Datasets are all US:
Scheduled queries are all Region "us"
Since this error started, my BigQuery Scheduled Queries that append data to tables have stopped running.
Where can I change the setting that seems to be calling europe-west8?
You need to check the region of the dataset you are using. The destination table for your scheduled query must be in the same region as the data being queried.
You can see the scheduled queries are supported in these locations here.
You specify a location for storing your BigQuery data when you create a dataset. After you create the dataset, the location cannot be changed, but you can copy the dataset to a different location, or manually move (recreate) the dataset in a different location.
You can see more information about how locations work in BigQuery here.
EDIT
This is a known issue from BigQuery UI, and the engineering team is aware of and is working towards a solution, although so far there isn't a specific ETA. Feel free to start the issue to raise further awareness towards it.
There are two possible workarounds you can try to circumvent this.
More specifically,
Workaround#1
Using the old UI, you can do it by clicking on "Disable editor
tabs".
Workaround#2
In Scheduled Query Editor > click the SCHEDULE dropdown > choose "Enable scheduled queries".
The overlay shows up with the message box ("Enable scheduled queries").
Click anywhere on the screen to close the overlay
Click the SCHEDULE dropdown again, and the create/update options are there.
If you are running schedule queries check that the processing location is set to the location of your data source and the destination table is also correct.
Checking the docs about setting a query location.
https://cloud.google.com/bigquery/docs/scheduling-queries

dataset was not found in location EU (ga_sessions data)

I have the following problem.
I've created the big query export adjusting the linkage from google analytics console.
And as expected there is a dataset created in the Big Query storage where the data is flowing on the daily basis.
The timezone and country in the GA account is Germany, but the location of the final data set in BQ is US (althoug I didn't specify it when I was linking the data), that causes some issues when connecting the data from this property with the other data I have in the storage.
My questions are:
Can someone please explain why it could have happened?
Is there any solution except copying the whole dataset to the new location?
Are there any other potential problems with having the dataset in the different location from the other datasets? (except of not able to query them at once?)
Really appreciate your help!
Thanks in advance
The data is located in US because it is the default location for the GA export data to BQ feature. It is documented here (Step 2.1).
Consider localizing your dataset to the E.U. at this step.
Data is geolocated in the U.S. by default. Localizing your data to the
EU after the initial export can cause issues with querying across
BigQuery regions. Resolving those issue may require a transfer of
data, which has associated costs. We recommend creating the
E.U.-localized dataset at this point in order to avoid any negative
side effects.
Google Analytics BigQuery Export is incompatible with GCP policies
that prevent dataset creation in the US. If you have such a policy on
your GCP project, you will have to remove it to export your data to
the EU.
The only solution in order to have the GA data in EU is to copy the whole dataset to a new one located in EU with a different name. Then, delete the original one, and copy the new one again to the original dataset.
Your main blocker is what you have mentioned. You won't be able to JOIN GA data with any other data located in other region. In addition, there might be legal issues because of the GDPR.

Joing Ads Data in Ads Data Hub with GA360 Data in BigQuery

I need to find a way how to (SQL)-join my GA360 tables in BigQuery(BQ) with data within AdsDataHub(ADH).
I already know how to query tables from BQ within ADH:
SELECT *
FROM 'projectname.table_name'
But I cant find any resources on what matching key to use in the Join statement
SELECT
*
FROM
adh.*** AS adh_data
adh_data LEFT JOIN ???
ON ga360.??? = ???
I read through this https://developers.google.com/ads-data-hub/guides/join-your-data
But it's not really clear to me what to get/use from it and I couldn't find any information on this topic anywhere.
Thank you in advance!
AFAIK, ADH doesn't currently allow for querying across google analytics data sets (which would already be in ADHs "clean room" if they wanted you to be able to make such queries...)
Your best option might be to A: make sure that you're capturing 1st party IDs in your google analytics implementation and B: ensuring those IDs are also captured in your CRM platforms as they interact with your properties (assumption being your CRM can capture, along with that ID, any Google Analytics related data you may find useful, though it won't be log level I don't think...)
From there, with "onboarding" of sorts, you may be able to eventually drop your CRM data into ADH queryable tables which can be joined (per the link you shared, "join your data") and then well... you're at google's behest for the most part, but I think that's the path you're looking for...
PS: Google may have some solutions with guides that include some useful example queries regarding join keys across CM/DV/GoogleAds tables, and they may be high quality, but they may not be EXACTLY what you're looking for... It's entirely possible they are not publicly available though...

Google Cloud Big Query Scheduled Queries weird error relating JURISDICTION

All my datasheets, tables, and ALL items inside BQ are un EU. When I try to do a View->to->Table 15 min scheduled query I get an error regarding my location, which is incorrect, because all, source and destiny are both on EU...
Anyone knows why?
There is a transient known issue matching your situation, GCP support team needs more time for troubleshooting. There may be a potential issue in the UI. I would ask you to try the following steps:
Firstly, try to make the same operation in Chrome's incognito mode.
Another possible workaround is trying to follow this official guide using a different approach than the UI (CLI for instance).
I hope it helps.

Is it possible to change the region of a Google Cloud Platform project?

If I go to the Google Developer Console then I can see all my Cloud Platform projects, but not their regions.
How do I see the region of each project? And is it possible to change the region once it has been set?
Thanks for any help.
There is no such thing as a region of a GCP project.
In other words, region/location is specific to resources, and a GCP project is not permanently tied to a single region/location.
For example, you can have a project with multiple BigQuery datasets in different regions.
That same project can have many Compute Engine instances running, each one in different location/region.
There is a default region that is set per GCP project, but that can always be overwritten when creating resources in GCP, and is mainly used to guess default location when location is not specified in API calls.
Regarding the BigQuery aspect of this question:
Data Locations on a table are immutable once set.
In order to change the location, the easiest solution would be to export the data to Google Cloud Storage, delete the table, re-create the table in the correct region, then import the data.
https://cloud.google.com/appengine/docs/python/console/#server-location
Setting the server location
When you create your project, you can specify the location from which it will be served. In the new project dialog, click on the link to Show Advanced Options, and select a location from the pulldown menu:
us-central
us-east1
europe-west
If you select us-east1 your project will be served from a single region in South Carolina. The us-central and europe-west locations contain multiple regions in the United States and western Europe, respectively. Projects deployed to either us-central or europe-west may be served from any one of the regions they contain. If you want to colocate your App Engine instances with other single-region services, such as Google Compute Engine, you should select us-east1.