Bigquery DR complaint and Multi region support - google-bigquery

I would like to know about Bigquery multiregion support , could you please help me with the following queries.
Is multi region a solution for DR ?
Is there any manual intervention from the user needed if disaster occurs ? can we still access the data from secondary region ?
3.What is the time delay to switch bigquery service from one region to another
Do we face any downtime and data inconsistency during this time ?

Related

Teradata Current CPU utilization (Not User level and no History data)

I want to run heavy extraction basically for migration of data from Teradata to some cloud warehouse and would want to check current CPU utilization (in percentage) of overall Teradata CPU and accordingly increase the extraction processes on it.
I know we have this type of information available in "dbc.resusagespma" but it looks like history data and not current, which we can see on Viewpoint.
Can we get such a run time information with the help of SQL in Teradata?
This info is returned by one of the PMPC-API funtions, syslib.MonitorPhysicalSummary, of course, you need Execute Function rights:
SELECT * FROM TABLE (MonitorPhysicalSummary()) AS t

Dynamic query using parameters in Tableau

I am trying to utilize tableau in creating a web dashboard to interact with a postgres database with a fair amount of rows.
The key here is the relevant data is within latitude/longitude boundaries, so I'm using tableau parameters in a custom SQL statement to get what I need, like so
SELECT id, lat, lng... FROM my_table
WHERE lat >= <Parameters.MIN_LAT> AND lat <= <Parameters.MAX_LAT>
AND lng >= <Parameters.MIN_LNG> AND lng <= <Parameters.MAX_LNG>
LIMIT 10000
I'm setting these parameters using the Tableau JavaScript API based off of a Google maps widget boundaries. When the map is moved, I'll refresh the parameters and the data needs to update as well. This refresh is not done constantly, but frequent enough that long wait times are not acceptable.
Because the lat/lng boundaries are dynamic and the full unfiltered table is very big (~1GB) I presumed it is impractical to create a data extract. Am I wrong?
Furthermore when I change some of the in-Tableau filters I'm applying there is a very long wait as if it is re-executing the query every-time, even if the MIN_LAT, MAX_LAT, .. parameters are un-changed.
What's the best way of resolving this? I'm new to Tableau so sorry if I'm missing something super obvious!
Thanks.
The best way of resolving this, is by making a query with less information on it (1GB is too much, the extract can help to group data to present dimensions very fast, but that's it.. if there is nothing to group it will be very extense), which permits doing a drill down to present more information on subsequents steps or dashboards levels.
I am thinking of a field of the database which can tell the zoom level of presenting information.
If you are navigating on googlemaps... first you find the countrys, then the capital cities, then the cities, then the small towns, then the local stores...
The key is on the zoom level you are on each time.
You may visit Tableau about Drill downs.

Bigquery: Error running query: Query exceeded resource limits for tier 1. Tier 29 or higher required. in Redash

I would like to know how to increase my billing tier for only one Bigquery query in Redash (not the whole project).
I am getting this error, while trying to refresh the query in Redash:
"Error running query: Query exceeded resource limits for tier 1. Tier 29 or higher required".
According to Bigquery's documentation there are three ways to increase this limit (https://cloud.google.com/bigquery/pricing#high-compute). However, I am not sure which one is applicable to the queries that are written directly in the Redash's query editor. It would be great if you could provide an example.
Thanks for your help.

Strange behavior on Bigquery dataset location

I noticed a strange behavior in the Google cloud compute engine using Bigquery and VM instances.
I have a java process that streams data into Bigquery.
I expected to have better performances by choosing the same region for BigQuery dataset and the VM instances but my tests showed an unexpected behavior.
CASE1: VM on us-central1-a AND dataset location US
Average time on insertion Bigquery response: 150 milliseconds
CASE2: VM on europe-west1-c AND dataset location US
Average time on insertion Bigquery response: 700 milliseconds
CASE3: VM on us-central1-a AND dataset location EU
Average time on insertion Bigquery response: 1200 milliseconds
CASE4: VM on europe-west1-c AND dataset location EU
Average time on insertion Bigquery responset: 1700 milliseconds
I can understand the decrease of performances in CASE2 and CASE3 but what about CASE4?
The test shows that if the Bigquery dataset location is "EU" performance decrease even if the VM region is europe-west1-c.
My conclusion is: never use Bigquery in EU (sure, except for requirements on the location of the data)!
Anything wrong in my considerations?
Thanks for reporting.
Looks like the latency mentioned in the post includes both tables.get() + tabledata.insertAll(). The latency difference is mostly caused by tables.get().
We are aware that calling metadata related APIs (e.g. tables.get) is slower from EU than US. It is caused by some existing infrastructure limitations, and unfortunately there is short-term fix for it. But we are actively working on some backend changes to minimize this latency difference for the long term.
A few things you might consider to mitigate this:
pre-create your tables ahead of times, so no need to check table existence every time before a insertAll
If it is daily table, maybe try PartitionTable? Then you only need to create table once. https://cloud.google.com/bigquery/docs/partitioned-tables https://cloud.google.com/bigquery/docs/querying-partitioned-tables
If newly created tables have the same schema as a base table, try streaming to a template table. https://cloud.google.com/bigquery/streaming-data-into-bigquery#template-tables

How to use BigQuery Slots

Hi,there.
Recently,I want to run a query in bigquery web UI by using "group by" over some tables(tables' name suits xxx_mst_yyyymmdd).The rows will be over 10 million. Unhappily,the query failed with this error:
Query Failed
Error: Resources exceeded during query execution.
I did some improvements with my query language,the error may not happen for this time.But with the increasement of my data, the Error will also appear in the future.So I checked the latest release of Bigquery,maybe there two ways to solve this:
1.After 2016/01/01,Bigquery will change the Query pricing tiers to satisfy the "High Compute Tiers" so that the "resourcesExceeded error" will not happen again.
2.BigQuery Slots.
I checked some documents in Google and didn't find a way on how to use BigQuery Slots.Is there any sample or usecase of BigQuery Slots?Or I have to contact with BigQuery Team to open the function?
Hope someone can help me to answer this question,thanks very much!
A couple of points:
I'm surprised that a GROUP BY with a cardinality of 10M failed with resources exceeded. Can you provide a job id of the failed query so we can investigate? You mention that you're concerned about hitting these errors more often as your data size increases; you should likely be able to increase your data size by a few more orders of magnitude without seeing this; likely you've encountered either a bug or something was strange with either your query or your data.
"High Compute Tiers" won't necessarily get rid of resourcesExceeded. For the most part, resourcesExceeded means that BigQuery ran into memory limitations; high compute tiers only address CPU usage. (and note, they haven't been enabled yet).
BigQuery slots enable you to process data faster and with more reliable performance. For the most part, they also wouldn't help prevent resourcesExceeded errors.
There is currently (as of Nov 5) a bug where you may need to provide an EACH keyword with a GROUP BY. Recent changes should enable BigQuery to automatically select the execution strategy, so EACH shouldn't be needed, but there are a couple of cases where it doesn't pick the right one. When in doubt, add an EACH to your JOIN and GROUP BY operations.
To get your project eligible for using slots you need to contact support.