If I pull the data from BiqQuery, will Google charge me or not for sending the data to Data Studio? - google-bigquery

If I pull the data from BiqQuery, will Google charge me or not for sending the data to Data Studio?

That depends. BigQuery is a consumption based model, unless you purchased slots. What that means is, any time you query you're utilizing resources and then getting charged at their defined rate, $5 per TB of data scanned.
There are a few caveats to that however one being that the first TB of data scanned per month is free, and not every query issued will scanned data as it may use cache. If you are concerned about the associated cost one option would be to utilize the BigQuery sandbox. It has limited functionality but will not charge you, however there are limitations.
https://cloud.google.com/bigquery/docs/quickstarts/quickstart-cloud-console

BigQuery runs queries that you pay.
DataStudio runs queries on BigQuery that you pay.
There is no cost of transfer between the two systems.

Related

Usecase for BIgQuery as a database backend for website thoughts

members,
Currently we synchronise salesdata into BigQuery, and it allows us to make fast, detailed, practically realtime reports of all kinds of stats that we otherwise would not have available. We want to have a website that is able to use these reports and present this information to website-users.
Some specs:
Users are using the data as 'readonly'
We want to do the analysis 'on request', so as soon as a user opens the page, we would query BigQuery and the user would see their stats depending on the query
The stats could change for external sources but often the result will be equal, I take into my mind that BigQuery would cache the query
The average query processes about 100Mb of data, it takes >2 seconds for the whole backend to respond (so user request, query, return resultset) so performance is what we want
Why I doubt:
BigQuery would not be adviced
Could it run 'out of hand'
Dataset will grow bigger, but we will need to keep using all historical data in any case
I would be an option to get aggregated data into another database for doing the main calls, but that would give me not a 'realtime' experience.
I would love to hear your thoughts.
As per your requirement, you can consider Bigquery as an option since Bigquery is fully managed and supports analytics over petabyte-scale data, it will be able to handle large amounts of data. Bigquery is specially designed for performing OLAP transactions so analysis can be performed on requests. Bigquery uses cached query results through which you can cache the query and fetch results quickly.
If your dataset is very large and grows then you can create partitioned tables to store and manage your data and easily query the tables. Since your data can go out of hand, Bigquery being a fully managed service will automatically handle that load. Historical data can be stored and accessed but for that you can set the expiration time of the table and also check the optimized storage according to your requirement.

Google big query backfill takes very long

I am new to stack overflow. I use Google big query to connect data from multiple sources toegether. I have made a connection to Google ads (using data transfer from big query) and this works well. But when i run a backfill of older data it takes more then 3 days to get the data from 180 days in big query. Google advises 180 days as maximum. But it takes so long. I want to do this for the past 2 years and multiple clients (we are an agency). I need to do this in chunks of 180 days.
Does anybody have a solution for this taking so long?
Thanks in advance.
According to the documentation, BigQuery Data Transfer Service supports a maximum of 180 days (as you said) per backfill request and simultaneous backfill requests are not supported [1].
BigQuery Data Transfer Service limits the maximum rate of incoming requests and enforces appropriate quotas on a per-project basis [2] and other BigQuery tasks in the project may be limiting the amount of resources used by the Transfer. Load jobs created by transfers are included in BigQuery's quotas on load jobs. It's important to consider how many transfers you enable in each project to prevent transfers and other load jobs from producing quotaExceeded errors.
If you need to increase the number of transfers, you can create other projects.
If you want to speed up the transfers for all your clients, you could split them into several projects, because it seems that’s an important amount of transfers that you are going to make there.

BQ: How to check query cost every time

Is it possible to show alert or message popup every time I run queries in BQ GUI?I am afraid of spending query cost too much.
I hope BQMate has this function.
Sometimes the cost of the query can only be determined when the query is finished, e.g, federated tables, and the newly released clustering tables. If you're concerned about the cost, the best option is to set the Maximum Bytes Billed option, then you can be sure you'll never be charged for more than that. You can set a default value for this option in your project, but right now you have to contact the support to set it for your project.
A fast way to get a query cost estimation is checking the amount of data processed on the right side of the screen in the query validator, by performing a dry-run. Check here a "query validator" example. You have two options to calculate the cost:
Manually: query pricing is described here on GB units, so you can sum and multiply: 1 free TB per month, $5 per extra TB. If you expect to query more than 1TB of data per month, you should sum queries' used data to know when to start calculating costs.
Automatically: using the online pricing calculator, which is available for all Google Cloud Platform products.
If you want to set custom cost controls, have a look on this page, since custom quotas are not enabled by default. Cost controls can be applied on project -level or user-level by restricting the number of bytes billed. Nowadays you have to submit a request from the Google Cloud Platform Console to ask for them to be set, on 10TB increments. If the usage exceeds a set quota the error message is quite clear, and is different depending on the project/user quota exceeded. For project quota:
Custom quota exceeded: Your usage exceeded the custom quota for
QueryUsagePerDay, which is set by your administrator. For more information,
see https://cloud.google.com/bigquery/cost-controls
With no remaining quota, BigQuery stops working for everyone in that project.
If you want to constantly monitorize billing data for BigQuery, have a look on this tutorial, which explains how to create a billing dashboard using Data Studio.
I don't know about BQMate since this is from Vaint Inc.

Google BigQuery for realtime call records data

I am thinking to use Google Big Query to store realtime call records involving around 3 million rows per day inserted and never updated.
I have signed up for a trial account and ran some tests
I have few concerns before i can go ahead with development
When streaming data via PHP it takes around 10-20 minutes sometime to get loaded on my tables and this is a show stopper for us because network support engineers need this data updated realtime to troubleshoot quality issues
Partitions, we can store data in partitions divided for each day but that also involves one partition being 2.5 GB on any given day and that shoots my costs to query data in range of thousands per month. Is there any other way to bring down cost here? We can store data partitioned per hour but there is no such support available.
If not BigQuery what other solutions are out there in market which can deliver similar performance and can solve these problems ?
You have the "Streaming insert" option which enables the records to be searchable in few seconds (it has its price).
See: streaming-data-into-bigquery
Check table-decorators for limiting query scan.

Google BigQuery basic questions

These may be few basic questions.
When i load data into BQ tables, where exactly data stored? (If billing is already enabled). if it is data center, what would be data center capacity? Does our data co-exist with other users data?
When we fire queries, How our queries processed? What is the default compute engine used for this?
How can we increase query processing capacity?
Thanks
CP
BigQuery datacenter capacity is practically unlimited. If you plan to upload petabytes in a very short time frame you might need to contact support first just to make sure, but for normal big loads everything should be fine.
BigQuery doesn't use compute engine, but a series of very large clusters where all queries run. That's the secret to a low cost per query, without ongoing costs per hour like other alternatives.
BigQuery increases the number of CPUs involved in your query elastically as the query needs. You don't need to manage storage nor processing capacity.