Export Billing Data to BigQuery not Working - google-bigquery

To track BQ usage we created a new dataset and configured it in Billing export. But after waiting for a day also the dataset seems to be empty as no new tables is created.
Is there any other setup needs to be done for this to work.
Refer this link,
https://cloud.google.com/billing/docs/how-to/export-data-bigquery
Thanks and regards,
Gour

You just need to follow the How to enable billing export to BigQuery steps to begin using this functionality. Keep in mind that you have to wait certain time to start seeing your data, as mentioned in the Export Billing Data to BigQuery documentation.
After you enable BigQuery export, it might take a few hours to start seeing your data. Billing data automatically exports your data to BigQuery in regular intervals, but the frequency of updates in BigQuery varies depending on the services you're using.
In case you continue having this issue, I recommend you to take a look the Issue Tracker tool that you can use to raise a BigQuery ticket in order to verify this scenario with the Google Technical Support Team. Since this is an automated process, you might need some of their help to review your Project's internal configuration.

Related

linking GA4 project to bigquery - streaming vs daily

I recently linked my GA4 property to bigquery to better look at the analytics data. That was initially on daily, so every day the data was exported from Google Analytics to Bigquery. However, I decided that streaming is necessary so I switched from daily to streaming in the BigQuery Linking section of GA4's admin tab. However, that streaming data is not showing up after a few hours. I'm wondering if anyone has done this with similar problems. Do I need to recreate an entire bigquery project?
If you look at your configuration options for GA to BigQuery, you will see a message under the streaming option.
This option will take effect after the next date boundary (tomorrow) for this property.
This property = the "Data exported continuously" option (streaming)
You will probably see your data tomorrow.

How to pre-process BigQuery data coming from Stackdriver

I am currently exporting logs from Stackdriver to BigQuery using sinks. But i am only interessted in the jsonPayload. I would like to ignore pretty much everything else.
But since the table creation and data insertion happens automatically, i could not do this.
Is there a way to preprocess data coming from sink to store only what matters?
If the answer is no, is there a way to run a cron job each day to copy yesterday data into a seperate table and then remove it? (knowing that the tables are named using timestamps which makes it possible to query them by day)
As far as I know both options mentioned are currently not possible in the GCP platform. On my end I've also tried to create an internal reproduction of your request and noticed that there isn't a way to solely filter the jsonPayload.
I would therefore suggest creating a feature request in regards to your ask on the following public issue tracker link. Note that feature requests do not have an ETA as to when they'll processed or if they'll be implemented.

How to be notified for high costs of queries in BigQuery?

I have a project in BigQuery where many people update/add Views.
Other access Views/Tables from 3rd party softwares like Tableau.
I have no control for example if the Analysit who wrote the query in Tableau used the Partition of the table or not.
Is it possible somehow to ask BigQuery to send email for each query that passes threshold? For example 20GB. Then I can check this specific query and user to see if it's OK or not (I'm not forcing partition as it's not always what we need)
I know that it's possible to use the Stackdriver Logging export to download logs into BigQuery tables / storage but I don't see anything there that can tell me if query passed this specific criteria.
There are different solutions available but the best is using Cloud Pub/Sub topics and piece of Cloud Function:
Enable programmatic notifications to receive Cloud Pub/Sub messages with the current status of your budget
Programmatic Budgets Notification Examples

Removing BigQuery Public Dataset

Does anyone know of any way to remove the public datasets from a BigQuery project?
Though the risk is very low, I don't want my users to be able to run queries against them and rack up costs.
Thanks
Its an old question, but for those who just want to unpin the "bigquery-public-data" to tidy up the resources list, you can click the name on the side, then on the far right of the info pane there is an "unpin project button". Click that.
The whole point of public datasets is that everyone has access to them so they can test BigQuery. Even if a feature request will create the option to disable the listing in the panel of the BigQuery web UI, the users will still have access and could query the public datasets.
It will be more practical to use custom quotas.
So you would create a project with a number of users that share a quota that you consider enough for their activities. When the established quota is reached BigQuery stops and the users receive an error message when trying to run queries.
Another useful tool is creating budget alerts with a desired level that you can set taking into account the previous month's spend. The alert will notify you when the project's bill have reached the amount you set and can save you from bad surprises.
In addition, implementing the Audit Logs in your project will give comprehensive overview on the BigQuery operations. Check this example of an Audit Logs query that will give details on the performed queries. Of course, you will find out about the use of a public dataset after it happens but this will point out who’s the user that performed the query and you can reinforce the administration policy of not inquiring public datasets. To get information on the performed query, including the interrogated dataset, use this field when querying the Audit Logs:
'protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent.job.jobConfiguration.query.query'
As a last resort, you can create a designated project for your users to query the public datasets and to make sure it will not create additional costs, you can remove the billing account. Though, by doing so you can only query 1 TB of data per month, the BigQuery always free usage tier.
Also keep in mind about this best practices to limit the queries costs.
if you closed current tab , public data set will disappear from google BigQuery page

Using BigQuery for logs analysis

Im trying to do logs analysis with BigQuery. Specifically, I have an appengine app and a javascript client that will be sending log data to BigQuery. In bigquery, I'll store the full log text in one column but also extract important fields into other columns. I then want to be able to do adhoc queries over those columns.
Two questions:
1) Is BigQuery particularly good or particularly bad at this use case?
2) How do I setup revolving logs? I.e. I want to only store the last N logs or the last X GB of log data. I see delete is not supported.
Just so you know, there is an excellent demo of moving App Engine Log data to BigQuery via App Engine MapReduce called log2bq (http://code.google.com/p/log2bq/)
Re: "use case" - Stack Overflow is not a good place for judgements about best or worst, but BigQuery is used internally at Google to analyse really really big log data.
I don't see the advantage of storing full log text in a single column. If you decide that you must set up revolving "logs," you could ingest daily log dumps by creating separate BigQuery tables, perhaps one per day, and then delete the tables when they become old. See https://developers.google.com/bigquery/docs/reference/v2/tables/delete for more information on the Table.delete method.
After implementing this - we decided to open source the framework we built for it. You can see the details of the framework here: http://blog.streak.com/2012/07/export-your-google-app-engine-logs-to.html
If you want your Google App Engine (Google Cloud) project's logs to be in BigQuery, Google has added this functionality built in to the new Cloud Logging system. It is a beta feature known as "Logs Export"
https://cloud.google.com/logging/docs/install/logs_export
They summarize it as:
Export your Google Compute Engine logs and your Google App Engine logs to a Google Cloud Storage bucket, a Google BigQuery dataset, a Google Cloud Pub/Sub topic, or any combination of the three.
We use the "Stream App Engine Logs to BigQuery" feature in our Python GAE projects. This sends our app's logs directly to BigQuery as they are occurring to provide near real-time log records in a BigQuery dataset.
There is also a page describing how to use the exported logs.
https://cloud.google.com/logging/docs/export/using_exported_logs
When we want to query logs exported to BigQuery over multiple days (e.g. the last week), you can use a SQL query with a FROM clause like this:
FROM
(TABLE_DATE_RANGE(my_bq_dataset.myapplog_,
DATE_ADD(CURRENT_TIMESTAMP(), -7, 'DAY'), CURRENT_TIMESTAMP()))