I have a BigQuery account which is accessed by many internal and external users/service accounts. Recently, with the growth of bill, we started researching how to increase the visibility of how much cost is going to each user/service account.
I know there is a way to get this info somewhere through the BigQuery API but I was wondering if there is any other easy way to get this info. Has anybody had a similar problem?
Restating the question: how to track how much data each BigQuery user/service account has processed?
Use BigQuery's audit logs (via Stackdriver) to track access and cost details as described in the docs.
A good tip is to export the logs back to BigQuery for analysis.
https://cloud.google.com/bigquery/audit-logs
Related
How to find out BigQuery cost for a project programmatically. Is there an API to do that?
Also, is it possible to know the user level cost details for queries made?
To track individual costs for a BigQuery project, you can redirect all logs back to BigQuery - and then you can run queries over these logs.
https://cloud.google.com/bigquery/docs/reference/auditlogs
These logs include who ran the query, and how much data was scanned.
Another way is using the INFORMATION_SCHEMA table, check this post:
https://www.pascallandau.com/bigquery-snippets/monitor-query-costs/
You can use Cloud Billing API
For example
[GET] https://cloudbilling.googleapis.com/v1/projects/{projectsId}/billingInfo
I have a project in BigQuery where many people update/add Views.
Other access Views/Tables from 3rd party softwares like Tableau.
I have no control for example if the Analysit who wrote the query in Tableau used the Partition of the table or not.
Is it possible somehow to ask BigQuery to send email for each query that passes threshold? For example 20GB. Then I can check this specific query and user to see if it's OK or not (I'm not forcing partition as it's not always what we need)
I know that it's possible to use the Stackdriver Logging export to download logs into BigQuery tables / storage but I don't see anything there that can tell me if query passed this specific criteria.
There are different solutions available but the best is using Cloud Pub/Sub topics and piece of Cloud Function:
Enable programmatic notifications to receive Cloud Pub/Sub messages with the current status of your budget
Programmatic Budgets Notification Examples
To track BQ usage we created a new dataset and configured it in Billing export. But after waiting for a day also the dataset seems to be empty as no new tables is created.
Is there any other setup needs to be done for this to work.
Refer this link,
https://cloud.google.com/billing/docs/how-to/export-data-bigquery
Thanks and regards,
Gour
You just need to follow the How to enable billing export to BigQuery steps to begin using this functionality. Keep in mind that you have to wait certain time to start seeing your data, as mentioned in the Export Billing Data to BigQuery documentation.
After you enable BigQuery export, it might take a few hours to start seeing your data. Billing data automatically exports your data to BigQuery in regular intervals, but the frequency of updates in BigQuery varies depending on the services you're using.
In case you continue having this issue, I recommend you to take a look the Issue Tracker tool that you can use to raise a BigQuery ticket in order to verify this scenario with the Google Technical Support Team. Since this is an automated process, you might need some of their help to review your Project's internal configuration.
I would like to know if there is a method in the BigQuery API or any other way where i can list all the queries made and their processed bytes. Something like what is listed in the Activity Page but with the processedBytes field:
https://console.cloud.google.com/home/activity?project=coherent-server-125913
We are having a problem with billing. Suddenly our BigQuery Analysis Costs have increased a lot and we think we are being charged like 20 times more than expected (we check all the responses from BigQuery API and save the processedBytes field, taking into account that the minimum charge is of 10MB).
The only way we can solve this difference is listing all the requests and comparing to our numbers to see if we arenĀ“t measuring something or if we are doing something wrong. We have opened a billing support ticket and they have redirected me to Stackoverflow for asking the question as they think that is a technical issue.
Thanks in advance!
Instead of checking totalBytesProcessed - you should try checking totalBytesBilled and billingTier (see here)
You might jumped to high billing tiers - just guess
The best place to check would be the BigQuery logs.
This is going to tell you what queries were run, who ran them, what date/time they were run, the total bytes billed etc.
Logs can be a bit tedious to look through but BigQuery allows you to stream BigQuery logs into a BigQuery table and you can then query said table to identify expensive queries.
I've done this and it works really well to give you visibility on your BQ charges. The process of how to do this is outlined in more detail here: https://www.reportsimple.com.au/post/google-bigquery
Is it feasible in hybris to analysis and store what the user/end customer is upto in the page? For example: is it feasible to just collect a report of what the user has clicked in the page and what the user is viewing?
All i need is a report of user actions. Please help.
It's probably feasible but likely a bad idea. An Ecommerce platform should be real responsive to well... sales. All that extra user data in your database system is going to bring it to a crawl.
That said:
The reporting module could probably be extended to do this. I would siphon the data collected off to a separate reporting database.
Whats "better":
Using Google Analytics with the B2C Accelerator.
Whats "best":
Something like Adobe's Sitecatalyst. Generally if you can afford hybris, you can likely afford Sitecataylst.