How to check price of executed queries in BigQuery? - sql

Is there a way to check price for queries I executed on BigQuery?
I know I can see the estimate before running a query (e.g. This query will process 5.2 GB when run.), with information like 1TB ~ $5 but I would actually like to see how much I pay for exact queries I have already run (price per query executed).

You can check billing bytes (vs. processed bytes) in BigQuery UI - both Classic and New going to respectively "Details" and "Job Information"

Related

How to interpret query process GB in Bigquery?

I am using a free trial of Google bigquery. This is the query that I am using.
select * from `test`.events where subject_id = 124 and id = 256064 and time >= '2166-01-15T14:00:00' and time <='2166-01-15T14:15:00' and id_1 in (3655,223762,223761,678,211,220045,8368,8441,225310,8555,8440)
This query is expected to return at most 300 records and not more than that.
However I see a message like this as below
But the table on which this query operates is really huge. Does this indicate the table size? However, I ran this query multiple times a day
Due to this, it resulted in error below
Quota exceeded: Your project exceeded quota for free query bytes scanned. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors
How long do I have to wait for this error to go-away? Is the daily limit 1TB? If yes, then I didn't not use close to 400 GB.
How to view my daily usage?
If I can edit quota, can you let me know which option should I be editing?
Can you help me with the above questions?
According to the official documentation
"BigQuery charges for queries by using one metric: the number of bytes processed (also referred to as bytes read)", regardless of how large the output size is. What this means is that if you do a count(*) on a 1TB table, you will supposedly be charged $5, even though the final output is very minimal.
Note that due to storage optimizations that BigQuery is doing internally, the bytes processed might not equal to the actual raw table size when you created it.
For the error you're seeing, browse the Google Console to "IAM & admin" then "Quotas", where you can then search for quotas specific to the BigQuery service.
Hope this helps!
Flavien

Teradata Current CPU utilization (Not User level and no History data)

I want to run heavy extraction basically for migration of data from Teradata to some cloud warehouse and would want to check current CPU utilization (in percentage) of overall Teradata CPU and accordingly increase the extraction processes on it.
I know we have this type of information available in "dbc.resusagespma" but it looks like history data and not current, which we can see on Viewpoint.
Can we get such a run time information with the help of SQL in Teradata?
This info is returned by one of the PMPC-API funtions, syslib.MonitorPhysicalSummary, of course, you need Execute Function rights:
SELECT * FROM TABLE (MonitorPhysicalSummary()) AS t

BigQuery Count Appears to be Processing Data

I noticed that running a SELECT count(*) FROM myTable on my larger BQ tables yields long running times, upwards of 30/40 seconds despite the validator claiming the query processes 0 bytes. This doesn't seem quite right when 500 GB queries run faster. Additionally, total row counts are listed under details -> Table Info. Am I doing something wrong? Is there a way to get total row counts instantly?
When you run a count BigQuery still needs to allocate resources (such as: slot units, shards etc). You might be reaching some limits which cause a delay. For example, the slots default per project is 2,000 units.
BigQuery execution plan provides very detail information about the process which can help you better understand the source of the delay.
One way to overcome this is to use an approximate method described in this link
This Slide by Google might also help you
For more details see this video about how to understand the execution plan

Confirmation of how to calculate bigquery query costs

I want to double check what I need to look at when assessing query costs for BigQuery. I've found the quoted price per TB here which says $5 per TB, but for precisely 1 TB of what? I have been assuming up until now (before it seemed to matter) that the relevant number would be that which the BigQuery UI outputs above the results, so for this example query:
...in this case 2.34GB. So as a fraction of a terabyte and multiplied by $5 this would cost around 1.2cents assuming I'd used up my allowance for the month.
Can anyone confirm that I'm correct? Checking this before I process something I think could rack up some non-negligible costs for once. I should say I've never been stung with a sizeable BigQuery bill before it seems difficult to do.
Can anyone confirm that I'm correct?
Confirmed
Please note - BigQuery UI in fact uses DryRun which only estimates Total Bytes Processed . The final cost is based on Bytes Billed which reflects some nuances - minimum 10MB per each table involved in query as an example. You can see more details here - https://cloud.google.com/bigquery/pricing#on_demand_pricing
I know I am late but this might help you.
If you are pushing your audit logs to another dataset, you can do below on that dataset.
WITH data as
(
SELECT
protopayload_auditlog.authenticationInfo.principalEmail as principalEmail,
protopayload_auditlog.servicedata_v1_bigquery.jobCompletedEvent AS jobCompletedEvent
FROM
`administrative-audit-trail.gcp_audit_logs.cloudaudit_googleapis_com_data_access_20190227`
)
SELECT
principalEmail,
FORMAT('%9.2f',5.0 * (SUM(jobCompletedEvent.job.jobStatistics.totalBilledBytes)/POWER(2, 40))) AS Estimated_USD_Cost
FROM
data
WHERE
jobCompletedEvent.eventName = 'query_job_completed'
GROUP BY principalEmail
ORDER BY Estimated_USD_Cost DESC
Reference: https://cloud.google.com/bigquery/docs/reference/auditlogs/

How to get cost for a query in BQ

In BigQuery, how can we get the cost for a given query? We are doing a lot of high-compute queries -- https://cloud.google.com/bigquery/pricing#high-compute -- which often multiplies the data processed by 2 or more.
Is there a way to get the "Cost" of a query with the result set?
For the API or the CLI you could use the flat --dry_run which validates the query instead of running it, like so:
cat ../query.sql | bq query --use_legacy_sql=False --dry_run
Output:
Query successfully validated. Assuming the tables are not modified,
running this query will process 9614741466 bytes of data.
For costs, just divide the total bytes by 1024 ^ 4, multiply the result by 5 and then multiply by the Billing Tier you are in and you have the expected cost ($0.043 in this example).
If you already ran the query and want to know how much it processed, you can run:
bq show -j (job_id of your query)
And it'll return Bytes Billed and Billing Tier (looks like you still have to do the math for cost computation).
For WebUI, you can install BQMate and it already estimates costs for you (but you still have to adapt for your Billing Tier).
As a final recommendation, sometimes it's possible to greatly improve performance of analyzes just by optimizing how the query process data (here at our company we had several high computing queries that now process data normally just by using features such as ARRAYS and STRUCTS for instance).