how to raise Google Big Query daily query quota - google-bigquery

We are running a batch process, and hitting the daily query quota of 20,000.
Is there a way to raise the limit?
thanks.

The query-per-day limit (currently 40k / day) is one that we're generally happy to raise. In general, it is there to prevent abuse scenarios (people who use BigQuery as a calculator, as in SELECT 17 + 32). If you're running real queries over non-trivial sized data, we will almost certainly be willing to raise this ceiling.
If you've got a contact with Google Cloud Support, please let them know your project ID.
If you do not have a support contact, you can indicate your project ID here, or e-mail me (tigani#google.com) and I will route the request appropriately.

Related

Allowing many users to view stale BigQuery data query results concurrently

If I have a BigQuery dataset with data that I would like to make available to 1000 people (where each of these people would only be allowed to view their subset of the data, and is OK to view a 24hr stale version of their data), how can I do this without exceeding the 50 concurrent queries limit?
In the BigQuery documentation there's mention of 50 concurrent queries being permitted which give on-the-spot accurate data, which I would surpass if I needed them to all be able to view on-the-spot accurate data - which I don't.
In the documentation there is mention of Batch jobs being permitted and saving of results into destination tables which I'm hoping would somehow allow a reliable solution for my scenario, but am having difficulty finding information on how reliably or frequently those batch jobs can be expected to run, and whether or not someone querying results that exist in those destination tables is in itself counting towards the 50 concurrent users limit.
Any advice appreciated.
Without knowing the specifics of your situation and depending on how much data is in the output, I would suggest putting your own cache in front of BigQuery.
This sounds kind of like a dashboading/reporting solution, so I assume there is a large amount of data going in and a relatively small amount coming out (per-user).
Run one query per day with a batch script to generate your output (grouped by user) and then export it to GCS. You can then break it up into multiple flat files (or just read it into memory on your frontend). Each user hits your frontend, you determine which part of the output to serve up to them and respond.
This should be relatively cheap if you can work off the cached data and it is small enough that handling the BigQuery output isn't too much additional processing.
Google Cloud Functions might be an easy way to handle this, if you don't want the extra work of setting up a new VM to host your frontend.

List all the queries made to BigQuery with their processed bytes

I would like to know if there is a method in the BigQuery API or any other way where i can list all the queries made and their processed bytes. Something like what is listed in the Activity Page but with the processedBytes field:
https://console.cloud.google.com/home/activity?project=coherent-server-125913
We are having a problem with billing. Suddenly our BigQuery Analysis Costs have increased a lot and we think we are being charged like 20 times more than expected (we check all the responses from BigQuery API and save the processedBytes field, taking into account that the minimum charge is of 10MB).
The only way we can solve this difference is listing all the requests and comparing to our numbers to see if we arenĀ“t measuring something or if we are doing something wrong. We have opened a billing support ticket and they have redirected me to Stackoverflow for asking the question as they think that is a technical issue.
Thanks in advance!
Instead of checking totalBytesProcessed - you should try checking totalBytesBilled and billingTier (see here)
You might jumped to high billing tiers - just guess
The best place to check would be the BigQuery logs.
This is going to tell you what queries were run, who ran them, what date/time they were run, the total bytes billed etc.
Logs can be a bit tedious to look through but BigQuery allows you to stream BigQuery logs into a BigQuery table and you can then query said table to identify expensive queries.
I've done this and it works really well to give you visibility on your BQ charges. The process of how to do this is outlined in more detail here: https://www.reportsimple.com.au/post/google-bigquery

How do I find out how much of my 1TB free monthly allotment I've used in Google BigQuery?

I feel like there must be some way to figure out how much of free 1TB is left besides summing up the "bytes processed" amounts for every single query. But I haven't been able to find it anywhere in the console or elsewhere.
Unfortunately this is not super easy right now. The best way to get the answer is, as you said, to sum up the bytes processed for the queries you've run.
You can get the data via the BQ jobs.list API, or you can use BQ's audit logs if that's more convenient. The audit logs can even be queried in BQ, but of course that incurs additional usage. :-)
You can also see your unbilled usage for GCP services via the GCP console. However, this only shows BQ usage from the free 1 TB tier once you've incurred some actual BQ charges (i.e., once you've gone over the 1 TB), which makes it less useful for your particular use case.
To expand on Jeremy's answer, you might get that information also directly via the INFORMATION_SCHEMA.JOBS_BY_* views as I explained in my answer to How can I monitor incurred BigQuery billings costs (jobs completed) by table/dataset in real-time? based on How to monitor query costs in Google BigQuery
SELECT
(SUM(total_bytes_processed) * 5) / (1024 * 1024 *1024 *1024),
FROM
`region-us`.INFORMATION_SCHEMA.JOBS_BY_USER
Please consider the caveats mentioned in the linked answers (caching, region-specific costs per TB, ...)!

How to circumvent BigQuery's 20 concurrent queries limitation?

Wondering if anyone knows... or have ran into this.. there's a 20 concurrent queries limitation for BigQuery.
https://developers.google.com/bigquery/quota-policy#queries
Is there a way to disable the limit? Our MapReduce tasks needs many concurrent queries in order to complete within a reasonable amount of time.
We have a similar problem. There is no way how to change this from your side. Also "upgrading your plan" as #Dominik suggest won't help.
You have to contact Google directly, explain your problem (business case) and if it is valid they can increase your quota limits (for certain Google Cloud project)

Bloomberg API request limit

Is there anyway to determine how many requests or how much data you have in your remaining request limit amount for Bloomberg API?
from Bloomberg HelpDesk on April 2014 (this is valid for a basic desktop client):
We have 3 kind of limits..
You can have no more that 3500 real time fields open at the same time.
If you exceed this limit you will see "NA Limit" as error message and
you just need to delete some securities/ fields in order for the error
message to disappeared and to see the values.
We have also a daily limit. The Daily API limit is 500,000 hits/per
day. A "hit" is defined as one request for a single security/field
pairing. Therefore, if you request static data for 5 fields and 10
securities, that will translate into a total of 50 hits. so try to
refresh just the portion of the spreadsheet that really needs to be
refreshed and avoid refreshing it all or reopen it many times a day.
The last limit is a monthly limit. Our monthly limits comes from a
proprietary model. Only about 0.4% of our user database ever goes over
this limit. This limit is based on unique securities and depends on
the type of data being downloaded. For example some of the data on the
system such as intra-day is valued a little bit higher than historical
end of day for any given list of securities. We do not recommend more
than 5000 to 7000 unique identifiers per month and the limit upgrade
will only allow you to get data to complete your project. Once a
security is used once in a month then if you use it again it will not
count again towards the monthly limit.
We normally grant 2 resets per month in case you exceed your daily
limit and if you exceed your monthly limit we grant 1 extension per
month (10% more), if you breach the limit again you will then need to
wait for the midnight for the daily limit to be reset automatically or
the end of the month for the reset of the monthly.
Bloomberg do not state what the explicit limits are, and there is no programmatic way of finding out what the limits are or what proportion of your limits you have used.
The best information from Bloomberg that I have found is on the WAPI page (in the terminal). On the menus on the LHS, go to WAPI Home > API Resources > API Data Limits. There are two pages, 'Extended Rules and Usage Limits' and 'Managing Your API Data Limits' that shed some further light on the matter.
Broadly speaking... there is a daily limit of individual data requests (i.e. security/field pairs - but duplicates are counted for each request). However, your limit for subscriptions is based on the number of securities you are subscribing to concurrently - i.e. if you expect to be requesting the price of a security every 5 mins, you are much better off subscribing to that security's price. Then there is a monthly limit that is based on the number of unique securities that you are making requests for.
there is an upper limit on Bloomberg API, 500,000 hits per day.
-- information from Bloomberg Help Help
The daily limit is clearly stated - it is the monthly limit that is not to my knowledge disclosed in writing. I have been told the following in the context of discussions about Data Licence, which is one Bloomberg product for bulk data subscription. The monthly limit is expressed as a budget in $, and it is the equivalent price for your requests, priced under the Data Licence schema, which clearly is not secret if you enquire about that product. So why the secrecy about the budget? The reason it is commercially sensitive is that this budget is many times the monthly cost of the Terminal Licence, so clearly if you (a) know what it is and (b) either have access via API to the budget spent (nope) OR write software to 'count the cost' (not hard), then you could pony up a couple of terminals and vastly reduce your Data Licence spend. Bloomberg naturally frown on this sort of activity because it represents an arbitrage opportunity in their pricing model and it is not really 'playing nice'. They likewise do not like if you hit 'the wrong kind of data' too often or the monthly limit at all often, and these activities may prompt them to investigate your business model to be sure you are in compliance with all the T&C of the Data Addendum. Out of courtesy to Bloomberg I am not posting that budget number here, but you should be able to get it from your salesperson and confirm the validity of what I have said, because it may change at any time as it is not part of any contract.
I don't believe this is possible programmatically, however if you speak to the Bloomberg helpdesk they will be able to tell you whether you are near the limit, and reset it for you if necessary. Obviously they will only do that a certain number of times. I have not managed to get a definitive answer as to what the limit is, but it's designed to be large enough that you would not hit it just running spreadsheets, which have a limit of 3500 Bloomberg real-time formulas.
If you feel the download limit is not breached but you still get the error message, you can run the following steps to solve the issue:
Close Excel completely.
From the Windows "Start" menu, select All Programs > Bloomberg > Stop API Process. A command prompt window appears.
Press <Enter> to close the window.
From the Windows Start menu, select All Programs > Bloomberg > API Environment Diagnostics.
Click the Start button.
When the test is complete, if there are any red errors, click the "Repair" button.
Re-open Excel and test a formula.
500'000 data points is the approximate daily limit, however remember different types of data use up varying amounts. It is not 1 for 1. Typically requests for esoteric securities and fields will use up more data per request, than PX_LAST for AAPL US for example. Also there are different types of request, such as reference or historic, which will also consume your limit differently.
If you are requesting intra-day realtime data, these fields are typically not charged to your usage limit. Rather you have limits on how many times the realtime 'pipe' can be opened.
Bloomberg are typically very helpful at resetting your monthly data usage limit should you exceed it on an adhoc basis. This is not written company policy, but seems to be part of their customer care. If you are persistently breaching limits each month, they are likely to stop resetting your limits and try to move you to B-PIPE. But otherwise for my experience they are flexible