BigQuery job status logic - google-bigquery

I'm using this logic in order to catch BigQuery jobs
is success or not but sometimes i'm successful job id even due the job run
query but did't insert rows.
it's happen mainly with queries to table.
i used the code that i saw in google document and little bit add some logs for me.
it will be great if someone can tell me what i'm doing wrong.
def _wait_for_response(self, bq_api, insert_response, max_wait_time=3600):
"""get bigQuery job status. wait for DONE and check for errors.
if errors exist - raise an exception"""
start_time = time.time()
logstr.info(current_module='bq_session',
current_func='_wait_for_response')
# sleep interval between retries
# first, try 8 times every 1 second, then double sleep time until
# 30 seconds (and stay on 30 until max_wait_time is reached)
sleep = itertools.chain(itertools.repeat(1, 8), xrange(2, 30, 3),
itertools.repeat(30))
while time.time() - start_time < max_wait_time:
try:
job = bq_api.jobs().get(
projectId=insert_response['jobReference']['projectId'],
jobId=insert_response['jobReference']['jobId']).execute()
# on job end
if job['status']['state'] == 'DONE':
# if job failed raise error(s)
if 'errors' in job['status'].keys() and\
job['status']['errors']:
raise Exception(','.join(
[err['message']
for err in job['status']['errors']]))
else:
return job
except apiclient.errors.HttpError, error:
status = int(error.resp.get('status', 0))
if status >= 500:
pass
# raise Exception(
# global_messages.BQ_SERVER_ERROR.format(err=error))
elif status == 404:
raise Exception(
global_messages.BQ_JOB_NOT_FOUND.format(
jobid=insert_response['jobReference']['jobId']))
else:
raise Exception(
global_messages.BQ_ERROR_GETTING_JOB_STATUS.format(
err=error))
time.sleep(sleep.next())
raise Exception(global_messages.BQ_TIMEOUT.format(
time=max_wait_time,
jobid=insert_response['jobReference']['jobId']))

These lines of your script cause control to fall through if the status is greater than 500.
if status >= 500:
pass
# raise Exception(
# global_messages.BQ_SERVER_ERROR.format(err=error))
That could be preventing you from seeing the expected exception.

Related

telethon makes him interval for each request

I have a piece of code like this
hanzi = "一丁七万丈三上下不与"
try:
all_participants = await client.get_participants(my_channel, None,search=hanzi, aggressive=True)
except errors.FloodWaitError as e:
print('Have to sleep', e.seconds, 'seconds')
print("")
exit
# except errors.common.MultiError:
# print("")
# exit
When searching for "hanzi" with too much text, he will prompt a flood error. What I can think of is to split with an array and loop the request, but that doesn't solve the actual problem. Is there a way to make each request delay a few seconds before proceeding.

Will this timer code work for calling Python function every X seconds?

I have a Python thread, and in it, I attempt to call a function, wait X seconds after it finishes, then repeats indefinitely (runs as a Windows service).
I am using the code below. Based on the code below, since I am incrementing "count" every minute, theoretically, is there a case where this could "overflow" if it runs for a long time (like if I lowered it down to every 2 seconds or something very small)?
Is this method overkill if all I need to do is wait till the function finishes and then do it again after approximately (doesn't have to be precise) X seconds?
Code:
def g_tick():
t = time.time()
count = 0
while True:
count += 1
logging.debug(f"Count is: {count}")
yield max(t + count*60 - time.time(), 0) # 1 Minute Intervals (300/60 secs)
g = g_tick()
while True:
try:
processor_work() # this function does work and returns when complete
except Exception as e:
log_msg = f"An exception occurred while processing tickets due to: {e}"
logging.exception(log_msg)
time.sleep(next(g))

Python BigQuery client - setting query result timeout

Consider the following script (adapted from the Google Cloud Python documentation: https://google-cloud-python.readthedocs.io/en/0.32.0/bigquery/usage.html#querying-data), which runs a BigQuery query with a timeout of 30 seconds:
import logging
from google.cloud import bigquery
# Set logging level to DEBUG in order to see the HTTP requests
# being made by urllib3
logging.basicConfig(level=logging.DEBUG)
PROJECT_ID = "project_id" # replace by actual project ID
client = bigquery.Client(project=PROJECT_ID)
QUERY = ('SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013` '
'WHERE state = "TX" '
'LIMIT 100')
TIMEOUT = 30 # in seconds
query_job = client.query(QUERY) # API request - starts the query
assert query_job.state == 'RUNNING'
# Waits for the query to finish
iterator = query_job.result(timeout=TIMEOUT)
rows = list(iterator)
assert query_job.state == 'DONE'
assert len(rows) == 100
row = rows[0]
assert row[0] == row.name == row['name']
The linked documentation says:
Use of the timeout parameter is optional. The query will continue to
run in the background even if it takes longer the timeout allowed.
When I run it with google-cloud-bigquery version 1.23.1, the logging output seem to indicate that "timeoutMs" is 10 seconds.
DEBUG:urllib3.connectionpool:https://bigquery.googleapis.com:443 "GET /bigquery/v2/projects/project_id/queries/5ceceaeb-e17c-4a86-8a27-574ad561b856?maxResults=0&timeoutMs=10000&location=US HTTP/1.1" 200 None
Notice the timeoutMs=10000 in the output above.
This seems to happen whenever I call result with a timeout value that is higher than 10. On the other hand, if I use a value lower than 10 as the timeout value, the timeoutMs value looks correct. For example, if I change TIMEOUT = 30 to TIMEOUT = 5 in the script above, the log shows:
DEBUG:urllib3.connectionpool:https://bigquery.googleapis.com:443 "GET /bigquery/v2/projects/project_id/queries/71a28435-cbcb-4d73-b932-22e58e20d994?maxResults=0&timeoutMs=4900&location=US HTTP/1.1" 200 None
Is this behavior expected?
Thank you in advance and best regards.
The timeout parameter performs in a best-effort manner to execute all the API calls within the method in the timeframe indicated. Internally, the result() method can perform more than one request, and the getQueryResults request in the log:
DEBUG:urllib3.connectionpool:https://bigquery.googleapis.com:443 "GET /bigquery/v2/projects/project_id/queries/5ceceaeb-e17c-4a86-8a27-574ad561b856?maxResults=0&timeoutMs=10000&location=US HTTP/1.1" 200 None
is executed inside the done() method. You can see the source code to understand how the timeout for the request is calculated, but basically, it is the minimum value between 10 seconds and the user timeout. If the operation has not been completed, it will be retried until the timeout has been reached.

How to implement session time out in OpenERP

I want to automatically logout from OpenERP session if session time is more than 30 min.
This can be done by editing the session_gc method in .../addons/web/http.py. The following code illustrates your need -- remove or comment out the if condition (and un-indent the following lines accordingly):
def session_gc(session_store):
#if random.random() < 0.001:
# we keep session one week
last_week = time.time() - x
for fname in os.listdir(session_store.path):
path = os.path.join(session_store.path, fname)
try:
if os.path.getmtime(path) < last_week:
os.unlink(path)
except OSError:
pass
The x is the number of seconds for timeout as per your need.

Does Delayed::Worker.new.work_off work jobs with a future run_at time?

I have a controller action that creates 2 background jobs to be run at a future date.
I am trying to test that the background jobs get run
# spec/controllers/job_controller_spec.rb
setup
post :create, {:job => valid_attributes}
Delayed::Job.count.should == 2
Delayed::Worker.logger = Rails.logger
#Delayed::Worker.new.work_off.should == [2,0]
Delayed::Worker.new.work_off
Delayed::Job.count.should == 0 # this is where it fails
This is the error:
1) JobsController POST create with valid params queues up delayed job and fires
Failure/Error: Delayed::Job.count.should == 0
expected: 0
got: 2 (using ==)
For some reason it seems like it is not firing.
You can try to use
Delayed::Worker.new(quiet: false).work_off
to debug the result of your background jobs, this could help you to find out if the fact that they're supposed to run in the future is messing with the assert itself.
Don't forget to take off the "quiet:false" when you're done, otherwise your tests will always output the results of the background jobs.