Intermittent authentication error when posting to a pubsub topic - google-oauth

We have a data pipeline built in Google Cloud Dataflow that consumes messages from a pubsub topic and streams them into BigQuery. In order to test that it works successfully we have some tests that run in a CI pipeline, these tests post messages onto the pubsub topic and verify that the messages are written to BigQuery successfully.
This is the code that posts to the pubsub topic:
from google.cloud import pubsub_v1
def post_messages(project_id, topic_id, rows)
futures = dict()
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path(
project_id, topic_id
)
def get_callback(f, data):
def callback(f):
try:
futures.pop(data)
except:
print("Please handle {} for {}.".format(f.exception(), data))
return callback
for row in rows:
# When you publish a message, the client returns a future. Data must be a bytestring
# ...
# construct a message in var json_data
# ...
message = json.dumps(json_data).encode("utf-8")
future = publisher.publish(
topic_path,
message
)
futures_key = str(message)
futures[futures_key] = future
future.add_done_callback(get_callback(future, futures_key))
# Wait for all the publish futures to resolve before exiting.
while futures:
time.sleep(1)
When we run this test in our CI pipeline it has started failing intermittently with error
21:38:55: AuthMetadataPluginCallback "<google.auth.transport.grpc.AuthMetadataPlugin object at 0x7f5247407220>" raised exception!
Traceback (most recent call last):
File "/opt/conda/envs/py3/lib/python3.8/site-packages/grpc/_plugin_wrapping.py", line 89, in __call__
self._metadata_plugin(
File "/opt/conda/envs/py3/lib/python3.8/site-packages/google/auth/transport/grpc.py", line 101, in __call__
callback(self._get_authorization_headers(context), None)
File "/opt/conda/envs/py3/lib/python3.8/site-packages/google/auth/transport/grpc.py", line 87, in _get_authorization_headers
self._credentials.before_request(
File "/opt/conda/envs/py3/lib/python3.8/site-packages/google/auth/credentials.py", line 134, in before_request
self.apply(headers)
File "/opt/conda/envs/py3/lib/python3.8/site-packages/google/auth/credentials.py", line 110, in apply
_helpers.from_bytes(token or self.token)
File "/opt/conda/envs/py3/lib/python3.8/site-packages/google/auth/_helpers.py", line 130, in from_bytes
raise ValueError("***0!r*** could not be converted to unicode".format(value))
ValueError: None could not be converted to unicode
Error: The operation was canceled.
Unfortunately this only fails in our CI pipeline, and even then it is failing intermittently (only fails on a small percentage of all CI pipeline runs). If I run the same test locally it succeeds every time. When running in the CI pipeline the code is authenticating as a service account whereas when I run it locally it is authenticating as myself
I know from the error message that it is failing on this code:
if isinstance(result, six.text_type):
return result
else:
raise ValueError("{0!r} could not be converted to unicode".format(value))
https://github.com/googleapis/google-auth-library-python/blob/3c3fbf40b07e090f2be7fac5b304dbf438b5cd6c/google/auth/_helpers.py#L127-L130
which is in a python library from google that we install using pip.
Clearly the expression:
isinstance(result, six.text_type)
is evaluating to False. I put a breakpoint on that code when I ran it locally and discovered that under normal circumstances (i.e. when it works) the value of result is something like this:
That looks like some sort of auth token.
Given the error message:
ValueError: None could not be converted to unicode
it seems that whatever action is being undertaken by the google authentication libraries it is passing None through to the code shown above.
I am at the bounds of my knowledge here. Given this is only failing in a CI pipeline I don't have the opportunity to put a breakpoint in my code and debug it. Given the call stack in the error message this is something to do with authentication.
I'm hoping someone can advise on a course of action.
Can anyone explain a means by which I can discover why None is being passed through to the code that is raising an error?

We had the same error. Finally solved it by using a JSON Web Token for authentication per Google's Quckstart. Like so:
import json
from google.cloud import pubsub_v1
from google.auth import jwt
def post_messages(credentials_path, topic, list_of_messages):
credentials_dict = json.load(open(credentials_path,'r'))
audience = "https://pubsub.googleapis.com/google.pubsub.v1.Publisher"
credentials_ob = jwt.Credentials.from_service_account_info(
credentials_dict, audience=audience
)
publisher = pubsub_v1.PublisherClient(credentials=credentials_ob)
for message_dict in list_of_message_dicts:
message = json.dumps(message_dict, default=str).encode("utf-8")
future = publisher.publish(topic, message)
We also updated our environment but it didn't fix the ValueError until we changed to jwt. Here's the environment in any case:
google-api-core==2.4.0
google-api-python-client==2.36.0
google-auth==2.3.2
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.4.6
google-cloud-core==2.1.0
google-cloud-pubsub==2.9.0

Tried the jwt solution above and though it solved the issue, it drastically degraded my write throughput.
Offering another work around that solved this issue for me.
My GOOGLE_APPLICATION_CREDENTIALS env var was set to the location of my key-file. Instead, unset that env variable and, at the start of your process, run
gcloud auth activate-service-account {account_name} --key-file {location_of_key_file}
This allows the google auth to bypass a key file and use the default service account set up (which is now the original, intended service account). Works with normal throughput and zero errors. :)

Related

urllib3.exceptions.ProxySchemeUnknown: Not supported proxy scheme None

Recently my application started getting an error related to proxies
> in __init__
> raise ProxySchemeUnknown(proxy.scheme) urllib3.exceptions.ProxySchemeUnknown: Not supported proxy scheme None
I did not make any changes to the code or performed any updates to python3.8, which is what im using.
here is the function im using to fetch proxies from an api that pulls them from the DB
def get_proxy(self):
try:
req = self.session.post(url=self.script_function_url, headers=self.script_function_header, json={"action": "proxy"}, verify=False, timeout=20).json()
self.proxy = {"https": req['ipAddress']+":"+req['port']}
except Exception as e:
print(f'Proxy error: {e}')
exit()
any help would be greatly appreciated i am completely new to python.
I don't know what exact line is causing the error in your code and if you have a proxy yourselve, but I know that you need to specify a scheme to do API calls behind a proxy.
So in windows you would do:
set http_proxy=http://xxx.xxx.xxx.xxx:xxxx
set https_proxy=http://xxx.xxx.xxx.xxx:xxxx
Key point here is to add the http:// in front.

Sagemaker tensorflow endpoint not calling the input_handler when being invoked for a prediction

I'm deploying a tensorflow.serving endpoint with a custom inference.py script via the entry point parameter
model = Model(role='xxx',
framework_version='2.2.0',
entry_point='inference.py',
model_data='xxx')
predictor = model.deploy(instance_type='xxx',
initial_instance_count=1,
endpoint_name='xxx')
inference.py constains an input_handler and an output_handler functions, but when i call predict with:
model = Predictor(endpoint_name='xxx')
url = 'xxx'
input = {
'instances': [url]
}
predictions = model.predict(input)
I'm getting the following error:
botocore.errorfactory.ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (400) from model with message "{"error": "Failed to process element: 0 of 'instances' list. Error: Invalid argument: JSON Value: "xxx" Type: String is not of expected type: float" }"
It seems the function is never calling the input_handler function in inference.py script. Do you know why this might be happening?
I am adding another possible cause for this error message, since it took me some time to resolve this.
I was using different sagemaker api versions (1.x and 2.x).
The name of the handler has changed from input_fn() to input_handler() for newer sagemaker tf containers.
Therefore the input_fn() was never called and the special input type never handled.
For details see: https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/upgrade_from_legacy.html
Maybe this helps someone.
Found the problem thanks to AWS support:
I was creating an endpoint that already had an endpoint configuration with the same name and the new configuration wasn't being utilized.

S3 Boto3 Stubber doesn't have mapping for download file?

Currently writing tests and trying to make use of the Stubber provided by botocore.
I'm trying:
client = boto3.client("s3")
response = {'Body': 'content'}
expected_params = {'Bucket': 'a_bucket_name', 'Key': 'a_path', 'Filename': 'a_target'}
with Stubber(client) as stubber:
stubber.add_response('download_file', response, expected_params)
download_file(client, "a_bucket_name", "a_path", "a_target")
Where that download file is my own function that just wraps the client download_file call. It works in practice.
However, the test fails on the stubber.add_response due to a 'OperationNotFound' error. I stepped through using the debugger, and the issue appears here in the stub API:
if not hasattr(self.client, method):
raise ValueError(
"Client %s does not have method: %s"
% (self.client.meta.service_model.service_name, method))
# Create a successful http response
http_response = AWSResponse(None, 200, {}, None)
operation_name = self.client.meta.method_to_api_mapping.get(method) <------- Error here
self._validate_response(operation_name, service_response)
There doesn't seem to be a mapping between the two in the dictionary, is this a failure of the stub API or am I missing something?
I've just found this issue, so looks like for once it really is the library and not me:
https://github.com/boto/botocore/issues/974
That's because download_file and upload_file are customizations which live in boto3. They call out to one or many requests under the hood. Right now there's not a great story for supporting customizations other than recording underlying commands they use and adding them to the stubber. There's an external library that can handle that for you, though we don't support it ourselves.

Flask + SQLAlchemy + pytest - not rolling back my session

There are several similar questions on stack overflow, and I apologize in advance if I'm breaking etiquette by asking another one, but I just cannot seem to come up with the proper set of incantations to make this work.
I'm trying to use Flask + Flask-SQLAlchemy and then use pytest to manage the session such that when the function-scoped pytest fixture is torn down, the current transation is rolled back.
Some of the other questions seem to advocate using the db "drop all and create all" pytest fixture at the function scope, but I'm trying to use the joined session, and use rollbacks, since I have a LOT of tests. This would speed it up considerably.
http://alexmic.net/flask-sqlalchemy-pytest/ is where I found the original idea, and Isolating py.test DB sessions in Flask-SQLAlchemy is one of the questions recommending using function-level db re-creation.
I had also seen https://github.com/mitsuhiko/flask-sqlalchemy/pull/249 , but that appears to have been released with flask-sqlalchemy 2.1 (which I am using).
My current (very small, hopefully immediately understandable) repo is here:
https://github.com/hoopes/flask-pytest-example
There are two print statements - the first (in example/__init__.py) should have an Account object, and the second (in test/conftest.py) is where I expect the db to be cleared out after the transaction is rolled back.
If you pip install -r requirements.txt and run py.test -s from the test directory, you should see the two print statements.
I'm about at the end of my rope here - there must be something I'm missing, but for the life of me, I just can't seem to find it.
Help me, SO, you're my only hope!
You might want to give pytest-flask-sqlalchemy-transactions a try. It's a plugin that exposes a db_session fixture that accomplishes what you're looking for: allows you to run database updates that will get rolled back when the test exits. The plugin is based on Alex Michael's blog post, with some additional support for nested transactions that covers a wider array of user cases. There are also some configuration options for mocking out connectibles in your app so you can run arbitrary methods from your codebase, too.
For test_accounts.py, you could do something like this:
from example import db, Account
class TestAccounts(object):
def test_update_view(self, db_session):
test_acct = Account(username='abc')
db_session.add(test_acct)
db_session.commit()
resp = self.client.post('/update',
data={'a':1},
content_type='application/json')
assert resp.status_code == 200
The plugin needs access to your database through a _db fixture, but since you already have a db fixture defined in conftest.py, you can set up database access easily:
#pytest.fixture(scope='session')
def _db(db):
return db
You can find detail on how to setup and installation in the docs. Hope this helps!
I'm also having issues with the rollback, my code can be found here
After reading some documentation, it seems the begin() function should be called on the session.
So in your case I would update the session fixture to this:
#pytest.yield_fixture(scope='function', autouse=True)
def session(db, request):
"""Creates a new database session for a test."""
db.session.begin()
yield db.session
db.session.rollback()
db.session.remove()
I didn't test this code, but when I try it on my code I get the following error:
INTERNALERROR> Traceback (most recent call last):
INTERNALERROR> File "./venv/lib/python2.7/site-packages/_pytest/main.py", line 90, in wrap_session
INTERNALERROR> session.exitstatus = doit(config, session) or 0
...
INTERNALERROR> File "./venv/lib/python2.7/site-packages/_pytest/python.py", line 59, in filter_traceback
INTERNALERROR> return entry.path != cutdir1 and not entry.path.relto(cutdir2)
INTERNALERROR> AttributeError: 'str' object has no attribute 'relto'
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine
from unittest import TestCase
# global application scope. create Session class, engine
Session = sessionmaker()
engine = create_engine('postgresql://...')
class SomeTest(TestCase):
def setUp(self):
# connect to the database
self.connection = engine.connect()
# begin a non-ORM transaction
self.trans = self.connection.begin()
# bind an individual Session to the connection
self.session = Session(bind=self.connection)
def test_something(self):
# use the session in tests.
self.session.add(Foo())
self.session.commit()
def tearDown(self):
self.session.close()
# rollback - everything that happened with the
# Session above (including calls to commit())
# is rolled back.
self.trans.rollback()
# return connection to the Engine
self.connection.close()
sqlalchemy doc has solution for the case

Cherrypy web server hangs forever -- Matplotlib error

I'm creating a web-based interface for a number of different command line executables, and am using cherrypy behind apache (using mod_rewrite). I'm very new to this, and am having difficulty getting things configured properly. On my development machine, everything works reasonable well, but when I installed the code on a second machine I can't get anything to work properly.
The basic workflow for the applications is: 1. upload a dataset, 2. process the data (using python with some calls to executables using subprocess.call), 3. display the results on the web page.
After uploading and processing one dataset, everytime I attempt to process a second dataset the system stops responding. I'm not seeing any output in the terminal from the cherrypy process, or in the site log that shows any errors have occurred.
I'm starting cherrypy with the following conf file:
[global]
environment: 'production'
log.error_file: 'logs/site.log'
log.screen: True
tools.sessions.on: True
tools.session.storage_type: "file"
tools.session.storage_path: "sessions/"
tools.sessions.timeout: 60
tools.auth.on: True
tools.caching.on: False
server.socket_host: '0.0.0.0'
server.max_request_body_size: 0
server.socket_timeout: 60
server.thread_pool: 20
server.socket_queue_size: 10
engine.autoreload.on:True
My init.py file:
import cherrypy
import os
import string
from os.path import exists, join
from os import pathsep
from string import split
from mako.template import Template
from mako.lookup import TemplateLookup
from auth import AuthController, require, member_of, name_is
from twopoint import TwoPoint
current_dir = os.path.dirname(os.path.abspath(__file__))
lookup = TemplateLookup(directories=[current_dir + '/templates'])
def findInSubdirectory(filename, subdirectory=''):
if subdirectory:
path = subdirectory
else:
path = os.getcwd()
for root, dirs, names in os.walk(path):
if filename in names:
return os.path.join(root, filename)
return None
class Root:
#cherrypy.expose
#require()
def index(self):
tmpl = lookup.get_template("main.html")
return tmpl.render(usr=WebUtils.getUserName(),source="")
if __name__=='__main__':
conf_path = os.path.dirname(os.path.abspath(__file__))
conf_path = os.path.join(conf_path, "prod.conf")
cherrypy.config.update(conf_path)
cherrypy.config.update({'server.socket_host': '127.0.0.1',
'server.socket_port': 8080});
def nocache():
cherrypy.response.headers['Cache-Control']='no-cache,no-store,must-revalidate'
cherrypy.response.headers['Pragma']='no-cache'
cherrypy.response.headers['Expires']='0'
cherrypy.tools.nocache = cherrypy.Tool('before_finalize',nocache)
cherrypy.config.update({'tools.nocache.on':'True'})
cherrypy.tree.mount(Root(), '/')
cherrypy.tree.mount(TwoPoint(), '/twopoint')
cherrypy.engine.start()
cherrypy.engine.block()
For one example where this occurs, I've got the following javascript function that calls my python code:
function compTwoPoint(dataset,orig){
// call python code to generate images
$.post("/twopoint/compTwoPoint/"+dataset,
function(result){
res=jQuery.parseJSON(result);
if(res.success==true){
showTwoPoint(res.path,orig);
}
else{
alert(res.exception);
$('#display_loading').html("");
}
});
}
This calls the python code:
def twopoint(in_matrix):
"""proprietary code, can't share"""
def twopoint_file(in_file_name,out_file_name):
k = imread(in_file_name);
figure()
imshow(twopoint(k))
colorbar()
savefig(out_file_name,bbox_inches="tight")
close()
class TwoPoint:
#cherrypy.expose
def compTwoPoint(self,dataset):
try:
fnames=WebUtils.dataFileNames(dataset)
twopoint_file(fnames['filepath'],os.path.join(fnames['savebase'],"twopt.png"))
return encoder.iterencode({"success": True})
These functions work together to give the expected result. The problem is that after processing one input file, I am unable to process a second file. I don't seem to get a response from the server.
On the machine where things are working, I'm running python 2.7.6 and cherrypy 3.2.3. On the second machine, I have python 2.7.7 and cherrypy 3.3.0. While this may explain the difference in behavior, I'd like to find a way to make my code portable enough to overcome the difference in version (going from older to newer)
I'm not sure what the problem is, or even what to search for. I would appreciate any guidance or help you can offer.
(edit: Digging a bit more, I discovered something is happening with matplotlib. if I put print statments before and after the figure() command in twopoint_file, only the first one prints. Calling this function directly from a python interpreter (removing cherrypy from the equation) I get the following error:
can't invoke "event" command: application has been destroyed while executing "event generate $w{{ThemeChanged}}"
procedure "ttk::ThemeChanged" line 6 invoked from within "ttk::ThemeChanged"
end edit)
I don't understand what this error means, and haven't had much luck searching.
Old question, but I got the same problem which I fixed by changing backend in Matplotlib:
import matplotlib
matplotlib.use("qt4agg")