What is the recommended architecture for using amazon neptune in a scalable way? - amazon-neptune

I am building an application backed by a Neptune database. Because I want the application to be scalable, I am using AWS Lambda + API gateway to build a REST API to interact with the database. This seems to be a reasonable idea based on the fact that this use case is documented in the Neptune docs.
The Neptune docs recommend reusing the websocket connection to the database across the entire execution context of the function, which is what I am doing at the moment. The docs also recommend resetting the connection and retrying upon errors (see here), which I am also using. However, I am seeing exceptions every now and then (perhaps every 20 requests on average). One of the exceptions I get is
ConnectionResetError: Cannot write to closing transport
which seems to be the same as this issue.
The other one is:
Traceback (most recent call last):
File "/var/task/chalice/app.py", line 1685, in _get_view_function_response
response = view_function(**function_args)
File "/var/task/app.py", line 57, in resource
return Resource(app.current_request, g).process()
File "/var/task/backoff/_sync.py", line 94, in retry
ret = target(*args, **kwargs)
File "/var/task/chalicelib/handlers/resource.py", line 106, in get
values = resources.valueMap().with_(WithOptions.tokens).toList()
File "/var/task/gremlin_python/process/traversal.py", line 57, in toList
return list(iter(self))
File "/var/task/gremlin_python/process/traversal.py", line 47, in __next__
self.traversal_strategies.apply_strategies(self)
File "/var/task/gremlin_python/process/traversal.py", line 548, in apply_strategies
traversal_strategy.apply(traversal)
File "/var/task/gremlin_python/driver/remote_connection.py", line 63, in apply
remote_traversal = self.remote_connection.submit(traversal.bytecode)
File "/var/task/gremlin_python/driver/driver_remote_connection.py", line 60, in submit
results = result_set.all().result()
File "/var/lang/lib/python3.7/concurrent/futures/_base.py", line 435, in result
return self.__get_result()
File "/var/lang/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/var/task/gremlin_python/driver/resultset.py", line 90, in cb
f.result()
File "/var/lang/lib/python3.7/concurrent/futures/_base.py", line 428, in result
return self.__get_result()
File "/var/lang/lib/python3.7/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/var/lang/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/var/task/gremlin_python/driver/connection.py", line 82, in _receive
data = self._transport.read()
File "/var/task/gremlin_python/driver/aiohttp/transport.py", line 104, in read
raise RuntimeError("Connection was already closed.")
RuntimeError: Connection was already closed.
In case it is relevant, I am using gremlingpython==3.5.1
It seems to me that these issues are all ultimately a consequence of using AWS Lambda, namely due to the mismatch between the longevity of websocket connections and the ephemeral nature of lambda execution contexts. The question then is: Am I doing the wrong thing by trying to use AWS lambda for my API? Would it be more appropriate to setup an EC2 instance and deal with the scalability in some other way?
P.S. Previously I did create and close a connection in every function execution (as previously recommended in the Neptune docs), which did work fine but was naturally slow.

The latest version of Neptune only supports Gremlin 3.4.11 (https://docs.aws.amazon.com/neptune/latest/userguide/engine-releases-1.0.5.1.html). I would start by using gremlin-python 3.4.11 and see if that resolves your issue. Gremlin-python 3.5 replaced Tornado with AIO HTTP (ref) for websocket connections and I suspect that change may be causing a slight change in behavior that a future release supporting Gremlin 3.5 will address.

I wonder whether the 'Connection was already closed' error message is not being treated as a retriable error by the retry logic?
What happens if you add this error message to the list of retriable_error_msgs in the Python example in the docs?

Related

How does "on-negotiation-needed" work when trying to stream using gstreamer webrtc?

How does the webrtc pipeline get any information about its peers?
This is what I assume what the on_negotiation_needed callback does?
def start_pipeline(self):
self.pipe = Gst.parse_launch(PIPELINE_DESC)
self.webrtc = self.pipe.get_by_name('sendrecv')
**self.webrtc.connect('on-negotiation-needed', self.on_negotiation_needed)**
self.webrtc.connect('on-ice-candidate', self.send_ice_candidate_message)
self.webrtc.connect('pad-added', self.on_incoming_stream)
self.pipe.set_state(Gst.State.PLAYING)
I see that it has the on_negotiation_needed callback but its unclear where the element variable comes from? I looked here: http://blog.nirbheek.in/2018/02/gstreamer-webrtc.html and here: https://github.com/centricular/gstwebrtc-demos and I am still confused as to how this negotiation works? From what I understand there are 2 (or more) peers and both of them must connect to the signaling server, then one of them has to create an offer.
I await for the message from (I assume) the gstreamer webrtcbin on the signaling server:
print (websocket.remote_address)
#get message from client
message = await asyncio.wait_for(websocket.recv(), 3000)
and I get this error when the pipline starts:
('192.168.11.138', 44120)
Error in connection handler
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/websockets/protocol.py", line 674, in transfer_data
message = yield from self.read_message()
File "/usr/local/lib/python3.6/dist-packages/websockets/protocol.py", line 742, in read_message
frame = yield from self.read_data_frame(max_size=self.max_size)
File "/usr/local/lib/python3.6/dist-packages/websockets/protocol.py", line 815, in read_data_frame
frame = yield from self.read_frame(max_size)
File "/usr/local/lib/python3.6/dist-packages/websockets/protocol.py", line 884, in read_frame
extensions=self.extensions,
File "/usr/local/lib/python3.6/dist-packages/websockets/framing.py", line 99, in read
data = yield from reader(2)
File "/usr/lib/python3.6/asyncio/streams.py", line 672, in readexactly
raise IncompleteReadError(incomplete, n)
asyncio.streams.IncompleteReadError: 0 bytes read on a total of 2 expected bytes
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/websockets/server.py", line 169, in handler
yield from self.ws_handler(self, path)
File "signaling_server.py", line 34, in signaling
message = await asyncio.wait_for(websocket.recv(), 3000)
File "/usr/lib/python3.6/asyncio/tasks.py", line 358, in wait_for
return fut.result()
File "/usr/local/lib/python3.6/dist-packages/websockets/protocol.py", line 434, in recv
yield from self.ensure_open()
File "/usr/local/lib/python3.6/dist-packages/websockets/protocol.py", line 646, in ensure_open
) from self.transfer_data_exc
websockets.exceptions.ConnectionClosed: WebSocket connection is closed: code = 1006 (connection closed abnormally [internal]), no reason
I cannot say about Python (unfortunately, cannot make Python bindings for GStreamer work on Windows), however, demo works from C# (I just checked).
First you should connect with your browser to https://webrtc.nirbheek.in/, and get the 'Our id' value.
Your Python Gstreamer should connect to wss://webrtc.nirbheek.in:8443, and use the Id value from the browser.
The browser will get the test image stream from the GStreamer, and the GStreamer application will get the Webcam image from the browser.
HTH, Tom
Here's a screenshot:

Using LineReciever with Twisted Process protocols

I'm trying to use twisted to handle data generated by a binary (which indefinitely dumps lines onto stdout). Since by data is inherently line-delimited, I was trying to used the LineReciever instead of trying to parse data. The following is the relevant bit of the code which seems to be causing trouble :
class ProtocolBareQDAL41xB(ProcessProtocol, LineReceiver):
...
def outReceived(self, data):
print "Got Data:" + repr(data)
self.dataReceived(data)
def lineReceived(self, line):
print "Got Line: " + line
self._process_line(line)
...
This 'works' for the first of two lines in the output. I don't know yet if it works for only one line, or if it works for all but the last line. The resulting output looks something like :
$ python BareQDAL41xB.py
Made Connection
<Process pid=16486 status=-1>
Got Data:'No device found!\nMultiple devices found! Please connect only one.\n'
Got Line: No device found!
Got Serial Number : found!
Unhandled Error
Traceback (most recent call last):
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/python/log.py", line 101, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/python/log.py", line 84, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/python/context.py", line 118, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args, **kw)
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/python/context.py", line 81, in callWithContext
return func(*args,**kw)
--- <exception caught here> ---
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/internet/posixbase.py", line 597, in _doReadOrWrite
why = selectable.doRead()
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/internet/process.py", line 274, in doRead
return fdesc.readFromFD(self.fd, self.dataReceived)
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/internet/fdesc.py", line 94, in readFromFD
callback(output)
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/internet/process.py", line 277, in dataReceived
self.proc.childDataReceived(self.name, data)
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/internet/process.py", line 931, in childDataReceived
self.proto.childDataReceived(name, data)
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/internet/protocol.py", line 604, in childDataReceived
self.outReceived(data)
File "BareQDAL41xB.py", line 104, in outReceived
self.dataReceived(data)
File "/media/ldata/code/virtualenvs/tendril/local/lib/python2.7/site-packages/twisted/protocols/basic.py", line 573, in dataReceived
self.transport.disconnecting):
exceptions.AttributeError: 'Process' object has no attribute 'disconnecting'
processExited, status 0
processEnded, status 0
LineReciever seems to be expecting the transport to implement disconnecting.
Is it possible to use twisted's LineReciever with twisted's ProcessProtocol, or should I implement the line parser in my protocol instead?
LineReceiver is already a Protocol, which implements different interfaces than IProcessProtocol.
Luckily, recent versions of Twisted already contain an adapter that does what you want - which is to treat a subprocess as a stream of bytes. Rather than calling spawnProcess directly, use ProcessEndpoint, and you can pass a regular ProtocolFactory, no ProcessProtocol involved.
However, as a commenter has already pointed out, there's a bug here, where the disconnecting attribute is not formally part of ITransport, but LineReceiver (and LineOnlyReceiver) depend on it anyway, and since it's not part of the interface, ProcessEndpoint doesn't implement it. That should definitely be fixed, but in the meanwhile, we'll need to work around it.
As a happy accident, Twisted's built-in support for wrapping protocols, WrappingFactory, already has support for the disconnecting attribute, specifically because of this ugly disparity between the theory of the interface specifications and the reality of the most popular ITransport implementations. So even a do-nothing wrapper will work around the problem. You can implement this like so:
from zope.interface import implementer
from twisted.internet.interfaces import IStreamClientEndpoint
from twisted.protocols.policies import WrappingFactory
#implementer(IStreamClientEndpoint)
class DisconnectingWorkaroundEndpoint(object):
def __init__(self, endpoint):
self._endpoint = endpoint
def connect(self, protocolFactory):
return self._endpoint.connect(WrappingFactory(protocolFactory))
and then when you construct your ProcessEndpoint, do:
endpoint = DisconnectingWorkaroundEndpoint(ProcessEndpoint(...))
Sorry for the delay on answering; while you've probably worked out your own workaround, I hope this will be useful to others with the same question!

Issues with bitbake for building Angstrom

The issue I'm having is that I'm trying to build an Angstrom image from scratch using bitbake (since Angstrom is now Yocto Compatible) but I've run into an error the moment I run the bitbake systemd-image
Traceback (most recent call last):
File "/usr/bin/bitbake", line 234, in <module>
ret = main()
File "/usr/bin/bitbake", line 197, in main
server = ProcessServer(server_channel, event_queue, configuration)
File "/usr/lib/pymodules/python2.7/bb/server/process.py", line 78, in __init__
self.cooker = BBCooker(configuration, self.register_idle_function)
File "/usr/lib/pymodules/python2.7/bb/cooker.py", line 76, in __init__
self.parseConfigurationFiles(self.configuration.file)
File "/usr/lib/pymodules/python2.7/bb/cooker.py", line 510, in parseConfigurationFiles
data = _parse(os.path.join("conf", "bitbake.conf"), data)
TypeError: getVar() takes exactly 3 arguments (2 given)
ERROR: Error evaluating '${TARGET_OS}:${TRANSLATED_TARGET_ARCH}:build-${BUILD_OS}:pn-${PN}:${MACHINEOVERRIDES}:${DISTROOVERRIDES}:${CLASSOVERRIDE}:forcevariable${#bb.utils.contains("TUNE_FEATURES", "thumb", ":thumb", "", d)}${#bb.utils.contains("TUNE_FEATURES", "no-thumb-interwork", ":thumb-interwork", "", d)}'
Traceback (most recent call last):
File "/usr/lib/pymodules/python2.7/bb/data_smart.py", line 116, in expandWithRefs
s = __expand_var_regexp__.sub(varparse.var_sub, s)
File "/usr/lib/pymodules/python2.7/bb/data_smart.py", line 60, in var_sub
var = self.d.getVar(key, 1)
File "/usr/lib/pymodules/python2.7/bb/data_smart.py", line 260, in getVar
return self.expand(value, var)
File "/usr/lib/pymodules/python2.7/bb/data_smart.py", line 132, in expand
return self.expandWithRefs(s, varname).value
File "/usr/lib/pymodules/python2.7/bb/data_smart.py", line 117, in expandWithRefs
s = __expand_python_regexp__.sub(varparse.python_sub, s)
TypeError: getVar() takes exactly 3 arguments (2 given)
ERROR: Error evaluating '${#bb.parse.BBHandler.vars_from_file(d.getVar('FILE'),d)[0] or 'defaultpkgname'}'
Traceback (most recent call last):
File "/usr/lib/pymodules/python2.7/bb/data_smart.py", line 117, in expandWithRefs
s = __expand_python_regexp__.sub(varparse.python_sub, s)
File "/usr/lib/pymodules/python2.7/bb/data_smart.py", line 76, in python_sub
value = utils.better_eval(codeobj, DataContext(self.d))
File "/usr/lib/pymodules/python2.7/bb/utils.py", line 387, in better_eval
return eval(source, _context, locals)
File "PN", line 1, in <module>
TypeError: getVar() takes exactly 3 arguments (2 given)
I've been at this for a while now, searching on different sites. Originally I tried following the guide at the developer section on the Angstrom site, but once I got some errors (prior to this one I'm putting here), I found Derek Molloy's site http://derekmolloy.ie/building-angstrom-for-beaglebone-from-source/ which solved those errors and gave a little more detail into the process.
Eventually I stumbled onto another forum post which decribed my problem, but unfortunately the answers weren't really clear (for me anyway) http://comments.gmane.org/gmane.linux.distributions.angstrom.devel/7431. I'm at a loss on what could be wrong, and I'm pretty much new to Yocto project so I'm unsure if there's any steps missing or something that's implicit that I have overlooked, so I would deeply appreciate anyone who could point me on the right direction on this.
As side note, I've been thinking that it could be something having to do with the environment-angstrom-... file that I have, since mine is environment-angstrom-v2013.12 and all the other examples use previous versions, I'm wondering if there's a new step involved when working with this.
Is there a reason why you are using a system-wide bitbake instead of the one that is compatible with that release of Angstrom?
Don't use a system-wide bitbake, as the bitbake API can and does change over time. Use the corresponding bitbake for that release of angstrom.
(This is breaking because your bitbake requires getVar to take three arguments but your angstrom layers are only passing two)

plone.app.blob RuntimeError while migrating ATFile

I'm doing a blob migration onto a 3.2.1 site and I'm getting "RuntimeError: maximum recursion depth exceeded while calling a Python object error." on some file during ##blob-file-migration.
I found this http://svn.eionet.europa.eu/projects/Zope/ticket/4190 and it looks like they solved this problem for images by creating a custom migrator.
Any clue? Traceback below.
File "/home/simahawk/dev/plone/plone3/projx/src/plone.app.blob/src/plone/app/blob/content.py", line 113, in setFile
mutator = self.getField('file').getMutator(self)
File "/home/simahawk/dev/plone/plone3/buildout/eggs/Products.Archetypes-1.5.10-py2.4.egg/Products/Archetypes/BaseObject.py", line 241, in getField
return self.Schema().get(key)
File "/home/simahawk/dev/plone/plone3/buildout/eggs/Products.Archetypes-1.5.10-py2.4.egg/Products/Archetypes/BaseObject.py", line 828, in Schema
schema = ISchema(self)
File "/home/simahawk/dev/plone/plone3/projx/parts/zope2/lib/python/zope/app/component/hooks.py", line 96, in adapter_hook
return siteinfo.adapter_hook(interface, object, name, default)
File "/home/simahawk/dev/plone/plone3/buildout/eggs/archetypes.schemaextender-2.1.1-py2.4.egg/archetypes/schemaextender/extender.py", line 143, in cachingInstanceSchemaFactory
key = IUUID(context, str(id(context)))
File "/home/simahawk/dev/plone/plone3/projx/parts/zope2/lib/python/zope/app/component/hooks.py", line 96, in adapter_hook
return siteinfo.adapter_hook(interface, object, name, default)
RuntimeError: maximum recursion depth exceeded in cmp
2013-03-06 10:16:49 INFO ATCT.migration Rolling back to last safe point
When migration from Plone 3.x to Plone 4.x using Products.contentmigration I got teh same error. It seemed there was a bug in plone.app.blob migration. We made this custom migration to bypass the recursion error: http://svn.eionet.europa.eu/projects/Zope/browser/trunk/Products.EEAPloneAdmin/trunk/Products/EEAPloneAdmin/Extensions/ImageFS2Image.py?rev=29656
The problem is at.schemaextender version (2.1.1). Down-pinning to 1.6.0 solved the issue. This also solved a random KeyError on a 3.3.5 site. I think this is related to #12051 and #11396. It looks like these are common problems with newer version of at.schemaextender but in the package's README there's no info for Plone 3.x.

POSKeyError during migration

Migration from plone 3.3.2 to plone 4.2.1 fails with PosKeyError. I've tried recipes from this article http://plonechix.blogspot.com/2009/12/definitive-guide-to-poskeyerror.html.
I've run error_finder snippet, but it didn't give me any execeptions. I've also tried to take object in debugger using app.mysite._p_jar[p64(oid)] - also no success, it fails with the same error.
How can I delete the broken object or at least get more info about object (e.g. its class name or location)?
Full traceback:
POSKeyError('\x00\x00\x00\x00\x00\x0ey=',)
(Also, the following error occurred while attempting to render the standard error message, please see the event log for full details:
An operation previously failed, with traceback:
File "/Users/makmak/Plone/buildout-cache/eggs/Zope2-2.13.16-py2.7.egg/ZServer/PubCore/ZServerPublisher.py", line 31, in __init__
response=b)
File "/Users/makmak/Plone/buildout-cache/eggs/Zope2-2.13.16-py2.7.egg/ZPublisher/Publish.py", line 443, in publish_module
environ, debug, request, response)
File "/Users/makmak/Plone/buildout-cache/eggs/Zope2-2.13.16-py2.7.egg/ZPublisher/Publish.py", line 237, in publish_module_standard
response = publish(request, module_name, after_list, debug=debug)
File "/Users/makmak/Plone/buildout-cache/eggs/Zope2-2.13.16-py2.7.egg/ZPublisher/Publish.py", line 134, in publish
transactions_manager.commit()
File "/Users/makmak/Plone/buildout-cache/eggs/Zope2-2.13.16-py2.7.egg/Zope2/App/startup.py", line 301, in commit
transaction.commit()
File "/Users/makmak/Plone/buildout-cache/eggs/transaction-1.1.1-py2.7.egg/transaction/_manager.py", line 89, in commit
return self.get().commit()
File "/Users/makmak/Plone/buildout-cache/eggs/transaction-1.1.1-py2.7.egg/transaction/_transaction.py", line 336, in commit
t, v, tb = self._saveAndGetCommitishError()
File "/Users/makmak/Plone/buildout-cache/eggs/transaction-1.1.1-py2.7.egg/transaction/_transaction.py", line 329, in commit
self._commitResources()
File "/Users/makmak/Plone/buildout-cache/eggs/transaction-1.1.1-py2.7.egg/transaction/_transaction.py", line 443, in _commitResources
rm.commit(self)
File "/Users/makmak/Plone/buildout-cache/eggs/ZODB3-3.10.5-py2.7-macosx-10.4-x86_64.egg/ZODB/Connection.py", line 572, in commit
oid, serial, transaction)
File "/Users/makmak/Plone/buildout-cache/eggs/ZODB3-3.10.5-py2.7-macosx-10.4-x86_64.egg/ZODB/BaseStorage.py", line 416, in checkCurrentSerialInTransaction
committed_tid = self.getTid(oid)
File "/Users/makmak/Plone/buildout-cache/eggs/ZODB3-3.10.5-py2.7-macosx-10.4-x86_64.egg/ZODB/FileStorage/FileStorage.py", line 770, in getTid
with self._lock:
File "/Users/makmak/Plone/buildout-cache/eggs/ZODB3-3.10.5-py2.7-macosx-10.4-x86_64.egg/ZODB/FileStorage/FileStorage.py", line 403, in _lookup_pos
raise POSKeyError(oid)
POSKeyError: 0x0e793d
You can use the fsrefs.py to find the bad object.
A very short article on using it is: http://nathanvangheem.com/news/fixing-broken-zodb-object-references
I believe this is the same issue I just ran into which happens if a savepoint is rolled back that included adding an object to the catalog. I think this is a bug in the ZODB but you can workaround it by addressing whatever is rolling back a savepoint, in this case that's the migration of files and images to blobs. So if you fix what's keeping those files or images from successfully migrating to BLOBs (or just delete them) then it should succeed.