Running into uvloop issues with Database queries from Rasa-X? - amazon-neptune

I'm trying to make a simple query to my amazon neptune database, from Rasa-x.
Here is the code from my actions.py:
class ActionQueryDietary(Action):
def name(self) -> Text:
return "action_query_dietary"
def run(self, dispatcher: CollectingDispatcher, tracker: Tracker, domain: Dict[Text, Any]) -> List[Dict[Text, Any]]:
available = False
restaurant = "XXXX"
dietaryQuestion=tracker.get_slot('dietaryQuestion')
g, remoteConn = kb.openConnection()
dietary = kb.getDietary(g, restaurant, dietaryQuestion)
if(dietary=="Yes"):
available = True
if(available==True):
dispatcher.utter_message(text="According to our knowledge base, {} is on the menu").format(dietaryQuestion)
else:
dispatcher.utter_message(text="Sorry, according to our knowledge base, we don't have this option on the menu")
kb.closeConnection(remoteConn)
return []
and here is the code from knowledgebase.py:
def getDietary(g, restaurant, dietary):
properties = queryRestaurantProperties(g, restaurant)
result = properties[dietary]
print(result)
return result
but any query to the knowledgebase results in this error:
2020-10-22T18:01:22.347345241Z File "/opt/venv/lib/python3.7/site-packages/sanic/app.py", line 973, in handle_request
2020-10-22T18:01:22.347351643Z response = await response
2020-10-22T18:01:22.347357446Z File "/app/rasa_sdk/endpoint.py", line 102, in webhook
2020-10-22T18:01:22.347363522Z result = await executor.run(action_call)
2020-10-22T18:01:22.347369398Z File "/app/rasa_sdk/executor.py", line 392, in run
2020-10-22T18:01:22.347375473Z events = action(dispatcher, tracker, domain)
2020-10-22T18:01:22.347381348Z File "/app/actions/actions.py", line 33, in run
2020-10-22T18:01:22.347387063Z dietary = kb.getDietary(g, restaurant, dietaryQuestion)
2020-10-22T18:01:22.347392835Z File "/app/actions/knowledgebase.py", line 117, in getDietary
2020-10-22T18:01:22.347399112Z properties = queryRestaurantProperties(g, restaurant)
2020-10-22T18:01:22.347405111Z File "/app/actions/knowledgebase.py", line 86, in queryRestaurantProperties
2020-10-22T18:01:22.347411869Z result = g.V().has('name', restaurant).valueMap().next()
2020-10-22T18:01:22.347418048Z File "/opt/venv/lib/python3.7/site-packages/gremlin_python/process/traversal.py", line 89, in next
2020-10-22T18:01:22.347424293Z return self.__next__()
2020-10-22T18:01:22.347430069Z File "/opt/venv/lib/python3.7/site-packages/gremlin_python/process/traversal.py", line 48, in __next__
2020-10-22T18:01:22.347436333Z self.traversal_strategies.apply_strategies(self)
2020-10-22T18:01:22.347441940Z File "/opt/venv/lib/python3.7/site-packages/gremlin_python/process/traversal.py", line 573, in apply_strategies
2020-10-22T18:01:22.347447667Z traversal_strategy.apply(traversal)
2020-10-22T18:01:22.347453031Z File "/opt/venv/lib/python3.7/site-packages/gremlin_python/driver/remote_connection.py", line 149, in apply
2020-10-22T18:01:22.347459352Z remote_traversal = self.remote_connection.submit(traversal.bytecode)
2020-10-22T18:01:22.347465069Z File "/opt/venv/lib/python3.7/site-packages/gremlin_python/driver/driver_remote_connection.py", line 55, in submit
2020-10-22T18:01:22.347486749Z result_set = self._client.submit(bytecode)
2020-10-22T18:01:22.347493788Z File "/opt/venv/lib/python3.7/site-packages/gremlin_python/driver/client.py", line 127, in submit
2020-10-22T18:01:22.347499424Z return self.submitAsync(message, bindings=bindings).result()
2020-10-22T18:01:22.347505093Z File "/opt/venv/lib/python3.7/site-packages/gremlin_python/driver/client.py", line 146, in submitAsync
2020-10-22T18:01:22.347511092Z return conn.write(message)
2020-10-22T18:01:22.347516610Z File "/opt/venv/lib/python3.7/site-packages/gremlin_python/driver/connection.py", line 55, in write
2020-10-22T18:01:22.347522673Z self.connect()
2020-10-22T18:01:22.347529987Z File "/opt/venv/lib/python3.7/site-packages/gremlin_python/driver/connection.py", line 45, in connect
2020-10-22T18:01:22.347536097Z self._transport.connect(self._url, self._headers)
2020-10-22T18:01:22.347542222Z File "/opt/venv/lib/python3.7/site-packages/gremlin_python/driver/tornado/transport.py", line 36, in connect
2020-10-22T18:01:22.347547822Z lambda: websocket.websocket_connect(url))
2020-10-22T18:01:22.347553477Z File "/opt/venv/lib/python3.7/site-packages/tornado/ioloop.py", line 571, in run_sync
2020-10-22T18:01:22.347559295Z self.start()
2020-10-22T18:01:22.347564864Z File "/opt/venv/lib/python3.7/site-packages/tornado/platform/asyncio.py", line 132, in start
2020-10-22T18:01:22.347570978Z self.asyncio_loop.run_forever()
2020-10-22T18:01:22.347576693Z File "uvloop/loop.pyx", line 1351, in uvloop.loop.Loop.run_forever
2020-10-22T18:01:22.347582342Z File "uvloop/loop.pyx", line 484, in uvloop.loop.Loop._run
2020-10-22T18:01:22.347588222Z RuntimeError: Cannot run the event loop while another loop is running
I tried using nest_asyncio.apply, but that resulted in this error:
ValueError: Can't patch loop of type <class 'uvloop.Loop'>
which according to the docs is just a rule.
So I don't really know how to proceed?

Adding my comment above as an answer. In some cases it is necessary to downlevel the version of Tornado being used. There are some issues that you can sometimes run into if the event loop is already running when someone else tries to create it. For now, down leveling to Tornado 4.5.1 with Gremlin Python should resolve any issues.

Related

TensorFlow Federated (TFF) TypeError in tff.templates.IterativeProcess.next() when clients_per_round exceed 99

I implemented a custom federated learning GAN training loop with TFF similar to this code by Google Research.
The client data for a particular training round is found using the following code snippet:
def client_dataset_fn():
# Sample clients and data
sampled_clients = np.random.choice(train_data.client_ids, size=cfg.clients_per_round, replace=False)
datasets = [(next(client_gen_inputs_iterator),
train_data.create_tf_dataset_for_client(client_id).take(cfg.n_critic))
for client_id in sampled_clients]
return datasets
client_noise_inputs, client_real_data = zip(*client_dataset_fn())
This works perfectly up until cfg.clients_per_round is set to 99. When it is set to 100 or a larger value (with the total number of clients being larger of course), I receive the following error:
Traceback (most recent call last):
File "main.py", line 109, in main
metrics = run_single_trial(train_data, test_data, cfg)
File "/mnt/workspace/tff/GAN/federated/fedgan_main.py", line 73, in run_single_trial
metrics = train_loop(iterative_process, server_dataset_fn, client_dataset_fn, model, eval_hook_fn, cfg)
File "/mnt/workspace/tff/GAN/federated/fedgan_main.py", line 124, in train_loop
client_real_data)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/core/impl/computation/function_utils.py", line 525, in __call__
return context.invoke(self, arg)
File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f
return Retrying(*dargs, **dkw).call(f, *args, **kw)
File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call
return attempt.get(self._wrap_exception)
File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get
six.reraise(self.value[0], self.value[1], self.value[2])
File "/usr/local/lib/python3.6/dist-packages/six.py", line 703, in reraise
raise value
File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call
attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/core/impl/executors/execution_context.py", line 226, in invoke
_ingest(executor, unwrapped_arg, arg.type_signature)))
File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete
return future.result()
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/common_libs/tracing.py", line 396, in _wrapped
return await coro
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/core/impl/executors/execution_context.py", line 111, in _ingest
ingested = await asyncio.gather(*ingested)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/core/impl/executors/execution_context.py", line 116, in _ingest
return await executor.create_value(val, type_spec)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/common_libs/tracing.py", line 201, in async_trace
result = await fn(*fn_args, **fn_kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/core/impl/executors/reference_resolving_executor.py", line 294, in create_value
value, type_spec))
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/common_libs/tracing.py", line 201, in async_trace
result = await fn(*fn_args, **fn_kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/core/impl/executors/thread_delegating_executor.py", line 111, in create_value
self._target_executor.create_value(value, type_spec))
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/core/impl/executors/thread_delegating_executor.py", line 105, in _delegate
result_value = await _delegate_with_trace_ctx(coro, self._event_loop)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/common_libs/tracing.py", line 396, in _wrapped
return await coro
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/common_libs/tracing.py", line 201, in async_trace
result = await fn(*fn_args, **fn_kwargs)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/core/impl/executors/federating_executor.py", line 394, in create_value
return await self._strategy.compute_federated_value(value, type_spec)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/core/impl/executors/federated_composing_strategy.py", line 279, in compute_federated_value
py_typecheck.check_type(value, list)
File "/usr/local/lib/python3.6/dist-packages/tensorflow_federated/python/common_libs/py_typecheck.py", line 41, in check_type
type_string(type_spec), type_string(type(target))))
TypeError: Expected list, found tuple.
During debugging, I looked at the target variable in the final line of the traceback and found it to be the abovementioned client_real_data and client_noise_inputs. Their types are in fact tuples not lists, however, this does not change with different numbers of cfg.clients_per_round. The only usage of cfg.clients_per_round is shown above in the random choice.
I really cannot explain why this is happening, maybe somebody out there has experienced something similar and can help me out.
My used package versions are as follows:
Python 3.6.9 or 3.8.10 (checked both)
tensorflow 2.5.1
tensorflow-federated 0.19.0
retrying 1.3.3
six 1.15.0
As a workaround I now manually change the data type of client_noise_inputs and client_real_data using list(tuple_var), but I am still curious as to why the list is required somehow.
(Copying and pasting from original on GitHub)
This seems to me to be an implementation distinction between the federated_composing_strategy and the federated_resolving_strategy. IIRC, by default we don't inject a composing executor into your stack until you hit 100 clients--which would be the source of this exciting mystery.
In particular, the composing strategy is programmed against the assumption that the incoming clients-placed value is represented as a list, whereas the resolving strategy codes against a much more flexible set of containers.
It's not wild to coerce your clients-placed value to a list--we also could extend the permitted representation of clients-placed values in the composing executor to match that in the resolving one, possibly pulling the appropriate logic to a shared place like here. I think its a contribution wed be very happy to accept if youre up for it!

How to store crawled data from Scrapy to FTP as csv?

My scrapy settings.py
from datetime import datetime
file_name = datetime.today().strftime('%Y-%m-%d_%H%M_')
save_name = file_name + 'Mobile_Nshopping'
FEED_URI = 'ftp://myusername:mypassword#ftp.mymail.com/uploads/%(save_name)s.csv'
when I'm running my spider scrapy crawl my_project_name getting error...
Can I have to create a pipeline?
\scrapy\extensions\feedexport.py:247: ScrapyDeprecationWarning: The `FEED_URI` and `FEED_FORMAT` settings have been deprecated in favor of the `FEEDS` setting. Please see the `FEEDS` setting docs for more details
exporter = cls(crawler)
Traceback (most recent call last):
File "c:\users\viren\appdata\local\programs\python\python38\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\viren\appdata\local\programs\python\python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\viren\AppData\Local\Programs\Python\Python38\Scripts\scrapy.exe\__main__.py", line 7, in <module>
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\cmdline.py", line 145, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\cmdline.py", line 100, in _run_print_help
func(*a, **kw)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\cmdline.py", line 153, in _run_command
cmd.run(args, opts)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\commands\crawl.py", line 22, in run
crawl_defer = self.crawler_process.crawl(spname, **opts.spargs)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\crawler.py", line 191, in crawl
crawler = self.create_crawler(crawler_or_spidercls)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\crawler.py", line 224, in create_crawler
return self._create_crawler(crawler_or_spidercls)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\crawler.py", line 229, in _create_crawler
return Crawler(spidercls, self.settings)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\crawler.py", line 72, in __init__
self.extensions = ExtensionManager.from_crawler(self)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\middleware.py", line 53, in from_crawler
return cls.from_settings(crawler.settings, crawler)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\middleware.py", line 35, in from_settings
mw = create_instance(mwcls, settings, crawler)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\utils\misc.py", line 167, in create_instance
instance = objcls.from_crawler(crawler, *args, **kwargs)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\extensions\feedexport.py", line 247, in from_crawler
exporter = cls(crawler)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\extensions\feedexport.py", line 282, in __init__
if not self._storage_supported(uri, feed_options):
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\extensions\feedexport.py", line 427, in _storage_supported
self._get_storage(uri, feed_options)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\extensions\feedexport.py", line 458, in _get_storage
instance = build_instance(feedcls.from_crawler, crawler)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\extensions\feedexport.py", line 455, in build_instance
return build_storage(builder, uri, feed_options=feed_options, preargs=preargs)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\extensions\feedexport.py", line 46, in build_storage
return builder(*preargs, uri, *args, **kwargs)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\extensions\feedexport.py", line 201, in from_crawler
return build_storage(
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\extensions\feedexport.py", line 46, in build_storage
return builder(*preargs, uri, *args, **kwargs)
File "c:\users\viren\appdata\local\programs\python\python38\lib\site-packages\scrapy\extensions\feedexport.py", line 192, in __init__
self.port = int(u.port or '21')
File "c:\users\viren\appdata\local\programs\python\python38\lib\urllib\parse.py", line 174, in port
raise ValueError(message) from None
ValueError: Port could not be cast to integer value as 'Edh=)9sd'
I don't know how to store CSV into FTP.
error is coming because my password is int?
Is there anything I forget to write?
Can I have to create a pipeline?
Yes, you probably should create a pipeline. As shown in the Scrapy Architecture Diagram, the basic concept is this: requests are sent, responses come back and processed by the spider, and finally, the pipeline does something with the items returned by the spider. In your case, you could create a pipeline that saves the data in a CSV file and uploads it to an ftp server. See Scrapy's Item Pipeline documentation for more information.
I don't know how to store CSV into FTP. error is coming because my password is int? Is there anything I forget to write?
I believe this is due to the deprecation error below (and shown at the top of the errors you provided):
ScrapyDeprecationWarning: The FEED_URI and FEED_FORMAT settings have been deprecated in favor of the FEEDS setting. Please see the FEEDS setting docs for more details.
Try replacing FEED_URI with FEEDS; see the Scrapy documentation on FEEDS.
You need to specify the port as well.
You can specify this in settings.
See also class definition from scrapy docs
class FTPFilesStore:
FTP_USERNAME = None
FTP_PASSWORD = None
USE_ACTIVE_MODE = None
def __init__(self, uri):
if not uri.startswith("ftp://"):
raise ValueError(f"Incorrect URI scheme in {uri}, expected 'ftp'")
u = urlparse(uri)
self.port = u.port
self.host = u.hostname
self.port = int(u.port or 21)
self.username = u.username or self.FTP_USERNAME
self.password = u.password or self.FTP_PASSWORD
self.basedir = u.path.rstrip('/')

Writing to BigQuery from within a ParDo function

I would like to call a beam.io.Write(beam.io.BigQuerySink(..)) operation from within a ParDo function to generate a separate BigQuery table for each key in the PCollection (i'm using the python SDK). Here are two similar threads, which unfortunately didn't help:
1) https://stackoverflow.com/questions/31156774/about-key-grouping-with-groupbykey
2) Dynamic table name when writing to BQ from dataflow pipelines
When I execute the following code, the rows for the first key get inserted into BigQuery and then the pipeline fails with the error below. Would really appreciate any suggestions on what I'm doing wrong or any suggestions on how to fix it.
Pipeline code:
rows = p | 'read_bq_table' >> beam.io.Read(beam.io.BigQuerySource(query=query))
class par_upload(beam.DoFn):
def process(self, context):
key, value = context.element
### This block causes issues ###
value | 'write_to_bq' >> beam.io.Write(
beam.io.BigQuerySink(
'PROJECT-NAME:analytics.first_table', #will be replace by a dynamic name based on key
schema=schema,
write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED
)
)
### End block ######
return [value]
### Following part works fine ###
filtered = (rows | 'filter_rows' >> beam.Filter(lambda row: row['topic'] == 'analytics')
| 'apply_projection' >> beam.Map(apply_projection, projection_fields)
| 'group_by_key' >> beam.GroupByKey()
| 'par_upload_to_bigquery' >> beam.ParDo(par_upload())
| 'flat_map' >> beam.FlatMap(lambda l: l) #this step is just for testing
)
### This part works fine if I comment out the 'write_to_bq' block above
filtered | 'write_to_bq' >> beam.io.Write(
beam.io.BigQuerySink(
'PROJECT-NAME:analytics.another_table',
schema=schema,
write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE,
create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED)
)
Error message:
INFO:oauth2client.client:Attempting refresh to obtain initial access_token
INFO:oauth2client.client:Attempting refresh to obtain initial access_token
INFO:root:Writing 1 rows to PROJECT-NAME:analytics.first_table table.
INFO:root:Final: Debug counters: {'element_counts': Counter({'CreatePInput0': 1, 'write_to_bq/native_write': 1})}
ERROR:root:Error while visiting par_upload_to_bigquery
Traceback (most recent call last):
File "split_events.py", line 137, in <module>
run()
File "split_events.py", line 132, in run
p.run()
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/pipeline.py", line 159, in run
return self.runner.run(self)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/direct_runner.py", line 102, in run
super(DirectPipelineRunner, self).run(pipeline)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 98, in run
pipeline.visit(RunVisitor(self))
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/pipeline.py", line 182, in visit
self._root_transform().visit(visitor, self, visited)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/pipeline.py", line 419, in visit
part.visit(visitor, pipeline, visited)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/pipeline.py", line 422, in visit
visitor.visit_transform(self)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 93, in visit_transform
self.runner.run_transform(transform_node)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 168, in run_transform
return m(transform_node)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/direct_runner.py", line 98, in func_wrapper
func(self, pvalue, *args, **kwargs)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/direct_runner.py", line 180, in run_ParDo
runner.process(v)
File "apache_beam/runners/common.py", line 133, in apache_beam.runners.common.DoFnRunner.process (apache_beam/runners/common.c:4483)
File "apache_beam/runners/common.py", line 139, in apache_beam.runners.common.DoFnRunner.process (apache_beam/runners/common.c:4311)
File "apache_beam/runners/common.py", line 150, in apache_beam.runners.common.DoFnRunner.reraise_augmented (apache_beam/runners/common.c:4677)
File "apache_beam/runners/common.py", line 137, in apache_beam.runners.common.DoFnRunner.process (apache_beam/runners/common.c:4245)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/typehints/typecheck.py", line 149, in process
return self.run(self.dofn.process, context, args, kwargs)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/typehints/typecheck.py", line 134, in run
result = method(context, *args, **kwargs)
File "split_events.py", line 73, in process
create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 724, in __ror__
return self.transform.__ror__(pvalueish, self.label)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 445, in __ror__
return _MaterializePValues(cache).visit(result)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 105, in visit
return self._pvalue_cache.get_unwindowed_pvalue(node)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 262, in get_unwindowed_pvalue
return [v.value for v in self.get_pvalue(pvalue)]
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 244, in get_pvalue
value_with_refcount = self._cache[self.key(pvalue)]
KeyError: "(4384177040, None) [while running 'par_upload_to_bigquery']"
Edit (after the first answer):
I didn't realise my value needs to be a PCollection.
I've changed my code to this now (which is probably very inefficient):
key_pipe = p | 'pipe_' + key >> beam.Create(value)
key_pipe | 'write_' + key >> beam.io.Write(beam.io.BigQuerySink(..))
Which now works fine locally but not with BlockingDataflowPipelineRunner :-(
The pipeline fails with the following error:
JOB_MESSAGE_ERROR: (979394c29490e588): Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 474, in do_work
work_executor.execute()
File "dataflow_worker/executor.py", line 901, in dataflow_worker.executor.MapTaskExecutor.execute (dataflow_worker/executor.c:24331)
op.start()
File "dataflow_worker/executor.py", line 465, in dataflow_worker.executor.DoOperation.start (dataflow_worker/executor.c:14193)
def start(self):
File "dataflow_worker/executor.py", line 469, in dataflow_worker.executor.DoOperation.start (dataflow_worker/executor.c:13499)
fn, args, kwargs, tags_and_types, window_fn = (
ValueError: too many values to unpack (expected 5)
In the similar threads, the only suggestion to do BigQuery write operations in a ParDo was to use the BigQuery API directly, or using a client.
The code that you wrote is putting a Dataflow ParDo class beam.io.BigQuerySink() into a DoFn function. The ParDo class expects to work on a PCollection like filtered in the working code example. Which is not the case for the non-functioning code working on value.
I think the easiest option would be to take a look at the gcloud-python BigQuery function insert_data() and put this inside your ParDo.

Turbogears with gevent-socketio: request Key Error

I try to use gevent.socketio with my TurboGears 2 Website:
in the ini-file i use
[server:main]
use = egg:gevent-socketio#paster
transports = xhr-multipart, xhr-polling, websocket
host = 0.0.0.0
port = 8080
when i try to access the controller in the Webbrowser:
#expose('wago.templates.test')
def index(self):
socketio_manage(request.environ, {"/stat": StatNamespace}, request=request)
return dict()
i get the following error:
Traceback (most recent call last):
File "/home/pi/tgenv/lib/python2.7/site-packages/tg/wsgiapp.py", line 105, in __call__
response = self.wrapped_dispatch(controller, environ, context)
File "/home/pi/tgenv/lib/python2.7/site-packages/tg/wsgiapp.py", line 278, in dispatch
return controller(environ, context)
File "/home/pi/tgenv/lib/python2.7/site-packages/tg/controllers/dispatcher.py", line 132, in __call__
response = self._perform_call(context)
File "/home/pi/tgenv/lib/python2.7/site-packages/tg/controllers/dispatcher.py", line 113, in _perform_call
r = self._call(func, params, remainder=remainder, context=context)
File "/home/pi/tgenv/lib/python2.7/site-packages/tg/controllers/decoratedcontroller.py", line 120, in _call
output = controller_caller(context_config, bound_controller_callable, remainder, params)
File "/home/pi/tgenv/lib/python2.7/site-packages/tg/decorators.py", line 42, in _decorated_controller_caller
return application_controller_caller(tg_config, controller, remainder, params)
File "/home/pi/tgenv/lib/python2.7/site-packages/tg/configuration/app_config.py", line 124, in call_controller
return controller(*remainder, **params)
File "/home/pi/tgenv/WAGO/wago/controllers/root.py", line 13, in index
socketio_manage(request.environ, {"/stat": StatNamespace}, request=request)
File "/home/pi/tgenv/lib/python2.7/site-packages/socketio/__init__.py", line 67, in socketio_manage
socket = environ['socketio']
KeyError: 'socketio'
I used several tutorials for pyramid to introduce myself to gevent-socketio.
I tried it with older versions from TurboGears2, gevent and gevent-socketio, i also tried this module but always the same error.
i'm pretty new to sockets, so maybe i'm just missing something obvious
the gevent-socketio recognizes socket requests only from a specific url (socket.io/1/)
because TurboGears uses the python function names as url it is not posible to us "." or "1" on the regular way. A simple solution:
#expose()
def _default(self, *args):
args = list(args)
if "socketio" in request.environ:
#do socketio stuff...
else:
abort(404)

hg push error and username not specified in .hg/hgrc. Keyring will not be used

I did the following:
hg clone ...somelink.to.repo.in.hg... Giga
cd Giga
ls (...it shows me giga.txt file exist in Giga directory)
vi giga.txt (...made some changes..)
hg commit -m "byte"
hg out (got the following error)
** unknown exception encountered, details follow
** report bug details to http://mercurial.selenic.com/bts/
** or mercurial#selenic.com
** Mercurial Distributed SCM (version 1.5)
** Extensions loaded: acl, bugzilla, children, churn, color, convert, extdiff, fetch, gpg, graphlog, hgcia, hgk, highlight, interhg, keyword, mercurial_keyring, mq, notify, pager, patchbomb, progress, purge, rebase, record, relink, schemes, share, transplant, zeroconf
Traceback (most recent call last):
File "/usr/bin/hg", line 27, in <module>
mercurial.dispatch.run()
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 16, in run
sys.exit(dispatch(sys.argv[1:]))
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 30, in dispatch
return _runcatch(u, args)
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 47, in _runcatch
return _dispatch(ui, args)
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 466, in _dispatch
return runcommand(lui, repo, cmd, fullargs, ui, options, d)
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 336, in runcommand
ret = _runcommand(ui, options, cmd, d)
File "/usr/lib/python2.6/site-packages/mercurial/extensions.py", line 128, in wrap
return wrapper(origfn, *args, **kwargs)
File "/usr/lib/python2.6/site-packages/hgext/pager.py", line 66, in pagecmd
return orig(ui, options, cmd, cmdfunc)
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 517, in _runcommand
return checkargs()
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 471, in checkargs
return cmdfunc()
File "/usr/lib/python2.6/site-packages/mercurial/dispatch.py", line 465, in <lambda>
d = lambda: util.checksignature(func)(ui, *args, **cmdoptions)
File "/usr/lib/python2.6/site-packages/mercurial/util.py", line 401, in check
return func(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/mercurial/extensions.py", line 116, in wrap
util.checksignature(origfn), *args, **kwargs)
File "/usr/lib/python2.6/site-packages/mercurial/util.py", line 401, in check
return func(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/hgext/color.py", line 352, in nocolor
return orig(*args, **opts)
File "/usr/lib/python2.6/site-packages/mercurial/util.py", line 401, in check
return func(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/mercurial/extensions.py", line 116, in wrap
util.checksignature(origfn), *args, **kwargs)
File "/usr/lib/python2.6/site-packages/mercurial/util.py", line 401, in check
return func(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/hgext/mq.py", line 2648, in mqcommand
return orig(ui, repo, *args, **kwargs)
File "/usr/lib/python2.6/site-packages/mercurial/util.py", line 401, in check
return func(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/mercurial/extensions.py", line 116, in wrap
util.checksignature(origfn), *args, **kwargs)
File "/usr/lib/python2.6/site-packages/mercurial/util.py", line 401, in check
return func(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/hgext/graphlog.py", line 365, in graph
return orig(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/mercurial/util.py", line 401, in check
return func(*args, **kwargs)
File "/usr/lib/python2.6/site-packages/mercurial/commands.py", line 2275, in outgoing
other = hg.repository(cmdutil.remoteui(repo, opts), dest)
File "/usr/lib/python2.6/site-packages/mercurial/hg.py", line 82, in repository
repo = _lookup(path).instance(ui, path, create)
File "/usr/lib/python2.6/site-packages/mercurial/httprepo.py", line 271, in instance
inst.between([(nullid, nullid)])
File "/usr/lib/python2.6/site-packages/mercurial/httprepo.py", line 190, in between
d = self.do_read("between", pairs=n)
File "/usr/lib/python2.6/site-packages/mercurial/httprepo.py", line 134, in do_read
fp = self.do_cmd(cmd, **args)
File "/usr/lib/python2.6/site-packages/mercurial/httprepo.py", line 85, in do_cmd
resp = self.urlopener.open(req)
File "/usr/lib/python2.6/urllib2.py", line 397, in open
response = meth(req, response)
File "/usr/lib/python2.6/urllib2.py", line 510, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.6/urllib2.py", line 429, in error
result = self._call_chain(*args)
File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.6/urllib2.py", line 855, in http_error_401
url, req, headers)
File "build/bdist.linux-i686/egg/mercurial_keyring.py", line 339, in basic_http_error_auth_reqed
File "/usr/lib/python2.6/urllib2.py", line 833, in http_error_auth_reqed
return self.retry_http_basic_auth(host, req, realm)
File "/usr/lib/python2.6/urllib2.py", line 836, in retry_http_basic_auth
user, pw = self.passwd.find_user_password(realm, host)
File "build/bdist.linux-i686/egg/mercurial_keyring.py", line 333, in find_user_password
File "build/bdist.linux-i686/egg/mercurial_keyring.py", line 184, in find_auth
File "build/bdist.linux-i686/egg/mercurial_keyring.py", line 67, in get_http_password
File "/usr/local/lib/python2.6/site-packages/keyring/core.py", line 37, in get_password
return _keyring_backend.get_password(service_name, username)
File "/usr/local/lib/python2.6/site-packages/keyring/backend.py", line 143, in get_password
items = gnomekeyring.find_network_password_sync(username, service)
gnomekeyring.IOError
My ~/.hgrc (OpenSUSE machine)
[ui]
username=c123456 <Arun.Sangal#MyCompany.com>
[extensions]
mercurial_keyring = /root/mercurial_keyring.py
#[trusted]
#users = *
#groups = *
[extensions]
acl =
bugzilla =
children =
churn =
color =
convert =
eol = !
extdiff =
factotum = !
fetch =
gpg =
graphlog =
hgcia =
hgcr-gui-qt = !
hgk =
highlight =
interhg =
keyword =
largefiles = !
mercurial_keyring =
mq =
notify =
pager =
patchbomb =
perfarce = !
progress =
projrc = !
purge =
rebase =
record =
relink =
schemes =
....
........etc
My local repository
(on OpenSuse cloned folder - inside: /Giga/.hg/hgrc) is:
[paths]
default = http://the.hg.server.com/hg/TestHgRepo1/
myrepo = http://the.hg.server.com/hg/TestHgRepo1/
[auth]
myrepo.schemes = http https
myrepo.prefix = the.hg.server.com/hg
myrepo.username = c123456
I tried everything but this Keyring thing is not working. I get prompt everytime I do:
hg out
hg push
etc hg operation but not when I do
hg commit
Can someone please tell what the heck I'm missing here. Tried the same excercise on Windows with TortoiseHg, with C:...\mercurial.ini (Windows side kinda of unix ~/.hgrc file).. and updated/made sure local repository cloned folder's ../clonedfolder/.hg/hgrc file contains the similar [auth] ..3 lines but Mercurial on Linux OpenSUSE and on Windows using TortoiseHg is not working with keyring.
It's prompting me for entering user credentials again n again :((
can someone pls correct me on what should I do to get this resolved.
if prompted multiple times for user credentials in mercurial. Setup Mercurial_Keyring and then
this question comes which nobody explained in an easy way.
??? how to make the [auth] xx.prefix = servername/hg_or_something work for all repositories under servername/hg location either if I use servername, servername's IP or servername's FQDN ?
Final ANSWER: Arun • 2 minutes ago −
OK, I put this in ~/.hgrc (Linux/Unix -home directory's .hgrc hidden file) or Windows users %UserProfile%/mercurial.ini or %HOME%/mercurial.ini file.
[auth]
default1.schemes = http https
default1.prefix = hg_merc_server/hg
default1.username = c123456
default2.schemes = http https
default2.prefix = hg_merc_server.company.com/hg
default2.username = c123456
default3.schemes = http https
default3.prefix = 10.211.222.321/hg
default3.username = c123456
Now, I can checkout using either Server/IP/Server's FQDN.