Python3 library pyxb throwing NamespaceError that category has no elementBiding - python-3.8

I've started getting pyxb error while parsing a response. It worked before. The classes that parse the XML are made with pyxb from a the schema.
The XML is:
<result xmlns="http://payment.website.com/Schema/V2/Res">
<success>true</success>
<referenceId>reference_id_hash</referenceId>
<purchaseId>purchase_id_hash</purchaseId>
<returnType>REDIRECT</returnType>
<redirectUrl>https://payment.website.com/redirect/8f9359e15709e87bd0e4/MTAzYWQ5YWE4YjRjOGRjMjJ=</redirectUrl>
</result>
When I execute the CreateFromDocument I get an error that I cannot figure it out.
transaction_result = result.CreateFromDocument(response_content)
File "/usr/lib/python3.8/xml/sax/expatreader.py", line 111, in parse
xmlreader.IncrementalParser.parse(self, source)
File "/usr/lib/python3.8/xml/sax/xmlreader.py", line 125, in parse
self.feed(buffer)
File "/usr/lib/python3.8/xml/sax/expatreader.py", line 217, in feed
self._parser.Parse(data, isFinal)
File "../Modules/pyexpat.c", line 407, in StartElement
File "/usr/lib/python3.8/xml/sax/expatreader.py", line 369, in start_element_ns
self._cont_handler.startElementNS(pair, None,
File "/opt/venv/lib/python3.8/site-packages/pyxb/binding/saxer.py", line 324, in startElementNS
element_binding = name_en.elementBinding()
File "/opt/venv//lib/python3.8/site-packages/pyxb/namespace/__init__.py", line 104, in __getattr__
category_value = ns.categoryMap(name).get(self.localName())
File "/opt/venv/lib/python3.8/site-packages/pyxb/namespace/__init__.py", line 309, in categoryMap
raise pyxb.NamespaceError(self, '%s has no category %s' % (self, category))
Exception
pyxb.exceptions_.NamespaceError: http://payment.website.com/Schema/V2/Res has no category elementBinding 6 0.006 0.364
How can I fix this?

Related

pandas 1.3.3 to_feather giving ArrowMemoryError

I have a dataset of size around 270MB and I use the following to write to feather file:
df.reset_index().to_feather(feather_path)
This gives me an error :
File "C:\apps\Python\lib\site-packages\pandas\util\_decorators.py", line 207, in wrapper
return func(*args, **kwargs)
File "C:\apps\Python\lib\site-packages\pandas\core\frame.py", line 2519, in to_feather
to_feather(self, path, **kwargs)
File "C:\apps\Python\lib\site-packages\pandas\io\feather_format.py", line 87, in to_feather
feather.write_feather(df, handles.handle, **kwargs)
File "C:\apps\Python\lib\site-packages\pyarrow\feather.py", line 152, in write_feather
table = Table.from_pandas(df, preserve_index=False)
File "pyarrow\table.pxi", line 1553, in pyarrow.lib.Table.from_pandas
File "C:\apps\Python\lib\site-packages\pyarrow\pandas_compat.py", line 607, in dataframe_to_arrays
arrays[i] = maybe_fut.result()
File "C:\apps\Python\lib\concurrent\futures\_base.py", line 438, in result
return self.__get_result()
File "C:\apps\Python\lib\concurrent\futures\_base.py", line 390, in __get_result
raise self._exception
File "C:\apps\Python\lib\concurrent\futures\thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
File "C:\apps\Python\lib\site-packages\pyarrow\pandas_compat.py", line 575, in convert_column
result = pa.array(col, type=type_, from_pandas=True, safe=safe)
File "pyarrow\array.pxi", line 302, in pyarrow.lib.array
File "pyarrow\array.pxi", line 83, in pyarrow.lib._ndarray_to_array
File "pyarrow\error.pxi", line 114, in pyarrow.lib.check_status
pyarrow.lib.ArrowMemoryError: realloc of size 3221225472 failed
Note : This works well in PyCharm. No issues writing the feather file.
But when the python program is called in a Windows batch file like:
call python "myprogram.py"
and when I schedule the batch file in a task using Task Scheduler it fails with above memory error.
PyArrow version is 5.0.0 if that helps.
Any ideas please?

Error while exciting the eval.py on TF object detection API

i'm trying to evaluate my model
using this command:
python eval.py --logtostderr --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config --checkpoint_dir=inference_graph --eval_dir=eval
and im getting this error
and I'm getting this error:
Traceback (most recent call last): File "eval.py", line 142, in tf.app.run() File "C:\Users\mosta\Anaconda3\envs\mat\lib\site-packages\tensorflow_core\python\platform\app.py", line 40, in run _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "C:\Users\mosta\Anaconda3\envs\mat\lib\site-packages\absl\app.py", line 299, in run _run_main(main, args) File "C:\Users\mosta\Anaconda3\envs\mat\lib\site-packages\absl\app.py", line 250, in _run_main sys.exit(main(argv)) File "C:\Users\mosta\Anaconda3\envs\mat\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 324, in new_func return func(*args, **kwargs) File "eval.py", line 138, in main graph_hook_fn=graph_rewriter_fn) File "C:\Users\mosta\Anaconda3\envs\mat\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\legacy\evaluator.py", line 274, in evaluate evaluator_list = get_evaluators(eval_config, categories) File "C:\Users\mosta\Anaconda3\envs\mat\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\legacy\evaluator.py", line 166, in get_evaluators EVAL_METRICS_CLASS_DICTeval_metric_fn_key) File "C:\Users\mosta\Anaconda3\envs\mat\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\utils\object_detection_evaluation.py", line 470, in init use_weighted_mean_ap=False) File "C:\Users\mosta\Anaconda3\envs\mat\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\utils\object_detection_evaluation.py", line 194, in init self._build_metric_names() File "C:\Users\mosta\Anaconda3\envs\mat\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\utils\object_detection_evaluation.py", line 213, in _build_metric_names category_name = unicode(category_name, 'utf-8') NameError: name 'unicode' is not defined
Hi there!
Python 3 renamed the unicode type to str, the old str type has been replaced by bytes.
Knowing this it makes sense that we're getting errors as parts of the TF Object Detection API are deprecated (written using Python 2.x)
See here for more explanation on how to upgrade the code to be compatible with Python 3.
I hope this helps!

Apache BEAM pipeline fails when writing TF Records - AttributeError: 'str' object has no attribute 'iteritems'

The issue started appearing over the weekend. For some reason, it feels to be a DataFlow issue.
Previously, I was able to execute the script and write TF records just fine. However, now, I am unable to initialize the computation graph to process the data.
The traceback is:
Traceback (most recent call last):
File "my_script.py", line 1492, in <module>
MyBeamClass()
File "my_script.py", line 402, in __init__
self.run()
File "my_script.py", line 514, in run
transform_fn_io.WriteTransformFn(path=self.JOB_DIR + '/transform/'))
File "/anaconda3/envs/ml27/lib/python2.7/site-packages/apache_beam/pipeline.py", line 426, in __exit__
self.run().wait_until_finish()
File "/anaconda3/envs/ml27/lib/python2.7/site-packages/apache_beam/runners/dataflow/dataflow_runner.py", line 1238, in wait_until_finish
(self.state, getattr(self._runner, 'last_error_msg', None)), self)
apache_beam.runners.dataflow.dataflow_runner.DataflowRuntimeException: Dataflow pipeline failed. State: FAILED, Error:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 649, in do_work
work_executor.execute()
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/executor.py", line 176, in execute
op.start()
File "apache_beam/runners/worker/operations.py", line 531, in apache_beam.runners.worker.operations.DoOperation.start
def start(self):
File "apache_beam/runners/worker/operations.py", line 532, in apache_beam.runners.worker.operations.DoOperation.start
with self.scoped_start_state:
File "apache_beam/runners/worker/operations.py", line 533, in apache_beam.runners.worker.operations.DoOperation.start
super(DoOperation, self).start()
File "apache_beam/runners/worker/operations.py", line 202, in apache_beam.runners.worker.operations.Operation.start
def start(self):
File "apache_beam/runners/worker/operations.py", line 206, in apache_beam.runners.worker.operations.Operation.start
self.setup()
File "apache_beam/runners/worker/operations.py", line 480, in apache_beam.runners.worker.operations.DoOperation.setup
with self.scoped_start_state:
File "apache_beam/runners/worker/operations.py", line 485, in apache_beam.runners.worker.operations.DoOperation.setup
pickler.loads(self.spec.serialized_fn))
File "/usr/local/lib/python2.7/dist-packages/apache_beam/internal/pickler.py", line 247, in loads
return dill.loads(s)
File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 317, in loads
return load(file, ignore)
File "/usr/local/lib/python2.7/dist-packages/dill/_dill.py", line 305, in load
obj = pik.load()
File "/usr/lib/python2.7/pickle.py", line 864, in load
dispatch[key](self)
File "/usr/lib/python2.7/pickle.py", line 1232, in load_build
for k, v in state.iteritems():
AttributeError: 'str' object has no attribute 'iteritems'
I am using tensorflow==1.13.1 and tensorflow-transform==0.9.0 and apache_beam==2.7.0
with beam.Pipeline(options=self.pipe_opt) as p:
with beam_impl.Context(temp_dir=self.google_cloud_options.temp_location):
# rest of the script
_ = (
transform_fn
| 'WriteTransformFn' >>
transform_fn_io.WriteTransformFn(path=self.JOB_DIR + '/transform/'))
I was experiencing the same error.
It seems to be triggered by a mismatch in the tensorflow-transform versions of your local (or master) machine and the workers one (specified in the setup.py file).
In my case I was running tensorflow-transform==0.13 on my local machine whereas the workers were running 0.8.
Downgrading the local version to 0.8 fixed the issue.

ParseError 1:1 Using Tensorflow with Bazel

System information
Running Python 3.6.4 on Windows
Describe the problem
I'm trying to run Tensorflow's lm_1b on sample mode, by inputting:
$ bazel-bin/lm_1b/lm_1b_eval --mode sample --prefix "I love that I" --pbtxt data/vocab-2016-09-10.txt --vocab_file data/vocab-2016-09-10.txt --ckpt 'data/ckpt-*'
But I get the error:
google.protobuf.text_format.ParseError: 1:1 : Expected identifier or number, got <.
Any help would really be appreciated
Source code / logs
Recovering graph.
Traceback (most recent call last):
File "\\?\C:\Users\snmsa\AppData\Local\Temp\Bazel.runfiles_9sq54ngc\runfiles\__main__\lm_1b\lm_1b_eval.py", line 308, in <module>
tf.app.run()
File "C:\Users\snmsa\Anaconda3\lib\site-packages\tensorflow\python\platform\app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "\\?\C:\Users\snmsa\AppData\Local\Temp\Bazel.runfiles_9sq54ngc\runfiles\__main__\lm_1b\lm_1b_eval.py", line 298, in main
_SampleModel(FLAGS.prefix, vocab)
File "\\?\C:\Users\snmsa\AppData\Local\Temp\Bazel.runfiles_9sq54ngc\runfiles\__main__\lm_1b\lm_1b_eval.py", line 174, in _SampleModel
sess, t = _LoadModel(FLAGS.pbtxt, FLAGS.ckpt)
File "\\?\C:\Users\snmsa\AppData\Local\Temp\Bazel.runfiles_9sq54ngc\runfiles\__main__\lm_1b\lm_1b_eval.py", line 89, in _LoadModel
text_format.Merge(s, gd)
File "C:\Users\snmsa\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 533, in Merge
descriptor_pool=descriptor_pool)
File "C:\Users\snmsa\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 587, in MergeLines
return parser.MergeLines(lines, message)
File "C:\Users\snmsa\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 620, in MergeLines
self._ParseOrMerge(lines, message)
File "C:\Users\snmsa\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 635, in _ParseOrMerge
self._MergeField(tokenizer, message)
File "C:\Users\snmsa\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 679, in _MergeField
name = tokenizer.ConsumeIdentifierOrNumber()
File "C:\Users\snmsa\Anaconda3\lib\site-packages\google\protobuf\text_format.py", line 1152, in ConsumeIdentifierOrNumber
raise self.ParseError('Expected identifier or number, got %s.' % result)
google.protobuf.text_format.ParseError: 1:1 : Expected identifier or number, got <.
Your command line is wrong. It should be:
bazel-bin/lm_1b/lm_1b_eval --mode sample \
--prefix "I love that I" \
--pbtxt data/graph-2016-09-10.pbtxt \
...
You are passing a vocabulary file --pbtxt data/vocab-2016-09-10.txt where a serialized GraphDef file is expected.

lxml/pyquery: parse in a less strict way

I am using PyQuery to process a large amount of documents from the Web. PyQuery uses lxml to parse the HTML documents.
As a matter of fact, a lot of the documents are not valid HTML. As a consequence, those invalid documents cannot be successfully parsed by lxml, which prevents me from getting the information further. And the the following exceptions are raised quite often:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 1201, in mainLoop
self.runUntilCurrent()
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/base.py", line 824, in runUntilCurrent
call.func(*call.args, **call.kw)
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 382, in callback
self._startRunCallbacks(result)
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 490, in _startRunCallbacks
self._runCallbacks()
--- <exception caught here> ---
File "/usr/local/lib/python2.7/dist-packages/twisted/internet/defer.py", line 577, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/home/hxiao/hiit/crawl/crawl/spiders/basic.py", line 40, in parse
doc = pq(response.body)
File "/usr/local/lib/python2.7/dist-packages/pyquery/pyquery.py", line 226, in __init__
elements = fromstring(context, self.parser)
File "/usr/local/lib/python2.7/dist-packages/pyquery/pyquery.py", line 70, in fromstring
result = getattr(lxml.html, meth)(context)
File "/usr/local/lib/python2.7/dist-packages/lxml/html/__init__.py", line 706, in fromstring
doc = document_fromstring(html, parser=parser, base_url=base_url, **kw)
File "/usr/local/lib/python2.7/dist-packages/lxml/html/__init__.py", line 600, in document_fromstring
value = etree.fromstring(html, parser, **kw)
File "lxml.etree.pyx", line 3032, in lxml.etree.fromstring (src/lxml/lxml.etree.c:68121)
File "parser.pxi", line 1786, in lxml.etree._parseMemoryDocument (src/lxml/lxml.etree.c:102470)
File "parser.pxi", line 1674, in lxml.etree._parseDoc (src/lxml/lxml.etree.c:101299)
File "parser.pxi", line 1074, in lxml.etree._BaseParser._parseDoc (src/lxml/lxml.etree.c:96481)
File "parser.pxi", line 582, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:91290)
File "parser.pxi", line 683, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:92476)
File "parser.pxi", line 631, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:91904)
lxml.etree.XMLSyntaxError: line 649: htmlParseEntityRef: expecting ';'
What I am asking:
I would like a way to let lxml to parse in a less strict way so that this invalidity can be ignored.
This answer may not be very helpful, but I investigated similar problem.
Maybe you can have a look at this tip of pyquery ?
http://pythonhosted.org/pyquery/tips.html