Blank sessionid cookie causes error in request.user - django-nonrel

An end user somehow ended up with their sessionid cookie blank (as in "sessionid=;"). This causes the following error call stack (below the function call to request.user) when using Django in conjunction with GAE:
File "/src/django/utils/functional.py", line 204, in inner
self._setup()
File "/src/django/utils/functional.py", line 270, in _setup
self._wrapped = self._setupfunc()
File "/src/django/contrib/auth/middleware.py", line 18, in <lambda>
request.user = SimpleLazyObject(lambda: get_user(request))
File "/src/django/contrib/auth/middleware.py", line 10, in get_user
request._cached_user = auth.get_user(request)
File "/src/django/contrib/auth/__init__.py", line 136, in get_user
user_id = request.session[SESSION_KEY]
File "/src/django/contrib/sessions/backends/base.py", line 44, in __getitem__
return self._session[key]
File "/src/django/contrib/sessions/backends/base.py", line 167, in _get_session
self._session_cache = self.load()
File "/src/django/contrib/sessions/backends/cached_db.py", line 39, in load
expire_date__gt=timezone.now()
File "/src/django/db/models/manager.py", line 143, in get
return self.get_query_set().get(*args, **kwargs)
File "/src/django/db/models/query.py", line 398, in get
num = len(clone)
File "/src/django/db/models/query.py", line 106, in __len__
self._result_cache = list(self.iterator())
File "/src/django/db/models/query.py", line 317, in iterator
for row in compiler.results_iter():
File "/src/djangotoolbox/db/basecompiler.py", line 375, in results_iter
results = self.build_query(fields).fetch(
File "/src/djangotoolbox/db/basecompiler.py", line 481, in build_query
query.add_filters(self.query.where)
File "/src/djangotoolbox/db/basecompiler.py", line 174, in add_filters
self.add_filters(child)
File "/src/djangotoolbox/db/basecompiler.py", line 176, in add_filters
field, lookup_type, value = self._decode_child(child)
File "/src/djangotoolbox/db/basecompiler.py", line 216, in _decode_child
lookup_type, value, field, annotation)
File "/src/djangotoolbox/db/basecompiler.py", line 254, in _normalize_lookup_value
return self.ops.value_for_db(value, field, lookup_type)
File "/src/djangoappengine/db/base.py", line 128, in value_for_db
return super_value_for_db(value, field, lookup)
File "/src/djangotoolbox/db/base.py", line 245, in value_for_db
field_kind, db_type, lookup)
File "/src/djangoappengine/db/base.py", line 160, in _value_for_db
raise DatabaseError("Only strings and positive integers "
DatabaseError: Only strings and positive integers may be used as keys on GAE.
This error does not occur if sessionid is set to some invalid non-empty value (such as "session=garbage;"). I think it is related to follow contrast of behavior in a python shell:
>>> Session.objects.filter(session_key='abc').exists()
0
>>> Session.objects.filter(session_key='').exists()
Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/src/django/db/models/query.py", line 610, in exists
return self.query.has_results(using=self.db)
File "/src/django/db/models/sql/query.py", line 445, in has_results
return compiler.has_results()
File "/src/dbindexer/compiler.py", line 32, in has_results
return super(SQLCompiler, self).has_results()
File "/src/djangotoolbox/db/basecompiler.py", line 384, in has_results
return self.get_count(check_exists=True)
File "/src/djangotoolbox/db/basecompiler.py", line 468, in get_count
return self.build_query().count(high_mark)
File "/src/djangotoolbox/db/basecompiler.py", line 481, in build_query
query.add_filters(self.query.where)
File "/src/djangotoolbox/db/basecompiler.py", line 174, in add_filters
self.add_filters(child)
File "/src/djangotoolbox/db/basecompiler.py", line 176, in add_filters
field, lookup_type, value = self._decode_child(child)
File "/src/djangotoolbox/db/basecompiler.py", line 216, in _decode_child
lookup_type, value, field, annotation)
File "/src/djangotoolbox/db/basecompiler.py", line 254, in _normalize_lookup_value
return self.ops.value_for_db(value, field, lookup_type)
File "/src/djangoappengine/db/base.py", line 128, in value_for_db
return super_value_for_db(value, field, lookup)
File "/src/djangotoolbox/db/base.py", line 245, in value_for_db
field_kind, db_type, lookup)
File "/src/djangoappengine/db/base.py", line 160, in _value_for_db
raise DatabaseError("Only strings and positive integers "
DatabaseError: Only strings and positive integers may be used as keys on GAE.
Is this a djangoappengine or djangotoolbox bug, or a Django bug? What's the proper way to prevent this error, and consider user unauthenticated?

Ok, I think I may have to add a middleware class to handle this special case, and place it directly after SessionMiddleware:
class EmptySessionMiddleware(object):
def process_request(self, request):
session = request.session
if session.session_key is not None and len(session.session_key) == 0:
logging.info('[EmptySessionMiddleware] setting empty session key to None')
session._session_key = None
It's a weird special case, but basically the problem is that Django session middleware only checks for None session before looking up in db (not empty string), and an empty string primary key query in djangoappengine raises an exception. I'm not sure there's another way to handle this case.

Related

pyshark "TypeError: sequence item 6: expected str instance, _io.TextIOWrapper found"

I am using pyshark for live packet capture. when I pass a parameter output_file = myFilObject for saving captures to a file,
getting following error on sniffing line. If output_file parameter is removed, this works absolutely fine. Please suggest.
MySampleCode:
import pyshark
def capturePacket():
outputF = open('capturepcap.pcap', 'w')
cap = pyshark.LiveCapture(interface='Ethernet 8', output_file=outputF)
cap.sniff(timeout=60)
outputF.close()
Error:
Traceback (most recent call last):
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "c:\Users\wxyz\.vscode\extensions\ms-python.python-2022.6.2\pythonFiles\lib\python\debugpy\__main__.py", line 45, in <module>
cli.main()
File "c:\Users\wxyz\.vscode\extensions\ms-python.python-2022.6.2\pythonFiles\lib\python\debugpy/..\debugpy\server\cli.py", line 444, in main
run()
File "c:\Users\wxyz\.vscode\extensions\ms-python.python-2022.6.2\pythonFiles\lib\python\debugpy/..\debugpy\server\cli.py", line 285, in run_file
runpy.run_path(target_as_str, run_name=compat.force_str("__main__"))
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 269, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 96, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "c:\Users\wxyz\Documents\automation\practice_set_script\paket_capture\basic_packetCapture.py", line 29, in <module>
capturePacket()
File "c:\Users\wxyz\Documents\automation\practice_set_script\paket_capture\basic_packetCapture.py", line 22, in capturePacket
cap.sniff(timeout=60)
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\site-packages\pyshark\capture\capture.py", line 137, in load_packets
self.apply_on_packets(keep_packet, timeout=timeout, packet_count=packet_count)
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\site-packages\pyshark\capture\capture.py", line 274, in apply_on_packets
return self.eventloop.run_until_complete(coro)
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\asyncio\base_events.py", line 641, in run_until_complete
return future.result()
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\asyncio\tasks.py", line 445, in wait_for
return fut.result()
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\site-packages\pyshark\capture\capture.py", line 283, in packets_from_tshark
tshark_process = await self._get_tshark_process(packet_count=packet_count)
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\site-packages\pyshark\capture\live_capture.py", line 94, in _get_tshark_process
tshark = await super(LiveCapture, self)._get_tshark_process(packet_count=packet_count, stdin=read)
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\site-packages\pyshark\capture\capture.py", line 399, in _get_tshark_process
self._log.debug("Creating TShark subprocess with parameters: " + " ".join(parameters))
TypeError: sequence item 6: expected str instance, _io.TextIOWrapper found
Error on reading from the event loop self pipe
loop: <ProactorEventLoop running=True closed=False debug=False>
Traceback (most recent call last):
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\asyncio\proactor_events.py", line 779, in _loop_self_reading
f = self._proactor.recv(self._ssock, 4096)
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\asyncio\windows_events.py", line 450, in recv
self._register_with_iocp(conn)
File "C:\Users\wxyz\AppData\Local\Programs\Python\Python310\lib\asyncio\windows_events.py", line 723, in _register_with_iocp
_overlapped.CreateIoCompletionPort(obj.fileno(), self._iocp, 0, 0)
OSError: [WinError 87] The parameter is incorrect
PS C:\Users\wxyz\Documents\automation\practice_set_script\paket_capture>
The issue in your code is these lines:
outputF = open('capturepcap.pcap', 'w')
cap = pyshark.LiveCapture(interface='Ethernet 8', output_file=outputF)
The output_file parameter is a string and not a io.TextIOWrapper
:param output_file: A string of a file to write every read packet into (useful when filtering).
So this works:
import pyshark
def capturePacket():
cap = pyshark.LiveCapture(interface='en0', output_file='capturepcap.pcap')
cap.sniff(timeout=60)
capturePacket()
Here is a reference that I put together on using PyShark

indeed scrapy can't retrieve data

I run this command scrapy crawl indeed --logfile=crawl.log But no logfile is generated and I received the following error. I tried to debug the code, can't see nothing.
enter code here`File "C:\Anaconda3\Scripts\scrapy-script.py", line 10, in <module>
sys.exit(execute())
File "C:\Anaconda3\lib\site-packages\scrapy\cmdline.py", line 110, in execute
settings = get_project_settings()
File "C:\Anaconda3\lib\site-packages\scrapy\utils\project.py", line 68, in get_project_settings
settings.setmodule(settings_module_path, priority='project')
File "C:\Anaconda3\lib\site-packages\scrapy\settings\__init__.py", line 295, in setmodule
self.set(key, getattr(module, key), priority)
File "C:\Anaconda3\lib\site-packages\scrapy\settings\__init__.py", line 270, in set
self.attributes[name].set(value, priority)
File "C:\Anaconda3\lib\site-packages\scrapy\settings\__init__.py", line 55, in set
value = BaseSettings(value, priority=priority)
File "C:\Anaconda3\lib\site-packages\scrapy\settings\__init__.py", line 91, in __init__
self.update(values, priority)
File "C:\Anaconda3\lib\site-packages\scrapy\settings\__init__.py", line 327, in update
for name, value in six.iteritems(values):
File "C:\Anaconda3\lib\site-packages\six.py", line 587, in iteritems
return iter(d.items(**kw))
AttributeError: 'list' object has no attribute 'items'

Writing to BigQuery from within a ParDo function

I would like to call a beam.io.Write(beam.io.BigQuerySink(..)) operation from within a ParDo function to generate a separate BigQuery table for each key in the PCollection (i'm using the python SDK). Here are two similar threads, which unfortunately didn't help:
1) https://stackoverflow.com/questions/31156774/about-key-grouping-with-groupbykey
2) Dynamic table name when writing to BQ from dataflow pipelines
When I execute the following code, the rows for the first key get inserted into BigQuery and then the pipeline fails with the error below. Would really appreciate any suggestions on what I'm doing wrong or any suggestions on how to fix it.
Pipeline code:
rows = p | 'read_bq_table' >> beam.io.Read(beam.io.BigQuerySource(query=query))
class par_upload(beam.DoFn):
def process(self, context):
key, value = context.element
### This block causes issues ###
value | 'write_to_bq' >> beam.io.Write(
beam.io.BigQuerySink(
'PROJECT-NAME:analytics.first_table', #will be replace by a dynamic name based on key
schema=schema,
write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED
)
)
### End block ######
return [value]
### Following part works fine ###
filtered = (rows | 'filter_rows' >> beam.Filter(lambda row: row['topic'] == 'analytics')
| 'apply_projection' >> beam.Map(apply_projection, projection_fields)
| 'group_by_key' >> beam.GroupByKey()
| 'par_upload_to_bigquery' >> beam.ParDo(par_upload())
| 'flat_map' >> beam.FlatMap(lambda l: l) #this step is just for testing
)
### This part works fine if I comment out the 'write_to_bq' block above
filtered | 'write_to_bq' >> beam.io.Write(
beam.io.BigQuerySink(
'PROJECT-NAME:analytics.another_table',
schema=schema,
write_disposition=beam.io.BigQueryDisposition.WRITE_TRUNCATE,
create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED)
)
Error message:
INFO:oauth2client.client:Attempting refresh to obtain initial access_token
INFO:oauth2client.client:Attempting refresh to obtain initial access_token
INFO:root:Writing 1 rows to PROJECT-NAME:analytics.first_table table.
INFO:root:Final: Debug counters: {'element_counts': Counter({'CreatePInput0': 1, 'write_to_bq/native_write': 1})}
ERROR:root:Error while visiting par_upload_to_bigquery
Traceback (most recent call last):
File "split_events.py", line 137, in <module>
run()
File "split_events.py", line 132, in run
p.run()
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/pipeline.py", line 159, in run
return self.runner.run(self)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/direct_runner.py", line 102, in run
super(DirectPipelineRunner, self).run(pipeline)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 98, in run
pipeline.visit(RunVisitor(self))
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/pipeline.py", line 182, in visit
self._root_transform().visit(visitor, self, visited)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/pipeline.py", line 419, in visit
part.visit(visitor, pipeline, visited)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/pipeline.py", line 422, in visit
visitor.visit_transform(self)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 93, in visit_transform
self.runner.run_transform(transform_node)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 168, in run_transform
return m(transform_node)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/direct_runner.py", line 98, in func_wrapper
func(self, pvalue, *args, **kwargs)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/direct_runner.py", line 180, in run_ParDo
runner.process(v)
File "apache_beam/runners/common.py", line 133, in apache_beam.runners.common.DoFnRunner.process (apache_beam/runners/common.c:4483)
File "apache_beam/runners/common.py", line 139, in apache_beam.runners.common.DoFnRunner.process (apache_beam/runners/common.c:4311)
File "apache_beam/runners/common.py", line 150, in apache_beam.runners.common.DoFnRunner.reraise_augmented (apache_beam/runners/common.c:4677)
File "apache_beam/runners/common.py", line 137, in apache_beam.runners.common.DoFnRunner.process (apache_beam/runners/common.c:4245)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/typehints/typecheck.py", line 149, in process
return self.run(self.dofn.process, context, args, kwargs)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/typehints/typecheck.py", line 134, in run
result = method(context, *args, **kwargs)
File "split_events.py", line 73, in process
create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 724, in __ror__
return self.transform.__ror__(pvalueish, self.label)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 445, in __ror__
return _MaterializePValues(cache).visit(result)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/transforms/ptransform.py", line 105, in visit
return self._pvalue_cache.get_unwindowed_pvalue(node)
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 262, in get_unwindowed_pvalue
return [v.value for v in self.get_pvalue(pvalue)]
File "/Users/dimitri/anaconda/lib/python2.7/site-packages/apache_beam/runners/runner.py", line 244, in get_pvalue
value_with_refcount = self._cache[self.key(pvalue)]
KeyError: "(4384177040, None) [while running 'par_upload_to_bigquery']"
Edit (after the first answer):
I didn't realise my value needs to be a PCollection.
I've changed my code to this now (which is probably very inefficient):
key_pipe = p | 'pipe_' + key >> beam.Create(value)
key_pipe | 'write_' + key >> beam.io.Write(beam.io.BigQuerySink(..))
Which now works fine locally but not with BlockingDataflowPipelineRunner :-(
The pipeline fails with the following error:
JOB_MESSAGE_ERROR: (979394c29490e588): Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/dataflow_worker/batchworker.py", line 474, in do_work
work_executor.execute()
File "dataflow_worker/executor.py", line 901, in dataflow_worker.executor.MapTaskExecutor.execute (dataflow_worker/executor.c:24331)
op.start()
File "dataflow_worker/executor.py", line 465, in dataflow_worker.executor.DoOperation.start (dataflow_worker/executor.c:14193)
def start(self):
File "dataflow_worker/executor.py", line 469, in dataflow_worker.executor.DoOperation.start (dataflow_worker/executor.c:13499)
fn, args, kwargs, tags_and_types, window_fn = (
ValueError: too many values to unpack (expected 5)
In the similar threads, the only suggestion to do BigQuery write operations in a ParDo was to use the BigQuery API directly, or using a client.
The code that you wrote is putting a Dataflow ParDo class beam.io.BigQuerySink() into a DoFn function. The ParDo class expects to work on a PCollection like filtered in the working code example. Which is not the case for the non-functioning code working on value.
I think the easiest option would be to take a look at the gcloud-python BigQuery function insert_data() and put this inside your ParDo.

Exception during RML report generation in OpenERP 7

I am getting the exceptions while PDF generation.
This happen sometimes not always. Sometime first exception occurs and sometime second.
Here are the stack traces capturing the problems. Almost similar to first exception identified for django at Troubleshoot reportlab heisenbug
But this doesn't seems to work in my case.
File "/home/openerp/clean_oe7/server/openerp/report/report_sxw.py", line 533, in create_single_pdf
pdf = create_doc(etree.tostring(processed_rml),rml_parser.localcontext,logo,title.encode('utf8'))
File "/home/openerp/clean_oe7/server/openerp/report/interface.py", line 206, in create_pdf
obj.render()
File "/home/openerp/clean_oe7/server/openerp/report/render/render.py", line 59, in render
self._result = self._render()
File "/home/openerp/clean_oe7/server/openerp/report/render/rml.py", line 41, in _render
return rml2pdf.parseNode(self.rml, self.localcontext, images=self.bin_datas, path=self.path,title=self.title)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 1032, in parseNode
r.render(fp)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 328, in render
pt_obj.render(el)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 1003, in render
fis += r.render(node_story)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 897, in render
return process_story(node_story)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 890, in process_story
flow = self._flowable(node)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 813, in _flowable
return self._table(node)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 697, in _table
fl = self._flowable(n, extra_style=paraStyle)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 766, in _flowable
result.append(platypus.Paragraph(i, style, **(utils.attr_get(node, [], {'bulletText':'str'}))))
File "/usr/lib/python2.7/dist-packages/reportlab/platypus/paragraph.py", line 827, in __init__
self._setup(text, style, bulletText, frags, cleanBlockQuotedText)
File "/usr/lib/python2.7/dist-packages/reportlab/platypus/paragraph.py", line 842, in _setup
style, frags, bulletTextFrags = _parser.parse(text,style)
File "/usr/lib/python2.7/dist-packages/reportlab/platypus/paraparser.py", line 1058, in parse
return self._complete_parse()
File "/usr/lib/python2.7/dist-packages/reportlab/platypus/paraparser.py", line 1061, in _complete_parse
del self._seq
AttributeError: ParaParser instance has no attribute '_seq'
File "/home/openerp/clean_oe7/server/openerp/report/report_sxw.py", line 442, in create
pdf = create_doc(etree.tostring(processed_rml),rml_parser.localcontext,logo,title.encode('utf8'))
File "/home/openerp/clean_oe7/server/openerp/report/interface.py", line 206, in create_pdf
obj.render()
File "/home/openerp/clean_oe7/server/openerp/report/render/render.py", line 59, in render
self._result = self._render()
File "/home/openerp/clean_oe7/server/openerp/report/render/rml.py", line 41, in _render
return rml2pdf.parseNode(self.rml, self.localcontext, images=self.bin_datas, path=self.path,title=self.title)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 1032, in parseNode
r.render(fp)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 328, in render
pt_obj.render(el)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 1003, in render
fis += r.render(node_story)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 897, in render
return process_story(node_story)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 890, in process_story
flow = self._flowable(node)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 813, in _flowable
return self._table(node)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 697, in _table
fl = self._flowable(n, extra_style=paraStyle)
File "/home/openerp/clean_oe7/server/openerp/report/render/rml2pdf/trml2pdf.py", line 766, in _flowable
result.append(platypus.Paragraph(i, style, **(utils.attr_get(node, [], {'bulletText':'str'}))))
File "/usr/lib/python2.7/dist-packages/reportlab/platypus/paragraph.py", line 827, in __init__
self._setup(text, style, bulletText, frags, cleanBlockQuotedText)
File "/usr/lib/python2.7/dist-packages/reportlab/platypus/paragraph.py", line 842, in _setup
style, frags, bulletTextFrags = _parser.parse(text,style)
File "/usr/lib/python2.7/dist-packages/reportlab/platypus/paraparser.py", line 1057, in parse
self.close() # force parsing to complete
File "/usr/lib/python2.7/dist-packages/reportlab/lib/xmllib.py", line 521, in close
self.parser.close()
AttributeError: 'NoneType' object has no attribute 'close'

Scrapy calling spider other than the one specified on the command line

(P6Svenv)malikarumi#Tetuoan2:~/Projects/P6/P6Svenv/test2/test2/spiders$ scrapy crawl zomd
Traceback (most recent call last):
File "/usr/bin/scrapy", line 9, in <module>
load_entry_point('Scrapy==1.0.3.post6-g2d688cd', 'console_scripts', 'scrapy')()
File "/usr/lib/pymodules/python2.7/scrapy/cmdline.py", line 142, in execute
cmd.crawler_process = CrawlerProcess(settings)
File "/usr/lib/pymodules/python2.7/scrapy/crawler.py", line 209, in __init__
super(CrawlerProcess, self).__init__(settings)
File "/usr/lib/pymodules/python2.7/scrapy/crawler.py", line 115, in __init__
self.spider_loader = _get_spider_loader(settings)
File "/usr/lib/pymodules/python2.7/scrapy/crawler.py", line 296, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/usr/lib/pymodules/python2.7/scrapy/spiderloader.py", line 30, in from_settings
return cls(settings)
File "/usr/lib/pymodules/python2.7/scrapy/spiderloader.py", line 21, in __init__
for module in walk_modules(name):
File "/usr/lib/pymodules/python2.7/scrapy/utils/misc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/home/malikarumi/Projects/P6/P6Svenv/test2/test2/spiders/t350_crawl.py", line 36
def parse_item(self, response):
^
IndentationError: unindent does not match any outer indentation level
Do you see it? Scrapy isn't even calling the spider I specified on the command line!
I see that super in the traceback, but all my t350's are derived from CrawlSpider. zomd is subclassed from scrapy.Spider. Why is this happening and what do I do about it?
Spider's name doesn't equal to the file name. It is defined within the spider file by the second line below:
class CAPjobSpider(Spider):
name = "spider_name"
The above spider's name is "spider_name", even if the file may be "New_York.py".