How to open a remote .FTS.gz file with astropy.io.fits.open()?

How to open a remote .FTS.gz file with astropy.io.fits.open()? - gzip

Summary of a problem:
I am writing some code that checks the content of a FTS file header (data saved from a telescope) using astropy.io.fits. My problem is when I try to open .FTS.gz files instead of .FTS files on a remote server. When I open() a .FTS.gz I get errors, if I gunzip the .FTS.gz file, all is good. One of the errors suggest I have an END missing card. Searching online, I used a suggestion of using the ignore_missing_end=True argument in fits.open(), but then I get the next error. This next error suggests my FITS file is empty or corrupt, however it is not the case. I can open it with SAOImage DS9 without any problems, plus I have run this handy online tool called fitsverify which reports no errors in my file. If I download the offending file .FTS.gz and run a similar code to fits.open() this file locally, I get no errors at all. An example of an offending file (used in the code below) is now uploaded here.
The Astropy documentation says:
"Working with compressed files
The open() function will seamlessly open FITS files that have been compressed with gzip, bzip2 or pkzip. Note that in this context we’re talking about a fits file that has been compressed with one of these utilities - e.g. a .fits.gz file."
How do I open a remote .FTS.gz file without downloading it? I have hundreds of thousands of files like this, so downloading is not an option and it is not just one file that gives a problem, it is all of them.
Thanks,
Aina.
Code and errors:
CODE TO OPEN A REMOTE .FTS.gz FILE:
from astropy.io import fits
import paramiko
client = paramiko.SSHClient()
client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
client.load_system_host_keys()
client.connect('myhostname', username='myusername', password='mypassword')
apath = '/path/to/folder/to/search'
apattern = '"RUN0001.FTS.gz"'
rawcommand = 'find {path} -name {pattern}'
command = rawcommand.format(path=apath, pattern=apattern)
stdin, stdout, stderr = client.exec_command(command)
filelist = stdout.read().splitlines()
for i in filelist:
sftp_client = client.open_sftp()
remote_file = sftp_client.open(i)
hdulist = fits.open(remote_file)
client.close()
ERROR:
Traceback (most recent call last):
File "/Users/amusaeva/Documents/PyCharm/FITSHeaders/stackoverflow.py", line 17, in <module>
hdulist = fits.open(remote_file)
File "/Library/Python/2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 166, in fitsopen
lazy_load_hdus, **kwargs)
File "/Library/Python/2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 404, in fromfile
lazy_load_hdus=lazy_load_hdus, **kwargs)
File "/Library/Python/2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 1040, in _readfrom
read_one = hdulist._read_next_hdu()
File "/Library/Python/2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 1135, in _read_next_hdu
hdu = _BaseHDU.readfrom(fileobj, **kwargs)
File "/Library/Python/2.7/site-packages/astropy/io/fits/hdu/base.py", line 329, in readfrom
**kwargs)
File "/Library/Python/2.7/site-packages/astropy/io/fits/hdu/base.py", line 394, in _readfrom_internal
header = Header.fromfile(data, endcard=not ignore_missing_end)
File "/Library/Python/2.7/site-packages/astropy/io/fits/header.py", line 450, in fromfile
padding)[1]
File "/Library/Python/2.7/site-packages/astropy/io/fits/header.py", line 519, in _from_blocks
raise IOError('Header missing END card.')
IOError: Header missing END card.
Process finished with exit code 1
CHANGING THE CODE ABOVE FOR ONE LINE ONLY:
hdulist = fits.open(remote_file, ignore_missing_end=True)
ERROR:
WARNING: VerifyWarning: Error validating header for HDU #0 (note: Astropy uses zero-based indexing).
Header size is not multiple of 2880: 7738429
There may be extra bytes after the last HDU or the file is corrupted. [astropy.io.fits.hdu.hdulist]
Traceback (most recent call last):
File "/Users/amusaeva/Documents/PyCharm/FITSHeaders/stackoverflow.py", line 17, in <module>
hdulist = fits.open(remote_file, ignore_missing_end=True)
File "/Library/Python/2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 166, in fitsopen
lazy_load_hdus, **kwargs)
File "/Library/Python/2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 404, in fromfile
lazy_load_hdus=lazy_load_hdus, **kwargs)
File "/Library/Python/2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 1044, in _readfrom
raise IOError('Empty or corrupt FITS file')
IOError: Empty or corrupt FITS file
Process finished with exit code 1
CODE TO OPEN THE OFFENDING .FTS.gz FILE LOCALLY PRODUCES NO ERRORS:
import os
from astropy.io import fits
folderTosearch = "/path/to/folder/to/search/locally";
for root, dirs, files in os.walk(folderTosearch):
for file in files:
if file.endswith("RUN0001.FTS.gz"):
hdulist = fits.open(os.path.join(root, file))

This happens because the sftp call passes some variant of a file-like object (which has a .read() method that fits.open() will use.
The file like object, however, is still a gzip file. Astropy checks whether a file is zipped only for file names, that is, when the argument to fits.open() is a string (that happens to be a path). Astropy does not appear to test for the magic bytes that identify a byte stream as a gzip file. Oddly enough, it does do this verification when path strings are passed. Arguably, this may be a slight shortcoming in the astropy.io.fits module, but perhaps there's a reason for it.
(Disclaimer: the above conclusion is from scanning quickly through the relevant source code; I may have missed something. Hopefully people will correct me if so.)
One solution is to do the unzipping yourself. I've cobbled up the following:
from cStringIO import StringIO
import zlib
<...>
for i in filelist:
sftp_client = client.open_sftp()
remote_file = sftp_client.open(i)
decompressed = StringIO(
zlib.decompress(remote_file.read(), zlib.MAX_WBITS|32))
hdulist = fits.open(decompressed)
client.close()
Above, we're reading the full contents of the remote file (remote_file.read(), then uncompressing the contents. That results in a string, so we wrap it in a StringIO instance to make it a file-like object again, that we can pass to fits.open(). (For the zlib.MAX_WBITS|32 argument: see this answer.)
Alternatively, you can sftp the file to local disk, and then read the file (with the local filename) locally. The above just keeps everything in memory.

Related

Incorrect context when trying to reload text

in order to develop my script in an external IDE, I'm trying to automatically reload a text in blender when it is modified from outside of Blender. I think I've got the idea of how it works, but there must be one tiny detail that completely blocks me from achieving my goal.
My script is the following:
import bpy
import sys
ctx = bpy.context.copy()
for area in ctx['screen'].areas:
if area.type == 'TEXT_EDITOR':
textEditor = area
for text in bpy.data.texts:
if text.name == 'main.py':
textEditor.spaces[0].text = text
bpy.ops.text.reload()
Output:
>>> exec(bpy.data.texts['reload.py'].as_string())
Traceback (most recent call last):
File "<blender_console>", line 1, in <module>
File "<string>", line 13, in <module>
File "/Applications/Blender.app/Contents/Resources/3.3/scripts/modules/bpy/ops.py", line 113, in __call__
ret = _op_call(self.idname_py(), None, kw)
RuntimeError: Operator bpy.ops.text.reload.poll() failed, context is incorrect
I cannot get what part of the context is correct. I found no solution on the Internet, even the doc recommends reading the source code to understand what might be the cause of this error, but I couldn't find it anyway in the source code.
Can someone tell me what I am missing here please?

How to have multi-page table row cells converted properly using rst2pdf Shpinx?

I have a table of which one-row cell has so much data that it could span multiple pages in the finally-generated PDF file. rst2pdf ungracefully fails when I feed it my file, with the following output:
[ERROR] pdfbuilder.py:161 Failed to build doc
Traceback (most recent call last):
File "/home/pwng/.local/lib/python3.9/site-packages/rst2pdf/pdfbuilder.py", line 158, in write
docwriter.write(doctree, destination)
File "/usr/lib/python3/dist-packages/docutils/writers/__init__.py", line 78, in write
self.translate()
File "/home/pwng/.local/lib/python3.9/site-packages/rst2pdf/pdfbuilder.py", line 697, in translate
createpdf.RstToPdf(
File "/home/pwng/.local/lib/python3.9/site-packages/rst2pdf/createpdf.py", line 689, in createPdf
pdfdoc.multiBuild(elements)
File "/usr/lib/python3/dist-packages/reportlab/platypus/doctemplate.py", line 1167, in multiBuild
self.build(tempStory, **buildKwds)
File "/usr/lib/python3/dist-packages/reportlab/platypus/doctemplate.py", line 1080, in build
self.handle_flowable(flowables)
File "/home/pwng/.local/lib/python3.9/site-packages/rst2pdf/createpdf.py", line 859, in handle_flowable
self.handle_frameEnd()
File "/usr/lib/python3/dist-packages/reportlab/platypus/doctemplate.py", line 726, in handle_frameEnd
self.handle_pageEnd()
File "/usr/lib/python3/dist-packages/reportlab/platypus/doctemplate.py", line 668, in handle_pageEnd
raise LayoutError(ident)
reportlab.platypus.doctemplate.LayoutError: More than 10 pages generated without content - halting layout. Likely that a flowable is too large for any frame.
FAILED
build succeeded.
and make latexpdf produce undesirable output depicted in the following screenshot.
Is there a way to remedy this problem using either latexpdf or rst2pdf? Ideally, I would like a solution that works for both spaced text (i.e. space-separated words) and consecutive, wrapped non-separated text.

This isn't the answer you want, but rst2pdf won't split the cell across pages, so if it doesn't fit onto the page, it won't be able to generate the document. The project (I'm a maintainer) is open to patches, in case you end up fixing it yourself. I'm not aware of a workaround, other than reformatting the content to be more printable.

GNU Radio OOT block: AttributeError: module 'twoTypes' has no attribute 'passthrough_cc'

What I'm trying to do:
I am trying to write an OOT block for GNU radio that accept complex or byte values and just pass through. (My final target is obviously do some processing of the incoming stream, but it had so many errors, I had to go back to the basics)
The error:
I receive this error at runtime
traceback (most recent call last):
File "/home/maisun/Desktop/asdf.py", line 193, in <module>
main()
File "/home/maisun/Desktop/asdf.py", line 171, in main
tb = top_block_cls()
File "/home/maisun/Desktop/asdf.py", line 82, in __init__
self.twoTypes_passthrough_0 = twoTypes.passthrough_cc(0)
AttributeError: module 'twoTypes' has no attribute 'passthrough_cc'
what I've tried:
I have looked into the source of GR itself and tried to correct my yaml files, my header files. To the best of my knowledge, I have defined 'passthrough_cc' in the twoTypes.h header file as,
typedef passthrough<std::uint8_t> passthrough_bb;
typedef passthrough<gr_complex> passthrough_cc;
Obviosly i'm still doing something wrong here.
my questions:
I have 2 questions.
First, how can I correct the CPP code and the python module? so that I can call passthrough_cc without errors.
second, I am more comfortable in C language. So sometimes GR codes are very confusing to me. GR wiki has some nice guides available. But does anyone know if there is any guide/blog post that kind of discusses the workflow of GR? For example, when I trace my codes, I start from main() and follow the flow. with GR I always get lost.
full code:
https://github.com/maisunmonowar/gr-twoTypes

Odoo 12: 'report.label.report_label' AttributeError

I am using a third-party module in Odoo to do mass label printing (https://www.odoo.com/apps/modules/12.0/label/) and despite the fact that the module claims to be compatible with version 12, I am getting server errors when trying to run the pdf rendering:
Odoo Server Error
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/odoo/addons/web/controllers/main.py", line 1677, in report_download
response = self.report_routes(reportname, converter=converter, **dict(data))
File "/usr/lib/python3/dist-packages/odoo/http.py", line 517, in response_wrap
response = f(*args, **kw)
File "/usr/lib/python3/dist-packages/odoo/addons/web/controllers/main.py", line 1614, in report_routes
pdf = report.with_context(context).render_qweb_pdf(docids, data=data)[0]
File "/usr/lib/python3/dist-packages/odoo/addons/base/models/ir_actions_report.py", line 677, in render_qweb_pdf
html = self.with_context(context).render_qweb_html(res_ids, data=data)[0]
File "/usr/lib/python3/dist-packages/odoo/addons/base/models/ir_actions_report.py", line 710, in render_qweb_html
data = self._get_rendering_context(docids, data)
File "/usr/lib/python3/dist-packages/odoo/addons/base/models/ir_actions_report.py", line 723, in _get_rendering_context
data.update(report_model._get_report_values(docids, data=data))
AttributeError: 'report.label.report_label' object has no attribute '_get_report_values'
Screenshot:
It may be an error related to the change of some Odoo version (or not, I don’t really know).
Does anyone know if this attribute exist? I haven’t been able to find this information in the Odoo documentation (it doesn’t seem very complete regarding these topic).
Here are some screenshots of the configurations I’m using:
Thank you for your help!

From the traceback you shared on first screenshot, the problem seems to be with report_model variable which is reference to report.label.report_label object, defined in label/report/dunamic_model.py file which contains the method get_report_values. But from odoo 12 community code, ir.actions.report is looking for _get_report_values, the mismatch between this two method name is actually causing the problem.

Using fileinput.input() to read gzip files

I'm using fileinput to read some large data:
import gzip
import fileinput
f=gzip.open('/scratch/try.fastq.gz','r')
for line in fileinput.input(f):
print line
However I got errors like:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/share/lib/python2.6/fileinput.py", line 253, in next
line = self.readline()
File "/share/lib/python2.6/fileinput.py", line 345, in readline
self._file = open(self._filename, self._mode)
IOError: [Errno 2] No such file or directory: '#HWI-ST150_0129:2:1:13466:2247#0/1\n'
Cannot fileinput take file object as input? Then how to use fileinput to deal with gzip file?
thx

Nope, the first argument to fileinput.input should be a list of filenames. What you want can be achieved with
for line in gzip.open('/scratch/try.fastq.gz')
print line
fileinput exists to support the idiom where a program reads from a list of files, probably supplied on the command line, or standard input if no files have been specified. If you still want to use it, even though it's useless in your example, you should do
for line in fileinput(['/scratch/try.fastq.gz'], openhook=gzip.open):
print line

As other sources have said, the value for openhook must be a function, but that doesn't mean you can't call a function to return a function. For example, if you want to support multiple different types of incoming files you could write something like this:
import fileinput
import gzip
def get_open_handler(compressed):
if deciding_data:
# mode comes in as 'r' by defualt, but that means binary to `gzip`
return lambda file_name, mode: gzip.open(file_name, mode='rt')
else:
# the default mode of 'r' means text for `open`
return open
# get args here
for line in fileinput.input(args.files, openhook=get_open_handler(args.compressed))
print(line)
As you can see, we are calling a function from openhook, but that function returns another function. In this case, we are fixing the mode of gzip.open, but we can do anything we want, including using functools.partial to bind some values to a function so that when the default filename and mode get passed to the function assigned to openhook, the function will do what you want.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas