Is there any spreadsheet program that supports reading HDF5 files ?;
have you already tried HDFview?
its tabular view is quite similar to a spreadsheet application, you can also save to text file and then open it with a more standard spreadsheet application if you prefer:
http://www.hdfgroup.org/hdf-java-html/hdfview/UsersGuide/ug05spreadsheet.html#ug05save
You can download HDFview here:
http://www.hdfgroup.org/hdf-java-html/hdfview/index.html
Related
I need some help extracting and manipulating data from a pdf.
pdf in question below, link: https://www.england.nhs.uk/wp-content/uploads/2018/04/national-tables-5-mgml-v3.pdf
national dose band screenshot
What I want is to create a list of lists, with the items on columns 1 and 3, like this one: oxalirange = ([5.75, 6.24], [6.25, 6.74], [6.75, 7.24],...
I know how to extract the pdf as an excel table via Camelot and pandas, and then what I have been doing is manually compiling the list, so what I'd like to know is how to automate that via python and pandas (or any other python library)
I am happy to be pointed out to the most relevant website so I can find the info myself.
Thanks in advance.
You can uses xlrd library in python to read an excel file here is a link to their documentation, However it will be limited to .xls files only (old excel)
https://xlrd.readthedocs.io/en/latest/
but here is a list of alternative libraries related to excel
https://www.python-excel.org/
I am working on a Deep Learning project, the data was provided to me in a file with the ".data" extension. Able to read the data from the file using the Pandas "read_csv" function. I tried to search about the file properties on the web, but i am not clear about the file properties, usage, etc. Here are the few questions i have,
What is the ".data" file?
How they are created? (Mean exported from any application or database)
Is this the correct way to read the ".data" file using the pd.read_csv method? (Tried read_table as well)
Is there any other way to read the ".data" file?
Recently i found a solution for .data files using pandas.
import pandas as pd
data = pd.read_fwf("example.data")
For more details check here.
I just ran into a .data file in the wild myself. I've been able to view it in any text editor (notepad, visual studio code, jupyter lab, etc). This helped determine what the separator should be. Mine was not tab-delimited as mrinali mentioned, but that's not to say that there aren't any tab-delimited .data files. Mine was space-delimited, so I just specified this as "sep" in panda's .read_csv() method:
pd.read_csv('<your_path>', sep=' ')
A DATA file is a data file used by Analysis Studio, a statistical analysis and data mining program. It contains mined data in a plain text, tab-delimited format, including an Analysis Studio file header. DATA files are commonly used to store data for offline data analysis when not connected to an Analysis Studio server, but may also be used in online mode.
Due to their tab-delimited format, DATA files may be imported using pandas via read_csv function once their header information is stripped.
HOW TO OPEN A .DATA FILE?
Launch a .data file, or any other file on your PC, by double-clicking it. If your file associations are set up correctly, the application that's meant to open your .data file will open it. It's possible you may need to download or purchase the correct application. It's also possible that you have the correct application on your PC, but .data files aren't yet associated with it. In this case, when you try to open a .data file, you can tell Windows which application is the correct one for that file. From then on, opening a .data file will open the correct application.
I am using Google Custom Search along with the XML API. From the documentation linked below, I can see that the XML API supports searches for .xls files, but what about .xlsx files? Half of our files are now the newer .xlsx format and we need for them to turn up in our search results.
How does one search for .xlsx files with the XML API? This is not covered anywhere in the XML API documentation and searching for .xls files does not return any results for .xlsx files, when it should.
https://developers.google.com/custom-search/docs/xml_results
I figured it out. You can use filetype:xlsx even though the documentation does not include xlsx as a supported file type to search for.
Also, you can search for multiple file types. Here is the documentation on that: https://developers.google.com/custom-search/docs/xml_results#wsSpecialQueryTerms
I'm trying to import fields from a fill-able PDF into a sql databse.
I can't seem to find an answer online:
What's the best way to import/read data from pdf files?
Insert a PDF file into Core Data?'
http://www.utteraccess.com/forum/Import-Fillable-Pfd-Data-t1971535.html
So I'm wondering does anyone know how to extract data from a fill-able PDF into a database(or excel from which it can be imported into a database)
Thanks
Data from fillable pdf's can be exported into an .FDF file, which is a text file. pdftk is a command-line utility that will allow you to extract the data programmatically. You will then need to write a custom parser to pull the data out of the .FDF file.
It won't be a lot of fun, but it should be do-able.
You can use pdftk. I used it and it's great works like a charm. Lot's of coding though. You can get back at me if you need any help
I'd like to offer the possibility for users of my app to export to Excel. I don't ever need to read Excel files.
The three ways I know right now is to
make a CSV file, which isn't too great as I'd like to have some custom formatting in the spreadsheet
make an XML file that I don't think people'd recognize as an Excel file
make a template xlsx file, unzip it in the app, do a lot of search-replacing in the files and then zip it back up again
Are there other alternatives? I'm not sure how supported .xlsx files are, and that seems like very much work. Are there any frameworks out there I can lean on, that perhaps even make old-school .xls files?
Cheers
Nik
Some options for you to consider:
1) You may be able to use ooxml http://en.wikipedia.org/wiki/Office_Open_XML_file_formats. You may need the "office compatibility pack" on computers with excel 2003 or lower http://go.microsoft.com/?linkid=5754865.
2) Excel 2000 uses the BIFF file format: http://www.google.com/url?sa=t&source=web&cd=1&ved=0CBcQFjAA&url=http%3A%2F%2Fsc.openoffice.org%2Fexcelfileformat.pdf&ei=iDx0TKOhBIqmnQfckKy7CQ&usg=AFQjCNE2w4xyFSoKmvKdsa7O9TMqynYpbA (pdf). You may be able to create simple documents from the spec or based on other info on the web.