Matplotlib's LaTeX run directory - matplotlib

Where is the LaTeX run directory for matplotlib? Are the LaTeX log files kept at all?
I use pdflatex system to generate a ".pgf" plot, that I could insert into my LaTeX document. Unfortunately the python traceback shows only a small part of the log file, which is not enough to solve the issue. I would like to take a look at the log file. The traceback tells me the following:
! Dimension too large.
<to be read again>
\relax
l.995 \Gm#process
! ==> Fatal error occurred, no output PDF file produced!
Transcript written on figure.log.
and the following error:
shell returned 1
Since I have nowhere "\Gm" in my python code, I would need to take a look at the .tex file and the log to help me figure out what is going on. I've tried to search for the file "figure.log" on my system, but it does not exists.

Related

Segmentation Error: Local Machine Fails (16gb) but AWS EC2 works (1gb)

I understand this is a little vague but not sure where else to go to or things to debug. My python script was running fine yesterday. I made minor changes today and now it only runs successfully on my Amazon LightSail (ec2) machine. Everything I read about segmentation errors is that there is not enough memory, however my local machine has 16gb of ram while the cloud machine only has 1gb. Plus I am not working with big files? The files being imported/manipulated are typically under 2mb and there are like 7-10 files.
I feel it may be something related to my terminal/zsh rather than my codes.
The below is the error code I can not seem to manage to get around.
I've done enough research to find the python faulthandler module import faulthandler; faulthandler.enable() to give the debugging below:
Fatal Python error: Segmentation fault
Current thread 0x000000010c58edc0 (most recent call first):
File "/Users/garrett/opt/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/generic.py", line 1795 in <genexpr>
File "/Users/garrett/opt/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/generic.py", line 1797 in <listcomp>
File "/Users/garrett/opt/anaconda3/lib/python3.7/site-packages/pandas/core/groupby/generic.py", line 1797 in count
File "GmailDownloader.py", line 215 in <module>
zsh: segmentation fault python *.py
The code seems to regularly break on line 215 while trying to compute a gorupby in pandas but it is very similar to other groupbys in the code that were successful before it.
I am on a Mac Catlina using the pre-baked zsh for my terminal handling but even when I switch to good ol' bash using chsh -s /bin/bash in my terminal and then running the code I still get a zsh segmentation error.
I have recently tried out PyCharm today and it asked for permissions to store something in a bin folder to which I just said yes. I'm not sure if that is correlated at all or not.
The full code repository: https://github.com/GarrettMarkScott/AutomotiveCRMPuller
Ongoing list of other things I have tried:
Trashing the Terminal preferences (~/Library/Preferences/com.apple.Terminal.plist)
I almost threw in the towel but tried to reinstall my pandas since it was mentioned in my bug error and what do you know it worked after running pip install --upgrade pandas
Would of been impossible without the FaultHandler! Hopefully this helps someone out there!

Running tesseract 4.1 with openjpeg2 - cannot produce pdf output

I have installed on my RedHat machine:
(py36_maw) [rvp#lib-archcoll box]$ tesseract -v
tesseract 4.1.0
leptonica-1.78.0
libjpeg 6b (libjpeg-turbo 1.2.90) : libpng 1.5.13 : libtiff 4.0.3 : zlib 1.2.7 : libopenjp2 2.3.1
Found SSE
I try to run, per what docs I can find, to produce pdf output:
(py36_maw) [rvp#lib-archcoll box]$ time tesseract test.jp2 out -l eng PDF
read_params_file: Can't open PDF
Tesseract Open Source OCR Engine v4.1.0 with Leptonica
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 275
That takes 10 seconds and produces file out.txt with fine OCR to text conversion evident.
However, it tries to read a file called PDF, but I cannot figure how to get PDF output.
I have read various docs, the most promising seeming to be advising to edit the config file, but the only docs I can guess are relevant, by googling 'tesseract 4.1 config', list many 'config' variable names, for older versions of tesseract, but none of which seems to indicate I can specify producing pdf output, much less specifically for tesseract 4.1.
How can I invoke tesseract 4.1 (using libopenjp2 2.3.1) via CLI to produce pdf output from my jp2 input file? Bonus question: how can I get it to produce both txt and pdf output in one run?
Robert
After more surfing and digging, assuming the reader also has done some and knows what TESSDATA_PREFIX is used for by tesseract, here are the steps that worked for me:
Download the pdf.ttf file from: https://github.com/tesseract-ocr/tesseract/blob/master/tessdata/pdf.ttf
Copy pdf.ttf to your directory $TESSDATA_PREFIX and make sure that variable is exported to your shell.
TIP: Use command: tesseract --print-parameters # to discover defined variable names you can use in your own config file
Go to your dir with the test.jp2 file and create file config with these lines.
tessedit_create_pdf 1 Write .pdf output file
tessedit_create txt 1 Write .txt output file
(Note: or you may be able to put the config file in the TESSDATA_PREFIX directory as well and let it always be the default. Not tested.)
Run in that dir:
$ tesseract test.jp2 outputbase -l eng config
Verify your success: it runs and produces files outputbase.txt and outputbase.pdf. The txt file looks good and the searchable pdf looks and works OK in a pdf viewer, that is, you can search and find text strings.
Hope this helps someone else!

error Converting PDF to PNG - Python 3.6 and GhostScript

I have much trouble to have a code to convert pdf file to png on python 3.6, windows 10.
I know what you are going to say : google it !
But barely everything I've found was on python 2.7. And some packages haven't been updated.
What I've seen so far it's that the best way to do it is using Wand, right ? (I have installed ImageMagick before )
from wand.image import Image
# Converting first page into JPG
with Image(filename='0.pdf') as img:
img.save(filename="/temp.jpg")
# Resizing this image
Here was my second error :
wand.exceptions.DelegateError: PDFDelegateFailed
`The system cannot find the file specified.' # error/pdf.c/ReadPDFImage/809
So i read i need ghostscript. I installed it. But the package is for python 2.7 and it doesn't work. I found python3-ghostscript 0.5.0. https://pypi.python.org/pypi/python3-ghostscript/0.5.0
New error :
RuntimeError: Can not find Ghostscript DLL in registry
So here I needed to install Ghostscript 9 :
https://www.ghostscript.com/download/gsdnld.html
First of all it's not a GPL license ... That's not even a package but a program. I don't know how I can use it in my futures python codes...
and there is still an error :
RuntimeError: Can not find Ghostscript DLL in registry
and i can't find anything for it.
Ghostscript is licensed under the AGPL, the licence can be found in /Program Files (x86)/gs/gs9.21/doc if you want sources then they are available from the Ghostscript Git repository. Note I'm assuming you are running on Windows since you refer to the Registry.
If you install the prebuilt binary then it will create an entry in the Windows Registry, I assume that's what your Python code is looking for but I can't be sure. You should make sure you install the correct word size (32 or 64) version required by Python, if it cares.
You can, of course, simply run Ghostscript to render a PDF file and produce PNG output.
gswin32c -sDEVICE=png16m -sOutputFile=out%d.png input.pdf
This will create one file per page of the input PDF file, use gswin64c for the 64-bit version...
You can alter the resolution of the output with the -r switch, eg -r300
I presume you can simply fork a process from Python. Otherwise you'll have to get someone to tell you what the Python script is looking for in the Registry. Perhaps its looking for a specific version of Ghostscript, or the 32-bit version or something.

Unable to open a saved Gephi project file

Recently I worked on a project done in the network visualization and analysis software Gephi, and I saved it with the ".gephi" extension. However, when I try to reopen the file, it gives the following error message:-
"The project file couldn't be opened. Please check the file has .gephi extension.
XMLStreamException - ParseError at [row,col]:[1,1]
Message: Premature end of file."
I'm a beginner in Gephi and only an amateur programmer. I do not understand this error message, and thus have no ideas on how to resolve it. I tried updating Gephi to the latest version. I also tried to open the file from within Gephi. Neither of those steps have resolved the problem. Can anyone help me out with this, please?
The error message "premature end of file" means that the xml file was not complete. I suppose that the whole file is empty or just the xml part of the file. so maybe the file got corrupted while saving.
Can you try to open the file with notepad or a hexeditor to verify that it has some content?
There must be some bug on the gephi files writing or reading process.
In order to identify the problem it would help if you can post a gephi log file when each error happens.
You can find the log file on gephi user directory (check http://wiki.gephi.org/index.php/Troubleshooting)
For example in Windows 7 the path is C:\Users\Your_User\AppData\Roaming.gephi\dev\var\log\messages.log
Also, if you can share the files, it will be easier to fix.
This could be related to an open bug where Java6 is used to save the gephi file and then Java7 is used to load the file, say on a different machine.
The jdk used by Gephi can be specified in /etc/gephi.conf or alternatively it can be specified as a parameter --jdkhome when launching Gephi.
The problem is with java and javac:
If you created your gephi file with open java-6-openjdk (for example) and then you sitch your java to java-7-openjdk, then this problem surges.
I fix my gephi returning to the same java and javac executables in Linux by:
(In terminal)
sudo update-alternatives --config java
and then
(In terminal)
sudo update-alternatives --config javac
Hope this can help!

python-django Ghostscript apache problem

When I run my app, that converts pdf to png, from django server, the conversion works fine. But when I run this from an apache server, I am getting this error: GhoscriptError: Fatal. Reading from the sterr of ghostscript, it says
Initialization file gs_init.ps does
not begin with an integer.
It seems an initialization error for me, but I have no idea how to fix this.
Using Ubuntu by the way. gs folder is in the path, so Im not sure if that is causing the problem.
Here's my code that generates the images
def PDF_to_png(input,output):
args = [
"-dSAFER",
"-dBATCH", "-dNOPAUSE", "-sDEVICE=png16m",
"-r300",
"-sOutputFile=" + os.path.join(output,input.file_name_without_extension)+"_%d.png",
input
]
ghostscript.Ghostscript(*args)
The error is telling you that the file gs_init.ps which is normally found in gs/Resource/Init/ is not valid. From the header of the file:
------------------------------------------------------------------------
% Interpreter library version number
% NOTE: the interpreter code requires that the first non-comment token
% in this file be an integer, and that it match the compiled-in version!
902
------------------------------------------------------------------------
You can build GS with the resources built-in or on disk, I don't know which build you get with Ubuntu but it sounds like either there is a gs_init.ps in the GS path which has been damaged. This probably means you are using a version with the resources on disk.
You should first try just starting up Ghostscript. If that works then it's something to do with the environment which is different when you run the failing instance. Look for environment variables which begin GS_ (especially *GS_LIB*). You should also try actually defining where GS should look on the command line by including something like :
-I/usr/src/gs/Resource
This I ncludes the specified directory as a search path for Ghostscript (NB GS does not use the PATH environment variable). GS will search here for initialisation files first before proceeding on its fall back mechanism.