When I run my app, that converts pdf to png, from django server, the conversion works fine. But when I run this from an apache server, I am getting this error: GhoscriptError: Fatal. Reading from the sterr of ghostscript, it says
Initialization file gs_init.ps does
not begin with an integer.
It seems an initialization error for me, but I have no idea how to fix this.
Using Ubuntu by the way. gs folder is in the path, so Im not sure if that is causing the problem.
Here's my code that generates the images
def PDF_to_png(input,output):
args = [
"-dSAFER",
"-dBATCH", "-dNOPAUSE", "-sDEVICE=png16m",
"-r300",
"-sOutputFile=" + os.path.join(output,input.file_name_without_extension)+"_%d.png",
input
]
ghostscript.Ghostscript(*args)
The error is telling you that the file gs_init.ps which is normally found in gs/Resource/Init/ is not valid. From the header of the file:
------------------------------------------------------------------------
% Interpreter library version number
% NOTE: the interpreter code requires that the first non-comment token
% in this file be an integer, and that it match the compiled-in version!
902
------------------------------------------------------------------------
You can build GS with the resources built-in or on disk, I don't know which build you get with Ubuntu but it sounds like either there is a gs_init.ps in the GS path which has been damaged. This probably means you are using a version with the resources on disk.
You should first try just starting up Ghostscript. If that works then it's something to do with the environment which is different when you run the failing instance. Look for environment variables which begin GS_ (especially *GS_LIB*). You should also try actually defining where GS should look on the command line by including something like :
-I/usr/src/gs/Resource
This I ncludes the specified directory as a search path for Ghostscript (NB GS does not use the PATH environment variable). GS will search here for initialisation files first before proceeding on its fall back mechanism.
Related
I have installed on my RedHat machine:
(py36_maw) [rvp#lib-archcoll box]$ tesseract -v
tesseract 4.1.0
leptonica-1.78.0
libjpeg 6b (libjpeg-turbo 1.2.90) : libpng 1.5.13 : libtiff 4.0.3 : zlib 1.2.7 : libopenjp2 2.3.1
Found SSE
I try to run, per what docs I can find, to produce pdf output:
(py36_maw) [rvp#lib-archcoll box]$ time tesseract test.jp2 out -l eng PDF
read_params_file: Can't open PDF
Tesseract Open Source OCR Engine v4.1.0 with Leptonica
Warning: Invalid resolution 0 dpi. Using 70 instead.
Estimating resolution as 275
That takes 10 seconds and produces file out.txt with fine OCR to text conversion evident.
However, it tries to read a file called PDF, but I cannot figure how to get PDF output.
I have read various docs, the most promising seeming to be advising to edit the config file, but the only docs I can guess are relevant, by googling 'tesseract 4.1 config', list many 'config' variable names, for older versions of tesseract, but none of which seems to indicate I can specify producing pdf output, much less specifically for tesseract 4.1.
How can I invoke tesseract 4.1 (using libopenjp2 2.3.1) via CLI to produce pdf output from my jp2 input file? Bonus question: how can I get it to produce both txt and pdf output in one run?
Robert
After more surfing and digging, assuming the reader also has done some and knows what TESSDATA_PREFIX is used for by tesseract, here are the steps that worked for me:
Download the pdf.ttf file from: https://github.com/tesseract-ocr/tesseract/blob/master/tessdata/pdf.ttf
Copy pdf.ttf to your directory $TESSDATA_PREFIX and make sure that variable is exported to your shell.
TIP: Use command: tesseract --print-parameters # to discover defined variable names you can use in your own config file
Go to your dir with the test.jp2 file and create file config with these lines.
tessedit_create_pdf 1 Write .pdf output file
tessedit_create txt 1 Write .txt output file
(Note: or you may be able to put the config file in the TESSDATA_PREFIX directory as well and let it always be the default. Not tested.)
Run in that dir:
$ tesseract test.jp2 outputbase -l eng config
Verify your success: it runs and produces files outputbase.txt and outputbase.pdf. The txt file looks good and the searchable pdf looks and works OK in a pdf viewer, that is, you can search and find text strings.
Hope this helps someone else!
I am trying to run an application which uses pagemap in gem5 FS mode.
But I am not able to use pagemap in gem5. It throws below error -
"assert(pagemap>=0) failed"
The line of code is:
int pagemap = open("/proc/self/pagemap", O_RDONLY);
assert(pagemap >= 0);
Also, If I try to run my application on gem5 terminal with sudo ,it throws error-
sudo command not found
How can I use sudo in gem5 ??
These problems are not gem5 specific, but rather image / Linux specific, and would likely happen on any simulator or real hardware. So I recommend that you remove gem5 from the equation completely, and ask a Linux or image specific question next time, saying exactly what image your are using, kernel configs, and provide a minimal C example that reproduces the problem: this will greatly improve the probability that you will get help.
I have just done open("/proc/self/pagemap", O_RDONLY) successfully with: this program and on this fs.py setup on aarch64, see also these comments.
If /proc/<pid>/pagemap is not present for any file, do the following:
ensure that procfs is mounted on /proc. This is normally done with an fstab entry of type:
proc /proc proc defaults 0 0
but your init script needs to use fstab as well.
Alternatively, you can mount proc manually with:
mount -t proc proc proc/
you will likely want to ensure that /sys and /dev are mounted as well.
grep the kernel to see if there is some config controlling the file creation.
These kinds of things are often easy to find without knowing anything about the kernel.
If I do:
git grep '"pagemap'
to find the pagemap string, which is likely the creation point, on v4.18 this leads me to fs/proc/base.c, which contains:
#ifdef CONFIG_PROC_PAGE_MONITOR
REG("pagemap", S_IRUSR, proc_pagemap_operations),
#endif
so make sure CONFIG_PROC_PAGE_MONITOR is set.
sudo: most embedded / simulator images don't have it, you just login as root directly and can do anything by default without it. This can be seen by the conventional # in the prompt instead of $.
I have much trouble to have a code to convert pdf file to png on python 3.6, windows 10.
I know what you are going to say : google it !
But barely everything I've found was on python 2.7. And some packages haven't been updated.
What I've seen so far it's that the best way to do it is using Wand, right ? (I have installed ImageMagick before )
from wand.image import Image
# Converting first page into JPG
with Image(filename='0.pdf') as img:
img.save(filename="/temp.jpg")
# Resizing this image
Here was my second error :
wand.exceptions.DelegateError: PDFDelegateFailed
`The system cannot find the file specified.' # error/pdf.c/ReadPDFImage/809
So i read i need ghostscript. I installed it. But the package is for python 2.7 and it doesn't work. I found python3-ghostscript 0.5.0. https://pypi.python.org/pypi/python3-ghostscript/0.5.0
New error :
RuntimeError: Can not find Ghostscript DLL in registry
So here I needed to install Ghostscript 9 :
https://www.ghostscript.com/download/gsdnld.html
First of all it's not a GPL license ... That's not even a package but a program. I don't know how I can use it in my futures python codes...
and there is still an error :
RuntimeError: Can not find Ghostscript DLL in registry
and i can't find anything for it.
Ghostscript is licensed under the AGPL, the licence can be found in /Program Files (x86)/gs/gs9.21/doc if you want sources then they are available from the Ghostscript Git repository. Note I'm assuming you are running on Windows since you refer to the Registry.
If you install the prebuilt binary then it will create an entry in the Windows Registry, I assume that's what your Python code is looking for but I can't be sure. You should make sure you install the correct word size (32 or 64) version required by Python, if it cares.
You can, of course, simply run Ghostscript to render a PDF file and produce PNG output.
gswin32c -sDEVICE=png16m -sOutputFile=out%d.png input.pdf
This will create one file per page of the input PDF file, use gswin64c for the 64-bit version...
You can alter the resolution of the output with the -r switch, eg -r300
I presume you can simply fork a process from Python. Otherwise you'll have to get someone to tell you what the Python script is looking for in the Registry. Perhaps its looking for a specific version of Ghostscript, or the 32-bit version or something.
I have decided to take on MCP and have downloaded it, however, when running the decompile.bat, it returns an error.
(I'm running 32-bit Windows 10)
Here is what it returned:
'"C:\Program Files\Java\jdk1.8.0_65\bin\java" -jar runtime\bin\fernflower.jar -din=1 -rbr=1 -dgs=1 -asc=1 -rsy=1 -iec=1 -jvn=1 -log=WARN "-e=jars\libraries\net/java/jinput\jinput\2.0.5\jinput-2.0.5.jar" "-e=jars\libraries\org/lwjgl/lwjgl\lwjgl-platform\2.9.4-nightly-20150209\lwjgl-platform-2.9.4-nightly-20150209-natives-windows.jar" "-e=jars\libraries\com/ibm/icu\icu4j-core-mojang\51.2\icu4j-core-mojang-51.2.jar" "-e=jars\libraries\tv/twitch\twitch-external-platform\4.5\twitch-external-platform-4.5-natives-windows-32.jar" "-e=jars\libraries\org/apache/httpcomponents\httpcore\4.3.2\httpcore-4.3.2.jar" "-e=jars\libraries\org/apache/logging/log4j\log4j-api\2.0-beta9\log4j-api-2.0-beta9.jar" "-e=jars\libraries\org/apache/commons\commons-lang3\3.3.2\commons-lang3-3.3.2.jar" "-e=jars\libraries\net/java/jutils\jutils\1.0.0\jutils-1.0.0.jar" "-e=jars\libraries\net/java/dev/jna\jna\3.4.0\jna-3.4.0.jar" "-e=jars\libraries\com/paulscode\libraryjavasound\20101123\libraryjavasound-20101123.jar" "-e=jars\libraries\net/sf/jopt-simple\jopt-simple\4.6\jopt-simple-4.6.jar" "-e=jars\libraries\com/google/guava\guava\17.0\guava-17.0.jar" "-e=jars\libraries\oshi-project\oshi-core\1.1\oshi-core-1.1.jar" "-e=jars\libraries\commons-logging\commons-logging\1.1.3\commons-logging-1.1.3.jar" "-e=jars\libraries\org/apache/commons\commons-compress\1.8.1\commons-compress-1.8.1.jar" "-e=jars\libraries\net/java/dev/jna\platform\3.4.0\platform-3.4.0.jar" "-e=jars\libraries\com/paulscode\codecjorbis\20101023\codecjorbis-20101023.jar" "-e=jars\libraries\com/paulscode\soundsystem\20120107\soundsystem-20120107.jar" "-e=jars\libraries\com/paulscode\librarylwjglopenal\20100824\librarylwjglopenal-20100824.jar" "-e=jars\libraries\org/lwjgl/lwjgl\lwjgl_util\2.9.4-nightly-20150209\lwjgl_util-2.9.4-nightly-20150209.jar" "-e=jars\libraries\commons-codec\commons-codec\1.9\commons-codec-1.9.jar" "-e=jars\libraries\org/apache/httpcomponents\httpclient\4.3.3\httpclient-4.3.3.jar" "-e=jars\libraries\org/lwjgl/lwjgl\lwjgl\2.9.4-nightly-20150209\lwjgl-2.9.4-nightly-20150209.jar" "-e=jars\libraries\commons-io\commons-io\2.4\commons-io-2.4.jar" "-e=jars\libraries\com/mojang\realms\1.7.39\realms-1.7.39.jar" "-e=jars\libraries\com/mojang\authlib\1.5.21\authlib-1.5.21.jar" "-e=jars\libraries\com/google/code/gson\gson\2.2.4\gson-2.2.4.jar" "-e=jars\libraries\tv/twitch\twitch\6.5\twitch-6.5.jar" "-e=jars\libraries\com/paulscode\codecwav\20101023\codecwav-20101023.jar" "-e=jars\libraries\tv/twitch\twitch-platform\6.5\twitch-platform-6.5-natives-windows-32.jar" "-e=jars\libraries\net/java/jinput\jinput-platform\2.0.5\jinput-platform-2.0.5-natives-windows.jar" "-e=jars\libraries\org/apache/logging/log4j\log4j-core\2.0-beta9\log4j-core-2.0-beta9.jar" "-e=jars\libraries\io/netty\netty-all\4.0.23.Final\netty-all-4.0.23.Final.jar" temp/minecraft_ff_in.jar temp\src\minecraft' failed : 1
Decompile failed
This is caused by the decompilation system running out of RAM. I'm not entirely sure why it's happening, but it also was happening to me.
If you're using Minecraft Forge's ForgeGradle, see this. You can either edit the gradle options file ( .gradle/gradle.properties in your user folder) and add org.gradle.jvmargs=-Xmx2G to it, or you can set the options variable to -Xmx2G (in a command prompt, run set GRADLE_OPTS=-Xmx2G and then gradlew setupDecompWorkspace).
However, given that you referenced decompile.bat, you probably are using MCP without Forge. (Which is fine but forge does make mods easier/more compatible; you may want to consider doing this if you're making a more permanent mod rather than just messing about.) In this case, you can edit MCP's configuration to increase the given RAM.
In the MCP folder, open the conf folder and then open mcp.cfg with a text editor of your choice. Then, find this line (near the bottom)
CmdFernflower = %s -jar %s -din=1 -rbr=0 -dgs=1 -asc=1 -log=WARN {indir} {outdir}
and replace it with this:
CmdFernflower = %s -Xmx2G -jar %s -din=1 -rbr=0 -dgs=1 -asc=1 -log=WARN {indir} {outdir}
(You may need to change other lines also adding -Xmx2G before -jar but it doesn't seem to be needed from my experience).
This will run the decompiler with additional RAM.
Alternatively, if you don't want to mess around with the MCP configuration, MCP910 doesn't seem to have this issue. It works with 1.8.0 instead of 1.8.8, but should still do everything you want.
I know, this answer comes very late, but you should install the 64-bit Version of Java. With the 32-Bit Version, it doesn't work...
I don't know if you can install this on your 32-Bit System, but you can try it. On my 86-Bit System (Windows 8) it works!
I'm trying to get ghostscript to convert PDFs to PCL-5 (or 5e) using a driver capable of being configured (the built-in drivers produce very surprisingly large output, and I need something capable of being tweaked).
I have gutenprint compiled, and have placed the ijsgutenprint executable at /home/marcintustin/webapps/django/oneclickcosvirt/bin/ijsgutenprint.5.2. When I try to invoke it with ghostscript with
gs -dBATCH -dNOPAUSE -dNOCIE -dSAFER -sDEVICE=ijs \
-sIjsServer=/home/marcintustin/webapps/django/oneclickcosvirt/bin/ijsgutenprint.5.2 \
-sDeviceManufacturer=vendor -sDeviceModel=name -sOutputFile=- - < sztst.pdf > sztst.pcl
I get the error GPL Ghostscript 8.70: Can't start ijs server "/home/marcintustin/webapps/django/oneclickcosvirt/bin/ijsgutenprint.5.2". I am mystified because the file is at the given location, is set executable, and can be invoked without error from the commandline. Any ideas on what's wrong / another way to solve this?
(I'm doing this on a shared host, to which I am not root, so I can't configure system-wide printing, and I'd prefer not to install any printing-related daemons unless absolutely necessary).
The issue was that gutenprint, in addition to the ijsgutenprint.5.2 binary, needs to have in the same directory where the binary is installed a directory called .libs containing further files. (The .objects directory also generated during build is not required in the installation).
Take note if performing a manual install!