Train Tesseract to label icons - python-tesseract

I'm trying to create training data for Tesseract 4.0 to identify icons (like, comment, share, save) in screenshots. This is a sample screenshot:
I would like to fine tune the Tesseract to achieve output as below:
Like 147
Comment 29
Saved 5
Actions
58
Actions
Profile Visits 24
Follows 2
I have followed step-by-step as stated in https://pretius.com/how-to-prepare-training-files-for-tesseract-ocr-and-improve-characters-recognition/
I modified the box file as below:
- Heart : Like
- Speech bubble: Comment
- Bookmark: Saved
- Arrow: Share
But, the final training data failed to read the icon as I wanted. Example of error I've got is 'Like is not in unicharset'. Do I have to do something different when creating the unicharset for icons?

I've figured it out. The box editor expects single letter/number instead of full words. I have used Unicode character to interpret my icons. The steps are as below:
Crop all target icons that you wish for Tesseract to detect and save it in one file named as (in my case) own.std.exp0.png
Create box file using the command 'tesseract own.std.exp0.png own.std.exp0 makebox'
Open jTessBoxEditor and input unicode at the char column. The list of supported unicode can be found under program Character Map (https://sites.psu.edu/symbolcodes/windows/charmap/). Example: For heart symbol I used U+2665. Note that some unicode are not supported. It shows as blank square. So, keep trying till you find one that works. My final edited box file looks like this.
Create the final training file which will be own.trainneddata (can be done as shown here https://medium.com/apegroup-texts/training-tesseract-for-labels-receipts-and-such-690f452e8f79 or train using jTessBoxEditor).
Copy the own.traineddata to the directory Tesseract/tessdata and run Tesseract using lang='own+eng'. I used pytesseract and the output is as below:

Related

How to add footer to pdf with pdfjam or pdftk?

I am using a shell script to modify many pdfs and would like to create a script that adds the page number (1 of X format) to the bottom of PDFs in a directory along with the text of the filename.
I tried using pdfjam with this format:
pdfjam --pagenumbering true
but it fails saying undefine pagenumbering
Any other recommendations how to do this? I am OK installing other tools but would like this to all be within a shell script.
Thank you
tl;dr: pdfjam --pagecommand '' input.pdf
By default, pdfjam adds the following LaTeX command to every page: \thispagestyle{empty}. By changing the command to an empty command, the default plain page style is used, which consists of a page number at the bottom. Of course you may want to play with other styles or layout options to position the page number differently.

Creating image retention test im builder view

I just downloaded psychopy this morning and have spent the day trying to figure out how to work with builder view. I watched the youtube video "Build your first PsychoPy experiment (Stroop task)" by Jon Pierce. In his video he was explaining how to make a conditions file with excel that would be used in his experiment. I wanted to make a very similar test where images would appear and subjects would be required to give a yes or no answer to them (the correct answer is already predefined). In his conditions file he had the columns 'word' 'colour' and 'corrANS'. I was wondering if instead of a 'word' column, I can have an 'image' column. In this column I would like to upload all my images to them in the same way I would words, and have them correlated to a correct answer of either 'yes' or 'no'. We tried doing this and uploaded images to the conditions file, but we haven't had any success in running the test successfully and were hoping somebody could help us.
Thank you in advance.
P.S. we are not familiar with python, or code in general, so we were hoping to get this running using the builder view.
EDIT: Here is the error message we are receiving when running the program
#### Running: C:\Users\mr00004\Desktop\New folder\1_lastrun.py
4.8397 ERROR Couldn't find image file 'C:/Users/mr00004/Desktop/New folder/PPT Retention 1/ Slide102.JPG'; check path?
Traceback (most recent call last):
File "C:\Users\mr00004\Desktop\New folder\1_lastrun.py", line 174, in
image.setImage(images)
File "C:\Program Files (x86)\PsychoPy2\lib\site-packages\psychopy-1.80.03-py2.7.egg\psychopy\visual\image.py", line 271, in setImage
maskParams=self.maskParams, forcePOW2=False)
File "C:\Program Files (x86)\PsychoPy2\lib\site-packages\psychopy-1.80.03-py2.7.egg\psychopy\visual\basevisual.py", line 652, in createTexture
% (tex, os.path.abspath(tex))#ensure we quit
OSError: Couldn't find image file 'C:/Users/mr00004/Desktop/New folder/PPT Retention 1/ Slide102.JPG'; check path? (tried: C:\Users\mr00004\Desktop\New folder\PPT Retention 1\ Slide102.JPG)
Yes, certainly, that is exactly how PsychoPy is designed to work. Simply place the image names in a column in your conditions file. You can then use the name of that column in the Builder Image component's "Image" field. The appropriate image file for a given trial will be selected.
It is difficult to help you further, though, as you haven't specified what went wrong. "we haven't had any success" doesn't give us much to go on.
Common problems:
(1) Make sure you use full filenames, including extensions (.jpg, .png, etc). These aren't always visible in Windows at least I think, but they are needed by Python.
(2) Have the images in the right place. If you just use a bare filename (e.g. image01.jpg), then PsychoPy will expect that the file is in the same directory as your Builder .psyexp file. If you want to tidy the images away, you could put them in a subfolder. If so, you need to specify a relative path along with the filename (e.g. images/image01.jpg).
(3) Avoid full paths (starting at the root level of your disk): they are prone to errors, and stop the experiment being portable to different locations or computers.
(4) Regardless of platform, use forward slashes (/) not backslashes (\) in your paths.
make a new folder in H drive and fill in the column of image in psychopy as e.g. 'H:\psych\cat.jpg' it works for me

Generated corrupt large ply file - how to find the error

I just wrote a java class to generate meshes from a cylinder list stored to a ply file. I tested the files with a hand generated list of 3 cylinders. The resulting file I can open both in Meshlab and Cloudcompare.
When I use the class in my real program I have to write a mesh for more than 13000 cylinders.
Cloudcompare gives me the following error : Reading error(no access right?)
Meshlab this one : error details, unexptected eof
I already checked if my ply file contains the exact number of vertices and faces defined in the header. I also assured, there are no nan (checked for 'n','a', etc in winedit) values contained.
I can reproduce the errors with my test file from the 3 hand made cylinder file by deleting the last line. But as mentioned earlier, I already checked if the line numbers are correct (might be an empty line not caught by my eyes though, as scrolling down half a million lines is impossible).
So are there any programs available to parse the ply file for errors? Open source tools would be appreciated here. Or are the files just to large? 436302 lines to be exact. I use ascii version of ply.
Found a non open source tool called nugraf, which provides information about the corrupted line numbers.
Java seems to print NAN with '?'. For this char i did not check, so problem seems to be solved and I can debug my java software now again.

Ansys multiphysics: blank output file

I have a model of a heating process on Ansys Multiphysics, V11.
After running the simulation, I have a script to plot a temperature profile:
!---------------- POST PROCESSING -----------------------
/post1 ! tdatabase postprocessor
!---define profile temperature
path,s_temp1,2,,100 ! define a path
ppath,1,,dop/2,0,0 ! create a path point
ppath,2,,dop/2,1.5,0 ! create a path point
PDEF,surf_t1,TEMP, ,noav ! print a path
plpath,surf_t1 ! plot a path
What I now need, is to save the resulting path in a text file. I have already looked online for a solution, and found the following code to do it, which I appended after the lines above:
/OUTPUT,filename,extension
PRPATH,surf_t1
/OUTPUT
Ansys generates the file filename.extension but it is empty. I tried to place the OUTPUT command in a few locations in the script, but without any success.
I suspect I need to define something else, but I have no idea where to look, as Ansys documentation online is terribly chaotic, and all internet pages I've opened before writing this question are not better.
A final note: Ansys V11 is an old version of the software, but I don't want to upgrade it and fit the old model to the new software.
For the output of the simulation (which includes all calculation steps, and sub-steps description and node-by-node results) the output must be declared in the beginning of the code, and not in the postprocessing phase.
Declaring
/OUTPUT,filename,extension
in the preamble of the main script makes such that the output is stored in the right location, with the desired extension. At the end of the scripts, you must then declare
/OUTPUT
to reset the output file location for ANSYS.
The output to the PATH call made in the postprocessing script is however not printed in the file.
It is convenient to use
*CFOPEN,file,ext
*VWRITE,Vector(1,1).Vector(1,2)
(2F12.6)
*CFCLOSE
where Vector(1,1) is a two column array created by *DIM, and stores your data to output to file
As this is a special command, run it from file i.e. macro_output.mac

Display variables using CBC MPS input in NEOS

Am trying to use NEOS to solve a linear program using MPS input.
The MPS file is fine, but apparently you need a "paramaters file" as well to tell the solver what to do (min/max etc.). However I can't find any information on this online anywhere.
So far I have got NEOS to solve a maximization problem and display the objective function. However I cannot get it to display the variables.
Does anyone know what code I should add to the paramters file to tell NEOS/CBC to display the resulting variables?
The parameter file consists of a list of Cbc (standalone) commands in a file (one per line). The format of the commands is (quoting the documentation):
One command per line (and no -)
abcd? gives list of possibilities, if only one + explanation
abcd?? adds explanation, if only one fuller help(LATER)
abcd without value (where expected) gives current value
abcd value or abcd = value sets value
The commands are the following:
? dualT(olerance) primalT(olerance) inf(easibilityWeight)
integerT(olerance) inc(rement) allow(ableGap) ratio(Gap)
fix(OnDj) tighten(Factor) log(Level) slog(Level)
maxN(odes) strong(Branching) direction error(sAllowed)
gomory(Cuts) probing(Cuts) knapsack(Cuts) oddhole(Cuts)
clique(Cuts) round(ingHeuristic) cost(Strategy) keepN(ames)
scaling directory solver import
export save(Model) restore(Model) presolve
initialS(olve) branch(AndBound) sol(ution) max(imize)
min(imize) time(Limit) exit stop
quit - stdin unitTest
miplib ver(sion)
To see the solution values, you should include the line sol - after the min or max line of your parameter file.
If this doesn't work you can submit the problem to NEOS in AMPL format via this page. In addition to model and data files, it accepts a commands file where you can use statements to solve the problem and display the solution, for example:
solve;
display _varname, _var;
This post describes how to convert MPS to AMPL.