I am trying to find the numbers in images like the one attached. I found that it helps when I run tesseract with the following parameters:
numbers = pytesseract.image_to_string(maskArray,config ='--oem 3 --psm 11 -c tessedit_char_whitelist=01234567890')
since I am looking for just numbers and they could be anywhere on the page. I found that tesseract can find 51416,51230, and 44133 but not 51523. At first I thought it may be a question of that dotted line behind the number, so I replaced that column with white. That did not seem to help. This image is already grayscale btw. One thing that seems to work is if I crop the image so that there is nothing left of that number. But in order to do that, I need to know where that number is. Any ideas what could be causing tesseract to randomly miss this number but not the other ones? or how to increase the accuracy?
Other things Ive tried:
rescaling up and down, inverting the image, blurring/unblurring, morphology/thresh,bilateral filter,erode , psm 11, 12, and default. Some combinations will increase the accuracy, but none seem to give me 100% which I would expect from the simplicity of the problem (currently maxing out at around 87%). Also, is there a way that I can increase the number of total outputs (including false positives)?
Related
Using pyiron, I want to calculate the mean square displacement of the ions in my system. How do I see the total displacement (i.e. not folded back by periodic boundary conditions) without dumping very frequently and checking when an atom passes over the boundary and gets wrapped?
Try to compare job['output/generic/unwrapped_positions'][-1] and job.structure.positions+job.output.total_displacements[-1]. If they deliver the same values, it's definitely fine both ways. If not, you can post the relevant lines in your notebook here.
I'd like to add a few comments to Jan's answer:
While job['output/generic/unwrapped_positions'] returns the unwrapped positions parsed from the output files, job.output.total_displacements returns the displacement of atoms calculated from each pair of consecutive snapshots. So if an atom moves more than half the box length in any direction, job.output.total_displacements will give wrong coordinates. Therefore, job['output/generic/unwrapped_positions'] is generally more trustworthy, but it is not available in all the codes (since some codes simply do not provide an output for unwrapped positions).
Moreover, if an interactive job is used, it is possible that job.structure.positions does not return the initial positions, i.e. job.structure.positions+job.output.total_displacements won't be initial positions + displacements.
So, in short, my answer to your question would be rather "Use job['output/generic/unwrapped_positions'] and if it's not available, use job.structure.positions+job.output.total_displacements but be aware of potential problems you might be running into."
The constructor for wx.TextCtrl takes a wx.Size argument, which is in units of pixels. Usually, I don't want to specify the size of a multiline TextCtrl in pixels, but rather in how many characters it can show without scrolling. I find that multiline TextCtrls are often the dominant component in my windows, thus stretching by Sizer is not an option.
The wxPython Phoenix documentation contains a hint as to how to do this, however this is meant more for short text on single line control.
I have started using this utility method:
def _set_textctrl_size_by_chars(self, tc, w, h):
sz = tc.GetTextExtent('X')
sz = wx.Size(sz.x * w, sz.y * h)
tc.SetInitialSize(tc.GetSizeFromTextSize(sz))
along with code like this:
tc = wx.TextCtrl(self, style=wx.TE_MULTILINE)
self._set_textctrl_size_by_chars(tc, 80, 20)
This works, but I consider it a hack. I have looked over the documentation but have not found any other way to do it.
I understand that fonts are not usually monospaced, and using 'X' as a representative character width is inexact, however it's plenty good enough for my usage. Still, it seems there should be some way to do this directly using the wx library.
Using something like text.GetSizeFromTextSize(text.GetTextExtent("99999").x) is indeed the best way to size the text control to fit exactly 5 digits (e.g. a ZIP code in some localities). Notice that this is slightly better than your code because the width of 80 "X"s is not necessarily quite the same as 80 times the width of a single "X". And I'd also recommend using "M" or "W" which can be noticeably wider than "X" in some fonts, but this is not going to changes matters much.
We thought about adding a helper method doing this and it might indeed be useful, but, again, this still won't make things as simple as you'd like because you really need to specify the characters you want to use: "W" for letters, "9" for digits and maybe something like "x" if you want the control to be wide enough to fit the given number of characters on average instead of being wide enough to guarantee fitting the given number of the widest characters because the difference may be noticeable.
The main place where we could make life simpler would be at XRC level and this would be worth doing ("just" a question of time...), but for the code I really don't think we can make things much simpler than what they're now.
I have multiple sequences of images from my micropipette aspiration experiments that look somewhat comparable to this: https://www.youtube.com/watch?v=HpY_2_e7b6Y
Now I would like to track the length of the tissue within the pipette automatically for all the different images in the sequence.
Does anyone know how this can be done?
Thanks!
You could probably do it with a macro: first, you manually draw a line in the middle of the channel to define the direction. Then, for each image of the movie, you try to find the edge of the pipette and the interface. To do that, you can try to use the "Process>Find Edges" function. Depending on the quality of your images, you might need several steps. Finally, you just need to find the distance between these two points.
Here is a quick-and-dirty macro which kind of gives a not-too-wrong result with the Youtube video:
print("\\Clear");
run("Clear Results");
run("Duplicate...", "title=Stack-1 duplicate range=1-5");
selectWindow("Stack-1");
run("Find Edges", "stack");
waitForUser("Please draw a line and click OK.");
//makeLine(402, 238, 170, 221);
setLineWidth(20);
//for each frame of the movie
for(s=1;s<=nSlices;s++){
setSlice(s);
profile=getProfile();
maxProfile=Array.findMaxima(profile,15);
diff=abs(maxProfile[1]-maxProfile[0]);
print("max at: "+maxProfile[1] +" "+ maxProfile[0]);
print("Slice "+s+" the difference is: "+diff+" px");
setResult("Time",nResults,s*2);
setResult("Position",nResults-1,diff/200);
}
I tested it on five frames from the Youtube video (available here for about 1 month, 6MB): it gives almost consistent results, but it's very approximative and could be easily improved. I think the main idea is correct.
I'm using Selenium to automate webpage functional testing. It's important for us to do a pixel-by-pixel comparison when we roll out new code, so we're using Selenium to take screenshots and comparing the base64 encoded strings to see if anything has changed.
We're finding that in practice, it's hard to get complete pixel consistency, especially with images. I would like minor blurriness / rendering artifacts to count as a "pass" instead of a "fail", so I'm wondering if there's a way of doing a fuzzy comparison to make our tests a bit less fragile.
I was thinking of maybe looking at the Levenshtein distance between the base64 strings as a starting point, but I don't really know if that's a good approach, or what the tolerances should be that distinguish "something moved on the page" from "rendering artifact". Any ideas / approaches?
So I ended up going with the ImageMagick command-line tool (because why re-invent image comparison). The "Peak Absolute Error" metric of the "compare" tool tells you how much you have to fuzz pixels before two images are identical. This seems to work well... for an image with slight graphical distortions, there might be a lot of pixels that don't match, but slight fuzzing is enough to make them match; but for two images that are actually different, even though most pixels might match, the ones that don't tend to be very different. Right now I'm checking for a PAE of less than 15% to see if the images should be counted as identical. Command line I'm using is:
compare -metric PAE original.png new.png comparison.png
Documentation on ImageMagick's compare tool is here: http://www.imagemagick.org/script/compare.php
I've been using perceptualdiff which uses a model of the human visual system to try to avoid reporting unnoticeable changes (the authors used for renderer regression testing). Usage is quite simple:
perceptualdiff -output diff.ppm baseline.png test.png
(where diff.ppm is a PPM format image highlighting the areas of difference)
The needle regression testing framework has support for using pdiff to compare screenshots:
http://needle.readthedocs.org/en/latest/#engines
Use an image format that does not create artifacts (like BMP or PNG) then you can do a per-pixel comparison.
I think you can check each pixel with a common Euclidean Distance.
To improve performance a little, do not calculate the square root but check the squares of the distances
// Maximum color distance allowed to define pixel consistency.
const float maxDistanceAllowed = 5.0;
// Square of the distance, used in calculations.
float maxD = maxDistanceAllowed * maxDistanceAllowed;
public bool isPixelConsistent(Color pixel1, Color pixel2)
{
// Euclidean distance in 3-dimensions.
float distanceSquared = (pixel1.R - pixel2.R)*(pixel1.R - pixel2.R) + (pixel1.G - pixel2.G)*(pixel1.G - pixel2.G) + (pixel1.B - pixel2.B)*(pixel1.B - pixel2.B);
// If the actual distance is less than the max allowed, the pixel is
// consistent and the method returns TRUE
return distanceSquared <= maxD;
}
Didn't test the C# code, but it should give you the idea. Give some tries and adjust the maxDistanceAllowed to your needs.
If anyone else is looking for something similar there is Depicted-dpxdt. It is designed to be used as part of a CI/CD process.
It combines perceptual diff with server, commandline tool, wrapper for phantom js.
It has functionality demonstrated like crawling entire site and comparing pages for differences.
I'm attempting to gauge the percentage difference between two images.
Having done a lot of reading I seem to have a number of options but I'm not sure what the best method to follow for:
Ease of coding
Performance.
The methods I've seen are:
Non language specific - academic Image comparison - fast algorithm and Mac specific direct pixel access http://www.markj.net/iphone-uiimage-pixel-color/
Does anyone have any advice about what solutions make most sense for the above two cases and have code samples to show how to apply them?
I've had success calculating the difference between two images using the histogram technique mentioned here. redmoskito's answer in the SO question you linked to was actually my inspiration!
The following is an overview of the algorithm I used:
Convert the images to grayscale—compare one channel instead of three.
Divide each image into an n * n grid of "subimages". Then, for subimage pair:
Calculate their colour composition histograms.
Calculate the absolute difference between the two histograms.
The maximum difference found between two subimages is a measure of the two images' difference. Other metrics could also be used (e.g. the average difference betwen subimages).
As tskuzzy noted in his answer, if your ultimate goal is a binary "yes, these two images are (roughly) the same" or "no, they're not", you need some meaningful threshold value. You could produce such a value by passing images into the algorithm and tweaking the threshold based on its output and how similar you think the images are. A form of machine learning, I suppose.
I recently wrote a blog post on this very topic, albeit as part of a larger goal. I also created a simple iPhone app to demonstrate the algorithm. You can find the source on GitHub; perhaps it will help?
It is really difficult to suggest something when you don't tell us more about the images or the variations. Are they shapes? Are they the different objects and you want to know what class of objects? Are they the same object and you want to distinguish the object instance? Are they faces? Are they fingerprints? Are the objects in the same pose? Under the same illumination?
When you say performance, what exactly do you mean? How large are the images? All in all it really depends. With what you've said if it is only ease of coding and performance I would suggest to just find the absolute value of the difference of pixels. That is super easy to code and about as fast as it gets, but really unlikely to work for anything other than the most synthetic examples.
That being said I would like to point you to: DHOG, GLOH, SURF and SIFT.
You can use fairly basic subtraction technique that the lads above suggested. #carlosdc has hit the nail on the head with regard to the type of image this basic technique can be used for. I have attached an example so you can see the results for yourself.
The first shows a image from a simulation at some time t. A second image was subtracted away from the first which was taken some (simulation) time later t + dt. The subtracted image (in black and white for clarity) then shows how the simulation has changed in that time. This was done as described above and is very powerful and easy to code.
Hope this aids you in some way
This is some old nasty FORTRAN, but should give you the basic approach. It is not that difficult at all. Due to the fact that I am doing it on a two colour pallette you would do this operation for R, G and B. That is compute the intensities or values in each cell/pixal, store them in some array. Do the same for the other image, and subtract one array from the other, this will leave you with some coulorfull subtraction image. My advice would be to do as the lads suggest above, compute the magnitude of the sum of the R, G and B componants so you just get one value. Write that to array, do the same for the other image, then subtract. Then create a new range for either R, G or B and map the resulting subtracted array to this, the will enable a much clearer picture as a result.
* =============================================================
SUBROUTINE SUBTRACT(FNAME1,FNAME2,IOS)
* This routine writes a model to files
* =============================================================
* Common :
INCLUDE 'CONST.CMN'
INCLUDE 'IO.CMN'
INCLUDE 'SYNCH.CMN'
INCLUDE 'PGP.CMN'
* Input :
CHARACTER fname1*(sznam),fname2*(sznam)
* Output :
integer IOS
* Variables:
logical glue
character fullname*(szlin)
character dir*(szlin),ftype*(3)
integer i,j,nxy1,nxy2
real si1(2*maxc,2*maxc),si2(2*maxc,2*maxc)
* =================================================================
IOS = 1
nomap=.true.
ftype='map'
dir='./pictures'
! reading first image
if(.not.glue(dir,fname2,ftype,fullname))then
write(*,31) fullname
return
endif
OPEN(unit2,status='old',name=fullname,form='unformatted',err=10,iostat=ios)
read(unit2,err=11)nxy2
read(unit2,err=11)rad,dxy
do i=1,nxy2
do j=1,nxy2
read(unit2,err=11)si2(i,j)
enddo
enddo
CLOSE(unit2)
! reading second image
if(.not.glue(dir,fname1,ftype,fullname))then
write(*,31) fullname
return
endif
OPEN(unit2,status='old',name=fullname,form='unformatted',err=10,iostat=ios)
read(unit2,err=11)nxy1
read(unit2,err=11)rad,dxy
do i=1,nxy1
do j=1,nxy1
read(unit2,err=11)si1(i,j)
enddo
enddo
CLOSE(unit2)
! substracting images
if(nxy1.eq.nxy2)then
nxy=nxy1
do i=1,nxy1
do j=1,nxy1
si(i,j)=si2(i,j)-si1(i,j)
enddo
enddo
else
print *,'SUBSTRACT: Different sizes of image arrays'
IOS=0
return
endif
* normal finishing
IOS=0
nomap=.false.
return
* exceptional finishing
10 write (*,30) fullname
return
11 write (*,32) fullname
return
30 format('Cannot open file ',72A)
31 format('Improper filename ',72A)
32 format('Error reading from file ',72A)
end
! =============================================================
Hope this is of some use. All the best.
Out of the methods described in your first link, the histogram comparison method is by far the simplest to code and the fastest. However key point matching will provide far more accurate results since you want to know a precise number describing the difference between two images.
To implement the histogram method, I would do the following:
Compute the red, green, and blue histograms of each image
Add up the differences between each bucket
If the difference is above a certain threshold, then the percentage is 0%
Otherwise the colors found in the images are similar. So then do a pixel by pixel comparison and convert the difference into a percentage.
I don't know any precise algorithms for finding the key points of an image. However once you find them for each image you can do a pixel by pixel comparison for each of the key points.