How can retrieve data in Text and block images from scan image, source image contain data in grid VB Dot Net

How can retrieve data in Text and block images from scan image, source image contain data in grid VB Dot Net - vb.net

Source Image
enter image description here
covert into excle
enter image description here
Current i am doing slicing but it taking too much time not accurate
enter image description here

Related

Can YOLO pictures have a bounded box that covering the whole picture?

I wonder why YOLO pictures need to have a bounding box.
Assume that we are using Darknet. Each image need to have a corresponding .txt file with the same name as the image file. Inside the .txt file it need to be. It's the same for all YOLO frameworks that are using bounded boxes for labeling.
<object-class> <x> <y> <width> <height>
Where x, y, width, and height are relative to the image's width and height.
For exampel. If we goto this page and press YOLO Darknet TXT button and download the .zip file and then go to train folder. Then we can see a these files
IMG_0074_jpg.rf.64efe06bcd723dc66b0d071bfb47948a.jpg
IMG_0074_jpg.rf.64efe06bcd723dc66b0d071bfb47948a.txt
Where the .txt file looks like this
0 0.7055288461538461 0.6538461538461539 0.11658653846153846 0.4110576923076923
1 0.5913461538461539 0.3545673076923077 0.17307692307692307 0.6538461538461539
Every image has the size 416x416. This image looks like this:
My idéa is that every image should have one class. Only one class. And the image should taked with a camera like this.
This camera snap should been taked as:
Take camera snap
Cut the camera snap into desired size
Upscale it to square 416x416
Like this:
And then every .txt file that correspons for every image should look like this:
<object-class> 0 0 1 1
Question
Is this possible for e.g Darknet or other framework that are using bounded boxes to labeling the classes?
Instead of let the software e.g Darknet upscale the bounded boxes to 416x416 for every class object, then I should do it and change the .txt file to x = 0, y = 0, width = 1, height = 1 for every image that only having one class object.
Is that possible for me to create a traing set in that way and train with it?

Little disclaimer I have to say that I am not an expert on this, I am part of a project and we are using darknet so I had some time experimenting.
So if I understand it right you want to train with cropped single class images with full image sized bounding boxes.
It is possible to do it and I am using something like that but it is most likely not what you want.
Let me tell you about the problems and unexpected behaviour this method creates.
When you train with images that has full image size bounding boxes yolo can not make proper detection because while training it also learns the backgrounds and empty spaces of your dataset. More specifically objects on your training dataset has to be in the same context as your real life usage. If you train it with dog images on the jungle it won't do a good job of predicting dogs in house.
If you are only going to use it with classification you can still train it like this it still classifies fine but images that you are going to predict also should be like your training dataset, so by looking at your example if you train images like this cropped dog picture your model won't be able to classify the dog on the first image.
For a better example, in my case detection wasn't required. I am working with food images and I only predict the meal on the plate, so I trained with full image sized bboxes since every food has one class. It perfectly classifies the food but the bboxes are always predicted as full image.
So my understanding for the theory part of this, if you feed the network with only full image bboxes it learns that making the box as big as possible is results in less error rate so it optimizes that way, this is kind of wasting half of the algorithm but it works for me.
Also your images don't need to be 416x416 it resizes to that whatever size you give it, you can also change it from cfg file.
I have a code that makes full sized bboxes for all images in a directory if you want to try it fast.(It overrides existing annotations so be careful)
Finally boxes should be like this for them to be centered full size, x and y are center of the bbox it should be center/half of the image.
<object-class> 0.5 0.5 1 1
from imagepreprocessing.darknet_functions import create_training_data_yolo, auto_annotation_by_random_points
import os
main_dir = "datasets/my_dataset"
# auto annotating all images by their center points (x,y,w,h)
folders = sorted(os.listdir(main_dir))
for index, folder in enumerate(folders):
auto_annotation_by_random_points(os.path.join(main_dir, folder), index, annotation_points=((0.5,0.5), (0.5,0.5), (1.0,1.0), (1.0,1.0)))
# creating required files
create_training_data_yolo(main_dir)
```

Difference in the output seen in radiant viewer after saving/writing same dicom image using sitk.Write

I'm reading a dicom image and accessing its pixel array. After that saving that array again into dicom format using sitk.Write but there is different between original image that is going to be read and same image after being written. How can I get the same image display. I'm using Radiant Viewer for visualization of Dicom images. I want same output as of input. Code along with input and out image is given below:
# Reading a dicom image
Image = pydicom.dcmread('Input.dcm')
output = Image.pixel_array
#Saving the image into another folder
img = sitk.GetImageFromArray(output)
sitk.WriteImage(img, 'output.dcm' )
Size of dicom image is greater so sending the screenshots of input1 and output2 image

I'm gonna answer my own question here. So, the Photometric Interpretation of my original image was MONOCHROME1. But after converting the image into pixel arrays and then saving again with .dcm format, its some details were changed one of which was Photometric Interpretation that changed from MONOCHROME1 to MONOCHROME2. I changed this of the saved image and then saved again as follows.
elem = image[0x0028, 0x0004]
elem.value = 'MONOCHROME1'
image.save_as('P1_L_CC.dcm',write_like_original=False)

Image replacement using PDFBox does not change size of pdf according to image

I am using PDFBox 2.0.8 to replace image in my application. I am able to extract the image and replace the same with another image of same dimension. However, there is no decrease in the size of PDF if there is decrease in the size of image. For example refer the documents/images in the below links. Original size of PDF is 93 KB. Extracted image is 91 KB. Replaced image is 54 KB. PDF size after image replacement is still 92 KB....
Original Document = http://35.200.192.44/download?fileName=/outbox/pdf/10_cert.pdf
Extracted Image = http://35.200.192.44/download?fileName=/outbox/pdf/image0.jpg
Replacement image = http://35.200.192.44/download?fileName=/outbox/pdf/image1.jpg
PDF after replacement = http://35.200.192.44/download?fileName=/outbox/pdf/10_cert1.pdf.
The change in size of PDF after replacement is not in the same proportion... Code snippet used for image replacement is
BufferedImage buffered_replacement_image_file = ImageIO.read(new File(replacement_image_file));
PDImageXObject replacement_img = JPEGFactory.createFromImage(doc, buffered_replacement_image_file);
resources.put(xObjectName, replacement_img);

The images in your two example PDFs are identical. This most likely is due to the way you load the image data, first creating a BufferedImage from the file and then creating a PDImageXObject from that BufferedImage. This causes the input image data to be expanded to a plain bitmap and then re-compressed to JPEG identically by JPEGFactory.createFromImage.
To use the JPEG data as they initially are, try this approach instead:
PDImageXObject replacement_img = JPEGFactory.createFromStream(doc, new FileInputStream(replacement_image_file));
resources.put(xObjectName, replacement_img);
or, if the replacement_image_file is not necessarily a JPEG file, like this
PDImageXObject replacement_img = PDImageXObject.createFromFileByExtension(new File(replacement_image_file), doc);
resources.put(xObjectName, replacement_img);
If this doesn't help, you most likely have other issues in your code, too, and need to show more of it.

I need detect the approximate location of QR code in scanned image (PDF converted to PNG)

I have many scanned document in PDF.
I use ImageMagick with Ghostscript to convert PDF to PNG in big density. I use convert -density 288 2.pdf 2.png. After that I read the pixels with PHP and find where is QR code and decode it. Because image is very big (~ 2500px), it's need very much RAM. I want, before I read pixels with PHP, to crop the image with ImageMagick and leave only that part with the QR code.
Can I detect the approximate location of QR code with ImageMagick, crop and leave only that part ?
Sample PDF
Converted PNG

Further Update
I see your discussion with Kurt about better extraction of the image from the PDF in the first place, and his recommendation was to use pdfimages. I just wanted to add that you won't find that if you do brew search pdfimages, but you actually need to use
brew install poppler
and then you get the pdfimages executable.
Updated Answer
If you change the tile size to 100x100 on the crop command and run this for the second PDF you supplied:
convert -density 288 pdf2.pdf -crop 100x100 tile%04d.png
and then use the same entropy analysis command
convert -format "%[entropy]:%X%Y:%f\n" tile*.png info: | sort -n
...
...
0.84432:+600+3100:tile0750.png
0.846019:+600+2800:tile0678.png
0.980938:+700+400:tile0103.png
0.984906:+700+500:tile0127.png
0.988808:+600+400:tile0102.png
0.998365:+600+500:tile0126.png
The last 4 listed tiles are
Likewise for the other PDF file you supplied, you get
0.863498:+1900+500:tile0139.png
0.954581:+2000+500:tile0140.png
0.974077:+1900+600:tile0163.png
0.97671:+2000+600:tile0164.png
which means these tiles
I would think that should help you pretty much approximately locate the QR code.
Original Answer
This is not all that scientific, but it may help you get started. The key, I think, is the entropy of the various areas of the image. The QR code has a lot of information encoded in a small area so it should have high entropy. So, I use ImageMagick to split the image into square 400x400 tiles like this:
convert image.png -crop 400x400 tile%03d.png
which gives me 54 tiles. Then I calculate the entropy of each of the tiles and sort them by increasing entropy, also outputting their offsets from the top left of the frame, and their name, like this:
convert -format "%[entropy]:%X%Y:%f\n" tile*.png info: | sort -n
0.00408949:+1200+2800:tile045.png
0.00473755:+1600+2800:tile046.png
0.00944815:+800+2800:tile044.png
0.0142171:+1200+3200:tile051.png
0.0143607:+1600+3200:tile052.png
0.0341039:+400+2800:tile043.png
0.0349564:+800+3200:tile050.png
0.0359226:+800+0:tile002.png
0.0549334:+800+400:tile008.png
0.0556793:+400+3200:tile049.png
0.0589632:+400+0:tile001.png
0.0649078:+1200+0:tile003.png
0.10811:+1200+400:tile009.png
0.116287:+2000+3200:tile053.png
0.120092:+800+800:tile014.png
0.12454:+0+2800:tile042.png
0.125963:+1600+0:tile004.png
0.128795:+800+1200:tile020.png
0.133506:+0+400:tile006.png
0.139894:+1600+400:tile010.png
0.143205:+2000+2800:tile047.png
0.144552:+400+2400:tile037.png
0.153143:+0+0:tile000.png
0.154167:+400+400:tile007.png
0.173786:+0+2400:tile036.png
0.17545:+400+1600:tile025.png
0.193964:+2000+400:tile011.png
0.209993:+0+3200:tile048.png
0.211954:+1200+800:tile015.png
0.215337:+400+2000:tile031.png
0.218159:+800+1600:tile026.png
0.230095:+2000+1200:tile023.png
0.237791:+2000+0:tile005.png
0.239336:+2000+1600:tile029.png
0.24275:+800+2400:tile038.png
0.244751:+0+2000:tile030.png
0.254958:+800+2000:tile032.png
0.271722:+2000+2000:tile035.png
0.275329:+0+1600:tile024.png
0.278992:+2000+800:tile017.png
0.282241:+400+1200:tile019.png
0.285228:+1200+1200:tile021.png
0.290524:+400+800:tile013.png
0.320734:+0+800:tile012.png
0.330168:+1600+2000:tile034.png
0.360795:+1200+2000:tile033.png
0.391519:+0+1200:tile018.png
0.421396:+1200+1600:tile027.png
0.421421:+2000+2400:tile041.png
0.421696:+1600+2400:tile040.png
0.486866:+1600+1600:tile028.png
0.489479:+1600+800:tile016.png
0.611449:+1600+1200:tile022.png
0.674079:+1200+2400:tile039.png
and, hey presto, the last one listed (i.e. the one with the highest entropy) tile039.png is this one.
I have drawn a rectangle around its location using this command
convert image.png -stroke red -fill none -strokewidth 3 -draw "rectangle 1200,2400 1600,2800" a.jpg
I concede there may be luck involved, but I only have one image to test my mad theories. You may need to tile twice, the second time with an x-offset and y-offset of half a tile width, so that you don't cut the QR code and split it across 2 tiles. You may need different size tiles for different size barcodes. You may need to consider the last 3-5 tiles located for your next algorithm. But I think it could form the basis of a method.

convert grayIplImage to color iplimage in opencv [duplicate]

I want to composite a number of images into a single window in openCV. I had found i could create a ROI in one image and copy another colour image into this area without any problems.
Switching the source image to an image i had carried out some processing on didn't work though.
Eventually i found out that i'd converted the src image to greyscale and that when using the copyTo method that it didn't copy anything over.
I've answered this question with my basic solution that only caters for greyscale to colour. If you use other Mat image types then you'd have to carry out the additional tests and conversions.

I realised my problem was i was trying to copy a greyscale image into a colour image. As such i had to convert it into the appropriate type first.
drawIntoArea(Mat &src, Mat &dst, int x, int y, int width, int height)
{
Mat scaledSrc;
// Destination image for the converted src image.
Mat convertedSrc(src.rows,src.cols,CV_8UC3, Scalar(0,0,255));
// Convert the src image into the correct destination image type
// Could also use MixChannels here.
// Expand to support range of image source types.
if (src.type() != dst.type())
{
cvtColor(src, convertedSrc, CV_GRAY2RGB);
}else{
src.copyTo(convertedSrc);
}
// Resize the converted source image to the desired target width.
resize(convertedSrc, scaledSrc,Size(width,height),1,1,INTER_AREA);
// create a region of interest in the destination image to copy the newly sized and converted source image into.
Mat ROI = dst(Rect(x, y, scaledSrc.cols, scaledSrc.rows));
scaledSrc.copyTo(ROI);
}
Took me a while to realise the image source types were different, i'd forgotten i'd converted the images to grey scale for some other processing steps.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How can retrieve data in Text and block images from scan image, source image contain data in grid VB Dot Net - vb.net

Source Image enter image description here covert into excle enter image description here Current i am doing slicing but it taking too much time not accurate enter image description here

Related

Can YOLO pictures have a bounded box that covering the whole picture?

Difference in the output seen in radiant viewer after saving/writing same dicom image using sitk.Write

Image replacement using PDFBox does not change size of pdf according to image

I need detect the approximate location of QR code in scanned image (PDF converted to PNG)

convert grayIplImage to color iplimage in opencv [duplicate]

Categories

Resources