Can we apply multiple psm modes to a single image in tesseract ocr

Can we apply multiple psm modes to a single image in tesseract ocr - python-tesseract

pytesseract.image_to_string(imageopen, lang='eng',config='--psm 12')
can we apply one more psm in this code like
pytesseract.image_to_string(imageopen, lang='eng',config='--psm 12',config='--psm 6')

Related

Correct annotation to train spaCy's NER

I'm having some troubles finding the right way how to annotate my data. I'm dealing with laboratory test related texts and I am using the following labels:
1) Test specification (e.g. voltage, length,...)
2) Test object (e.g. battery, steal beam,...)
3) Test value (e.g. 5 V; 5 m...)
Let's take this example sentences:
The battery voltage should be 5 V.
I would annotate this sentences like this:
The
battery voltage (test specification)
should
be
5 V (Test value)
.
However, if this sentences looks like this:
The voltage of the battery should be 5 V.
I would use the following annotation:
The
voltage (Test specification)
of
the
battery (Test object)
should
be
5 V (Test value)
.
Is anyone experienced in annotating data to explain if this is the right way? Or should I use in he first example the Test object label for battery as well? Or should I combine the labels in the second example voltage of the battery as Test specification?
I am annotating the data to perform information extraction.
Thanks for any help!

All of your examples are unusual annotations formats. The typical way to tag NER data (in text) is to use an IOB/BILOU format, where each token is on one line, the file is a TSV, and one of the columns is a label. So for your data it would look like:
The
voltage U-SPEC
of
the
battery U-OBJ
should
be
5 B-VALUE
V L-VALUE
.
Pretend that is TSV, and I have omitted O tags, which are used for "other" items.
You can find documentation of these schema in the spaCy docs.
If you already have data in the format you provided, or you find it easier to make it that way, it should be easy to convert at least. For training NER spaCy requires the data be provided in a particular format, see the docs for details, but basically you need the input text, character spans, and the labels of those spans. Here's example data:
TRAIN_DATA = [
("Who is Shaka Khan?", {"entities": [(7, 17, "PERSON")]}),
("I like London and Berlin.", {"entities": [(7, 13, "LOC"), (18, 24, "LOC")]}),
]
This format is trickier to produce manually than the above TSV type format, so generally you would produce the TSV-like format, possibly using a tool, and then convert it.

The main rule to correctly annotate entities is to be consistent (i.e. you always apply the same rules when deciding which entity is what). I can see you already khave some rules in terms of when battery voltage should be considered test object or test specification.
Apply those rules consistently and you'll be ok.
Have a look at the spacy-annotator.
It is a library that helps you annotating data in the way you want.
Example:
import pandas as pd
import re
from spacy_annotator.pandas_annotations import annotate as pd_annotate
# Data
df = pd.DataFrame.from_dict({'full_text' : [The battery voltage should be 5 V., 'The voltage of the battery should be 5 V.']})
# Annotations
pd_dd = pd_annotate(df,
col_text = 'full_text', # Column in pandas dataframe containing text to be labelled
labels = ['test_specification', 'test object', 'test_value'], # List of labels
sample_size=1, # Size of the sample to be labelled
delimiter=',', # Delimiter to separate entities in GUI
model = None, # spaCy model for noisy pre-labelling
regex_flags=re.IGNORECASE # One (or more) regex flags to be applied when searching for entities in text
)
# Example output
pd_dd['annotations'][0]
The code will show you a user interface you can use to annotate each relevant entities.

try to implement cv2.findContours for person detection

I'm new to opencv and I'm trying to detect person through cv2.findContours with morphological transformation of the video. Here is the code snippet..
import numpy as np
import imutils
import cv2 as cv
import time
cap = cv.VideoCapture(0)
while(cap.isOpened()):
ret, frame = cap.read()
#frame = imutils.resize(frame, width=700,height=100)
gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
gray = cv.GaussianBlur(gray, (21, 21), 0)
cv.accumulateWeighted(gray, avg, 0.5)
mask2 = cv.absdiff(gray, cv.convertScaleAbs(avg))
mask = cv.absdiff(gray, cv.convertScaleAbs(avg))
contours0, hierarchy = cv.findContours(mask2,cv.RETR_EXTERNAL,cv.CHAIN_APPROX_SIMPLE)
for cnt in contours0:
.
.
.
The rest of the code has the logic of a contour passing a line and incrementing the count.
The problem I'm encountering is, cv.findContours detects every movement/change in the frame (including the person). What I want is cv.findContours to detect only person and not any other movement. I know that person detection can be achieved through harrcasacade but is there any way I can implement detection using cv2.findContours?
If not then is there a way I can still do morphological transformation and detect people because the project I'm working on requires filtering of noise and much of the background to detect the person and increment it's count on passing the line.

I will show you two options to do this.
The method I mentioned in the comments which you can use with Yolo to detect humans:
Use saliency to detect the standout parts of the video
Apply K-Means Clustering to cluster the objects into individual clusters.
Apply Background Subtraction and erosion or dilation (or both depends on the video but try them all and see which one does the best job).
Crop the objects
Send the cropped objects to Yolo
If the class name is a pedestrian or human then draw the bounding boxes on them.
Using OpenCV's builtin pedestrian detection which is much more easier:
Convert frames to black and white
Use pedestrian_cascade.detectMultiScale() on the grey frames.
Draw a bounding box over each pedestrian
Second method is much simpler but it depends what is expected of you for this project.

Apply two materials for two segment at single mesh

i have one mesh from 100 pieces (a gun) and materials are from Blender separated into two segments.. main material and steel material. I generated all maps (normal, roughness etc.) from Substance painter for each segment. When I exported mesh from Blender I have milion of pieces and not single mesh. Its ok, I can join them, but how can I apply two materials to joined mesh please in UE? Do I need to export fbx for each segment (after join from blender) and then put them together in UE or there is different and native was to do this please?
Thank you

Ok, i exported as .obj and not as .fbx and it works good. Just assign two materials and finish

What is the unit for raw data in Kinect V2?

I am trying to figure out what is the raw data in Kinect V2? ... I know we can convert these raw data to meters and to gray color to represent the depth .... but what is the unit of these raw data ?
and why all the images that captured by Kinect are mirrored?

The raw values stored in the depth image are in millimeters. You can get the X and Y values using the pixel position along with the depth camera intrinsic parameters. If you want I could share a Matlab code that converts the depth image into X,Y,Z values.
Yes, the images are mirrored in Windows-SDK and in the "libfreenect2" which is a open source version of SDK. I couldn't get a solid answer why it is so, but you could look at the discussion available in the link.

There are different kinds of frame which can be captured through Kinect V2. Every raw data captured has a different unit. Like, for depth frame it is millimeters, for color it is RGB (0-255, 0-255, 0-255), for bodyFrames it is 0 or 1 ( having same resolution as depth frame, but can identify up-to a maximum number of human bodies at a time ) and etc.
Ref: https://developer.microsoft.com/en-us/windows/kinect

Output exact depth map in Blender

How can I output the depth information of each frame in blender to a .txt file?
I can generate a depth-map as gray-scale image with values in [0,1] but I want the values in units I use in Blender (meter in my case).

The solution turned out to be very easy. Just link the Z buffer in the Node Editor with the output. Also make sure to turn he Z buffer on.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Can we apply multiple psm modes to a single image in tesseract ocr - python-tesseract

pytesseract.image_to_string(imageopen, lang='eng',config='--psm 12') can we apply one more psm in this code like pytesseract.image_to_string(imageopen, lang='eng',config='--psm 12',config='--psm 6')

Related

Correct annotation to train spaCy's NER

try to implement cv2.findContours for person detection

Apply two materials for two segment at single mesh

What is the unit for raw data in Kinect V2?

Output exact depth map in Blender

Categories

Resources