Google Document AI Labeling Task - data-science

Here everyone, I am fairly new google cloud console. I am trying to customize a google document ai model that will learn to extract different sections of document to various data. As you can on the image that it fails to train the model and I have been running the Labeling Task for several days now I have not seen progress. Can you please assist in telling what is the right way to customize google document ia modelenter image description here
I have tried to manually label the different sections of the document, it took me a while so I did around 20 test and training dataset which I think the model to not train then I decided to do the Labeling Task as an alternative to manually labeling the dataset.

Here is the information about Labeling Tasks for Document AI
https://cloud.google.com/document-ai/docs/workbench/label-documents#labeling-tasks
Labeling Tasks use Human-in-the-Loop to have human labelers label documents for training data or production review. You can either set up your own labeling team or apply for access to the Google-Managed Workforce.
However, it doesn't seem like this is the correct course of action for what you are trying to do, since the labeling has already been completed.
Could you provide more clarification on what you are trying to accomplish with the Custom Document Extractor?
Note: Some of the error messages that are output from Document AI Workbench are not very descriptive (e.g. Internal Error Occurred) but the product development team is working to surface more helpful errors when possible.

Related

Extract inconsistence format PDF data into excel

I am need to build an AI model to extract data in the PDF into excel. I can use Power automate AI builder, but the issue is all the Pdfs are in different formats. There are more than 5k Pdfs
I have tried power automate Ai Builder but failed to extract the required data.
Please could you suggest the best way I could build something automate for long term usage.
Many Thanks!
Did you try to create a document processing custom model with unstructured documents and see if you have a better result ?
You can also post our message to the Power Automate Community
https://powerusers.microsoft.com/t5/AI-Builder/bd-p/AIBuilder
Hope it helps!

What is needed for a recommendation engine based on word/text input

I'm new to the Machine-Learning (AI) technology. I'm developing a messenger app for Android/IOs where I would like to recommend the users based on the texts/word/conversation a product from a relative small product portfolio.
Example 1:
In case the user of the messenger writes a sentence including the words "vine", "dinner", "date" the AI should recommend a bottle of vine to the user.
Example 2:
In case the user of the app writes that he has drunk a good coffee this morning, the AI should recommend a mug to the user.
Example 3:
In case the user writes something about a cute boy she met last day, the AI should recommend a "teddy bear" to the user.
I'm a Software Developer since almost 20 year with experience in the development of C/C++/Java based application (Android and IOs apps) as well as some experience in Google Cloud Platform. The ML/AI technology is completely new to me. Okay, I know the basics (input data is needed to train the ML/AI system etc.), but I wonder If there is already a framework which could help me to develop such a system which solves the above described uses-case.
I would appreciate it, if you could give me some hints where and how to start.
Thank you and regards
It is definitely possible to implement such an application, in case you want to do it in Google Cloud you will need some understanding of Tensorflow.
First of all, I recommend to you to do the Machine Learning Crash Course, for a good introduction to Machine Learning and to start to familiarize yourself with TensorFlow. Afterwards I recommend to take a look into Tensorflow tutorials which will give you a more practical introduction to Tensorflow, and include various examples on building/training/testing models.
Once you are famirialized with Tensorflow, you can jump into learning how to run jobs in the Machine Learning engine, you can start by following the quickstart. The documentation includes detailed guides on how to use the ml-engine, plus multiple samples and tutorials.
Since I believe that your application would fall into the Recommender System type, here you can see an example model, in Google Cloud ML Engine, on how to recommend items to users based on his previous searches. In your case, you would have to build a model in order to recommend items to users based on his previous words in the sentence.
The second option, in case you don't want to go through the hassle of building a new model from scratch, would be to use the Google Cloud Natural Language API, which you can understand as pre-trained models using Google (incredibly big) data. In your case, I believe that the Content Classifying API would help you achieve what your application intends to do, however, the outputs (which you can see here) are limited to what the model was trained to do, and might not be specific enough for your application, however it is an easy solution and you can still profit of this API in order to extract labels/information and send it as input to another model.
I hope that these links provide you with some foundations on what is possible to do with Tensorflow in the ML Engine, and are useful to you.

training images? Considerations for selection

I'm relatively new and am still learning the basics. I've used NVIDIA DIGITS in the past, and am now looking at Tensorflow. While I've been able to fumble my way around creating some models for a few projects I'm working on, I really want to start diving deeper into what I'm doing, how I'm doing it, and ultimately a better understanding of why.
One area that I would like to start with is the Images that I'm using for training and testing. Can anyone point me to a blog, an article, a paper, or give me some insight in what I need to consider when selecting images to train a new model on. Up until recently, I've been using datasets that have already been selected and that are available for download. Lets say I'm going to start working on a project that involves object detection of ships from a variety of distances and angles.
So my thoughts would be
1) I need a large quantity of images.
2) The images need to contain ships of the different types I would like to detect. (lets just say one class, ships, don't care what type of ships)
3) I also need to have images that have a great variety of distance perspective for the different types of ships.
Ultimately, my thoughts are that the images need to reflect the distance, perspective, and types of ships I would ideally want to identify from the video. Seems simple enough.
However, there are a number of questions
Does the images need to be the same/similar resolution as the camera I'll be using, for best results?
Does the images all need to be the same resolution?
Can I use a single image and just digitally zoom out on the image to give the illusion of different distances?
I'm sure there are a number of other questions that I'm not asking, or should be asking. Are there any guide lines available for creating a solid collection of images to use when creating the collection of images for training and validation?
I recommend thinking through end to end, like would you need to classify ship models as a next step? I recommend going through well known public datasets and actually work with the structure, how to store data, labels, how to handle preprocessing etc.
More importantly, what are you trying to achieve? Talking to experts in the topic does help greatly while preparing your own dataset.
Use open source images if you can, e.g. flickr, google, imagenet.
No, you don't need them to be the same resolution.
It is not ideal to zoom in/out images to use in different categories. Preprocessing images and data augmentation already does this to create more distant representations of the same class. This is why I would recommend hands on approach with an existing dataset first.
Yes, what you need is many, different representations of classes, and a roughly balanced dataset of classes. If you define your data structure well in the beginning, it will save you a ton of time as you won't have to make changes often.

What methods to recognize sentence handwriting?

I mean posts per sentence, not per letter. Such a doctor's prescription handwriting which hard to read. Not just a normal handwriting.
In example :
I use a data mining or machine learning for doing a training from
paper handwrited.
User scanning a paper with hard to read writing.
The application doing an image processing.
And the output is some sentence from paper.
And what device to use? (Scanner or webcam)
I am newbie. If could i need some example in vb.net with emguCV/openCV and researches journals.
Any help would be appreciated.
Welcome to stack overflow! The answer to your question is twofold:
a. If you want to recognize handwriting that has already happened i.e. it is presented to you as an image you are in trouble. Computer Vision is still not good enough to provide you with reasonable accuracy.
b. If you have a chance to recognize handwriting “as it's happening” - you are in luck. Download, for example, a Gesture Search app from Android play store and you are in business.
The difference between the two scenarios is subtle but significant. In the second case you have an extra piece of information that makes handwriting recognition possible. This piece is timing of each stroke. In other words, instead of an image with handwriting you have a bunch of strokes that are all labeled with their time stamps. You can think about it as a sequence of lines and curves or as image segmentation - in any way this provides a big hint for the system. Additional help comes from the dictionary on your phone but this is typically used by any handwriting system.
Android of course has an open source library for stroke recognition (find more on your own). If you still want to go for recognizing images though, you have to first detect text (e.g. as a bounding box) and second use any of the existing engines to process detected regions. For text detection I can recommend MSER. But be careful trying to implement even text detection on your own - you are entering a world of pain here ;). Here is an article that can help.
As for learning how to recognize text from images on the Internet - this can be your plan B or C or Z when you master above mentioned stages. Don’t try to abuse learning methods and make them do hard work for you - you will hit a wall if you don’t understand what’s going on under the hood.

Is there any kind of iOS API that exists for detecting shapes through the camera?

I've been working with augmented reality API's lately but haven't been able to achieve irregular shape detection, namely the hand. I want to be able to detect hand shapes through the video/camera and execute code based on hand signs. Does anything like this already exist?
Did you have a look at OpenCV?
These are some of the links I found just using Google: Face Detection using OpenCV, Vision For Robots