When will the V2 barcode scanning model be available in the unbundled version for barcode scanning - google-mlkit

Looking at the documentation found here
https://developers.google.com/ml-kit/vision/barcode-scanning/android
It says that the unbundled version uses V2 of the barcode scanning model which has performance and accuracy improvements. When will these improvements be included in the Google Play services version( unbundled version)?

We don't have a committed timeline for providing the unbundled barcode v2 in Google Play Services yet, but we are working on it and hope to make it available in a couple of months. In the meantime, you can use the unbundled barcode v2 in Google Play Services to see if that fits your need.

Related

How to achieve Image recognition using phone camera

I'm trying to build an app to make image recognition using the phone camera... I saw a lot of videos where using the camara the app identify where is the person or which feelings they have or things like that in real time.
I need to do a built an app like this, I know it's not an easy task, but I need to know which technologies can be use in order to achieve this in a mobile app?
Is it tensor flow?
Are there some libraries that helps to achieve this?
Or do I need to build a full Machine Learning with IA app?
Sorry to make such a general question but I need some insights.
Redgards
If you are trying to do this for the iOS platform, you could use a starter kit here: https://developer.ibm.com/patterns/build-an-ios-game-powered-by-core-ml-and-watson-visual-recognition/ for step-by-step instructions.
https://github.com/IBM/rainbow is a repo which it references.
You train your vision model on the IBM Cloud using Watson Visual Recognition, which just needs example images to learn from. Then you download the model into your iOS app and deploy with XCode. It will "scan" the live camera feed for the classes defined in your model.
I see you tagged TF (which is not part of this starter kit) but if you're open to other technologies, I think it would be very helpful.

What is the difference between Google Cloud Vision API and Mobile Vision?

I have been playing with cloud vision API. I did some label and facial detection. During this Google I/O, there is a session where they talked about mobile vision. I understand both the APIs are related to machine learning in Google Cloud.
Can anyone explain (use-cases) when to use one over another?
what sort of applications that we can build by using both ?
There can be many various sets of application requirements and use cases befitting one of the APIs or the other, so the best decision is to be taken on an individual basis.
It is worthwhile noting that the Mobile Vision offers facial recognition, whereas Cloud Vision API does not. Mobile Vision is geared towards the likeliest use-cases to be encountered in a mobile device environment, and encompasses the Face, Barcode Scanner and Text Recognition APIs.
A use case for both the Mobile Vision set and the Cloud Vision API would require face recognition as well as one of the features specific to the Cloud Vision API, such as the detection of inappropriate content in an image.
Google is planning to "wind down" Mobile Vision as I understood.
The Mobile Vision API is now a part of ML Kit. We strongly encourage you to try it out, as it comes with new capabilities like on-device image labeling! Also, note that we ultimately plan to wind down the Mobile Vision API, with all new on-device ML capabilities released via ML Kit. Feel free to reach out to Firebase support for help.
https://developers.google.com/vision/introduction?hl=en

Opening Kinect datasets and/or SDK Samples

I am very new to Kinect programming and am tasked to understand several methods for 3D point cloud stitching using Kinect and OpenCV. While waiting for the Kinect sensor to be shipped over, I am trying to run the SDK samples on some data sets.
I am really clueless as to where to start now, so I downloaded some datasets here, and do not understand how I am supposed to view/parse these datasets. I tried running the Kinect SDK Samples (DepthBasic-D2D) in Visual Studio but the only thing that appears is a white screen with a screenshot button.
There seems to be very little documentation with regards to how all these things work, so I would appreciate if anyone can point me to the right resources on how to obtain and parse depth maps, or how to get the SDK Samples work.
The Point Cloud Library (or PCL) it is a good starting point to handle point cloud data obtained using Kinect and OpenNI driver.
OpenNI is, among other things, an open-source software that provides an API to communicate with vision and audio sensor devices (such as the Kinect). Using OpenNI you can access to the raw data acquired with your Kinect and use it as a input for your PCL software that can process the data. In other words, OpenNI is an alternative to the official KinectSDK, compatible with many more devices, and with great support and tutorials!
There are plenty of tutorials out there like this, this and these.
Also, this question is highly related.

recognizing facial expressions using Kinect SDK

I am trying to do some work using Kinect and the Kinect SDK.
I was wondering whether it is possible to detect facial expressions (e.g. wink, smile etc) using the Kinect SDK Or, getting raw data that can help in recognizing these.
Can anyone kindly suggest any links for this ? Thanks.
I am also working on this and i considered 2 options:
Face.com API:
there is a C# client library and there are a lot of examples in their documentation
EmguCV
This guy Talks about the basic face detection using EmguCV and Kinect SDK and you can use this to recognize faces
Presently i stopped developing this but if you complete this please post a link to your code.
This is currently not featured within the Kinect for Windows SDK due to the limitations of Kinect in producing high-resolution images. That being said, libraries such as OpenCV and AForge.NET have been sucessfuly used to detected finger and facial recognition from both the raw images that are returned from Kinect, and also RGB video streams from web cams. I would use this computer vision libraries are a starting point.
Just a note, MS is releasing the "Kinect for PC" along with a new SDK version in february. This has a new "Near Mode" which will offer better resolution for close-up images. Face and finger recognition might be possible with this. You can read a MS press release here, for example:
T3.com
The new Kinect SDK1.5 is released and contains the facial detection and recognition
you can download the latest SDK here
and check this website for more details about kinect face tracking

Official Kinect SDK vs. Open-source alternatives

Where do they differ?
What are the advantages of choosing libfreenect or OpenNI+SensorKinect, for example, over the Official SDK, and vice-versa?
What are the disadvantages?
Please note that the below answer is per date and some facts may very well be outdated in the near future. Current state of the Official Kinect SDK is beta 1.00.12.
The first obvious difference is that the official SDK is maintained by the Microsoft Research team while OpenKinect is an open source SDK maintained by the open source community. Both has its cons and pros.
The Official SDK is developed by Microsoft which also develops the hardware and therefore should know internal information about the device that the open source society must reverse engineer. Obviously this is to Microsoft's advantage.
Microsoft is pouring a lot of money into this device and I am sure that they will do what they feel is necessary to keep their SDK up to par. Having economy behind it gives many advantages.
On the other hand, never underestimate the force of the open source society: "The OpenKinect community consists of over 2000 members contributing their time and code to the Project. Our members have joined this Project with the mission of creating the best possible suite of applications for the Kinect. OpenKinect is a true "open source" community!" - http://openkinect.org/wiki/Main_Page.
OpenKinect was released long before the official SDK as the kinect device was hacked on the first or second day of its release. Kudos to OpenKinect!
Programming languages supported:
Official SDK: C++, C#, or Visual Basic by using Microsoft Visual Studio 2010.
OpenKinect: Python, C, C++, C#, Java, Lisp and more! Obviously not requiring Visual Studio.
Operating systems support:
Official SDK: only installs on Windows 7.
OpenKinect: runs on Linux, OS X and Windows
Clearly advantage OpenKinect.
License:
The Official SDK is in its current beta state only for testing. The SDK has been developed specifically to encourage wide exploration and experimentation by academic, research and enthusiast communities. commercial applications are not permitted. Note however that this will probably change in future releases of the SDK. Visit the FAQ for more information
OpenKinect appers to be open for commercial usage, but online sources state that it may not be that simple. I would take a good look at the terms before releasing any commercial apps with it. Read Kinect – Licensing implications of open hardware projects for more info.
Documentation and support:
Official SDK: well documented and provides a support forum
OpenKinect: appears to have a mailing list, twitter and irc. but no official forum/QA? Documentation on website is not as rich as I would like it to be.
Device calibration:
Different Kinect devices may differ slightly depending on the batch that they were produced in. Thus device calibration is sometimes required. But:
the Official SDK does not provide any calibration settings but I have so far not had to calibrate the device I am working on. According to something I read online (link lost) at production time the calibration parameters are written to the kinect device, so with the Official SDK calibration is not needed.
OpenKinect features device calibration: http://openkinect.org/wiki/Calibration. Thus I believe that you should calibrate your device if you go with OpenKinect.
If its true that calibration is only needed for OpenKinect that is a big advantage for the official SDK as it is easier to distribute and install applications without such need.
Personally, after a failed try with the OpenKinect SDK I went with the official SDK, which
came with drivers that installed out of the box
came with examples and code for easy getting into business
All-in-all: I could start my own development within 15 minutes or so.
Now, after working with the Kinect for a few months, I have to say that I am quite satisfied with the API provided. I cannot however compare it to the OpenKinect SDK as I in fact never got it working (but perhaps it didn't give it a fair try).
UPDATE: As of February 1st 2012 there is a commercial license for the official SDK:
"The commercial license for this release authorizes development and distribution of commercial applications. The prior SDK was a beta, and as a result was appropriate only for research, testing and experimentation, and was not suitable for use with a final, commercial product. The new license will enable developers to create and sell their Kinect for Windows applications to end user customers using Kinect for Windows hardware on Windows platforms."
Developer Frequently Asked Questions
As explained by Avada Kedavra in his/her answer, these are some interesting differences:
supported operating systems: you can only use Microsoft SDK on Windows, while open source solutions are usually able to work on other operating systems;
programming languages: you have a wider choice with open source solutions, while Microsoft only supports C++ and C# (Visual Basic is no more supported with SDK 2.0);
documentation and support: Microsoft offer a good forum and a well done documentation (with a lot of samples); but there are several open source solution well documented;
license: Microsoft is less or more proprietary, open source is less or more free. Consider also that open source ideas have sometimes been bought by big companies, and transformed in something that is no more open. Probably yours will not be the case, but keep in mind this additional eventuality.
In my personal opinion, the most significant difference between open source solutions and Microsoft SDKs is strictly related to the skeletal tracking algorithm.
While depth and RGB data can be effectively provided by both open/free APIs and Microsoft SDKs, implementing skeletal tracking capabilities is not only a matter of reverse engineering.
To implement such an algorithm, developers must have strong competences in pattern recognition and machine learning areas, and I am quite sure that such kind of knowledge is available among the open source community. But the implementation of skeletal tracking is based on a "trained" algorithm, that requires a lot of experiments to collect very large amount of data. These data are then used to "train" the algorithm, that can recognize the skeletal joints.
Getting enough data, but also adjusting and properly using them, requires a lot of time and money. Microsoft researchers and developers are in the best conditions to work on this kind of stuff, simply because it is their job.
In my previous experiences, I noticed that open source solutions provide good skeletal tracking capabilities, but they are not at the same level of what Microsoft offers with its SDK.
Remember also that Microsoft SDK provide a lot of additional capabilities, like facial recognition or joint orientation, and several widgets very useful if you want to fastly build a gestural GUI.
So what I suggest is: if you are working on a project in which you simply need depth and/or RGB data, or if you have the necessity to use a programming language that is not supported by Microsoft SDK, then you should opt for open source solution. Otherwise, Microsoft SDK would be my best choice.
I would strongly recommend the Cinder framework. (libcinder.org)
It supports both OpenNI and Kinect develoment, if you're using C++. It now supports Kinect SDK 1.7 and OpenNI 2, via these Cinderblocks:
MS Kinect SDK 1.7 (stable)
https://github.com/BanTheRewind/Cinder-MsKinect
OpenNI 2 / NITE 2.2 (alpha)
https://github.com/wieden-kennedy/Cinder-OpenNI
Both can do skeletal tracking out of the boz, OpenNI being capable of tracking up to six skeletons simultaneously. OpenNI 2 is gaining rapidly on the Kinect, although the new Kinect will probably change that when it comes out next month. However the basic underlying principles are unlikely to change.
The main drawback with the initial release of OpenNI was that it required a full body activation pose to recognise a user, which was a deal breaker for a lot of applications - however this seems to have been solved in the newer versions and OpenNI 2 also supports robust hand tracking at close range, although it still requires a focus gesture initially. If you work on Mac or Linux, it's pretty much your only choice.