Earlier this week I received the Intel RealSense D435 camera and now I am discovering its capabilities. After doing a few hours of research, I discovered the previous version of the SDK had a 3D model scan example application. Since SDK 2.0, this example application is no longer present making it harder to create 3D models with the camera.
I have managed to create various Point cloud (.ply) files with the camera, and now I try to use CloudCompare to generate a 3D model of it. However, without any success. Since my knowledge about computer vision is rather basic, I reach out to the community how it's possible to accomplish a 3D model scan using only PointClouds. The model can be rough, but preferable most noisy data needs to be removed.
Try recfusion 1.7.3 for scanning. 99 euro
Related
I am exporting simple 3D models as .obj, .fbx or .glb using blender, and succesfully display them in the 3D viewer app of a hololens 2.
As soon as the models are more complex (for example created by makehuman), the exports cannot be displayed in Hololens 2 3d viewer app.The error message says that the models are not optimised for windows mixed reality.
I found some documentation on the limitation of Hololens 1 .glb files. But I cannot find the specification for hololens 2 and the three file formats.
In addition: Should I reduced the complexity in the blender models, or during the export, or are there even tools to post-process 3D models for Hololens 2 / Windwos mixed reality?
You can use the following link as a guide for optimizing your models - Optimize your 3D models
The asset requirements for pre-installed 3D Viewer app on HoloLens 2, please see Asset requirements overview for more details, here is a quote to the main points::
Exporting - Assets must be delivered in the .glb file format (binary glTF)
Modeling - Assets must be less than 10k triangles, have no more than 64 nodes and 32 submeshes per LOD
Materials - Textures can't be larger than 4096 x 4096 and the smallest mip map should be no larger than 4 on either dimension
Animation - Animations can't be longer than 20 minutes at 30 FPS (36,000 keyframes) and must contain <= 8192 morph target vertices
Optimizing - Assets should be optimized using
the WindowsMRAssetConverter. Required on Windows OS Versions <= 1709*
and recommended on Windows OS versions >= 1803
For the question of other tools to post-process 3D models, you can easily optimize any glTF 2.0 model using the Windows Mixed Reality Asset Converter available on GitHub. This tool includes a command line tool that uses these steps in sequence in order to convert a glTF 2.0 core asset for use in the Windows Mixed Reality home.
From my experience, only the simplest models will successfully open in 3D viewer, whether using HoloLens1 or 2. A primary reason is that even a model that "looks simple" could very well be comprised of well more than 10,000 polygons. For instance, a simple model of a screw, modeled originally in a CAD application, could be 10,000 polygons. So imagine how many polygons the whole product model would be!
Being a novice I need an advice how to solve the following problem.
Say, with photogrammetry I have obtained a point cloud of the part of my room. Then I upload this point cloud to an android phone and I want it to track its camera pose relatively to this point cloud in real time.
As far as I know there can be problems with different cameras' (simple camera or another phone camera VS my phone camera) intrinsics that can affect the presision of localisation, right?
Actually, it's supposed to be an AR-app, so I've tried existing SDKs - vuforia, wikitude, placenote (haven't tried arcore yet cause my device highly likely won't support it). The problem is they all use their own clouds for their services and I don't want to depend on them. Ideally, it's my own PC where I perform 3d reconstruction and from where my phone downloads a point cloud.
Do I need a SLAM (with IMU fusion) or VIO on my phone, don't I? Are there any ready-to-go implementations within libs like ARtoolKit or, maybe, PCL? Will any existing SLAM catch up a map, reconstructed with other algorithms or should I use one and only SLAM for both mapping and localization?
So, the main question is how to do everything arcore and vuforia does without using third party servers. (I suspect the answer is to device the same underlay which vuforia and other SDKs use to employ all available hardware..)
The short question: I am wondering if the kinect SDK / Nite can be exploited to get a depth image IN, skeleton OUT software.
The long question: I am trying to dump depth,rgb,skeleton data streams captured from a v2 Kinect into rosbags. However, to the best of my knowledge, capturing the skeleton stream on Linux with ros, kinect v2 isn't possible yet. Therefore, I was wondering if I could dump rosbags containing rgb,depth streams, and then post-process these to get the skeleton stream.
I can capture all three streams on windows using the Microsoft kinect v2 SDK, but then dumping them to rosbags, with all the metadata (camera_info, sync info etc) would be painful (correct me if I am wrong).
It's quite some time ago that I worked with NITE (and I only used Kinect v1) so maybe someone else can give a more up-to-date answer, but from what I remember, this should easily be possible.
As long as all relevant data is published via ROS topics, it is quite easy to record them with rosbag and play them back afterwards. Every node that can handle live data from the sensor will also be able to do the same work on recorded data coming from a bag file.
One issue you may encounter is that if you record kinect-data, the bag files are quickly becoming very large (several gigabytes). This can be problematic if you want to edit the file afterwards on a machine with very little RAM. If you only want to play the file or if you have enough RAM, this should not really be a problem, though.
Indeed it is possible to perform a NiTE2 skeleton tracking on any depth-image-stream.
Refer to:
https://github.com/VIML/VirtualDeviceForOpenNI2/wiki/How-to-use
and
https://github.com/VIML/VirtualDeviceForOpenNI2/wiki/About-PrimeSense-NiTE
With this extension one can add a virtual device which allows to manipulate each pixel of the depth stream. This device can then be used for creation of a userTracker object. As long as the right device name is provided skeleton tracking can be done
\OpenNI2\VirtualDevice\Kinect
but consider usage limits:
NiTE only allow to been used with "Authorized Hardware"
I am very new to Kinect programming and am tasked to understand several methods for 3D point cloud stitching using Kinect and OpenCV. While waiting for the Kinect sensor to be shipped over, I am trying to run the SDK samples on some data sets.
I am really clueless as to where to start now, so I downloaded some datasets here, and do not understand how I am supposed to view/parse these datasets. I tried running the Kinect SDK Samples (DepthBasic-D2D) in Visual Studio but the only thing that appears is a white screen with a screenshot button.
There seems to be very little documentation with regards to how all these things work, so I would appreciate if anyone can point me to the right resources on how to obtain and parse depth maps, or how to get the SDK Samples work.
The Point Cloud Library (or PCL) it is a good starting point to handle point cloud data obtained using Kinect and OpenNI driver.
OpenNI is, among other things, an open-source software that provides an API to communicate with vision and audio sensor devices (such as the Kinect). Using OpenNI you can access to the raw data acquired with your Kinect and use it as a input for your PCL software that can process the data. In other words, OpenNI is an alternative to the official KinectSDK, compatible with many more devices, and with great support and tutorials!
There are plenty of tutorials out there like this, this and these.
Also, this question is highly related.
As briefly as I can; are there any frameworks available that I can drop into an iPad app I'm working on, along with a 3D model, and allow me to add a view that will present the model in an interactive format?
Model needs to be rotatable, and ideally I would like to be able to add interactive points on to the model that pop up modal views when tapped.
I have never worked with 3D before in any respect so I'm coming at that part as a complete novice. The 3D model is being supplied to me and will be available in "various formats". The rest of the app is pure Objective-C in which I'm proficient enough.
I have Googled and Googled and have come up with nothing so far.
Failing there being any drop-in frameworks, how much of a challenge is it likely to be to get myself up to speed with what I would need to know? Are there any good starting points to expand my knowledge here?
3D is a complex matter, if you don't see your future dealing with it on a regular basis I wouldn't recommend writing your own solutions for it.
The closest you can find to a drag and drop framework would be the SDK of the iPhone / iPad GPU's manufacturer. It's pretty easy to use.
PowerVR SDK Download
After a free registration on their website, you can download the SDK that contains lots of samples with source code. Their framework displays 3D models in their own POD format, which is of course heavily optimized for the iOS devices. Ask your 3D model provider to give you the models in POD format (you can find POD converters / exporters for Maya etc. on PowerVR's website as well).