3D models (.obj or .fbx or .glb) do not load in Hololens 2 3d-viewer - blender

I am exporting simple 3D models as .obj, .fbx or .glb using blender, and succesfully display them in the 3D viewer app of a hololens 2.
As soon as the models are more complex (for example created by makehuman), the exports cannot be displayed in Hololens 2 3d viewer app.The error message says that the models are not optimised for windows mixed reality.
I found some documentation on the limitation of Hololens 1 .glb files. But I cannot find the specification for hololens 2 and the three file formats.
In addition: Should I reduced the complexity in the blender models, or during the export, or are there even tools to post-process 3D models for Hololens 2 / Windwos mixed reality?

You can use the following link as a guide for optimizing your models - Optimize your 3D models

The asset requirements for pre-installed 3D Viewer app on HoloLens 2, please see Asset requirements overview for more details, here is a quote to the main points::
Exporting - Assets must be delivered in the .glb file format (binary glTF)
Modeling - Assets must be less than 10k triangles, have no more than 64 nodes and 32 submeshes per LOD
Materials - Textures can't be larger than 4096 x 4096 and the smallest mip map should be no larger than 4 on either dimension
Animation - Animations can't be longer than 20 minutes at 30 FPS (36,000 keyframes) and must contain <= 8192 morph target vertices
Optimizing - Assets should be optimized using
the WindowsMRAssetConverter. Required on Windows OS Versions <= 1709*
and recommended on Windows OS versions >= 1803
For the question of other tools to post-process 3D models, you can easily optimize any glTF 2.0 model using the Windows Mixed Reality Asset Converter available on GitHub. This tool includes a command line tool that uses these steps in sequence in order to convert a glTF 2.0 core asset for use in the Windows Mixed Reality home.

From my experience, only the simplest models will successfully open in 3D viewer, whether using HoloLens1 or 2. A primary reason is that even a model that "looks simple" could very well be comprised of well more than 10,000 polygons. For instance, a simple model of a screw, modeled originally in a CAD application, could be 10,000 polygons. So imagine how many polygons the whole product model would be!

Related

Which file is supported by Microsoft Mesh app?

The Microsoft Mesh app on Hololens 2 allows loading files from the local device. I want to load 3D models to my workspace, but don't know which types of 3D models are supported. I have tried .fbx files but had no success.
I have found the answer:
"For 3D content, only .glb files are supported at this time. We currently have a file size limit of 75MB, or maximum 300,000 vertex count for 3D models. If these limits are exceeded, you will fail to load your content, and get a warning: "This model is too complex"."
https://learn.microsoft.com/en-us/mesh/mesh-app/use-mesh/import-content#import-user-content

Complete open source software stack which can be used for building digital twins?

For a poc project, we would like to build digital twin of an physical device like e.g. coffee machine. Would like to know which open source software components can be used for the same. Some software components based on the information available are mentioned below:
Eclipse Hono IOT platform for iot gateway
Eclipse Vorto for describing information models
Eclipse Ditto for Digital Twin representation. It provides abstract representation of device last state in the form of http or websocket apis
Blender / Unreal Engine for 3D models
Protege for Ontology editor
I have the following questions:
Are we missing any software components to create digital twin of an physical asset?
Assuming we have 3D models available and sensor data is also available, how can we feed live sensor data to 3D models e.g changing the color of water tank based on real sensor data of water tank level? Not able to understand how real time sensor data will be connected to 3D models.
How will ontology be helpful in creating 3D models?
So you have a 3d model and sensor information, and you want to change some properties of the 3d model to reflect the sensor information? You should need to use 5 different tools for something like that. I would suggest looking into video games development tools like Unity3D or UnrealEngine 4.

Making 3D scan model using Intel RealSense D435 Point clouds

Earlier this week I received the Intel RealSense D435 camera and now I am discovering its capabilities. After doing a few hours of research, I discovered the previous version of the SDK had a 3D model scan example application. Since SDK 2.0, this example application is no longer present making it harder to create 3D models with the camera.
I have managed to create various Point cloud (.ply) files with the camera, and now I try to use CloudCompare to generate a 3D model of it. However, without any success. Since my knowledge about computer vision is rather basic, I reach out to the community how it's possible to accomplish a 3D model scan using only PointClouds. The model can be rough, but preferable most noisy data needs to be removed.
Try recfusion 1.7.3 for scanning. 99 euro

Why so low Prediction Rate 25 - 40 [sec/1] using Faster RCNN for custom object detection on GPU?

I have trained a faster_rcnn_inception_resnet_v2_atrous_coco model (available here) for custom object Detection.
For prediction, I used object detection demo jupyter notebook file on my images. Also checked the time consumed on each step and found that sess.run was taking all the time.
But it takes around 25-40 [sec] to predict an image of (3000 x 2000) pixel size ( around 1-2 [MB] ) on GPU.
Can anyone figure out the problem here?
I have performed profiling, link to download profiling file
Link to full profiling
System information:
Training and Prediction on Virtual Machine created in Azure portal with Standard_NV6 (details here) which uses NVIDIA Tesla M60 GPU
OS Platform and Distribution - Windows 10
TensorFlow installed from - Using pip pip3 install --upgrade tensorflow-gpu
TensorFlow version - 1.8.0
Python version - 3.6.5
GPU/CPU - GPU
CUDA/cuDNN version - CUDA 9/cuDNN 7
Can anyone figure out the problem here ?
Sorry for being here brutally opened & straight fair onwhere the root-cause of the observed performance problem is :
One could not find a worse VM-setup from Azure portfolio for such a computing-intense ( performance-and-throughput motivated ) task. Simply could not - there is no "less" equipped option for this on the menu.
Azure NV6 is explicitly marketed for a benefit of Virtual Desktop users, where NVidia GRID(R) driver delivers a software-layer of services for "sharing" parts of an also virtualised FrameBuffer for image/video ( desktop graphics pixels, max SP endecs ) shared, among teams of users, irrespective of their terminal device ( yet, 15 users at max per either of both on-board GPUs, for which it was specifically explicitly advertised and promoted on Azure as being it's Key Selling Point. NVidia goes even a step father, promoting this device explicitly for (cit.) Office Users ).
M60 lacks ( obviously, as having been defined such for the very different market-segment ) any smart AI / ML / DL / Tensor-processing features, having ~ 20x lower DP performance, than the AI / ML / DL / Tensor-processing specialised computing GPU devices.
If I may cite,
... "GRID" is the software component that lays over a given set of Tesla ( Currently M10, M6, M60 ) (and previously Quadro (K1 / K2)) GPUs. In its most basic form (if you can call it that), the GRID software is currently for creating FrameBuffer profiles when using the GPUs in "Graphics" mode, which allows users to share a portion of the GPUs FrameBuffer whilst accessing the same physical GPU.
and
No, the M10, M6 and M60 are not specifically suited for AI. However, they will work, just not as efficiently as other GPUs. NVIDIA creates specific GPUs for specific workloads and industry (technological) areas of use, as each area has different requirements.( credits go to BJones )
Next,
if indeed willing to spend efforts on this a-priori known worst option á la Carte :
make sure that both GPUs are in "Compute" mode, NOT "Graphics" if you're playing with AI. You can do that using the Linux Boot Utility you'll get with the correct M60 driver package after you've registered for the evaluation. ( credits go again to BJones )
which obviously does not seem to have such an option for a non-Linux / Azure-operated Virtualised-access devices.
Resumé :
If striving for an increased performance-and-throughput, best choose another, AI / ML / DL / Tensor-processing equipped GPU-device, where both problem-specific computing-hardware resources were put and there are no software-layers ( no GRID, or at least a disable-option easily available ), that would in any sense block achieving such advanced levels of GPU-processing performance.
As the website says the image size should be 600x600 and the code ran on Nvidia GeForce GTX TITAN X card. But first please make sure your code is actually running on GPU and not on CPU. I suggest running your code and opening another window to see GPU utilization using command below and see if anything changes.
watch nvidia-smi
TensorFlow takes long time for initial setup. ( Don't worry. It is just a one time process ).
Loading the graph is a heavy process. I executed this code in my CPU. It took almost 40 seconds to complete the program.
The time taken for initial set up like loading the graph was 37 seconds.
The actual time taken for performing object detection was 3 seconds, i.e. 1.5 seconds per image.
If I had given 100 images then the total time taken would be 37 + 1.5 * 100. I don't have to load the graph 100 times.
So in your case, if that took 25 [s], then the initial setup would have taken ~ 23-24 [s]. The actual time should be ~ 1-2 [s].
You can verify it in the code. May use the time module in python:
import time # used to obtain time stamps
for image_path in TEST_IMAGE_PATHS: # iteration of images for detection
# ------------------------------ # begins here
start = time.time() # saving current timestamp
...
...
...
plt.imshow( image_np )
# ------------------------------ # processing one image ends here
print( 'Time taken',
time.time() - start # calculating the time it has taken
)
It is natural that big images takes more time. Tensorflow object detection performs well even at lower resolutions like 400*400.
Take a copy of original image, resize it to lower resolution to perform object detection.
You will get bounding box cordinates. Now calculate corresponding bounding box coordinates for your original higher resolutionimage. Draw bounding box on original image.
i.e
Imagine You have an image of 3000*2000,
Make a copy of it and resize it to 300*200.
Performing object detection on the resized image, detected an object with bounding box (50,100,150,150) i.e (ymin, xmin, ymax, xmax)
Now the corresponding box coordinates for larger original image will be (500,1000,1500,1500).Draw rectangle on it.
Perform detection on small image then draw bounding box on original image.
Performance will be improved tremendously.
Note: TensorFlow support normalized cordinates.
i.e if you have an image with height 100 and ymin = 50 then normalized ymin is 0.5.
You can map normalized cordinates to image of any dimension just by multiplying with height or width for y and x cordinates respectively.
I suggest using OpenCV (cv2) for all image processing.

Need to add an interactive 3D model to my otherwise non-3D app

As briefly as I can; are there any frameworks available that I can drop into an iPad app I'm working on, along with a 3D model, and allow me to add a view that will present the model in an interactive format?
Model needs to be rotatable, and ideally I would like to be able to add interactive points on to the model that pop up modal views when tapped.
I have never worked with 3D before in any respect so I'm coming at that part as a complete novice. The 3D model is being supplied to me and will be available in "various formats". The rest of the app is pure Objective-C in which I'm proficient enough.
I have Googled and Googled and have come up with nothing so far.
Failing there being any drop-in frameworks, how much of a challenge is it likely to be to get myself up to speed with what I would need to know? Are there any good starting points to expand my knowledge here?
3D is a complex matter, if you don't see your future dealing with it on a regular basis I wouldn't recommend writing your own solutions for it.
The closest you can find to a drag and drop framework would be the SDK of the iPhone / iPad GPU's manufacturer. It's pretty easy to use.
PowerVR SDK Download
After a free registration on their website, you can download the SDK that contains lots of samples with source code. Their framework displays 3D models in their own POD format, which is of course heavily optimized for the iOS devices. Ask your 3D model provider to give you the models in POD format (you can find POD converters / exporters for Maya etc. on PowerVR's website as well).