iOS - 2d image turn into a 3d - objective-c

I was checking out this cool app called Morfo. According to their product description -
Use Morfo to quickly turn a photo of your friend's face into a
talking, dancing, crazy 3D character! Once captured, you can make your
friend say anything you want in a silly voice, rock out, wear makeup,
sport a pair of huge green cat eyes, suddenly gain 300lbs, and more.
So if you take a normal 2D image of steve jobs & feed it to this app it converts it into a 3D model of that image & the user can interact with it.
My questions are as following -
How are they doing this?
How is this possible in iPad?
Isn't it computationally intensive to render and convert 2D image into 3D?
Any pointers, links to websites or libraries in objectiveC which do this is very much appreciated.
UPDATE: this demo of this product here shows how morfo, uses a template mechanism to do the conversion. i.e. after a 2D image is fed, one needs to set the boundaries of the face, where the eyes are located, size & length of lips. then it goes off to convert it into a 3D model. How is this part done? What frameworks or libraries they might be using?

This is a broad question but i can point you in the right direction of how 3D Rendering works, trust me this is a huge subject with decades of work behind it and to much to put here. Not sure how up to speed you are on 3D Rendering techniques so i will give you a basic idea of texturing and point you to a good set of tutorials.
How are they doing this?
The idea is that in 3D Rendering, 3D models can be textured with a 2d image known as a texture map. You use a 2D image and wrap it around a 3d model, be that a simple primitive like a sphere of a cube or more advanced such as the classic teapot or the model of a human head e.t.c. A texture can be taken from anywhere, I have used the camera feed in the past to texture meshes with the video from the camera stream, I have used photos from the camera which s how there doing it. So this is how the face is rendered to the 3D Model.
Is this efficient?
On iOS and most mobile devices 3D rendering uses hardware acceleration utilizing OpenGLES. In regards to your question this is really fast depending on how you implement your render code.
The way it uses the mapping (scale rotate template in the video) as mentioned by anticyclope allows you to make the texture fit a model and also place the eyes which are part of there render code.
So if you want to pick this up i recommend reading Jeff Lamarche Tutorial "from the ground up" as a primer:
http://iphonedevelopment.blogspot.com/2009/05/opengl-es-from-ground-up-table-of.html
Second to that i have read about 4 books on OpenGLES, for general design and for platforms specifics. I recommend this book:
http://www.amazon.co.uk/iPhone-Programming-Developing-Graphical-Applications/dp/0596804822/ref=sr_1_1?ie=UTF8&qid=1331114559&sr=8-1

In my opinion, there is how they doing it. Just my thoughts, haven't saw the application in real-life.
They have a 3D model of human's head. When you click on certain points on 2D image, they are adjusting corresponding points in 3D model, so it is represents a specific face's features like distance between eyes, lips width and so on. Next, texture from 2D image is applied to 3D model using that control points, so we have a textured 3D model of human's head. Given the fact, that our perception is able to reconstruct a 3D shape from 2D images (say, we looking at 2D photo and still imagining a 3D person), there's no need to reconstruct 3D shape accurately, texture will do the work.

There is an issue in the rendering of 3D images, called UV mapping, takes the 3D model and defines a set of edges, and this creates an image that is used to generate different textures to the model.
Now if you notice in Morfo, you define the edge of the head, eyes, mouth and nose. with this information the Morfo knows how to place it texture to the model that has defined.
the process of loading a texture on a model is not very complex and this can be done on any device that has support of some technology such as OpenGL

Isn't it computationally intensive to render and convert 2D image into 3D?
Apple is sinking billions of dollars into developing custom chipsets, and recent models have impressive performance, considering the battery life and low operating temperature (no fans).

Related

Capture a live video of handwriting using pen and paper and replace the hand in video with some object or cursor

I want to process the captured video. I will try to capture the video of handwriting on paper / drawing on paper. But I do not want to show the hand or pen on the paper while live streaming via p5.js.
Can this be done using machine learning?
Any idea how to implement this?
If I understand you right you want to detect where in the image the hand is a draw an overlay on this position right?
If so You can use YOLO more information to detect where the hand is.
There are some trained networks that you can download maybe they are good enough, maybe you have to train your own just for handy.
There are also some libery for yolo and JS https://github.com/ModelDepot/tfjs-yolo-tiny
You may not need to go the full ML object segmentation route.
If the paper's position and illumination are constant (or at least knowable) you could try some simple heuristic comparing the pixels in the current frame with a short history and using the most constant pixel values. There might be some lag as new parts of your drawing 'become constant' so maybe you could try some modification to the accumulation, such as if the pixel was white and is going black.

2D image to 3D world coordinate in perspective view

I have been trying to locate detected objects from 2D image in the 3D space for a single fixed camera installed at the height.
I went trough the similar questions, but the perspective view is not mentioned.
What I have:
The height of the camera
Calibration parameters
The exact location of one fixed object in view
I've wrote a set of solutions to this kind of problem. 3d points reconstruction from 2d coordinates (yes, "in perspective"), are obtained by means of the extrinsics matrix. See https://github.com/rodolfoap/screen2world-k. Other methods are linked from there.

How can a 3D game render an object without having a sprite for every single angle?

When learning to program simple 2D games, each object would have a sprite sheet with little pictures of how a player would look in every frame/animation. 3D models don't seem to work this way or we would need one image for every possible view of the object!
For example, a rotating cube would need a lot images depicting how it would look on every single side. So my question is, how are 3D model "images" represented and rendered by the engine when viewed from arbitrary perspectives?
Multiple methods
There is a number of methods for rendering and storing 3D graphics and models. There are even different methods for rendering 2D graphics! In addition to 2D bitmaps, you also have SVG. SVG uses numbers to define points in an image. These points make shapes. The points can also define curves. This allows you to make images without the need for pixels. The result can be smaller file sizes, in addition to the ability to transform the image (scale and rotate) without causing distortion. Most 3D graphics use a similar technique, except in 3D. What these methods have in common, however, is that they all ultimately render the data to a 2D grid of pixels.
Projection
The most common method for rendering 3D models is projection. All of the shapes to be rendered are broken down into triangles before rendering. Why triangles? Because triangles are guaranteed to be coplanar. That saves a lot of work for the renderer since it doesn't have to worry about "coloring outside of the lines". One drawback to this is that most 3D graphics projection technologies don't support perfect spheres or other round surfaces. You have to use approximations and other tricks to make round surfaces (although there are some renderers which support round surfaces). The next step is to convert or project all of the 3D points into 2D points on the screen (as seen below).
From there, you essentially "color in" the triangles to make everything look solid. While this is pretty fast, another downside is that you can't really have things like reflections and refractions. Anytime you see a refractive or reflective surface in a game, they are only using trickery to make it look like a reflective or refractive material. The same goes for lighting and shading.
Here is an example of special coloring being used to make a sphere approximation look smooth. Notice that you can still see straight lines around the smoothed version:
Ray tracing
You also can render polygons using ray tracing. With this method, you basically trace the paths that the light takes to reach the camera. This allows you to make realistic reflections and refractions. However, I won't go into detail since it is too slow to realistically use in games currently. It is mainly used for 3D animations (like what Pixar makes). Simple scenes with low quality settings can be ray traced pretty quickly. But with complicated, realistic scenes, rendering can take several hours for a single frame (as is the case with Pixar movies). However, it does produce ultra realistic images:
Ray casting
Ray casting is not to be confused with the above-mentioned ray tracing. Ray casting does not trace the light paths. That means that you only have flat surfaces; not reflective. It also does not produce realistic light. However, this can be done relatively quickly, since in most cases you don't even need to cast a ray for every pixel. This is the method that was used for early games such as Doom and Wolfenstein 3D. In early games, ray casting was used for the maps, and the characters and other items were rendered using 2D sprites that were always facing the camera. The sprites were drawn from a few different angles to make them look 3D. Here is an image of Wolfenstein 3D:
Castle Wolfenstein with JavaScript and HTML5 Canvas: Image by Martin Kliehm
Storing the data
3D data can be stored using multiple methods. It is not necessarily dependent on the rendering method that is used. The stored data doesn't mean anything by itself, so you have to render it using one of the methods that have already been mentioned.
Polygons
This is similar to SVG. It is also the most common method for storing model data. You define the geometry using 3D points. These points can have other properties, such as texture data (in the form of UV mapping), color data, and whatever else you might want.
The data can be stored using a number of file formats. A common file format that is used is COLLADA, which is an XML file that stores the 3D data. There are a lot of other formats though. Fundamentally, however, all file formats are still storing the 3D data.
Here is an example of a polygon model:
Voxels
This method is pretty simple. You can think of voxel models like bitmaps, except they are a bunch of bitmaps layered together to make 3D bitmaps. So you have a 3D grid of pixels. One way of rendering voxels is converting the voxel points to 3D cubes. Note that voxels do not have to be rendered as cubes, however. Like pixels, they are only points that may have color data which can be interpreted in different ways. I won't go into much detail since this isn't too common and you generally render the voxels with polygon methods (like when you render them as cubes. Here is an example of a voxel model:
Image by Wikipedia user Vossman
In the 2D world with sprite sheets, you are drawing one of the sprites depending on the state of the actor (visual representation of your object). In the 3D world you are rendering a model for your actor that is a series of polygons with a texture mapped to it. There are standardized model files (I am mostly familiar with Autodesk 3DS Max), in which the model and the assigned textures can be packaged together (a .3DS or .MAX file), providing everything your graphics library needs to render the object and its textures.
In a nutshell, you don't use images for each view of a 3D object, you have a model with a texture rendered on it, creating a dynamic view as it is rendered by the graphics library.

3D Transformations on a Quartz2D path — Drawing Application

I'm in the planning stage of writing a Cocoa drawing application (for Mac, not iOS), and I'm trying to discern whether one of my features is technically possible via any of the drawing frameworks. Any help or relevant information would be greatly appreciated.
The idea is to apply a 3D transformation to an object drawn with Quartz2D. I've considered capturing the relevant portion of the canvas View (where objects are drawn) as an image and sending it to Core Animation, but that doesn't seem like the best option. Since this is a drawing application, it's less about 3D animation than it is about the transformed shape. This solution is also less than ideal because I assume that if the 2D object were a vector path rather a bitmap image, I would have to rasterize it to apply such a transformation. The ideal implementation would enable the user to dynamically rotate a flat object in 3 dimensions until she found a suitable orientation, lock in this transformation, and still be able to manually adjust the path's vector points.
Is this feasible? Would it require working directly with OpenGL? Help of any kind is most welcome.
Thank you!
Seems to me that anything you'd do with a 3D transform, you should be able to do with multiple affine transforms. See UIBezierPath's -applyTransform method.

Is it possible to animate markers in ArcMap?

I'm completely new to ArcGIS and ArcMap, but someone suggested this program to me for a project I'm working on.
I would like to animate individual entities on a map, and was wondering if it is possible to do so in ArcMap. I asked this earlier here and a member directed me to a tutorial on animating in ArcGIS. The animation in the guide was over a map spread (ie. each pixel on the map displays, say, a different color to indicate population data in the area). However I realized that if I zoom in a lot, eventually the image will degenerate into pixels, which is why I need an actual object to mark a certain point. I checked some online tutorials and it seems like we can place markers on the map. Can someone tell me if it is possible to animate these markers (for example via a for-loop)? And if so, could you point me in a direction where to start?
Thanks in advance!
You can animate layers in ArcMap is the short answer. Its not as simple as using the timeline feature in Google Earth for example though. But then ArcMap is much more than just a visualization tool.
This help page on the ESRI web help looks like a good place to start.
I'm not 100% sure what you mean by the image degenerates into pixels. Are you saying that the markers were single points in the layer. Unlike Google Earth you are not confined to simply plotting points on the map. You can draw completely arbitrary shapes in ArcMap, which can be defined to cover actual areas of the map, so when you zoom-in the shape gets larger.
The way you need to load data into ArcMap to produce an animation isn't too simple. There might be other ways to do this, but the way I know of is to generate a NetCDF file. This file contains a 3D matrix of layer data, where each layer is separated through time. Because you generate a matrix, you are effectively placing a raster image over the map. Thus if you want to cover a large area, each matrix becomes large, and you multiply that by the number of time slices you wish to animate over.
Once you have a NetCDF file with your data in however, getting ArcMap to animate it and produce say a .avi file is pretty simple.
You could try just loading some of the example NetCDF datasets into ArcMap to see how/if they will work to get you started.
Hope that helps.
The upcoming v10 will have better time-aware capabilities, which will allow for animation.