Graphics processor inventor nVidia has been taking a much closer look at 3D printing recently, especially as GPU based AI systems become increasingly relevant in understanding and managing the enormous amount of data that goes into digitalizing the physical world. NVidia has been working with Dyndrite on managing AM data. Now its own researchers developed the DIB-R AI framework to rapidly create 3D models form flat images.
Machine learning models need to be able to see objects in three dimensions so that they can accurately understand image data. NVIDIA researchers have now made this possible by creating the rendering framework called DIB-R — a differentiable interpolation-based renderer — that produces 3D objects from 2D images.
The researchers presented their model at the annual Conference on Neural Information Processing Systems (NeurIPS), in Vancouver. Their research was also published in a paper titled “Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer”.
In traditional computer graphics, a pipeline renders a 3D model to a 2D screen. But there’s information to be gained from doing the opposite — a model that could infer a 3D object from a 2D image would be able to perform better object tracking, for example.
NVIDIA researchers wanted to build an architecture that could do this while integrating seamlessly with machine learning techniques. The result, DIB-R, produces high-fidelity rendering by using an encoder-decoder architecture, a type of neural network that transforms input into a feature map or vector that is used to predict specific information such as shape, color, texture and lighting of an image.
It’s especially useful when it comes to fields like robotics. For an autonomous robot to interact safely and efficiently with its environment, it must be able to sense and understand its surroundings. DIB-R could potentially improve those depth perception capabilities.
It takes two days to train the model on a single NVIDIA V100 GPU, whereas it would take several weeks to train without NVIDIA GPUs. At that point, DIB-R can produce a 3D object from a 2D image in less than 100 milliseconds. It does so by altering a polygon sphere — the traditional template that represents a 3D shape. DIB-R alters it to match the real object shape portrayed in the 2D images.
The team tested DIB-R on four 2D images of birds (far left). The first experiment used a picture of a yellow warbler (top left) and produced a 3D object (top two rows).
NVIDIA researchers trained their model on several datasets, including a collection of bird images. After training, DIB-R could take an image of a bird and produce a 3D portrayal with the proper shape and texture of a 3D bird.
The 3D yellow warbler, as rendered by DIB-R.
“This is essentially the first time ever that you can take just about any 2D image and predict relevant 3D properties,” says Jun Gao, one of a team of researchers who collaborated on DIB-R.
DIB-R can transform 2D images of long-extinct animals like a Tyrannosaurus rex or chubby Dodo bird into a lifelike 3D image in under a second.
Built on PyTorch, a machine learning framework, DIB-R is included as part of Kaolin, NVIDIA’s newest 3D deep learning PyTorch library that accelerates 3D deep learning research.
The entire NVIDIA research paper, “Learning to Predict 3D Objects with an Interpolation-Based Renderer,” can be found here. The NVIDIA Research team consists of more than 200 scientists around the globe, focusing on areas including AI, computer vision, self-driving cars, robotics and graphics.