Eric Yu
example.jpg

Robotics Institute Research

Synthetic Data Generation (Robotics Institute Research)

Is synthetic data better than real data when training algorithms how to recognize objects?

A mock-up image of synthetic data to be used for training machine learning models, complete with bounding boxes as well.

A mock-up image of synthetic data to be used for training machine learning models, complete with bounding boxes as well.

NOTE: These projects are still ongoing, so some data has been omitted so as to not reveal sensitive information.

February 1st, 2018

I am a Senior Animation Designer at the Robotics Institute of Carnegie Mellon University, and my job is to support research into Computer Vision and Machine Learning. My main project involves researching the benefits of Synthetic Data in training object-recognition, machine learning models. Thus, as artists, we were given the unique chance of applying our skills to create as many generated images as we could, not for the eyes of an audience, but for the eyes of a machine.


Challenges

  • How can we quickly generate and render millions of images?

  • How can we use this synthetic data to improve the training algorithms?

Objective

  • In order train algorithms to identify objects, we need to generate synthetic scenes in order to reach the amount of data needed.


Takeaways

  • Although high quality images look nice, silhouettes are often good enough for object detection.

  • Synthetic data can be used to make new situations that are not represented well in real life data.

  • Coding (especially in Blender) allowed us to procedurally generate our scenes, offering us the most flexibility.

Outcomes

  • Used Python scripts to quickly populate 13 customizable scenes in Maya, allowing us to generate millions of images.

    • We switched to Blender, which allowed us to render in real-time, allowing us to cut render times (numbering in the days using our old method)!

  • Populated scenes with classes underrepresented in previously used real life data to help augment the training algorithms.

  • Created Python scripts to automate scene creation to help the other artists, reducing scene creation time by 67%.


Skills

  • Technical Art/ Programming
  • 3D Modeling
  • Motion Capture
  • Rendering

Tools

  • Maya
  • Arnold
  • Python
  • Blender
  • Unreal
  • Unity
  • Substance Painter
  • Substance Designer
  • MotionBuilder

Duration

  • Research (6 members)
  • Ongoing Project
  • Since 2018 (2 years)

The Team

Jessica Hodgins, a professor at Carnegie Mellon and President of SIGGRAPH, put together a small team of artists and programmers together in order to help the researchers in the Robotics Institute create synthetic data.

  • Eric Yu (Technical Artist, Modeler/Texturer): I am in charge of creating and texturing assets. I also make scripts, primarily in Python, to help with the scene creation process. In addition, I helped with recording motion capture, rigging, and animating characters.

  • Melanie Danver (3D Artist, Scene Layout)

  • Katie Tender (Modeler/Texturer)

  • Stanislav Panev (Project Scientist)

  • Sushma Akoju (Programmer)


The Solution

With synthetic data we are able to generate millions of “fake” images that we can use to train machine learning models, at a fraction of the time and cost of acquiring real life images. Labeling the images is also handled by the program, so we don’t need to do it by hand anymore. Plus, if there are any gaps in the data set (such as not enough dogs or not enough women in the data), we can add any underrepresented objects or actions, and generate millions of those to fill in the gaps. We can also export any additional information that is required, such as depth maps or bounding boxes.

An example of domain randomization, programmed in Unity by Kevin Carlos. Notice the random textures, vehicle layout, and distractor shapes.

An example of domain randomization, programmed in Unity by Kevin Carlos. Notice the random textures, vehicle layout, and distractor shapes.

Of course, this is an ever evolving field, so new information is bound to come up. We originally tried to make realistic images to match real life data. Thus, we used Autodesk Maya and the Arnold Renderer to make our scenes as realistic as possible, with textures done in Substance Painter.

However, according to a paper from researchers at NVIDIA (click here to see it), domain randomization, or random unrealistic scenes, proved to offer some more beneficial results. Instead of making realistic scenes, randomly change the textures, or throw some shapes in there. These unrealistic scenes have actually been shown to improve performance, since networks are forced to focus on only the essential characteristics of the objects. With this, we used game engines such as Unreal and Unity, since it’s easier to make quick randomization algorithms that are ready to use without the need to render.




My Role

Making the Environment

Different sun angles allow us to multiply our number of unique images by 13.

Different sun angles allow us to multiply our number of unique images by 13.

I was one of the artists tasked with creating realistic assets, using Maya primarily for modeling and Arnold for rendering. I also learned Substance Painter and Substance Designer to create realistic textures, with high quality scans from Quixel, so that the areas we built resembled realistic urban centers. For foliage, we primarily used the MASH network to make instances of trees and bushes, keeping the file sizes down. In order to create a large number of images, for each 5 second sequence we made, we changed the sun angles and camera that we used, turning 150 individual images into 58,500 images. Render times were kept below 20 seconds per frame. I was able to bring my modeling/texturing skills into game engines as well, particularly the Unreal Engine.

I also helped our Mocap Artist, Kevin, with recording motion capture shoots. With that data, I was able to use Motionbuilder to bake the animation onto the skeletons of our characters before combining separate clips to form a cohesive motion to be imported into the scene.

Garage model with sand brick material created in Substance Designer.

Garage model with sand brick material created in Substance Designer.

Turnaround of a small apartment, made in Maya and textured in Substance Painter.

Turnaround of a small apartment, made in Maya and textured in Substance Painter.

Recording bicycle motion capture at the Carnegie Mellon Motion Capture Lab.

Recording bicycle motion capture at the Carnegie Mellon Motion Capture Lab.

Scripting

With my background in programming, I was able to create scripts that helped my team generate scenes more efficiently, cutting down scene creation time to one third of what it was previously. Mostly done in Python, the scripts I made ranged from helping with rigging to producing depth maps for the researchers to use.

Vehicle Rigging: Many of the vehicles we find online are not rigged, so before we rigged them by hand. Now, this script produces controllers for the hood/trunk, the body, and the doors, rigs them with the corresponding vehicle part, and links wheel rotation with forward movement. A movable vehicle with a click of a button!

car_rigging.gif

Vehicle Spawning: Before when we made a scene, we would have to place vehicles by hand. Now, at designated spawn points, we are able to randomly generate vehicles at appropriate locations, with a window popping up to determine the number of vehicles, type of vehicles, and whether the vehicles would move or not.

vh_spawn.gif

Depth Map Creation: Made with help from researcher Stanislav Panev, this script generates a depth map of the scene, with white being closest, and black being furthest away. This script automatically generates a depth map from each frame for every camera.

depth_map.gif

Blinking: Allows us to induce blinking onto a person model without having to manually animate it. You designate a time period and the script puts in blinking at random intervals by manipulating the blend shape deformations.

eyeblink_1.gif

What I Learned

Though this project is still ongoing, I have learned so much during my time here. Having never before studied in the fields of Computer Vision nor Machine Learning, I had to learn a lot in order to keep up with the researchers, so as to better aid them. Indeed, this cross-functional environment has taught me how to communicate the intricacies of the 3D pipeline to the researchers of the Robotics Institute. Through these frequent meetings we were able to determine what they needed and were able to communicate solutions (not everyone knows about UV’s or joints in a skeletal system), which helped them scope their projects. Collaboration is key in our workplace.

Plus, since the fields we are working in are ever-evolving, we needed to come up with creative solutions for new problems, such as domain randomization and depth map creation to name a few. Thus, we had to think on our feet, drawing on what we knew and learning new methods or software in order to keep up with the demands of the field. When domain randomization came to our eyes we needed to learn Unreal and Unity. When a depth map was required to train the model on depths we needed to find a way to do that in Maya efficiently (ultimately finding a way to do it with Legacy Render Layers or Arnold AOVs). We had to become adept problem solvers, but we were always up for the challenge.

The future of Synthetic Data is looking bright, with autonomous cars and AI driven solutions on the rise. Who knows what the future has in store? Whatever it is… we’re ready!