NVIDIA Research Achieves AI Training Breakthrough Using Limited Datasets

Data augmentation technique enables AI model to emulate artwork from a small dataset from the Metropolitan Museum of Art — and opens up new potential applications in fields like healthcare.
by Isha Salian

NVIDIA Research’s latest AI model is a prodigy among generative adversarial networks. Using a fraction of the study material needed by a typical GAN, it can learn skills as complex as emulating renowned painters and recreating images of cancer tissue.

By applying a breakthrough neural network training technique to the popular NVIDIA StyleGAN2 model, NVIDIA researchers reimagined artwork based on fewer than 1,500 images from the Metropolitan Museum of Art. Using NVIDIA DGX systems to accelerate training, they generated new AI art inspired by the historical portraits.

The technique — called adaptive discriminator augmentation, or ADA — reduces the number of training images by 10-20x while still getting great results. The same method could someday have a significant impact in healthcare, for example by creating cancer histology images to help train other AI models.

“These results mean people can use GANs to tackle problems where vast quantities of data are too time-consuming or difficult to obtain,” said David Luebke, vice president of graphics research at NVIDIA. “I can’t wait to see what artists, medical experts and researchers use it for.”

The research paper behind this project is being presented this week at the annual Conference on Neural Information Processing Systems, known as NeurIPS. It’s one of a record 28 NVIDIA Research papers accepted to the prestigious conference.

This new method is the latest in a legacy of GAN innovation by NVIDIA researchers, who’ve developed groundbreaking GAN-based models for the AI painting app GauGAN, the game engine mimicker GameGAN, and the pet photo transformer GANimal. All are available on the NVIDIA AI Playground.

The Training Data Dilemma

Like most neural networks, GANs have long followed a basic principle: the more training data, the better the model. That’s because each GAN consists of two cooperating networks — a generator, which creates synthetic images, and a discriminator, which learns what realistic images should look like based on training data.

The discriminator coaches the generator, giving pixel-by-pixel feedback to help it improve the realism of its synthetic images. But with limited training data to learn from, a discriminator won’t be able to help the generator reach its full potential — like a rookie coach who’s experienced far fewer games than a seasoned expert.

It typically takes 50,000 to 100,000 training images to train a high-quality GAN. But in many cases, researchers simply don’t have tens or hundreds of thousands of sample images at their disposal.

With just a couple thousand images for training, many GANs would falter at producing realistic results. This problem, called overfitting, occurs when the discriminator simply memorizes the training images and fails to provide useful feedback to the generator.

In image classification tasks, researchers get around overfitting with data augmentation, a technique that expands smaller datasets using copies of existing images that are randomly distorted by processes like rotating, cropping or flipping — forcing the model to generalize better.

But previous attempts to apply augmentation to GAN training images resulted in a generator that learned to mimic those distortions, rather than creating believable synthetic images.

A GAN on a Mission

NVIDIA Research’s ADA method applies data augmentations adaptively, meaning the amount of data augmentation is adjusted at different points in the training process to avoid overfitting. This enables models like StyleGAN2 to achieve equally amazing results using an order of magnitude fewer training images.

As a result, researchers can apply GANs to previously impractical applications where examples are too scarce, too hard to obtain or too time-consuming to gather into a large dataset.

Different editions of StyleGAN have been used by artists to create stunning exhibits and produce a new manga based on the style of legendary illustrator Osamu Tezuka. It’s even been adopted by Adobe to power Photoshop’s new AI tool, Neural Filters.

With less training data required to get started, StyleGAN2 with ADA could be applied to rare art, such as the work by Paris-based AI art collective Obvious on African Kota masks.

Another promising application lies in healthcare, where medical images of rare diseases can be few and far between because most tests come back normal. Amassing a useful dataset of abnormal pathology slides would require many hours of painstaking labeling by medical experts.

Synthetic images created with a GAN using ADA could fill that gap, generating training data for another AI model that helps pathologists or radiologists spot rare conditions on pathology images or MRI studies. An added bonus: With AI-generated data, there are no patient data or privacy concerns, making it easier for healthcare institutions to share datasets.

NVIDIA Research at NeurIPS

The NVIDIA Research team consists of more than 200 scientists around the globe, focusing on areas including AI, computer vision, self-driving cars, robotics and graphics. Over two dozen papers authored by NVIDIA researchers will be highlighted at NeurIPS, the year’s largest AI research conference, taking place virtually from Dec. 6-12.

Check out the full lineup of NVIDIA Research papers at NeurIPS.

Main images generated by StyleGAN2 with ADA, trained on a dataset of fewer than 1,500 images from the Metropolitan Museum of Art Collection API.