Kernel-Predicting Convolutional Networks
for Denoising Monte Carlo Renderings

S. Bako*, T. Vogels*, B. McWilliams, M. Meyer, J. Novák, A. Harvill, P. Sen, T. DeRose, and F. Rousselle,
"Kernel-Predicting Convolutional Networks for Denoising Monte Carlo Renderings,"
ACM Transactions on Graphics, Vol. 36, No. 4, Article 97, July 2017, (Proceedings of ACM SIGGRAPH 2017).

Evaluation on production data

This section shows the performance of our models on test frames from Cars 3 and Coco, which are significantly different in style from our training set, Finding Dory. We compare our results to the results obtained by previous methods: RDFC [Rousselle et al. 2013], APR [Moon et al. 2016], NFOR [Bitterli et al. 2016], LBF [Kalantari et al. 2015], and the Hyperion denoiser, used in the production of the aforementioned films. To obtain the results for LBF [Kalantari et al. 2015], we trained on different data than their system expects which resulted in suboptimal results with excessive residual noise (see paper text for details). We trained separate systems for 32 spp and 128 spp data. A large subset of the frames include 16 spp results that were run with our network trained on 32 spp, demonstrating its ability to extrapolate to other sampling rates.

Hyper parameters

For this dataset, we used a learning rate of 10-5 and a batch size of 5 patches with ADAM optimization [Kingma and Ba 2014]. When performing fine tuning (Sec. 4.3), we use a learning rate of 10-6.

Test set


frame_00

frame_01

frame_02

frame_03

frame_04

frame_05

frame_06

frame_07

frame_08

frame_09

frame_10

frame_11

frame_12

frame_13

frame_14

frame_15

frame_16

frame_17

frame_18

frame_19

frame_20

frame_21

frame_22

frame_23

frame_24

Evaluation on an open-source dataset

The focus of this paper is the application of deep-learning based denoising to production-quality Monte Carlo renderings. Most of the training and evaluation has therefore been done on proprietary data. In order to facilitate comparisons to future methods, we also provide results for our models when trained and tested on a publicly available dataset.

Training data

We trained our models on a training set consisting of perturbations of scenes available on https://benedikt-bitterli.me/resources/. To assure the training data covers a sufficient range of lighting situations, color patterns, and geometry, we randomly perturbed the scenes by varying camera parameters, materials, and lighting. In this way, we generated 1484 (noisy image, high quality image) pairs from the following 8 base scenes. We extracted patches from those images at four different sample counts per pixel: 128 spp, 256 spp, 512 spp and 1024 spp.


Contemporary Bathroom
Mareck, Blendswap.com

Pontiac GTO 67
MrChimp2313, Blendswap.com

Bedroom
SlykDrako, Blendswap.com

4060.b Spaceship
thecali, Blendswap.com

Victorian Style House
MrChimp2313, Blendswap.com

The Breakfast Room
Wig42, Blendswap.com

The Wooden Staircase
Wig42, Blendswap.com

Japanese Classroom
NovaZeeke, Blendswap.com

Hyper parameters

We found that a learning rate of 10-4 and a batch size of 100 patches worked well for this dataset. A larger batch size helps to deal with the amount of noise in these scenes, which is much larger than in our production frames. For the fine-tuning stage, we use a learning rate of 10-6.

Evaluation

To assess the quality of our models, we evaluate them on different scenes from the same website.


Salle de Bain

The Grey & White Room

The Modern Living Room

The White Room

* joint first authors