Avatar-Net

Exemplar stylized results by the proposed Avatar-Net, which faithfully transfers the lena image by arbitrary style.

Overview

Zero-shot artistic style transfer is an important image synthesis problem aiming at transferring arbitrary style into content images. However, the trade-off between the generalization and efficiency in existing methods impedes a high quality zero-shot style transfer in real-time. In this repository, we resolve this dilemma and propose an efficient yet effective Avatar-Net that enables visually plausible multi-scale transfer for arbitrary style.

The key ingredient of our method is a style decorator that makes up the content features by semantically aligned style features from an arbitrary style image, which does not only holistically match their feature distributions but also preserve detailed style patterns in the decorated features.

Comparison of feature distribution transformation by different feature transfer modules. (a) Adaptive Instance Normalization, (b) Whitening and Coloring Transform, (c) Style-Swap, and (d) the proposed style decorator.

By embedding this module into a reconstruction network that fuses multi-scale style abstractions, the Avatar-Net renders multi-scale stylization for any style image in one feed-forward pass.

(a) Stylization comparison by autoencoder and style-augmented hourglass network. (b) The network architecture of the proposed method.

Results

Exemplar stylized results by the proposed Avatar-Net.

We demonstrate the state-of-the-art effectiveness and efficiency of the proposed method in generating high-quality stylized images, with a series of successful applications including multiple style integration, video stylization and etc.

Comparison with Prior Arts

The result by Avatar-Net receives concrete multi-scale style patterns (e.g. color distribution, brush strokes and circular patterns in the style image).
WCT distorts the brush strokes and circular patterns. AdaIN cannot even keep the color distribution, while style-swap fails in this example.

Execution Efficiency

Method	Gatys et. al.	AdaIN	WCT	Style-Swap	Avatar-Net
256x256 (sec)	12.18	0.053	0.62	0.064	0.071
512x512 (sec)	43.25	0.11	0.93	0.23	0.28

Avatar-Net has a comparable executive time as AdaIN and GPU-accelerated Style-Swap, and is much faster than WCT and the optimization-based style transfer by Gatys et. al..
The reference methods and the proposed Avatar-Net are implemented on a same TensorFlow platform with a same VGG network as the backbone.

Applications

Multi-style Interpolation

Content and Style Trade-off

Video Stylization (the Youtube link)

Code

Please refer to the GitHub repository for more details.

Publication

Lu Sheng, Ziyi Lin, Jing Shao and Xiaogang Wang, “Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration”, in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. [Arxiv]

@inproceedings{sheng2018avatar,
    Title = {Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration},
    author = {Sheng, Lu and Lin, Ziyi and Shao, Jing and Wang, Xiaogang},
    Booktitle = {Computer Vision and Pattern Recognition (CVPR), 2018 IEEE Conference on},
    pages={1--9},
    year={2018}
}