Profile Picture

Neural Style Transfer

CSE 455 Final Project: Abhinav Bandari, Arnav Thareja, Karthikeya Vemuri, Alan Wu

Introduction

We built a neural style transfer model to take a content image and a style image and output the content image with the style from the style image applied to it.

Algorithm

We use the VGGNet model architecture with weights pre-trained on ImageNet. We initialize the combined image to the content image, and pass the content image, style image, and combined image into the model. The model tunes the pixel values of the combined image until it represents a good combination of the content image and the style image.

The model's loss function consists of a weighted sum of a style loss and content loss. The content loss is defined as the L1 distance between the output of the intermediate layers of the network for the content image and for the combined image. The style loss is defined as the L1 distance between Gram matrices computed on the output of the intermediate layers of the network for the style image and for the combined image. The Gram matrix we use measures the correlations between outputs of a specific layer.

Since the model's loss function consists of a weighted sum of the style loss and content loss, we experimented with a range of weights for the style and content losses, and found that the generated images looked best when the style loss weight dominated the content loss weight. In practice, we found that using a L1 loss function worked better than an L2 loss function, and produced combined images with more apparent influence from the style image.

Gallery

Style Image

Style images are taken from a dataset of works from famous artists, or have been found online by us.

Content Image

Content images are taken from a dataset of images from Google Images, taken from 500px, or have been created by us or found online by us.

Combined Image

Combined images may appear distorted or have a different size than the original content image. This is due to the model expecting images to be square, requiring non-square content and style images to be resized.

Other