基於風格注意力網路的任意風格遷移

作者：由 WxZz呀發表于文化時間：2018-12-12

連結：

https：//

arxiv。org/pdf/1812。0234

2v2。pdf

摘要：Arbitrary style transfer aims to synthesize a content image with style of an image that has never been seen before。Recent arbitrary style transfer algorithms have trade-off between the content structure and the style patterns， or maintaining the global and local style patterns at the same time is difficult due to the patch-based mechanism。 In this paper， we introduce a novel style-attentional network （SANet）， which efficiently and flexibly decorates the local style patterns according to the semantic spatial distribution of the content image。 A new identity loss function and a multilevel features embedding also make our SANet and decoder preserve the content structure as much as possible while enriching the style patterns。 Experimental results demonstrate that our algorithm synthesizes higher-quality stylized images in real-time than the state-of-the-art-algorithms。

在這篇論文中提出了一種新穎風格注意力網路，能夠有效，靈活地根據內容影象的語義空間分佈，對區域性風格模式進行裝飾。貢獻還有就是加上了恆等損失函式以及多層特徵嵌入。

附上其中重要的參考Non-local neural networks

其實還有兩個重要參考：1。Attention is all you need。 2。Self- attention generative adversarial networks。這兩篇都是引入注意力機制的重要參考論文，都在知乎有解讀。

參考1的圖示

參考2的圖示

可以看出本文中的風格注意力網路和參考1，2文獻的網路結構是相似的。

在Non-local neural networks中知乎使用者也提到

跟Gram matrix［3］的聯絡

這部分的insight來自於毛豆大佬（一個喜歡貓的大佬）。

Gram matrix第一次被應用到風格遷移任務中［3］，並在後來成為style loss的標配。

Style loss的gram matrix把一個channel看成一個點（座標就是整個filter，長度等於一個filter大小H*W）；而公式（1）則是把每個空間位置看成點（座標是所有filter在該空間位置上的值，長度等於channel數）。兩者都是計算任意兩個點之間的內積。內積運算也就這兩種處理方式了……

也就是說，它們的差別在於沿著filters的不同方向做內積。

基於gram matrix的style loss可以捕捉到紋理資訊；從上一節我們知道，non-local層起到attention的作用。而由［4］我們知道，匹配gram matrix相當於最小化feature maps的二次多項式核的MMD距離。而non-local呢？暫時不知道。

本篇論文將這種網路應用於風格遷移，也可能證實了和Gram matrix存在聯絡。

標簽： Style local Matrix Content patterns

上一篇:給十八歲的一封信

下一篇：【1m 新知】建築的隱喻，就在視覺與錯覺之間