Visual Translator

This technique was first introduced in our paper published as Findings of EMNLP 2020, titled Using Visual Feature Space as a Pivot Across Languages. Our goal is to translate visual features into text or text back into visual features. Since we train our model with text in different languages then our inference procedure is able to go from text in English to visual features, and then back to any of the other languages used during training. We use a technique introduced in one of our earlier works called "feedback propagation" to perform inference. This demo showases how our model works for English, German (Deutsch), and Japanese (日本語). This demo is enabled by the Multi30k Dataset and the STAIR Dataset so it is likely to work in the type of language used in these resources which consists of mostly simple image descriptions of everyday objects and situations.

This technique was first introduced in our paper published as Findings of EMNLP 2020, titled Using Visual Feature Space as a Pivot Across Languages. Our goal is to translate visual features into text or text back into visual features. Since we train our model with text in different languages then our inference procedure is able to go from text in English to visual features, and then back to any of the other languages used during training.

Enter your sentence and see its attempted visually-enabled translations into two other languages:
Demo by Ziyan and Vicente