Deep Inside Visual-Semantic Embeddings

Published in PhD Thesis, 2020

Abstract

In this thesis, we aim to further advance image representation and understanding. Revolving around Visual Semantic Embedding (VSE) approaches, we explore different directions: First, we present relevant background in Chapter 2, covering images and textual representation and existing multimodal approaches. Then in Chapter 3 we propose novel architectures further improving retrieval capability of VSE. In Chapter 4 we extend VSE models to novel applications and leverage embedding models to visually ground semantic concept. Finally, in Chapter 5 we delve into the learning process and in particular the loss function.

Paper

Citation

If you found this work useful, please cite the associated paper:

M. Engilberge, "Deep Inside Visual-Semantic Embeddings," 2020

BibTex:

@inproceedings{engilbergeThesis2020,
  title = {Deep Inside Visual-Semantic Embeddings},
  author = {Engilberge, Martin},
  year = {2020}
}