Unsupervised grounding of textual descriptions of object features and actions in video

Alomari, Muhannad, Chinellato, Eris ORCID logoORCID: https://orcid.org/0000-0003-1920-2238, Gatsoulis, Yiannis, Hogg, David C. and Cohn, Anthony G. (2016) Unsupervised grounding of textual descriptions of object features and actions in video. Proceedings, Fifteenth International Conference on Principles of Knowledge Representation and Reasoning (KR-16). In: 15th International Conference Principles of Knowledge Representation and Reasoning (KR 2016), 25-29 April 2016, Cape Town, South Africa. ISBN 9781577357551. [Conference or Workshop Item]

PDF - Final accepted version (with author's formatting)
Download (940kB) | Preview


We propose a novel method for learning visual concepts and their correspondence to the words of a natural language. The concepts and correspondences are jointly inferred from video clips depicting simple actions involving multiple objects, together with corresponding natural language commands that would elicit these actions. Individual objects are first detected, together with quantitative measurements of their colour, shape, location and motion. Visual concepts emerge from the co-occurrence of regions within a measurement space and words of the language. The method is evaluated on a set of videos generated automatically using computer graphics from a database of initial and goal configurations of objects. Each video is annotated with multiple commands in natural language obtained from human annotators using crowd sourcing.

Item Type: Conference or Workshop Item (Paper)
Research Areas: A. > School of Science and Technology > Design Engineering and Mathematics
Item ID: 19684
Notes on copyright: This is the author's accepted manuscript included in this repository with permission, granted on 16/02/17 by the publisher AAAI. The final published paper appears as: "Alomari, Muhannad, Chinellato, Eris, Gatsoulis, Yiannis, Hogg, David, AND Cohn, Anthony. "Unsupervised Grounding of Textual Descriptions of Object Features and Actions in Video" Knowledge Representation and Reasoning Conference 2016". Published by the Association for the Advancement of Artificial Intelligence (AAAI), available at: http://www.aaai.org/ocs/index.php/KR/KR16/paper/view/12827
Useful Links:
Depositing User: Eris Chinellato
Date Deposited: 05 May 2016 11:02
Last Modified: 29 Nov 2022 21:59
URI: https://eprints.mdx.ac.uk/id/eprint/19684

Actions (login required)

View Item View Item


Activity Overview
6 month trend
6 month trend

Additional statistics are available via IRStats2.