Fine-grained action recognition by motion saliency and mid-level patches

Liu, Fang ORCID: https://orcid.org/0000-0002-6593-3878, Zhao, Liang ORCID: https://orcid.org/0000-0001-5829-6850, Cheng, Xiaochun ORCID: https://orcid.org/0000-0003-0371-9646, Dai, Qin, Shi, Xiangbin and Qiao, Jianzhong (2020) Fine-grained action recognition by motion saliency and mid-level patches. Applied Sciences, 10 (8) , e2811. ISSN 2076-3417 [Article] (doi:10.3390/app10082811)

[img] PDF - Published version (with publisher's formatting)
Available under License Creative Commons Attribution.

Download (11MB)

Abstract

Effective extraction of human body parts and operated objects participating in action is the key issue of fine-grained action recognition. However, most of the existing methods require intensive manual annotation to train the detectors of these interaction components. In this paper, we represent videos by mid-level patches to avoid the manual annotation, where each patch corresponds to an action-related interaction component. In order to capture mid-level patches more exactly and rapidly, candidate motion regions are extracted by motion saliency. Firstly, the motion regions containing interaction components are segmented by a threshold adaptively calculated according to the saliency histogram of the motion saliency map. Secondly, we introduce a mid-level patch mining algorithm for interaction component detection, with object proposal generation and mid-level patch detection. The object proposal generation algorithm is used to obtain multi-granularity object proposals inspired by the idea of the Huffman algorithm. Based on these object proposals, the mid-level patch detectors are trained by K-means clustering and SVM. Finally, we build a fine-grained action recognition model using a graph structure to describe relationships between the mid-level patches. To recognize actions, the proposed model calculates the appearance and motion features of mid-level patches and the binary motion cooperation relationships between adjacent patches in the graph. Extensive experiments on the MPII cooking database demonstrate that the proposed method gains better results on fine-grained action recognition.

Item Type: Article
Additional Information: This article belongs to the Special Issue Intelligent Processing on Image and Optical Information - Ⅱ.
Keywords (uncontrolled): fine-grained action recognition, motion saliency, mid-level patch, object proposal
Research Areas: A. > School of Science and Technology > Computer Science
Item ID: 29705
Notes on copyright: © 2020 by the authors.
Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Useful Links:
Depositing User: Jisc Publications Router
Date Deposited: 20 Apr 2020 15:33
Last Modified: 20 Apr 2020 15:33
URI: https://eprints.mdx.ac.uk/id/eprint/29705

Actions (login required)

View Item View Item

Full text downloads (NB count will be zero if no full text documents are attached to the record)

Downloads per month over the past year