Videos, Slides, Films

Motion Complementary Network for Efficient Action Recognition

Conferences
ICPR 2020 MAIN CONFERENCE PS T2.2: Biometrics and Human Analysis (2021)
Available as
Online
Summary

Both two-stream ConvNet and 3D ConvNet are widely used in action recognition. However, both methods are not efficient for deployment: calculating optical flow is very slow, while 3D convolution is ...

Both two-stream ConvNet and 3D ConvNet are widely used in action recognition. However, both methods are not efficient for deployment: calculating optical flow is very slow, while 3D convolution is computationally expensive. Our key insight is that the motion information from optical flow maps is complementary to the motion information from 3D ConvNet. Instead of simply combining these two methods, we propose two novel techniques to enhance the performance with less computational cost: fixed-motion-accumulation and balanced-motion-policy. With these two techniques, we propose a novel framework called Efficient Motion Complementary Network(EMC-Net) that enjoys both high efficiency and high performance. We conduct extensive experiments on Kinetics, UCF101, and Jester datasets. We achieve notably higher performance while consuming 4.7 times less computation than I3D, 11.6 times less computation than ECO, 17.8 times less computation than R(2+1)D. On Kinetics dataset, we achieve 2.6% better performance than the recent proposed TSM with 1.4 times fewer FLOPs and 10ms faster on K80 GPU.

Details

Additional Information