基於時空域摺積神經網路之抽菸動作辨識

國際上有許多國家或各區於室內公共或工作場所全面禁止抽菸，台灣也不例外。但在醫院的門口、校園的角落，仍時常看到有人在抽菸。即使沒有吸菸，但若站在吸菸者旁邊，仍會吸到菸，此菸稱為二手菸。二手菸對於人體危害甚多，除了增加罹患疾病的機率，如癌症、心臟病、中風、呼吸道疾病等，更進一步有可能傷害大腦機能。我們希望經由深度學習的技術與方法，用以辨識揪出違法的吸菸者。

本研究為「基於時空域摺積神經網路之抽菸動作辨識」，提出應用於抽菸動作辨識的系統。採用資料平衡與資料增加等方式增加效能，使用深度學習中的摺積神經網路 GoogLeNet，與Temporal segment networks之影片分段架構，組成擁有時間結構之空間域摺積神經網路(即題目之時空域神經網路)，達成有效辨識抽菸影片之系統。於原先之 Hmdb51 抽菸影片，辨識達100%，於增加之 Activitynet smoking 日常抽菸影片 (Hmdb51 + Activitynet smoking)，可達99.16%。於選擇之 AVA data 電影抽菸片段，亦能達到91.667%，能有效分辨抽菸之影片。

關鍵字：抽菸動作辨識、視訊分類、摺積神經網路、深度學習。

Smoking Action Recognition Based on Spatial-Temporal Convolutional Neural Networks

Cigarette smoking increases risk for death from all causes in men and women. If one stands next to a smoker, this person still can be infected, called passive smoking. Consequently, smoking is prohibited in many closed public areas such as government buildings, educational facilities, hospitals, enclosed sport facilities, and buses. However, it still often happens that smokers smoke even in highly prohibited places such as hospitals and elementary school campuses. The objective of this work is to develop a smoking action recognition system based on deep learning, which allows quick discovery of smoking behavior.

In this work, we propose a system that can recognize smoking action. It utilizes data balancing and data augmentation based on GoogLeNet and Temporal segment networks (TSN) architecture to achieve effective smoking action recognition. In our experiment, spatial CNN is more powerful than temporal CNN in smoking action. The experimental results show that the smoking accuracy rate can reach 100% for Hmdb51 test dataset. For additional ActivityNet smoking, accuracy rate can reach 99.16%. For additional irrelevant movie smoking clips, the accuracy can also be as high as 91.67%.

Keywords- Smoking action recognition, Video Classification, Convolutional neural networks, Deep learning