The description and analysis of animal behavior over long periods of time is one of the most important challenges in ecology. However, most of these studies are limited due to the time and cost required by human observers. The collection of data via video recordings allows observation periods to be extended. However, their evaluation by human observers is very time‐consuming. Progress in automated evaluation, using suitable deep learning methods, seems to be a forward‐looking approach to analyze even large amounts of video data in an adequate time frame.
In this study, we present a multistep convolutional neural network system for detecting three typical stances of African ungulates in zoo enclosures which works with high accuracy. An important aspect of our approach is the introduction of model averaging and postprocessing rules to make the system robust to outliers.
Our trained system achieves an in‐domain classification accuracy of >0.92, which is improved to >0.96 by a postprocessing step. In addition, the whole system performs even well in an out‐of‐domain classification task with two unknown types, achieving an average accuracy of 0.93. We provide our system at https://github.com/Klimroth/Video‐Action‐Classifier‐for‐African‐Ungulates‐in‐Zoos/tree/main/mrcnn_based so that interested users can train their own models to classify images and conduct behavioral studies of wildlife.
The use of a multistep convolutional neural network for fast and accurate classification of wildlife behavior facilitates the evaluation of large amounts of image data in ecological studies and reduces the effort of manual analysis of images to a high degree. Our system also shows that postprocessing rules are a suitable way to make species‐specific adjustments and substantially increase the accuracy of the description of single behavioral phases (number, duration). The results in the out‐of‐domain classification strongly suggest that our system is robust and achieves a high degree of accuracy even for new species, so that other settings (e.g., field studies) can be considered.
We design and implement a video action classification system which is able to automatically detect three behavioral states of African ungulates. We achieve to overcome the challenges caused by bad video material (like huge amount of truncation, low frame rates, and night vision) which might be called the standard case in behavioral studies.