Automatic recognition of aggressive pig behaviour requires consideration of spatial and temporal features due to its complexity. Researchers from China recently proposed a method that utilises the so-called temporal shift module.
For their study, the researchers selected 2 pens of mixed pigs (Duroc × Landrace × Yorkshire breed) with 8 pigs in each pen for 1 week of video recording. They installed a high definition 2D camera on the top of the pen to achieve vertical overhead recording.
The team annotated the videos manually to establish a dataset containing images of aggressive and non-aggressive behaviour. The dataset consisted of 5.220 videos of aggressive and 5.220 videous of non-aggressive behaviour.
Temporal shift module is a cross-frame processing module that enhances the feature extraction ability of 2D convolutional neural networks for video data. It thus improves the accuracy of video classification.
The team selected 4 high-performing models, including ResNet50, ResNeXt50, DenseNet201, and ConvNext-t to insert temporal shift module for the recognition of aggressive pig behaviour. ResNet50 is a deep convolutional neural network model composed of ResNet blocks adding the input and output to form a skip connection. It thereby alleviates the problems of gradient vanishing and network degradation.
ConvNext-t is also an improved model based on ResNet50. DenseNet-201 is a dense connected convolutional neural network model. Each layer is connected to all previous layers, thereby enhancing feature propagation and reuse. It reduces the number of parameters and computational complexity. Accuracy, recall, precision, F1-score, speed, and model parameters were used as evaluation metrics to evaluate the performance of the proposed model.
All 4 models exhibited a relatively smooth optimisation trend on the training set. ResNeXt50 showed slightly better convergence speed than the other models. The highest accuracy rates achieved by ResNet50, ResNeXt50, ConvNext-t, and DenseNet201 on the training set were 99.05%, 99.04%, 97.91%, and 98.57%, respectively. The highest accuracy rates achieved by ResNet50, ResNeXt50, ConvNext-tiny, and DenseNet201 on the validation set were 96.73%, 97.01%, 96.06%, and 96.49%.
Inclusion of temporal shift module improved the performance of the 4 convolutional neural network models significantly in recognising pig aggressive behaviour. ResNeXt50 achieved the highest accuracy, precision, and F1 score. This is consistent with its outstanding performance on the training and validation sets.
In addition, ResNeXt50 had fewer parameters than ResNet50 and ConvNext-tiny, and only slightly more parameters than DenseNet201. This indicates that ResNeXt50 can still maintain high accuracy while reducing the number of model parameters.
ResNeXt50 can extract the temporal features of aggressive behaviour effectively
ResNeXt50 judged the presence of aggressive behaviour based on significant displacement caused by interactions among 2 or more pigs. If there was no interaction, attention was directed towards the area where most pigs gathered. Therefore, the temporal features of aggressive behaviour is a critical factor in differentiating it from non-aggressive behaviour, and ResNeXt50 can extract the temporal features of aggressive behaviour effectively.
The authors concluded that the temporal shift module improves the recognition accuracy of aggressive behaviour. It also provides a new approach for pig behaviour recognition with high efficiency without increasing any additional model parameters.