Abstract:
Recent research has demonstrated that classification models based on deep learning haveproducedoutstandingresultswhenappliedtothetaskofhumanactivitydetection. Because of this, the automatic detection of violent content within videos has become a cryingneedtostopthespreadofpotentiallyharmfulcontentacrossdigitalplatforms.In spite of the impressive effectiveness of neural networks, adversarial attacks developed sofarbynumerousresearchersagainstneuralnetworksshowsomesecurityholes.This hasbroughttolightthenecessityofanalyzingtheresilienceofthisstate-of-the-artvio- lence detection classifiers. Even such adversaries pose a risk to model credibility; they can also serve as defense mechanisms against AI inference attacks. For that reason,we propose a transferrable logit approach for binary misclassification of video data which can evade the system by our spatially perturbed synthesized adversarial samples. We validateanon-sparsewhite-boxattacksettingbyusinganadversarialfalsification-based threatmodel.Ourregularizer-basedattackutilizedtheL2normfortargetedmisclassifi- cation.Thissettingwillgeneratecross-domainadversarialvideosamplesbyperturbing only spatial features without affecting the temporal features. We conduct in-depth ex- perimentsonthevalidationsetsoftwowell-knownviolencedetectiondatasets,namely the Hockey Fight Dataset and the Movie Dataset, and show that our proposed algo- rithmachievesanattacksuccessrateof86.84%ontheHockeydatasetand87.54%on theMoviedatasetwhenappliedagainstthestate-of-the-artviolencedetectionclassifier. Out of a total of 64 samples in the Movie dataset, 56 led to an inaccurate label predic- tion by the model and 264 samples out of 304 video samples taken from the Hockey dataset induced the model to make an incorrect classification, indicating that the eva- sion attack is successful. Furthermore, to address the attack transferability issue, we observedthatahigherdegreeoftransferabilityneedstoaccountforahigherattacksuc- cess rate which is acquired by a higher attack budget, however, we attained the same level of attack success rate with an attack budget of ϵ = 50. We expect that the results of our work will contribute to making models of violence detection more resistant to adversarialexamples.