😛 👩🏿‍🎨 🌦️ 俄语中有毒评论的定义 👨🏿‍🎤 ☪️ 🎴

如今，社交网络已成为在线和现实生活中的主要通信平台之一。表达包括有毒，侵略性和冒犯性评论在内的不同观点的自由可能对人们的观点和社会凝聚力产生长期的负面影响。因此，现代社会最重要的任务之一就是开发自动检测互联网上有毒信息以减少负面影响的手段。

本文介绍了如何解决俄语问题。作为数据源，我们使用了在Kaggle上匿名发布的数据集，另外还检查了注释的质量。为了创建分类模型，我们微调了两个版本的多语言通用句子编码器，“变形金刚”的双向编码器表示法和ruBERT。定制模型ruBERT显示F ₁ = 92.20％，是最好的分类结果。我们已经将训练有素的模型和代码示例公开提供。

1.简介

今天，使用高级深度学习技术可以很好地解决识别有毒评论的问题[1]，[35]。尽管有些作品直接调查了俄语中检测侮辱，有毒和仇恨言论的话题[2]，[8]，[17]，但只有一个公开可用的带有有毒俄语注释的数据集[5]。它在Kaggle上发布，没有对注释过程进行任何解释，因此，出于学术和实践目的，如果不进行额外的深入研究，它可能会不可靠。

本文致力于自动检测俄语中的有毒评论。对于此任务，我们检查了俄罗斯语言有毒评论数据集[5]的注释。然后，基于微调多语言通用句子编码器（M-USE）[48]，来自变压器的双向编码器表示（M-BERT）[13]和ruBERT [22]的预训练多语言版本创建分类模型。最精确的模型ruBERT-Toxic在有毒评论的二进制分类问题中显示F ₁ = 92,20％。可以从github下载生成的M-BERT和M-USE模型。

本文的结构如下。在第2节中我们简要介绍了有关该主题的其他作品以及可用的俄语数据集。在第3节中，我们概述了俄罗斯有毒评论数据集，并介绍了检查其注释的过程。在第4节中，我们描述了用于文本分类任务的语言模型的改进。在第5节中，我们描述了分类实验。最后，让我们讨论一下系统的性能以及未来的研究方向。

2.关于该主题的其他作品

已经进行了广泛的工作来检测各种数据源上的有毒评论。例如，Prabowo及其同事已使用朴素贝叶斯分类（NB），支持向量机（SVM）和集成决策树（RFDT）分类器来检测印度尼西亚语Twitter上的仇恨和冒犯性语言[34]。实验结果表明，带有字典字母组合符号的分层方法和SVM模型的准确性为68.43％。在由Founta [15]领导的团队的工作中，提出了一种基于GRU和预训练的GloVe嵌入的深度学习神经网络，用于对有毒文本进行分类。该模型在五个数据集上显示出很高的准确性，AUC范围从92％到98％。

越来越多的研讨会和竞赛致力于检测有毒，仇恨和令人反感的评论。例如，SemEval-2019的HatEval和OffensEval; HASOC在FIRE-2019;在GermEval-2019和GermEval-2018上共同确定冒犯性语言的任务; TRAC在COLING-2018。问题中使用的模型范围从传统的机器学习（例如SVM和逻辑回归）到深度学习（RNN，LSTM，GRU，CNN，CapsNet），包括注意力机制[45]，[49]以及高级模型，例如ELMo [31]，BERT [13]和USE [9]，[48]）。取得良好效果的大量团队[18]，[24]，[27]，[28]，[30]，[36]，[38]使用了列出的预训练语言模型中的嵌入。由于来自预训练模型的视图在分类中显示出很高的结果，因此它们被广泛用于后续研究中。例如，洛林大学的研究人员使用两种方法对Twitter消息进行了多类二进制分类：使用预训练的词汇嵌入训练DNN分类器，以及精心调整的预训练的BERT模型[14]。与基于FastText嵌入的CNN和双向LSTM神经网络相比，第二种方法显示出明显更好的结果。通过使用预训练的词汇嵌入和精心调整的预训练的BERT模型训练DNN分类器[14]。与基于FastText嵌入的CNN和双向LSTM神经网络相比，第二种方法显示出明显更好的结果。通过使用预训练的词汇嵌入和精心调整的预训练的BERT模型训练DNN分类器[14]。与基于FastText嵌入的CNN和双向LSTM神经网络相比，第二种方法显示出明显更好的结果。

尽管大量研究[7]，[33]，[41]专门用于研究俄语社交网络中的有毒和攻击行为，但对其自动分类的关注并不多。为了确定英语和俄语文本的攻击性，Gordeev使用了卷积神经网络和随机森林分类器（RFC）[17]。注释为侵略性的消息集包含大约1000条俄语消息和英语相同的消息，但尚未公开。经过训练的CNN模型显示了俄语文本二进制分类的准确性为66.68％。基于这些结果，作者得出结论，卷积神经网络和基于深度学习的方法在识别攻击性文本方面更有希望。Andruziak等人提出了一种无监督的概率方法，该方法采用了源词汇来对用乌克兰语和俄语编写的令人反感的YouTube评论进行分类[2]。作者已经发布了一个带有2,000个注释的手动标记数据集，但是它同时包含俄语和乌克兰语文本，因此不能直接用于研究俄语语言文本。

最近的一些研究集中在自动识别俄语社交网络中对移民和种族群体的态度上，包括基于身份的攻击识别。 Bodrunova与合著者在LiveJournal上研究了363,000种俄语出版物，主题是与其他国家相比对后苏联共和国移民的态度[8]。事实证明，在俄语博客中，移民并未引起重大讨论，也未受到最恶劣的待遇。同时，以完全不同的方式对待北高加索和中亚民族的代表。由别塞德诺夫（Bessudnov）领导的一组研究人员发现，俄罗斯人传统上对来自高加索地区和中亚地区的人们更怀有敌意。同时，乌克兰人和摩尔多瓦人被普遍认为是潜在的邻居[6]。根据科特佐娃（Koltsova）领导的集体的调查结果，对中亚民族和乌克兰代表的态度是最消极的[19]。尽管一些学术研究集中于确定有毒，令人反感和仇恨的言论，但没有一个作者公开公开其俄语数据集。据我们所知，俄语有毒评论数据集[5]是公共领域中唯一的俄语有毒评论集。但是，它在Kaggle上发布时并未描述其创建和注释过程，因此，如果不进行详细研究，则不建议在学术和实践项目中使用它。尽管一些学术研究集中于确定有毒，令人反感和仇恨的言论，但没有任何作者公开公开其俄语数据集。据我们所知，俄语有毒评论数据集[5]是公共领域中唯一的俄语有毒评论集。但是，它在Kaggle上发布时并未描述其创建和注释过程，因此，如果不进行详细研究，则不建议在学术和实践项目中使用它。尽管一些学术研究集中于确定有毒，令人反感和仇恨的言论，但没有任何作者公开公开其俄语数据集。据我们所知，俄语有毒评论数据集[5]是公共领域中唯一的俄语有毒评论集。但是，它在Kaggle上发布时并未描述其创建和注释过程，因此，如果不进行详细研究，则不建议在学术和实践项目中使用它。俄语有毒评论数据集[5]是公共领域中唯一的俄语有毒评论集。但是，它在Kaggle上发布时并未描述其创建和注释过程，因此，如果不进行详细研究，则不建议在学术和实践项目中使用它。俄语有毒评论数据集[5]是公共领域中唯一的俄语有毒评论集。但是，它在Kaggle上发布时并未描述其创建和注释过程，因此，如果不进行详细研究，则不建议在学术和实践项目中使用它。

由于很少有研究致力于对有毒俄语注释的定义，因此我们决定在“俄罗斯有毒注释数据集” [5]上评估深度学习模型的工作。我们尚无基于此数据源的任何分类研究。在最近的研究项目中，多语言BERT和多语言USE模型是最广泛和最成功的模型之一。而且只有他们正式支持俄语。我们选择使用微调作为学习转移方法，因为在最近的研究中它提供了最佳的分类结果[13]，[22]，[43]，[48]。

3.带有有害评论的数据集

将俄语设置为语言有毒评论数据集[5]是来自Dvach和Peekaboo网站的带注释评论的集合。它于2019年发布在Kaggle上，包含14,412条评论，其中4,826条被标记为有毒，而9,586条为无毒。注释的平均长度为175个字符，最小为21个字符，最大为7 403个字符。

为了检查注释的质量，我们手动注释了一些注释，并使用注释者之间的协议将它们与原始标签进行了比较。我们决定在达到注释者间协议的重要水平或较高水平时，将现有注释视为正确的。

首先，我们手动标记了3000条注释，并将生成的类标签与原始注释进行了比较。这些注释是由Yandex.Toloka众包平台的俄语用户编写的，该平台已在俄语文本的多项学术研究中使用[10]，[29]，[32]，[44]。作为标记的指南，我们使用了毒性识别说明以及在“拼图有毒评论分类挑战”中使用的其他属性。要求注释者确定文本中的毒性，每条注释都必须注明其水平。为了提高标记的准确性并限制欺骗的可能性，我们使用了以下技术：

我们根据注释者的回答为他们分配了一个级别来控制任务，并禁止提供错误答案的人。
响应速度过快的人员只能访问任务。
对主题任务的访问受限，不会连续多次输入正确的验证码。

每个注释都由3-8位注释者使用动态重叠技术进行注释。根据Yandex.Toloka的建议，使用Dawid-Skene方法[12]汇总结果。注释者之间的注释者之间的协议水平很高，Kripppendorfα系数为0.81。原始标签和我们聚合标签之间的科恩kappa系数为0.68，与显着水平的注释者之间的协议相符[11]。因此，我们决定将数据集的标记视为正确，尤其是考虑注释指令中可能存在的差异。

4.机器学习模型

4.1。基准方法

对于基线方法，我们采用了一种基本的机器学习方法和一种现代的神经网络方法。在这两种情况下，我们都做了一些准备工作：用关键字替换URL和昵称，删除标点符号，并用小写字母替换大写字母。

首先，我们应用了多项朴素贝叶斯（MNB）模型，该模型在文本分类问题中表现出色[16]，[40]。为了创建模型，我们采用了词袋和TF-IDF矢量化技术。第二个模型是双向长期短期记忆（BiLSTM）神经网络。对于埋层，我们预先训练的Word2Vec的嵌入（暗淡= 300）[25]基于RuTweetCorp [37]收集的俄语Twitter消息。在Word2Vec嵌入的顶部，我们添加了两个双向LSTM层。然后，我们添加了一个隐藏的完全连接层和一个S型输出层。为了减少过度拟合，我们向神经网络添加了具有高斯噪声的正则化层和排除层（Dropout）。我们使用Adam的优化器，其初始学习率为0.001，并将分类二元交叉熵作为损失函数。该模型使用固定嵌入进行了10个时期的训练。我们尝试在不同的时代解锁嵌入，同时降低学习率，但结果却更糟。原因可能是训练集的大小[4]。

4.2。BERT模型

现已正式提供多语言BERT _BASE模型的两个版本，但仅正式推荐Cased版本。 BERT _BASE接收不超过512个令牌的序列，并返回其表示形式。使用WordPiece [46]进行标记化，并进行初步的文本标准化和标点符号分离。来自MIPT的研究人员训练了BERT _BASE Cased并发布了ruBERT-俄语模型[22]。我们使用了两种模型-多语言BERT _BASE包含12个顺序转换块的Cased和ruBERT的隐藏大小为768，包含12个自动关注头和1.1亿个参数。使用[43]和官方存储库中的推荐参数执行微调阶段：三个学习时期，10％的预热阶段，最大序列长度128，数据包大小32，学习速率5e-5。

4.3。型号MUSE

作为输入，多语言USE _Trans接受不超过100个令牌的序列，而多语言USE _CNN接受不超过256个令牌的序列。 SentencePiece [20]标记化用于所有受支持的语言。我们使用了预训练的多语言使用_跨，它支持16种语言，包括俄语，包含一个编码器，转换器6转换层，8注意力头块，具有2048个滤波器尺寸，512隐藏的大小，我们还采用了预训练的多语种USE _CNN支持包括俄语在内的16种语言包含具有两个CNN层的CNN编码器，过滤器宽度（1、2、3、5）具有过滤器大小。对于这两种模型，我们将推荐参数与TensorFlow Hub页面：100个学习纪元，批量大小32个，学习率3e-4。

5.实验

我们比较了基线和学习转移方法：

多项式朴素贝叶斯分类器；
神经网络双向长期短期记忆（BiLSTM）；
来自变压器的双向编码器表示的多语言版本（M-BERT）；
鲁伯特;
两种版本的多语言通用句子编码器（M-USE）。

表中显示了测试集中训练模型的分类质量（20％）。所有经过调整的语言模型的准确性，召回率和度量F ₁均超出基线水平。ruBERT显示F ₁ = 92.20％，这是最好的结果。

有毒俄语的二进制分类注释：

系统	P	[R	˚F ₁
多国银行	87,01 %	81,22 %	83,21 %
BiLSTM	86,56 %	86,65 %	86,59 %
M − BERT_BASE − Toxic	91,19 %	91,10 %	91,15 %
ruBert − Toxic	91,91 %	92,51 %	92,20 %
M − USE_CNN − Toxic	89,69 %	90,14%	89,91 %
M − USE_Trans − Toxic	90,85 %	91,92 %	91,35 %

6.

在本文中，我们使用了多语言通用句子编码器[48]的两个微调版本，来自变形金刚的多语言双向编码器表示法[13]和ruBERT [22]来识别有毒的俄语注释。调谐的鲁伯特_毒物显示F ₁ = 92.20％，是最好的分类结果。

生成的M-BERT和M-USE模型可在github上获得。

文学来源

清单

Aken, B. van et al.: Challenges for toxic comment classification: An in-depth error analysis. In: Proceedings of the 2nd workshop on abusive language online (ALW2). pp. 33–42. Association for Computational Linguistics, Brussels, Belgium (2018).
Andrusyak, B. et al.: Detection of abusive speech for mixed sociolects of russian and ukrainian languages. In: The 12th workshop on recent advances in slavonic natural languages processing, RASLAN 2018, karlova studanka, czech republic, december 7–9, 2018. pp. 77–84 (2018).
Basile, V. et al.: SemEval-2019 task 5: Multilingual detection of hate speech against immigrants and women in twitter. In: Proceedings of the 13th international workshop on semantic evaluation. pp. 54–63. Association for Computational Linguistics, Minneapolis, Minnesota, USA (2019).
Baziotis, C. et al.: DataStories at SemEval-2017 task 4: Deep LSTM with attention for message-level and topic-based sentiment analysis. In: Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017). pp. 747–754. Association for Computational Linguistics, Vancouver, Canada (2017).
Belchikov, A.: Russian language toxic comments, https://www.kaggle.com/ blackmoon/russian-language-toxic-comments.
Bessudnov, A., Shcherbak, A.: Ethnic discrimination in multi-ethnic societies: Evidence from russia. European Sociological Review. (2019).
Biryukova, E. V. et al.: READER’S comment in on-line magazine as a genre of internet discourse (by the material of the german and russian languages). Philological Sciences. Issues of Theory and Practice. 12, 1, 79–82 (2018).
Bodrunova, S. S. et al.: Who’s bad? Attitudes toward resettlers from the post-soviet south versus other nations in the russian blogosphere. International Journal of Communication. 11, 23 (2017).
Cer, D. M. et al.: Universal sentence encoder. ArXiv. abs/1803.11175, (2018).
Chernyak, E. et al.: Char-rnn for word stress detection in east slavic languages. CoRR. abs/1906.04082, (2019).
Cohen, J.: A coefficient of agreement for nominal scales. Educational and psychological measurement. 20, 1, 37–46 (1960).
Dawid, A. P., Skene, A. M.: Maximum likelihood estimation of observer errorrates using the em algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics). 28, 1, 20–28 (1979).
Devlin, J. et al.: BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers). pp. 4171–4186. Association for Computational Linguistics, Minneapolis, Minnesota (2019).
d’Sa, A. G. et al.: BERT and fastText embeddings for automatic detection of toxic speech. In: SIIE 2020-information systems and economic intelligence. (2020).
Founta, A. M. et al.: A unified deep learning architecture for abuse detection. In: Proceedings of the 10th acm conference on web science. pp. 105–114. Association for Computing Machinery, New York, NY, USA (2019).
Frank, E., Bouckaert, R.: Naive bayes for text classification with unbalanced classes. In: Fürnkranz, J. et al. (eds.) Knowledge discovery in databases: PKDD 2006. pp. 503–510. Springer Berlin Heidelberg, Berlin, Heidelberg (2006).
Gordeev, D.: Detecting state of aggression in sentences using cnn. In: International conference on speech and computer. pp. 240–245. Springer (2016).
Indurthi, V. et al.: FERMI at SemEval-2019 task 5: Using sentence embeddings to identify hate speech against immigrants and women in twitter. In: Proceedings of the 13th international workshop on semantic evaluation. pp. 70–74. Association for Computational Linguistics, Minneapolis, Minnesota, USA (2019).
Koltsova, O. et al.: FINDING and analyzing judgements on ethnicity in the russian-language social media. AoIR Selected Papers of Internet Research. (2017).
Kudo, T., Richardson, J.: SentencePiece: A simple and language independent subword tokenizer and detokenizer for neural text processing. In: Proceedings of the 2018 conference on empirical methods in natural language processing: System demonstrations. pp. 66–71. Association for Computational Linguistics, Brussels, Belgium (2018).
Kumar, R. et al. eds: Proceedings of the first workshop on trolling, aggression and cyberbullying (TRAC-2018). Association for Computational Linguistics, Santa Fe, New Mexico, USA (2018).
Kuratov, Y., Arkhipov, M.: Adaptation of deep bidirectional multilingual transformers for Russian language. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference «Dialogue». pp. 333–340. RSUH, Moscow, Russia (2019).
Lenhart, A. et al.: Online harassment, digital abuse, and cyberstalking in america. Data; Society Research Institute (2016).
Liu, P. et al.: NULI at SemEval-2019 task 6: Transfer learning for offensive language detection using bidirectional transformers. In: Proceedings of the 13th international workshop on semantic evaluation. pp. 87–91. Association for Computational Linguistics, Minneapolis, Minnesota, USA (2019).
Mikolov, T. et al.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems—volume 2. pp. 3111–3119. Curran Associates Inc., Red Hook, NY, USA (2013).
Mishra, P. et al.: Abusive language detection with graph convolutional networks. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: Human language technologies, volume 1 (long and short papers). pp. 2145–2150 (2019).
Mishra, S., Mishra, S.: 3Idiots at HASOC 2019: Fine-tuning transformer neural networks for hate speech identification in indo-european languages. In: Working notes of FIRE 2019—forum for information retrieval evaluation, kolkata, india, december 12–15, 2019. pp. 208–213 (2019).
Nikolov, A., Radivchev, V.: Nikolov-radivchev at SemEval-2019 task 6: Offensive tweet classification with BERT and ensembles. In: Proceedings of the 13th international workshop on semantic evaluation. pp. 691–695. Association for Computational Linguistics, Minneapolis, Minnesota, USA (2019).
Panchenko, A. et al.: RUSSE’2018: A Shared Task on Word Sense Induction for the Russian Language. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference «Dialogue». pp. 547–564. RSUH, Moscow, Russia (2018).
Paraschiv, A., Cercel, D.-C.: UPB at germeval-2019 task 2: BERT-based offensive language classification of german tweets. In: Preliminary proceedings of the 15th conference on natural language processing (konvens 2019). Erlangen, germany: German society for computational linguistics & language technology. pp. 396–402 (2019).
Peters, M. et al.: Deep contextualized word representations. In: Proceedings of the 2018 conference of the north American chapter of the association for computational linguistics: Human language technologies, volume 1 (long papers). pp. 2227–2237. Association for Computational Linguistics, New Orleans, Louisiana (2018).
Ponomareva, M. et al.: Automated word stress detection in Russian. In: Proceedings of the first workshop on subword and character level models in NLP. pp. 31–35. Association for Computational Linguistics, Copenhagen, Denmark (2017).
Potapova, R., Komalova, L.: Lexico-semantical indices of «deprivation–aggression» modality correlation in social network discourse. In: International conference on speech and computer. pp. 493–502. Springer (2017).
Prabowo, F. A. et al.: Hierarchical multi-label classification to identify hate speech and abusive language on indonesian twitter. In: 2019 6th international conference on information technology, computer and electrical engineering (icitacee). pp. 1–5 (2019).
Risch, J., Krestel, R.: Toxic comment detection in online discussions. In: Deep learning-based approaches for sentiment analysis. pp. 85–109. Springer (2020).
Risch, J. et al.: HpiDEDIS at germeval 2019: Offensive language identification using a german bert model. In: Preliminary proceedings of the 15th conference on natural language processing (konvens 2019). Erlangen, germany: German society for computational linguistics & language technology. pp. 403–408 (2019).
Rubtsova, Y.: A method for development and analysis of short text corpus for the review classification task. Proceedings of conferences Digital Libraries: Advanced Methods and Technologies, Digital Collections (RCDL’2013). Pp. 269–275 (2013).
Ruiter, D. et al.: LSV-uds at HASOC 2019: The problem of defining hate. In: Working notes of FIRE 2019—forum for information retrieval evaluation, kolkata, india, december 12–15, 2019. pp. 263–270 (2019).
Sambasivan, N. et al.: «They don’t leave us alone anywhere we go»: Gender and digital abuse in south asia. In: Proceedings of the 2019 chi conference on human factors in computing systems. Association for Computing Machinery, New York, NY, USA (2019).
Sang-Bum Kim et al.: Some effective techniques for naive bayes text classification. IEEE Transactions on Knowledge and Data Engineering. 18, 11, 1457–1466 (2006).
Shkapenko, T., Vertelova, I.: Hate speech markers in internet comments to translated articles from polish media. Political Linguistics. 70, 4, Pages 104–111 (2018).
Strus, J. M. et al.: Overview of germeval task 2, 2019 shared task on the identification of offensive language. Presented at the (2019).
Sun, C. et al.: How to fine-tune bert for text classification? In: Sun, M. et al. (eds.) Chinese computational linguistics. pp. 194–206. Springer International Publishing, Cham (2019).
Ustalov, D., Igushkin, S.: Sense inventory alignment using lexical substitutions and crowdsourcing. In: 2016 international fruct conference on intelligence, social media and web (ismw fruct). (2016).
Vaswani, A. et al.: Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems. pp. 6000–6010. Curran Associates Inc., Red Hook, NY, USA (2017).
Wu, Y. et al.: Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144. (2016).
Yang, F. et al.: Exploring deep multimodal fusion of text and photo for hate speech classification. In: Proceedings of the third workshop on abusive language online. pp. 11–18. Association for Computational Linguistics, Florence, Italy (2019).
Yang, Y. et al.: Multilingual universal sentence encoder for semantic retrieval. CoRR. abs/1907.04307, (2019).
Yang, Z. et al.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the north American chapter of the association for computational linguistics: Human language technologies. pp. 1480–1489. pp. Association for Computational Linguistics, San Diego, California (2016).

俄语中有毒评论的定义