Analysing developers' sentiments of code reviews an empirical study

BUET ILS
BUET Institutional Repository: Home
→
Dissertations/Theses
→
Dissertations/Theses - Department of Computer Science and Engineering
→
View Item

dc.contributor.advisor	Iqbal, Dr. Anindya
dc.contributor.author	Toufique Ahmed
dc.date.accessioned	2017-07-11T09:40:44Z
dc.date.available	2017-07-11T09:40:44Z
dc.date.issued	2016-12
dc.identifier.uri	http://lib.buet.ac.bd:8080/xmlui/handle/123456789/4529
dc.description.abstract	The sentiment (i.e., general positive or negative attitude) towards another person, entity or event significantly influences a person’s decision-making process. It is one of the most important factors that influences interactions among the stakeholders for different application areas in many domains. Hence, various types of approaches have been proposed in order to detect sentiment accurately. However, it is not formally analyzed whether sentiments can impact the outcomes of that activity, expressed during software development activities, like peer code review. The objective of this study is to identify the factors influencing review comments and the impact of sentiments on the outcomes of associated review requests. On this goal, we manually rated 1000 review comments to build a training dataset and used that dataset to evaluate eight sentiment analysis techniques. We found a model based on Gradient Tree Boosting (GTB), a supervised learning algorithm, providing the best accuracy to distinguish among positive, negative, and neutral review comments. To the best of our knowledge, this is the first approach that implemented supervised learning methods in the context of code review. We achieved as high as 74% accuracy in sentiment detection which is significantly higher than existing lexicon based analyzers (50% accuracy). We have also validated it with human raters. Using our GTB based model, we classified 10.7 million review comments from 10 popular open source projects. The results suggest that larger code reviews (e.g., measured in terms of number of files or code churn) are more likely to receive negative review comments and those negative review comments not only may increase review interval (i.e., time to complete a code review) but also may decrease code acceptance rate. Based on these findings, we recommend developers to avoid submitting large code review requests and to avoid authoring negative review comments. The results also suggest that the reviewers authoring higher number of negative review comments are likely to suffer from higher review intervals and lower acceptance rate. We also found that core developers are likely to author more negative review comments than peripheral developers. However, in case of receiving negative review comments, we did not find any discrepancy between the core developers and the peripheral developers.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering (CSE)	en_US
dc.subject	Project management-Computer programs	en_US
dc.title	Analysing developers' sentiments of code reviews an empirical study	en_US
dc.type	Thesis-MSc	en_US
dc.contributor.id	1014052015 P	en_US
dc.identifier.accessionNumber	115101
dc.contributor.callno	005.1/TOU/2016	en_US