Proposed Model for Opinion Mining in Arabic Social Media Networks

Hamza, Taher; M. Al-Zoghby, Aya; Salama, Reem

doi:10.21608/mjcis.2020.321073

Proposed Model for Opinion Mining in Arabic Social Media Networks

Document Type : Original Research Articles.

Authors

Computer Science Department, Faculty of Computer & Information systems, Mansoura University, Egypt

10.21608/mjcis.2020.321073

Abstract

Many research classified opinions into positive and negative with ignoring the neutral classes. Ignoring neutral classes in opinion mining had shown to be an inaccurate practice. Therefore, some previous studies recommended increasing this third class in future works to get better performance and higher accuracy. This work aims to investigate the social opinion mining in regards to Arabic Twitters. It uses neutral training examples in learning because it enables a better division between positive and negative examples, improve pre-processing stage for Arabic language because Arabic language itself is a big challenge, and develop a model for opinion mining in Arabic social media networks. The proposed model was established using a number of classifiers to classify tweets. It is built based on using machine learning on collected data from Twitter to classify tweets on two levels. In the first level, the tweets were organized into positive and negative. In the second, the neutral samples were used in the classification process to distinguish between positive and negative samples. Text pre-processing is the key factor to the sentiment analysis and classification, especially for highly complicated languages (with rich morphology) such as the Arabic language. When the tweets have various approaches of pre-processing, the results showed dissimilar levels of accuracy and also showed the importance of using neutral training examples to facilitate learning. Different experiments had been conducted, using 2,000 identified tweets (1000 positive tweets and 1000 negative ones) on different subjects matters. According to the outcomes from these experiments, the proposed model shows an enhancement in the classification results comparing with some previous works.

Keywords