Skip to main content
Erschienen in:
Buchtitelbild

2024 | OriginalPaper | Buchkapitel

Baheta: Balanced and Unbalanced Dataset in Arabic Clickbait Detection Using a Deep Learning Model (LSTM)

verfasst von : Batool Alharbi, Razan Alhanaya, Deem Alqarawi, Ruwaidah Alnejaidi

Erschienen in: Advances in Emerging Information and Communication Technology

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The term “Clickbait” refers to content that has the express intention of grabbing the reader’s attention. It has become an annoyance for social media users because of the deception contained in the titles of Clickbait. Many studies detect Clickbait using deep learning (DL) and machine learning (ML) models. However, detecting Clickbait in Arabic titles was addressed by a few studies, all of which used ML techniques. This is where our research originated from. In our proposed work Baheta, which is an Arabic synonymous with lying, we suggest utilizing a deep learning model called long short-term memory (LSTM) to identify clickbait in Arabic headlines. In order to extract features from the text, we utilized Word2vec. In this study, we train the model on two Arabic datasets, the first is an unbalanced dataset and the second is a balanced dataset, which is about merging the unbalanced dataset with a fake news dataset. Word2vec provided the model with the best results, with a Macro-F value of 0.79 when applied to the unbalanced (raw) dataset. The LSTM model showed better performance with the unbalanced dataset, as it obtained a higher Macro-F value of 0.02 than that obtained by the LSTM with the balanced dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
3.
Zurück zum Zitat P. Ongsulee, Artificial intelligence, machine learning and deep learning, in 2017 15th International Conference on ICT and Knowledge Engineering (ICT&KE), (IEEE, 2017) P. Ongsulee, Artificial intelligence, machine learning and deep learning, in 2017 15th International Conference on ICT and Knowledge Engineering (ICT&KE), (IEEE, 2017)
4.
Zurück zum Zitat M. Al-Sarem et al., An improved multiple features and machine learning-based approach for detecting Clickbait news on social networks. Appl. Sci. 11(20), 9487 (2021)CrossRef M. Al-Sarem et al., An improved multiple features and machine learning-based approach for detecting Clickbait news on social networks. Appl. Sci. 11(20), 9487 (2021)CrossRef
6.
Zurück zum Zitat C. Oliva et al., Rumor and clickbait detection by combining information divergence measures and deep learning techniques, in Proceedings of the 17th International Conference on Availability, Reliability and Security, (2022) C. Oliva et al., Rumor and clickbait detection by combining information divergence measures and deep learning techniques, in Proceedings of the 17th International Conference on Availability, Reliability and Security, (2022)
7.
Zurück zum Zitat V. Vorakitphan, F.-Y. Leu, Y.-C. Fan, Clickbait detection based on word embedding models, in Innovative Mobile and Internet Services in Ubiquitous Computing: Proceedings of the 12th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2018), (Springer, 2019) V. Vorakitphan, F.-Y. Leu, Y.-C. Fan, Clickbait detection based on word embedding models, in Innovative Mobile and Internet Services in Ubiquitous Computing: Proceedings of the 12th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2018), (Springer, 2019)
8.
Zurück zum Zitat S. Kaur, P. Kumar, P. Kumaraguru, Detecting clickbaits using two-phase hybrid CNN-LSTM biterm model. Expert Syst. Appl. 151, 113350 (2020)CrossRef S. Kaur, P. Kumar, P. Kumaraguru, Detecting clickbaits using two-phase hybrid CNN-LSTM biterm model. Expert Syst. Appl. 151, 113350 (2020)CrossRef
9.
Zurück zum Zitat H. Zheng et al., Clickbait Convolutional Neural Network. Symmetry (Multidisciplinary Digital Publishing Institute (MDPI), 2018) H. Zheng et al., Clickbait Convolutional Neural Network. Symmetry (Multidisciplinary Digital Publishing Institute (MDPI), 2018)
10.
Zurück zum Zitat M.A. Shaikh, S. Annappanavar, A comparative approach for clickbait detection using deep learning, in 2020 IEEE Bombay Section Signature Conference (IBSSC), (IEEE, 2020) M.A. Shaikh, S. Annappanavar, A comparative approach for clickbait detection using deep learning, in 2020 IEEE Bombay Section Signature Conference (IBSSC), (IEEE, 2020)
11.
Zurück zum Zitat M.A. Bsoul, A. Qusef, S. Abu-Soud, Building an optimal dataset for arabic fake news detection. Procedia Comput. Sci. 201, 665–672 (2022)CrossRef M.A. Bsoul, A. Qusef, S. Abu-Soud, Building an optimal dataset for arabic fake news detection. Procedia Comput. Sci. 201, 665–672 (2022)CrossRef
16.
Zurück zum Zitat A. Luque et al., The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recogn. 91, 216–231 (2019)CrossRef A. Luque et al., The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recogn. 91, 216–231 (2019)CrossRef
Metadaten
Titel
Baheta: Balanced and Unbalanced Dataset in Arabic Clickbait Detection Using a Deep Learning Model (LSTM)
verfasst von
Batool Alharbi
Razan Alhanaya
Deem Alqarawi
Ruwaidah Alnejaidi
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-53237-5_1

Premium Partner