nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

Baheta: Balanced and Unbalanced Dataset in Arabic Clickbait Detection Using a Deep Learning Model (LSTM)

verfasst von : Batool Alharbi, Razan Alhanaya, Deem Alqarawi, Ruwaidah Alnejaidi

Erschienen in: Advances in Emerging Information and Communication Technology

Verlag: Springer Nature Switzerland

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The term “Clickbait” refers to content that has the express intention of grabbing the reader’s attention. It has become an annoyance for social media users because of the deception contained in the titles of Clickbait. Many studies detect Clickbait using deep learning (DL) and machine learning (ML) models. However, detecting Clickbait in Arabic titles was addressed by a few studies, all of which used ML techniques. This is where our research originated from. In our proposed work Baheta, which is an Arabic synonymous with lying, we suggest utilizing a deep learning model called long short-term memory (LSTM) to identify clickbait in Arabic headlines. In order to extract features from the text, we utilized Word2vec. In this study, we train the model on two Arabic datasets, the first is an unbalanced dataset and the second is a balanced dataset, which is about merging the unbalanced dataset with a fake news dataset. Word2vec provided the model with the best results, with a Macro-F value of 0.79 when applied to the unbalanced (raw) dataset. The LSTM model showed better performance with the unbalanced dataset, as it obtained a higher Macro-F value of 0.02 than that obtained by the LSTM with the balanced dataset.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Nächstes Kapitel Introducing a Vision of Regulation More Complex Than the Traditional One

Satista. Number of Internet and Social Media Users Worldwide as of January 2023. [January 2023 May 2023]; Available from: https://www.statista.com/statistics/617136/digital-population-worldwide/

D. Ruby, 69+ Fake News Statistics Revealed For 2023 (Updated). [April 17, 2023 May 2023]; Available from: https://www.demandsage.com/fake-news-statistics/

P. Ongsulee, Artificial intelligence, machine learning and deep learning, in 2017 15th International Conference on ICT and Knowledge Engineering (ICT&KE), (IEEE, 2017)

M. Al-Sarem et al., An improved multiple features and machine learning-based approach for detecting Clickbait news on social networks. Appl. Sci. 11(20), 9487 (2021)CrossRef

M. Yaseen, Arabic Fake News Dataset (AFND). 2022. Available from: https://www.kaggle.com/datasets/murtadhayaseen/arabic-fake-news-dataset-afnd?resource=download

C. Oliva et al., Rumor and clickbait detection by combining information divergence measures and deep learning techniques, in Proceedings of the 17th International Conference on Availability, Reliability and Security, (2022)

V. Vorakitphan, F.-Y. Leu, Y.-C. Fan, Clickbait detection based on word embedding models, in Innovative Mobile and Internet Services in Ubiquitous Computing: Proceedings of the 12th International Conference on Innovative Mobile and Internet Services in Ubiquitous Computing (IMIS-2018), (Springer, 2019)

S. Kaur, P. Kumar, P. Kumaraguru, Detecting clickbaits using two-phase hybrid CNN-LSTM biterm model. Expert Syst. Appl. 151, 113350 (2020)CrossRef

H. Zheng et al., Clickbait Convolutional Neural Network. Symmetry (Multidisciplinary Digital Publishing Institute (MDPI), 2018)

10.

M.A. Shaikh, S. Annappanavar, A comparative approach for clickbait detection using deep learning, in 2020 IEEE Bombay Section Signature Conference (IBSSC), (IEEE, 2020)

11.

M.A. Bsoul, A. Qusef, S. Abu-Soud, Building an optimal dataset for arabic fake news detection. Procedia Comput. Sci. 201, 665–672 (2022)CrossRef

12.

T. Wals, Clickbait-Fake-News-Dataset. 2023. Available from: https://github.com/RazanALhanaya/Clickbait-Fake-News-Dataset

13.

Ruqiya. Ruqia Library. Available from: https://github.com/Ruqyai/Ruqia-Library

14.

T. Zerrouki, Pyarabic Library. 2023. Available from: https://github.com/linuxscout/pyarabic

15.

A.B. Soliman, AraVec 3.0. Available from: https://github.com/bakrianoo/aravec

16.

A. Luque et al., The impact of class imbalance in classification performance metrics based on the binary confusion matrix. Pattern Recogn. 91, 216–231 (2019)CrossRef

Titel: Baheta: Balanced and Unbalanced Dataset in Arabic Clickbait Detection Using a Deep Learning Model (LSTM)
verfasst von: Batool Alharbi
Razan Alhanaya
Deem Alqarawi
Ruwaidah Alnejaidi
Verlag: Springer Nature Switzerland
Buch: Advances in Emerging Information and Communication Technology
Print ISBN: 978-3-031-53236-8

Electronic ISBN: 978-3-031-53237-5

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-3-031-53237-5_1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner