Using Python to detect fake news about covid-19: challenges and possibilities
DOI:
https://doi.org/10.29397/reciis.v16i2.3253Keywords:
Fake news, Artificial Intelligence, Covid-19, Python, Misinformation.Abstract
This work aims to report strategies for collecting a dataset in Portuguese for training Artificial Intelligence models to automatically identify fake news about covid-19 disseminated during the pandemic, using Python code. We analyze a fake news detection method based on a Recurrent Neural Network and supervised learning. We selected a corpus with 7,200 texts collected on websites and news agencies by Monteiro et al. (2018), each one of them previously cataloged as true or false as a training and validation dataset. This model was used to detect fake news about covid-19 in a set of news collected and classified by the authors of this work. The hit rate was 70%.
References
ALVES, Marco Antônio Sousa; MACIEL, Emanuella Ribeiro Halfeld. O fenômeno das fake news: definição, combate e contexto. Internet & Sociedade, Rio de Janeiro, v. 1, n. 1, p. 144-171, 2020. Disponível em: https://revista.internetlab.org.br/o-fenomeno-das-fake-news-definicao-combate-e-contexto/. Acesso em: 15 nov. 2021.
AMPER ENERGIA HUMANA. We Are Social e HootSuite: Digital 2021 [resumo e relatório completo]. In: AMPER ENERGIA HUMANA. Amper: marketing e comunicação. São Paulo, 03 maio 2022. Disponível em:
https://www.amper.ag/post/we-are-social-e-hootsuite-digital-2021-resumo-e-relatório-completo. Acesso em: 16 maio 2021.
AVAAZ. IBOPE: 1 em cada 4 brasileiros pode não se vacinar contra a covid-19. [S. l.]: Avaaz, 07 set. 2020. Disponível em: https://secure.avaaz.org/campaign/po/brasileiros_nao_vacinar_covid/. Acesso em: 14 nov. 2021.
AVAAZ. O Brasil está sofrendo uma infodemia de covid-19. [S. l.]: Avaaz, 04 maio 2020. Disponível em: https://avaazimages.avaaz.org/brasil_infodemia_coronavirus.pdf. Acesso em: 25 maio 2022.
BOSELLI, Marco Aurélio. Reciis. Uberlândia, 25 maio 2022. Código Python. Disponível em: https://github.com/maboselli/reciis. Acesso em: 14 jun. 2022.
BRUCK, Mozahir Salomão. O jornalismo diante de novos cenários sociais: a imprensa e o surgimento da aids e do crack. São Paulo: Intermeios, 2015.
CHENG, Raymond. Text preprocessing with NLTK. In: TDS Editors; HUBERMAN, Ben; Kindig Caitlin (ed.). Towards Data Science, [s. l.], 29 jun. 2020. Disponível em: https://towardsdatascience.com/nlp-preprocessingwith-nltk-3c04ee00edc0. Acesso em: 28 nov. 2021.
CHOLLET, François. Deep learning with Python. New York: Manning Publications, 2018.
COLLINS COBUILD. Fake news. In: COLLINS. Collins Dictionary. Nova York: Harper Collins, c2022. Disponível em: https://www.collinsdictionary.com/dictionary/english/fake-news. Acesso em: 01 fev. 2022.
FERREIRA, Fernanda Vasques. O papel do factual nos processos de agendamento e de enquadramento no telejornalismo. 2018. 438 f., il. Tese (Doutorado em Comunicação) – Universidade de Brasília, Brasília, 2018. Disponível em: https://repositorio.unb.br/handle/10482/33073. Acesso em: 16 maio 2021.
GOMES, Wilson. O que são fake news?. Brasília, DF: INCT, 2020. 1 vídeo (38 min). Publicado pelo canal INCT em Democracia Digital. Disponível em: https://www.youtube.com/watch?v=8tvJ4cMt YXY]. Acesso em: 16 maio 2021.
GOODFELLOW, Ian; BENGIO, Yoshua; COURVILLE, Aaron. Deep learning. Massachusetts: The MIT Press, Cambridge, 2016.
HASSON, Erez. Bad bot report 2021: the pandemic of the internet. In: IMPERVA, 13 abr. 2021. Disponível em:
https://www.imperva.com/blog/bad-bot-report-2021-the-pandemic-of-the-internet/. Acesso em: 23 jan. 2022.
KERTYSOVA, Katarina. Artificial Intelligence and Disinformation: how AI changes the way disinformation is produced, disseminated, and can be countered. Security and Human Rights, Leiden, v. 29, p. 55-81, 2018. DOI: http://dx.doi.org/10.1163/18750230-02901005. Disponível em: https://brill.com/view/journals/shrs/29/1-4/article-p55_55.xml. Acesso em: 14 jun. 2022.
KHAN, Junaed Younus et al. A benchmark study on Machine Learning Models for online fake news detection. arXiv, Ithaca, 12 maio 2019. Disponível em: https://arxiv.org/abs/1905.04749. Acesso em: 06 jun. 2022.
MAYERS, Gabriel. Detecting fake news using Machine Learning. In: TEAM AV (ed.). Analytics Vidhya, [s. l.], 19 jun. 2020. Disponível em: https://medium.com/analytics-vidhya/detecting-fake-news-using-machinelearning-95efefab08e4. Acesso em: 07 mar. 2022.
MELLO, Patrícia Campos. A máquina do ódio: notas de uma repórter sobre fake news e violência digital. São Paulo: Companhia das Letras, 2020.
MONTEIRO, Rafael A. et al. Contributions to the study of fake news in Portuguese: new corpus and Automatic Detection Results. In: COMPUTATIONAL FOR PORTUGUESE LANGUAGE, 13., 24-26 set. 2018, Canela. Proceedings […]. Cham: Springer, 2018. p. 324-334. (Lecture Notes in Computer Science, v. 11122). DOI:
http://dx.doi.org/10.1007/978-3-319-99722-3_33. Disponível em: https://github.com/roneysco/Fake.br-Corpus. Acesso em: 27 fev. 2022.
NIELSEN, Michael A. Neural networks and deep learning. [S. l.], Determination Press, 2015. Disponível em: http://neuralnetworksanddeeplearning.com/index.html. Acesso em: 27 fev. 2022.
PENNAFORT, Roberta. É #FAKE que foto mostre caixão enterrado vazio para inflar dados de mortos por coronavírus em Manaus. G1, [s. l.], 30 abr. 2020. Fato ou Fake. Disponível em: https://g1.globo.com/fatoou-fake/coronavirus/noticia/2020/04/30/e-fake-que-foto-mostre-caixao-enterrado-vazio-para-inflar-dados-demortos-por-coronavirus-em-manaus.ghtml. Acesso em: 29 maio 2022.
PENNYCOOK, Gordon; RAND, David G. Lazy, not biased: susceptibility to partisan fake news is better explained by lack of reasoning than by motivated reasoning. Cognition, [s. l.], v. 188, p. 39-50, 2019. DOI:
https://doi.org/10.1016/j.cognition.2018.06.011. Disponível em: https://www.sciencedirect.com/science/article/abs/pii/S001002771830163X. Acesso em: 16 maio 2021.
PEREIRA, Denilson Alves. A survey of sentiment analysis in the portuguese language. Artificial Intelligence Review, [s. l.], v. 54, n. 2, p. 1087-1115, 2020. DOI: https://doi.org/10.1007/s10462-020-09870-1. Disponível em: https://link.springer.com/article/10.1007/s10462-020-09870-1. Acesso em: 25 maio 2022.
RECUERO, Raquel et al. Desinformação, mídia social e covid-19 no Brasil: relatório, resultados e estratégias de combate. Pelotas: MIDIARS – Grupo de Pesquisa em Mídia Discurso e Análise de
Redes Sociais, 2021. Relatório de pesquisa. Disponível em: https://wp.ufpel.edu.br/midiars/files/2021/05/Desinformac%CC%A7a%CC%83o-covid-midiars-2021-1.pdf. Acesso em: 14 jun. 2022.
RECUERO, Raquel; SOARES, Felipe Bonow; GRUZD, Anatoliy. Hyperpartisanship, Disinformation and Political Conversations on Twitter: The Brazilian Presidential Election of 2018. In: INTERNATIONAL AAAI CONFERENCE ON WEB AND SOCIAL MEDIA,14., 2020, Atlanta. Proceedings […]. Atlanta: AAAI Digital
Library, 2020. Disponível em: https://ojs.aaai.org//index.php/ICWSM/article/view/7324. Acesso em: 14 jun. 2022.
RICHARDSON, Leonard. Beautiful Soup Documentation. [S. l.]: Leonard Richardson, c2020. Disponível em: https://www.crummy.com/software/BeautifulSoup/bs4/doc/. Acesso em: 06 jun. 2022.
SHARMA, Sagar. Activation functions in neural networks: Sigmoid, tanh, Softmax, ReLU, Leaky ReLU explained!!! In: TDS Editors; HUBERMAN, Ben; Kindig Caitlin (ed.). Towards Data Science, [s. l.], 6 set. 2017. Disponível em: https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6. Acesso
em: 06 jun. 2022.
SILVA, Igor Fediczko; ARAÚJO, Rafael de Paula Aguiar. Campanhas políticas em tempos de hiperpolítica: um ensaio sobre Peter Sloterdijk e a campanha de 2018. Ponto e Vírgula, São Paulo, n. 26, p. 138-145, 2019. Disponível em: https://revistas.pucsp.br/index.php/pontoevirgula/article/view/51519/34074. Acesso em: 28 set. 2021.
SLOTERDIJK, Peter. No mesmo barco: ensaio sobre a hiperpolítica. São Paulo: Estação Liberdade, 1999.
SOUSA JUNIOR, João Henriques de; PETROLL, Martin de La Martinière; ROCHA, Rudimar Antunes da. Fake news e o comportamento on-line dos eleitores nas redes sociais durante a campanha presidencial brasileira de 2018. In: SEMINÁRIOS EM ADMINISTRAÇÃO, 22., 6-8 nov, São Paulo. Anais [...], São Paulo: SemeAd, 2019. Disponível em: https://login.semead.com.br/22semead/anais/resumo.php?cod_trabalho=501. Acesso em: 20 dez. 2021.
SOUZA, Frederico Dias; SOUZA FILHO, João Batista de Oliveira. Sentiment analysis on Brazilian Portuguese user reviews. arXiv, Ithaca, 10 dez. 2021. Disponível em: https://arxiv.org/pdf/2112.05459.pdf. Acesso em: 20 dez. 2021.
VARÃO, Rafiza. Há alguma novidade na ideia de fake news? In: SOS IMPRENSA. Blog SOS Imprensa. Brasília, DF, 18 out. 2017. Disponível em: https://sosimprensa.wordpress.com/2017/10/18/ha-algumanovidade-na-ideia-de-fake-news/. Acesso em: 03 jan. 2022.
VIRAHONDA, Sergio. An easy tutorial about sentiment analysis with deep learning and Keras. In: TDS Editors; HUBERMAN, Ben; Kindig Caitlin (ed.). Towards Data Science, [s. l.], 08 out. 2020. Disponível em: https://towardsdatascience.com/an-easy-tutorial-about-sentiment-analysis-with-deep-learning-and-keras-2bf52b9cba91. Acesso em: 03 jan. 2022.
VOLKOFF, Vladimir. Pequena história da desinformação: do cavalo de Troia à internet. Lisboa: Editorial Notícias, 2000.
WARDLE, Claire. Fake News. It’s complicated. In: FIRST DRAFT. First Draft Footnotes, [s. l.], 16 fev. 2017. Disponível em: https://medium.com/1st-draft/fake-news-its-complicated-d0f773766c79. Acesso em: 14 maio 2021.
WÓJCIK, Rafał. Unsupervised sentiment analysis. In: TDS Editors; HUBERMAN, Ben; Kindig Caitlin (ed.). Towards Data Science, [s. l.], 26 nov. 2019. Disponível em: https://towardsdatascience.com/unsupervised-sentiment-analysis-a38bf1906483. Acesso em: 25 de maio 2022.
Downloads
Published
How to Cite
Issue
Section
License
Author’s rights: The author retains unrestricted rights over his work.
Rights to reuse: Reciis adopts the Creative Commons License, CC BY-NC non-commercial attribution according to the Policy on Open Access to Knowledge by Oswaldo Cruz Foundation. With this license, access, download, copy, print, share, reuse, and distribution of articles is allowed, provided that it is for non-commercial use and with source citation, granting proper authorship credits and reference to Reciis. In such cases, no permission is required from the authors or editors.
Rights of authors’s deposit / self-archiving: The authors are encouraged to deposit the published version, along with the link of their article in Reciis, in institutional repositories.