A Inteligência Artificial e os desafios da Ciência Forense Digital no século XXI

Authors

DOI:

https://doi.org/10.1590/s0103-4014.2021.35101.009

Keywords:

Digital forensics, Artificial Intelligence, Machine learning, Social media, Fake news

Abstract

Digital Forensics emerged from the need to perform forensic tasks in the digital age. Its most recent challenges are related to the popularization of social media and were intensified by the advance of Artificial Intelligence. The generation of massive social media data made forensic analyses more complex, mainly due to improvements in computational models able to artificially create highly realistic content. Because of this, Artificial Intelligence techniques have been studied and used to process the massive volume of information. This paper discusses the challenges and opportunities associated with such methods and provides real case examples, as well as the problems that arise when using these approaches in sensitive contexts and how the scientific community has approached these topics. Finally, it draws future research paths to be explored.

Downloads

Download data is not yet available.

Author Biographies

  • Rafael Padilha, Universidade Estadual de Campinas. Instituto da Computação

    é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). Contribuiu igualmente no desenvolvimento do artigo.@ – rafael.padilha@ic.unicamp.br / https://orcid.org/0000-0003-1944-5475.

  • Antônio Theóphilo, Universidade Estadual de Campinas. Instituto da Computação

    é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). Contribuiu igualmente no desenvolvimento do artigo.@ – antonio.theophilo@ic.unicamp.br/ https://orcid.org/0000-0003-1408-0745.

  • Fernanda A. Andaló, Universidade Estadual de Campinas. Instituto da Computação

    é pesquisadora colaboradora do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – feandalo@ic.unicamp.br/ https://orcid.org/0000-0002-5243-0921.

  • Didier A. Vega-Oliveros, Universidade Estadual de Campinas. Instituto da Computação

    é pesquisador de pós-doutorado do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – davo@unicamp.br / https://orcid.org/0000-0001-9569-3775.

  • João P. Cardenuto, Universidade Estadual de Campinas. Instituto da Computação

    é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – phillipe.cardenuto@ic.unicamp.br/ https://orcid.org/0000-0002-8370-6329.

  • Gabriel Bertocco, Universidade Estadual de Campinas. Instituto da Computação

    é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – gabriel.bertocco@ic.unicamp.br/https://orcid.org/0000-0002-7701-7420.

  • José Nascimento, Universidade Estadual de Campinas. Instituto da Computação

    é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – jose.nascimento@ic.unicamp.br/ https://orcid.org/0000-0003-3450-6029.

  • Jing Yang, Universidade Estadual de Campinas. Instituto da Computação

    é doutorando do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – jing.yang@ic.unicamp.br/ https://orcid.org/0000-0002-0035-3960.

  • Anderson Rocha, Universidade Estadual de Campinas. Instituto da Computação

    é professor associado do Instituto da Computação da Universidade Estadual de Campinas (Unicamp). @ – anderson.rocha@ic.unicamp.br/ https://orcid.org0000-0002-4236-8212.

References

ACUNA, D. E.; BROOKES, P. S.; KORDING, K. P. Bioscience-scale automated detection of figure element reuse. Cold Spring Harbor Laboratory, fev. 2018.

ADADI, A.; BERRADA, M. Peeking inside the black-box: A survey on explainable

artificial intelligence (XAI). IEEE Access, v.6, p.52138-60, 2018.

ATLURI, G.; KARPATNE, A.; KUMAR, V. Spatio-temporal data mining: A survey of

problems and methods. ACM Comput. Surv., v.51, n.4, p.1-83, ago. 2018.

BALL, P.; MAXMEN, A. The epic battle against coronavirus misinformation and conspiracy theories. Nature, v.581, n.7809, p.371-4, 2020.

BOCCALETTI, S. et al. The structure and dynamics of multilayer networks. Physics

Reports, v.544, n.1, p.1-122, 2014.

BOERS, N. et al. Complex networks reveal global pattern of extreme-rainfall teleconnections. Nature, v.566, n.7744, p.373, 2019.

BUCCI, E. M. Automatic detection of image manipulations in the biomedical literature. Cell Death & Disease, Springer Science and Business Media LLC, v.9, n.3, mar.

CALDERS, T.; ŽLIOBAITE, I. Why unbiased computational processes can lead to

discriminative˙ decision procedures. In: Discrimination and privacy in the information

society. s.l.: Springer, 2013. p.43-57.

CASEY, E. Digital evidence and computer crime: Forensic science, computers, and the

internet. s.l.: Academic Press, 2011.

CHEN, E.; LERMAN, K.; FERRARA, E. Tracking social media discourse about the

covid-19 pandemic: Development of a public coronavirus twitter data set. JMIR Public

Health and Surveillance, v.6, n.2, p.e19273, 2020.

CHESNEY, B.; CITRON, D. Deep fakes: a looming challenge for privacy, democracy,

and national security. California Law Review, v.107, n.6, p.1753-820, 2019.

CHISUM, W. J.; TURVEY, B. Evidence dynamics: Locard’s exchange principle & crime reconstruction. Journal of Behavioral Profiling, v.1, n.1, p.1-15, 2000.

CINELLI, M. et al. The covid-19 social media infodemic. arXiv preprint, arXiv:2003.05004, 2020.

CUI, L.; WANG, S.; LEE, D. Same: sentiment-aware multi-modal embedding for detecting fake news. In: IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). s.l.: s.n., 2019. p.41-8.

DATTA, A.; TSCHANTZ, M. C.; DATTA, A. Automated experiments on ad privacy

settings. Proceedings on Privacy Enhancing Technologies, v.2015, n.1, p.92-112, 2015.

ESTER, M. et al. A density-based algorithm for discovering clusters in large spatial databases with noise. Kdd., v.96, n.34, p.226-31, 1996.

FERREIRA, A. et al. Counteracting the contemporaneous proliferation of digital forgeries and fake news. Anais da Academia Brasileira de Ciências, v.91, n.1, p.e20180149,

FERREIRA, L. N. et al. Spatiotemporal data analysis with chronological networks. Nature Communications, v.11, n.1, p.1-11, 2020.

FROSST, N.; HINTON, G. Distilling a neural network into a soft decision tree. arXiv

preprint, arXiv:1711.09784, 2017.

GILPIN, L. H. et al. Explaining explanations: An overview of interpretability of machine learning. In: IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE

AND ADVANCED ANALYTICS (DSAA). s.l.: s.n., 2018. p.80-9.

GUNNING, D. et al. XAI—Explainable artificial intelligence. Science Robotics, v.4,

n.37, 2019.

GUPTA, A.; LAMBA, H.; KUMARAGURU, P. $1.00 per rt #bostonmarathon

#prayforboston: Analyzing fake content on twitter. In: 2013 APWG eCrime Researchers

Summit. s.l.: s.n., 2013. p.1-12.

HENDRICKS, L. A. et al. Generating visual explanations. In: EUROPEAN CONFERENCE ON COMPUTER VISION (ECCV). s.l.: s.n., 2016. p.3-19.

HERNANDEZ-SUAREZ, A. et al. A web scraping methodology for bypassing twitter

API restrictions. arXiv preprint, arXiv:1803.09875, 2018.

HOU, B.-J.; ZHOU, Z.-H. Learning with interpretable structure from gated RNN.

IEEE Transactions on Neural Networks and Learning Systems, v.31, n.7, p.2267-79, 2020.

JANG, S. M. et al. A computational approach for examining the roots and spreading

patterns of fake news: Evolution tree analysis. Computers in Human Behavior, v.84,

p.103-13, 2018.

JIANG, Z. et al. Focal-test-based spatial decision tree learning. IEEE Trans. Knowl.

Data Eng., v.27, n.6, p.1547-59, 2015.

JIN, Z. et al. Multimodal fusion with recurrent neural networks for rumor detection on

microblogs. In: ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA. s.l.:

s.n., 2017. p.795-816.

JUOLA, P. Authorship attribution. Foundations and Trends® in Information Retrieval,

v.1, n.3, p.233-334, 2008.

KAHNEMAN, D. Thinking, fast and slow. s.l.: Macmillan, 2011.

KHATTAR, D. et al. Mvae: Multimodal variational autoencoder for fake news detection.

In: THE WORLD WIDE WEB CONFERENCE (WWW). s.l.: s.n., 2019. p.2915-21.

KOPPEL, M.; SCHLER, J.; ARGAMON, S. Computational methods in authorship

attribution. Journal of the American Society for information Science and Technology, v.60,

n.1, p.9-26, 2009.

LAMERI, S. et al. Who is my parent? reconstructing video sequences from partially

matching shots. In: IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP). s.l.: s.n., 2014.

LEFÈVRE, T. Big data in forensic science and medicine. Journal of Forensic and Legal

Medicine, v.57, p.1-6, 2018.

MARRA, F. et al. Do GANs leave artificial fingerprints? In: IEEE CONFERENCE ON

MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR). s.l.:

s.n., 2019.

MCFARLAND, D. A.; LEWIS, K.; GOLDBERG, A. Sociology in the era of big data: The ascent of forensic social science. The American Sociologist, v.47, n.1, p.12-35, 2016.

MIDDLETON, S. E.; PAPADOPOULOS, S.; KOMPATSIARIS, Y. Social computing

for verifying social media content in breaking news. IEEE Internet Computing, v.22,

n.2, p.83-9, 2018.

NELSON, G. S. Bias in artificial intelligence. North Carolina Medical Journal, v.80,

n.4, p.220-2, 2019.

NGUYEN, D. T. et al. Automatic image filtering on social networks using deep learning and perceptual hashing during crises. arXiv preprint, arXiv:1704.02602, 2017.

NTOUTSI, E. et al. Bias in data-driven artificial intelligence systems—an introductory

survey. WIREs Data Mining and Knowledge Discovery, v.10, n.3, p.e1356, 2020.

OMEZI, N.; JAHANKHANI, H. Proposed forensic guidelines for the investigation of

fake news. In: Policing in the Era of AI and Smart Societies. s.l.: s.n., 2020. p.231-65.

PADILHA, R. et al. Unraveling the notre dame cathedral fire in space and time: an

x-coherence approach. In: A ser publicado em Crime Science and Digital Forensics: A

Holistic View. s.l.: CRC Press, 2021.

PADILHA, R.; ANDALÓ, F. A.; ROCHA, A. Improving the chronological sorting of

images through occlusion: A study on the notre-dame cathedral fire. In: IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). s.l.: s.n., 2020. p.2972-6.

PADILHA, R. et al. Forensic event analysis: From seemingly unrelated data to understanding. IEEE Security and Privacy, v.18, n.6, p.23-32, 2020.

PINHEIRO, G. et al. Detection and synchronization of video sequences for event reconstruction. In: IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP). s.l.: s.n., 2019.

POLLITT, M. A history of digital forensics. In: International Conference on Digital

Forensics (IFIP). s.l.: s.n., 2010. p.3-15.

PRATES, M. O.; AVELAR, P. H.; LAMB, L. C. Assessing gender bias in machine

translation: a case study with Google Translate. Neural Computing and Applications,

p. 1-19, 2019.

QI, C.; ZHANG, J.; LUO, P. Emerging concern of scientific fraud: Deep learning and

image manipulation. Cold Spring Harbor Laboratory, nov. 2020.

RIBEIRO, M. T.; SINGH, S.; GUESTRIN, C. “Why should I trust you?” Explaining the predictions of any classifier. In: ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING. s.l.: s.n., 2016.

p.1135-144.

ROCHA, A. et al. Authorship attribution for social media forensics. IEEE Transactions

on Information Forensics and Security, v.12, n.1, p.5-33, 2017.

RODRIGUES, C. M. et al. Image semantic representation for event understanding. In:

IEEE International Workshop on Information Forensics and Security (WIFS). s.l.: s.n.,

p.1-6.

ROSSETTI, G.; CAZABET, R. Community discovery in dynamic networks: A survey.

ACM Comput. Surv., v.51, n.2, fev. 2018. ISSN 0360-0300. Disponível em: <https://doi.org/10.1145/3172867>.

RUDER, S.; GHAFFARI, P.; BRESLIN, J. G. Character-level and multi-channel convolutional neural networks for large-scale authorship attribution. arXiv preprint, arXiv:1609.06686, 2016.

SAMMES, T.; JENKINSON, B. Forensic computing. s.l.: Springer, 2007.

SCHEIRER, W. A pandemic of bad science. Bulletin of the Atomic Scientists, Informa

UK Limited, v.76, n.4, p.175-84, 2020.

SCHEUFELE, D. A.; KRAUSE, N. M. Science audiences, misinformation, and fake

news. Proceedings of the National Academy of Sciences, v.116, n.16, p.7662-9, jan. 2019.

SCHNEIDER, M.; CHANG, S. A robust content based digital signature for image authentication. In: IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP). s.l.: s.n., 1996. v.3, p.227-30, v.3.

SCHWARZ, S.; THEÓPHILO, A.; ROCHA, A. Emet: Embeddings from multilingual-

-encoder transformer for fake news detection. In: IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP). s.l.:

s.n., 2020. p.2777–81.

SELVARAJU, R. R. et al. Grad-cam: Visual explanations from deep networks via gradient-based localization. In: IEEE International Conference on Computer Vision and

Pattern Recognition (CVPR). s.l.: s.n., 2017. p.618-26.

SHRESTHA, P. et al. Convolutional neural networks for authorship attribution of

short texts. In: Conference of the European Chapter of the Association for Computational

Linguistics: Volume 2, Short Papers. s.l.: s.n., 2017. p.669-74.

SHU, K.; WANG, S.; LIU, H. Beyond news contents: The role of social context for

fake news detection. In: ACM International Conference on Web Search and Data Mining (WSDM). s.l.: s.n., 2019. p.312-20.

SONG, L. et al. Unsupervised domain adaptive re-identification: Theory and practice.

Pattern Recognition, v.102, p.107-73, 2020.

STAMATATOS, E. A survey of modern authorship attribution methods. Journal of the

American Society for information Science and Technology, v.60, n.3, p.538-56, 2009.

SUNDARARAJAN, M.; TALY, A.; YAN, Q. Axiomatic attribution for deep networks.

In: International Conference on Machine Learning (ICML). s.l.: s.n., 2017. p.3319-28.

THEÓPHILO, A.; PEREIRA, L. A.; ROCHA, A. A needle in a haystack? Harnessing onomatopoeia and user-specific stylometrics for authorship attribution of micro-messages. In: IEEE International Conference on Acoustics, Speech and Signal Processing

(ICASSP). s.l.: s.n., 2019. p.2692-6.

VAROL, O. et al. Online human-bot interactions: Detection, estimation, and characterization. arXiv preprint, arXiv:1703.03107, 2017.

VENKATESAN, R. et al. Robust image hashing. In: IEEE International Conference on

Image Processing (ICIP). s.l.: s.n., 2000. p.664-6.

WU, L.; RAO, Y. Adaptive interaction fusion networks for fake news detection. arXiv

preprint, arXiv:2004.10009, 2020.

XIANG, Z.; ACUNA, D. E. Scientific image tampering detection based on noise inconsistencies: A method and datasets. arXiv preprint, arXiv:2001.07799, 2020.

XIE, N. et al. Explainable deep learning: A field guide for the uninitiated. arXiv preprint, arXiv:2004.14545, 2020.

YANG, F. et al. Asymmetric co-teaching for unsupervised cross-domain person re-identification. In: AAAI. s.l.: s.n., 2020. p.12597-604.

ZELLERS, R. et al. From recognition to cognition: Visual commonsense reasoning. In:

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR).

s.l.: s.n., 2019. p.6720-31.

ZHAI, Y. et al. Ad-cluster: Augmented discriminative clustering for domain adaptive

person re-identification. In: IEEE International Conference on Computer Vision and

Pattern Recognition (CVPR). s.l.: s.n., 2020. p.9021-30.

ZHOU, X.; WU, J.; ZAFARANI, R. Safe: Similarity-aware multi-modal fake news detection. arXiv preprint, arXiv:2003.04981, 2020.

Published

2021-04-30

Issue

Section

Artificial Intelligence

How to Cite

Padilha, R., Theóphilo, A., Andaló, F. A., Vega-Oliveros, D. A., Cardenuto, J. P., Bertocco, G., Nascimento, J., Yang, J., & Rocha, A. (2021). A Inteligência Artificial e os desafios da Ciência Forense Digital no século XXI. Estudos Avançados, 35(101), 113-138. https://doi.org/10.1590/s0103-4014.2021.35101.009