“Applications of Adversarial robustness”
Thursday March 9, 2023 | Time 16:30 CET
When we deploy models trained by standard training (ST), they work well on natural test data. However, those models cannot handle adversarial test data (also known as adversarial examples) that are algorithmically generated by adversarial attacks. An adversarial attack is an algorithm which applies specially designed tiny perturbations on natural data to transform them into adversarial data, in order to mislead a trained model and let it give wrong predictions. Adversarial robustness is aimed at improving the robust accuracy of trained models against adversarial attacks, which can be achieved by adversarial training (AT). What is AT? Given the knowledge that the test data may be adversarial, AT carefully simulates some adversarial attacks during training. Thus, the model has already seen many adversarial training data in the past, and hopefully it can generalize to adversarial test data in the future. AT has two purposes: (1) correctly classify the data (same as ST) and (2) make the decision boundary thick so that no data lie nearby the decision boundary. In this talk, I will introduce how to leverage adversarial attacks/training for evaluating/enhancing reliabilities of AI-powered tools.
Jingfeng Zhang is a researcher in RIKEN-AIP at “Imperfect Information Learning Team’’ supervised by Prof. Masashi Sugiyama. Prior to RIKEN-AIP, Jingfeng obtained his Ph.D. degree (in 2020) under Prof. Mohan Kankanhalli at School of Computing in the National University of Singapore, and his Bachelor’s Degree (in 2016) at Taishan College in Shandong University, China. Jingfeng is the receiver of Strategic Basic Research Programs ACT-X 2021-2023 funding, JSPS Grants-in-Aid for Scientific Research (KAKENHI), Early-Career Scientists 2022-2023, and the RIKEN Ohbu Award 2022. Jingfeng serves as a reviewer for prestigious ML conferences such as ICLR, ICML, NeurIPS, etc. Jingfeng’s long-term research interest is making artificial intelligence safe for human beings.