Multimodal Hateful Content Moderation

The ubiquity of internet memes has established itself in modern culture and communication, possessing a dangerous dual nature that transcends mere humor. While memes can be a light-hearted form of communication and tighten social connection, a negative aspect emerges through the proliferation of hateful memes. These hateful digital expressions not only contribute to heightened social tensions but also reinforce stereotypes and spread misinformation.

Ming Shan’s research thesis focuses on developing data-driven vision-language approaches that identify and explain these hateful memes, aiding in both curbing and comprehending such content. The use of data-driven approaches alongside feature importance analysis bolsters enhanced transparency and explainability, elucidating the model’s decision-making process and facilitating user understanding. This level of explainability is crucial for facilitating human-in-the-loop systems and fostering human trust, pivotal factors for model adoption in real-world content moderation systems. Importantly, it is essential to ensure an impartial content moderation system.

Therefore, he uses feature importance and gradient backpropagation analysis techniques to identify biases and understand modality interaction within models, supporting the development of a robust moderation system. His work highlights the significance of data-driven vision-language methodologies in addressing the multifaceted impact of internet memes on modern society, thereby contributing to online safety and trust in digital communication.