Exploring the bounds of Interpretability via Topic Models

Topic Models are inherently interpretable, with topics and their representation embodying identifiable concepts. In the quest for responsible artificial intelligence, there is potential to leverage Topic Modelling methodologies to achieve model-agnostic explainable techniques. However, measuring interpretability is difficult as it is often subjective. This thesis looks at the interpretability problem from the lenses of both humans and machines.

When evaluating for interpretability, the ground truth consists of human evaluation assessments, which might vary between individuals. This thesis incorporates a new user study approach and evaluation metrics through corpora statistics to challenge the robustness of human evaluation concerning interpretability.

On the interpretability of machines, there are ongoing debates on the futility of this endeavor. This thesis seeks to surmount the challenge by introducing new methods of analysis and evaluations via a data-first approach. Traditionally, data is the subject of analysis, and with a better understanding of data, we can now employ data as a tool to analyze models.

Finally, the subjects of these interpretability methods should be aligned. The design of human user studies should reflect the complexities captured by models. Likewise, methods for interpreting models should optimize for human consumption. The exploration of this middle-ground, in a data-first approach, advances explainable artificial intelligence.