"Through my dissertation, I introduce a causally grounded, extensible, approach for rating AI models for robustness by detecting their sensitivity to input perturbations and protected attributes, quantifying this behavior, and translating it into user-understandable ordinal ratings (trust certificates). "
While AI models have become increasingly accessible through chatbots and diverse applications, their "black-box" nature and sensitivity to input perturbations create significant challenges for interpretability and trust. Existing correlation-based robustness metrics often fail to adequately explain model errors or isolate causal effects. To address this limitation, I propose a causally-grounded framework for rating AI models based on their robustness. This method evaluates robustness by quantifying statistical and confounding biases, as well as measuring the impact of perturbations across diverse AI tasks.
The framework produces raw scores that quantify model robustness and translated them into ratings, empowering developers and users to make informed decisions regarding model robustness. These ratings complement traditional explanation methods to provide a holistic view of model behavior. Additionally, the method aids in the assessment and construction of robust composite models by guiding the selection and combination of primitive models. User studies confirm that these ratings reduce the cognitive load for users comparing AI models in time-series forecasting tasks, and also foster trust in the construction of efficient, low-cost composite AI models.
From Predictions to Ratings
How can one detect instability - lack of robustness - of AI models in a general manner?
Can we have a principled, extensible, method to measure the robustness of AI models?
How to create extensible rating methods?
[Rating Method] Can we build a method to issue relative ratings to a model with respect to baselines, in a general manner?
[Method Evaluation / Usability] Is the method effective in helping users understand model behavior for selecting a model?
[General Tool for Rating] Can a general tool be built to rate and compare AI models across different tasks and domains?
What is the need for AI ratings if there are already explanations for the AI model? Conversely, what is the need for explanation, if there are ratings?
How can one calculate the ratings of composite AI based on the ratings of individual constituent models?
Presentation Deck (Scroll to navigate or use toolbar)
Loading Slides...
Major Professor
Department of Computer Science
Committee Chair
Department of Computer Science
Committee Member
Department of Integrated Information Technology
Committee Member
Department of Computer Science
Committee Member
AI Research Lead, J.P. Morgan AI Research
Photos from Dissertation Proposal Defense
© 2026 Kausik Lakkaraju. All rights reserved.