Ph.D. Dissertation Defense

Rating AI Models for Robustness through a Causal Lens

Kausik Lakkaraju

University of South Carolina, Columbia
Spring 2026

Thesis Statement

"Through my dissertation, I introduce a causally grounded, extensible, approach for rating AI models for robustness by detecting their sensitivity to input perturbations and protected attributes, quantifying this behavior, and translating it into user-understandable ordinal ratings (trust certificates). "

Abstract

While AI models have become increasingly accessible through chatbots and diverse applications, their "black-box" nature and sensitivity to input perturbations create significant challenges for interpretability and trust. Existing correlation-based robustness metrics often fail to adequately explain model errors or isolate causal effects. To address this limitation, I propose a causally-grounded framework for rating AI models based on their robustness. This method evaluates robustness by quantifying statistical and confounding biases, as well as measuring the impact of perturbations across diverse AI tasks.

The framework produces raw scores that quantify model robustness and translated them into ratings, empowering developers and users to make informed decisions regarding model robustness. These ratings complement traditional explanation methods to provide a holistic view of model behavior. Additionally, the method aids in the assessment and construction of robust composite models by guiding the selection and combination of primitive models. User studies confirm that these ratings reduce the cognitive load for users comparing AI models in time-series forecasting tasks, and also foster trust in the construction of efficient, low-cost composite AI models.

Causal Framework Diagram

Rating Workflow

From Predictions to Ratings

Research Questions

1

Robustness Detection

How can one detect instability - lack of robustness - of AI models in a general manner?

2

Robustness Measurement

Can we have a principled, extensible, method to measure the robustness of AI models?

3

Rating Measurement

How to create extensible rating methods?

3a

[Rating Method] Can we build a method to issue relative ratings to a model with respect to baselines, in a general manner?

3b

[Method Evaluation / Usability] Is the method effective in helping users understand model behavior for selecting a model?

3c

[General Tool for Rating] Can a general tool be built to rate and compare AI models across different tasks and domains?

4

Rating in the Context of Explainability

What is the need for AI ratings if there are already explanations for the AI model? Conversely, what is the need for explanation, if there are ratings?

5

Rating Composition

How can one calculate the ratings of composite AI based on the ratings of individual constituent models?

Dissertation Proposal

Presentation Deck (Scroll to navigate or use toolbar)

Loading Slides...

Scroll to change pages Download PDF

Dissertation Committee

Major Professor

Dr. Biplav Srivastava

Major Professor

Department of Computer Science

Committee Chair

Dr. Marco Valtorta

Committee Chair

Department of Computer Science

Committee Member

Dr. Dezhi Wu

Committee Member

Department of Integrated Information Technology

Committee Member

Dr. Vignesh Narayanan

Committee Member

Department of Computer Science

Committee Member

Dr. Sunandita Patra

Committee Member

AI Research Lead, J.P. Morgan AI Research

Get in Touch

Questions about the research or interested in collaboration?

© 2026 Kausik Lakkaraju. All rights reserved.