SafeChat: A Framework for Building Trustworthy Collaborative Assistants and a Case Study of its Usefulness
Authors: Biplav Srivastava, Kausik Lakkaraju, Nitin Gupta, Vansh Nagpal, Bharath C Muppasani, Sara E Jones
Summary:
Modern chatbots powered by large language models (LLMs) are widely accessible but face issues like lack of transparency,
safety concerns, and complex development. These limitations make them unsuitable for sensitive areas like elections or healthcare.
To address this, we introduce SafeChat, a flexible and trustworthy chatbot framework built on Rasa.
It supports source-traceable answers, deflects unsafe queries, summarizes responses, and enables rapid development through a CSV-driven workflow.
We used it to build ElectionBot-SC and other safe assistants.
Project link: https://github.com/ai4society/trustworthy-chatbot
Publication Type: Unpublished Manuscript
Paper
|Bibtex
On Creating a Causally Grounded Usable Rating Method for Assessing the Robustness of Foundation Models Supporting Time Series
Authors: Kausik Lakkaraju, Rachneet Kaur, Parisa Zehtabi, Sunandita Patra, Siva Likitha Valluru, Zhen Zeng, Biplav Srivastava, Marco Valtorta
Summary:
Foundation Models have improved time-series forecasting but remain sensitive to input noise.
We propose a causal rating framework to evaluate their robustness using stock prediction as a
case study. Our findings show that multi-modal and task-specific foundation models are both more accurate and more robust.
Our user study confirmed that our ratings help users better compare model reliability.
Publication Type: Unpublished Manuscript
Paper
|Bibtex
ElectionBot-SC: A Tool to Understand and Compare Chatbot Behavior for Safe Election Information in South Carolina
Authors:Bharath Muppasani, Kausik Lakkaraju, Nitin Gupta, Vansh Nagpal, Sara Jones, Biplav Srivastava
Summary:
With the 2024 elections underway, getting trustworthy election info is critical. We present ElectionBot-SC, a chatbot that provides verified election guidance from official and nonprofit sources. It supports multiple engines—rule-based, search-based, and LLM—to answer queries while showing where the information comes from. It's currently being tested at a South Carolina university to help students and staff, including first-time voters. Demo video: https://shorturl.at/1A7cc
A Novel Approach to Balance Convenience and Nutrition in Meals With Long-Term Group Recommendations and Reasoning on Multimodal Recipes and its Implementation in BEACON
Authors:Vansh Nagpal, Siva Likitha Valluru, Kausik Lakkaraju, Nitin Gupta, Zach Abdulrahman, Andrew Davison, Biplav Srivastava
Summary:
Meal choices often involve trade-offs between health and convenience. We propose a data-driven system
that recommends customizable meals over time, balancing user preferences with nutrition and
preparation details. Key contributions include new meal quality measures, recipe conversion
to a rich multimodal format (R3), learning methods using contextual bandits, and a working
prototype called BEACON.
Publication Type: Unpublished Manuscript
Paper
|Bibtex
BEACON: Balancing Convenience and Nutrition in Meals With Long-Term Group Recommendations and Reasoning on Multimodal Recipes
Authors: Vansh Nagpal, Siva Likitha Valluru, Kausik Lakkaraju, Biplav Srivastava
Summary:
Choosing what to eat often means balancing health and convenience.
In this work, we tackle the meal recommendation problem by designing a system
that considers both nutrition and practicality, while also understanding ingredients
and cooking steps. We introduce a new way to rate meals, convert recipes into a rich format,
and use learning methods that adapt to user context, all showing early promise.
Publication Type: Unpublished Manuscript
Paper
|Bibtex
Rating Multi-Modal Time-Series Forecasting Models (MM-TSFM) for Robustness Through a Causal Lens
Authors: Kausik Lakkaraju, Rachneet Kaur, Zhen Zeng, Parisa Zehtabi, Sunandita Patra, Biplav Srivastava, Marco Valtorta
Summary:
AI forecasting models can behave unpredictably when inputs change slightly, which is risky in finance.
We study models that use both numbers and images (multi-modal) and propose a causal rating method to test how robust they are.
Across a large experiment, we find that multi-modal models (ViT-num-spec models) are not just more accurate, but also more reliable, making them a
better fit for decision-making under uncertainty.
Publication Type: Unpublished Manuscript
Paper
|Bibtex
Rating Sentiment Analysis Systems for Bias Through a Causal Lens
Authors: Kausik Lakkaraju, Biplav Srivastava, Marco Valtorta
Summary:
Sentiment Analysis Systems (SASs) analyze text emotions but can inaccurately change ratings over minor input variations,
showing potential bias towards attributes like gender or race.
We propose a method to evaluate and rate SASs on their sensitivity to these
attributes, aiming to help choose more fair and reliable systems and reduce bias-induced hate speech online.
Publication Type: Journal
Venue: IEEE Transactions on Technology and Society
Paper
|Bibtex
Trust and ethical considerations in a multi-modal, explainable AI-driven chatbot tutoring system: The case of collaboratively solving Rubik's Cube
Authors: Kausik Lakkaraju, Vedant Khandelwal, Biplav Srivastava, Forest Agostinelli, Hengtao Tang, Prathamjeet Singh, Dezhi Wu, Matt Irvin, Ashish Kundu
Summary:
AI can revolutionize education by analyzing vast data on student learning but faces unresolved ethical concerns,
such as data privacy and fairness, especially in high school settings. This paper introduces the ALLURE chatbot,
a platform designed to address these ethical issues, allowing students to collaboratively solve the Rubik's cube with AI.
Key features include prioritizing informed consent for data use and ensuring safe interaction and language use to protect students.
It also focuses on preventing information leakage between user groups as the system learns and improves.
Publication Type: Workshop
Venue: ICML Workshop on What’s left to TEACH (Trustworthy, Enhanced, Adaptable, Capable and Human-centric) chatbots?
Paper
|Bibtex
Advances in Automatically Rating the Trustworthiness of Text Processing Services
Authors: Biplav Srivastava, Kausik Lakkaraju, Mariana Bernagozzi, Marco Valtorta
Summary:
In this symposium paper, we talked about the previous approaches that were used to rate the trustworthiness of AI systems and we also
outlined the challenges and vision for a principled, causality-based, and multi-modal rating methodologies.
Publication Type: Journal, Symposium
Venue: AI and Ethics Journal; AAAI Spring Symposium
Paper
|Bibtex
LLMs for Financial Advisement: A Fairness and Efficacy Study in Personal Decision Making
Authors: Kausik Lakkaraju, Sara E Jones, Sai Krishna Revanth Vuruma, Vishal Pallagani, Bharath C Muppasani, Biplav Srivastava
Summary:
We compared ChatGPT and Bard, LLM-based chatbots, with SafeFinance, a rule-based chatbot,
in the personal finance domain. Our findings reveal that ChatGPT and Bard often provide inconsistent and
unreliable financial advice, while SafeFinance, though simpler,
offers dependable and accurate information. This study highlights the current limitations of
LLM-based chatbots in handling financial advisement tasks effectively.
Publication Type: Conference
Venue: Proceedings of the Fourth ACM International Conference on AI in Finance
Paper
|Bibtex
The Effect of Human v/s Synthetic Test Data and Round-tripping on Assessment of Sentiment Analysis Systems for Bias
Authors: Kausik Lakkaraju, Aniket Gupta, Biplav Srivastava, Marco Valtorta, Dezhi Wu
Summary:
Sentiment Analysis Systems (SASs), AI tools that analyze text sentiment, can show unstable and biased behavior, raising trust issues.
A new method rates these systems for bias using synthetic data. We enhanced this by using real chatbot conversations and a technique
that translates data through another language and back.
This revealed more bias in real compared to synthetic data, but translating through Spanish or Danish reduced bias significantly in real data.
Publication Type: Conference
Venue: The Fifth IEEE International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications
Paper
|Bibtex
Evaluating Chatbots to Promote Users' Trust -- Practices and Open Problems
Authors: Biplav Srivastava, Kausik Lakkaraju, Tarmo Koppel, Vignesh Narayanan, Ashish Kundu, Sachindra Joshi
Summary:
Chatbots have gained widespread attention, especially with the advent of LLM-based systems like ChatGPT and Bard.
As they become integral in business for engaging with customers, suppliers, and employees, ensuring their reliability
through thorough testing is crucial. This paper examines how chatbots are currently tested,
highlights the challenges in building user trust, and proposes directions for future research and development.
Publication Type: Unpublished Manuscript
Paper
|Bibtex
Can LLMs be Good Financial Advisors?: An Initial Study in Personal Decision Making for Optimized Outcomes
Authors: Kausik Lakkaraju, Sai Krishna Revanth Vuruma, Vishal Pallagani, Bharath Muppasani, Biplav Srivastava
Summary:
We tested advanced chatbots like ChatGPT and Bard on personal finance advice, using 13 questions in different
languages and dialects. Although the chatbots' answers sounded good,
we found they often lacked accuracy and reliability in providing financial information.
Publication Type: Workshop
Venue: ICAPS Workshop on Planning for Financial Services
Paper
|Bibtex
On Safe and Usable Chatbots for Promoting Voter Participation.
Authors: Bharath Muppasani, Vishal Pallagani, Kausik Lakkaraju, Shuge Lei, Biplav Srivastava, Brett Robertson, Andrea Hickerson, Vignesh Narayanan
Summary:
We created chatbots to help increase voting among seniors
and first-time voters by giving them easy access to trusted election information tailored
to their needs. Our system, built on the Rasa platform, ensures the information is reliable
and allows for quick chatbot setup for any region. We've tested these chatbots in two
US states where voting has been difficult, focusing on groups of senior citizens.
This project aims to support voters and democracy by making accurate election information more accessible.
Publication Type: Workshop
Venue: AAAI Workshop on AI for Credible Elections
Paper
|Bibtex
Why is my System Biased?: Rating of AI Systems through a Causal Lens
Authors: Kausik Lakkaraju
Summary:
This is a student paper which formulates my PhD dissertation problem and gives an overview of the solution.
Idea is to evaluate / rate AI systems for bias using causal analysis.
Publication Type: Doctoral Consortium
Venue: AIES
Paper
|Bibtex
ALLURE: A Multi-Modal Guided Environment for Helping Children Learn to Solve a Rubik’s Cube with Automatic Solving and Interactive Explanations
Authors: Kausik Lakkaraju, Thahimum Hassan, Vedant Khandelwal, Prathamjeet Singh, Cassidy Bradley, Ronak Shah, Forest Agostinelli, Biplav Srivastava, Dezhi Wu
Summary:
ALLURE is a Deep Reinforcement Learning based, multi-modal, explainable chatbot which teaches
children how to solve a Rubik’s Cube and allows the children to interact with the multi-modal chatbot while trying to solve the Cube.
Publication Type: Demonstration
Venue: AAAI
Paper
|Bibtex
|Video
Data-Based Insights for the Masses: Scaling Natural Language Querying to Middleware Data
Authors: Lakkaraju Kausik, Palaiya Vinamra, Paladi Sai Teja, Appajigowda Chinmayi, Srivastava Biplav, Johri Lokesh
Summary:
This is a demonstration paper which talks about a RASA-based chatbot that allows users to control
their network usage and bandwith using smart routers in a household or office setting. We also demonstrated another chatbot in the same paper which helps users in monitoring the power usage in a house, office or university setting using
smart sensors. These were deployed on Alexa device and Web for demonstration.
Publication Type: Demonstration
Venue: DASFAA
Paper
|Bibtex
|Video
A Rich Recipe Representation as Plan to Support Expressive Multi-Modal Queries on Recipe Content and Preparation Process.
Authors: Vishal Pallagani, Priyadharsini Ramamurthy, Vedant Khandelwal, Revathy Venkataramanan, Kausik Lakkaraju, Sathyanarayanan N Aakur, Biplav Srivastava
Summary:
In this paper, we discussed the construction of machine-understandable
rich recipe representation (R3), in the form of plans, from the recipes available in natural language. R3 is infused with additional knowledge like allergens and possible failures at each cooking step.
Publication Type: Workshop
Venue: ICAPS Workshop on Knowledge Engineering for Planning and Scheduling
Paper
|Bibtex
|Video
Explainable Pathfinding for Inscrutable Planners with Inductive Logic Programming
Authors: Rojina Panta, Forest Agostinelli, Vedant Khandelwal, Biplav Srivastava, Bharath Chandra Muppasani, Kausik Lakkaraju, Dezhi Wu
Summary:
By combining inductive logic programming (ILP) with a given inscrutable planner, we constructed an explainable graph representing solutions to all states in the state space.
This graph can then be summarized using a variety of methods such as hierarchical representations or simple if/else rules.
We tested our approach on Towers of Hanoi.
Publication Type: Workshop
Venue: ICAPS Workshop on Explainable AI Planning
Paper
|Bibtex
|Video
ROSE: Tool and Data ResOurces to Explore the Instability of SEntiment Analysis Systems
Authors: Gaurav Mundada, Kausik Lakkaraju, Biplav Srivastava
Summary:
ROSE is a tool that helps examine gender bias in Sentiment Analysis Systems (SASs), which score text for sentiment and emotion.
It offers a dataset of text inputs with their sentiment scores and a visualization tool for analyzing SAS behavior towards gender.
Developed with d3.js, ROSE is freely accessible for public use.
Publication Type: Unpublished Manuscript
Paper
|BibTex
|Tool
Multimodal retrieval and execution monitoring using rich recipe representation
Inventors: Biplav Srivastava, Vishal Pallagani, Revathy Chandrasekaran Venka, Vedant Khandelwal, Kausik Lakkaraju
Summary:
This work introduces a Rich Recipe Representation (R3) to improve how machines
interpret and retrieve complex workflows like recipes. By enriching each step with
contextual knowledge, such as allergen risks, failure modes,
and solutions, R3 enables more accurate reasoning and retrieval.
It powers a web-based system that supports multi-modal queries and real-time
agent monitoring during task execution.
Publication Type: Patent
Patent Link
|BibTex
Robust useful and general task-oriented virtual assistants
Inventors: Biplav Srivastava, Kausik Lakkaraju, Revathy Venkataramanan, Vishal Pallagani, Vedant Khandelwal, Hong Yung Yip
Summary:
This work presents a framework for task-oriented virtual assistants that combine
open-world knowledge discovery, user personalization, and domain-specific adaptation.
Designed for procedural tasks like cooking or DIY, the system also supports fault-tolerant
content curation for task completion despite common errors.
Publication Type: Patent
Patent Link
|BibTex
Assigning trust rating to ai services using causal impact analysis
Inventors:Biplav Srivastava, Kausik Lakkaraju, Marco Valtorta
Summary:
This work proposes a causal rating framework to assess the trustability of Sentiment Analysis
Systems (SASs) based on the influence of inputs like gender, race, and emotion-laden words.
The method assigns both fine-grained and overall ratings using a causal lens, and
includes an implementation across five SASs, deep learning, lexicon-based,
and custom models, to help users interpret model behavior in practical settings.
Publication Type: Patent
Patent Link
|BibTex