← Back

DEEPTHI SUDHARSAN

Self-motivated technology professional with strong business acumen and a passion for AI/ML. Experienced in leading projects, working as a team, and mentoring peers, I am eager to leverage my entrepreneurial mindset to drive innovative AI/ML products that deliver impactful outcomes.

Education

B.Tech, Artificial Intelligence 2019 — 2023
Amrita Vishwa Vidyapeetham, Coimbatore
CGPA: 9.13

Experience

Data Scientist Dec 2025 — Jan 2025
InvestCloud
  • Worked as part of the Data, AI, and Analytics team contributing to their Digital Wealth product by evaluating and building a multi-agent intelligent assistant for wealth management advisors using LangGraph and GPT-based models.
Independent Researcher Jul 2025 — Dec 2025
In collaboration with Microsoft Research
  • Advanced projects initiated during MSR fellowship through to paper submission at top-tier conferences and workshops (eg. AACL, AAAI, MMFood, etc.)
  • Conducting continued research on AI agentic, culturally nuanced story generation to support financial knowledge acquisition among rural adults, while investigating community-centric evaluation frameworks for assessing cultural representation in generative AI systems.
  • Collaborating with academic and industry researchers on dataset curation, benchmarking, and culturally nuanced storytelling methods for underrepresented contexts in Generative AI.
Research Fellow Jul 2023 — Jul 2025
Microsoft Research India
  • KahaniLed project Kahani, focused on building an AI-based multi-agent system for generating culturally-adaptive visual stories using GPT-4o and FLUX.1 and agentic frameworks like PydanticAI (Advisors: Dr. Kalika Bali and Dr. Mohit Jain).
  • Inclusive Representation in T2I — Developed innovative evaluation metrics for long-tail concepts and under-represented communities in models such as Stable Diffusion 3 and DALL-E3 (Advisor: Dr. Cecily Morrison).
  • Endangered Tribal Languages Documentation — Led collection and modeling of PoCs for various use cases, including fine-tuned RAG systems for traditional recipes in endangered tribal languages.
  • Updesh — Helped build a large-scale synthetic dataset designed to advance post-training of LLMs for Indic languages using MediaWiki API, BingSearch API, and GPT-4o. (Advisors: Dr. Kalika Bali and Dr. Sunayana Sitaram).
  • HyWay — Enhanced hybrid engagement by building privacy-preserving agents for summarization/recap tasks, deployed at internal events attended by Satya Nadella and Peter Lee.
  • Azure Maps Creator Tool — Collaborated with Azure Maps team to build an AI-powered automation pipeline using GPT-4V with geospatial data and CAD files.
Technology Consulting Intern — Emerging Technologies Jan 2023 — Jul 2023
PwC India
  • Delivered Proof-of-Concepts (from model development to model serving) for top clients, including automated weld defect detection using Deep Neural Nets and ANPR-based vehicle monitoring.
Machine Learning Intern Oct 2021 — Dec 2021
The Machine Learning Company
  • Built end-to-end Machine Learning solutions to analyze Near-Infrared Spectroscopy measurements for ensuring consistent content uniformity of Active Pharmaceutical Ingredients using Linear Regression, SVM, Random Forests, and LightGBM.

Technical Skills

Languages

Python, Java (Proficient); C++, Scala, Julia, R (Beginner)

Frameworks

Scikit-learn, TensorFlow, Keras, cvxpy, PuLP, OpenCV, Pycaret, Icevision, Langchain, LlamaIndex

Generative AI

Gemma Models, OpenAI, Llama Family, SDXL, SD 3.5, Flux, Autogen, PydanticAI, CrewAI

Serving & Cloud

Flask, Docker, Kubernetes, FastAPI, Azure, Google AutoML, Terraform, AWS Sagemaker

Publications

AAAI EGSAI 2026

Agentic Framework for Culturally Grounded Visual Story Generation

Hamna, Aasim Baig, Aayush Jansari, Deepthi Sudharsan, Advait Bhat, Vivek Seshadri, et al.
Proposed an agentic, human-in-the-loop framework for generating culturally grounded visual stories using coordinated LLM agents and text-to-image models.
View Poster
AACL-IJCNLP 2025

ELR-1000: A Community-Generated Dataset for Endangered Indic Indigenous Languages

Neha Joshi, Pamir Gogoi, Aasim Mirza, Aayush Jansari, Deepthi Sudharsan, Vivek Seshadri, et al.
Proposes a benchmark dataset of 1,060 traditional recipes in 10 endangered languages collected via a mobile tool for low-digital-literacy contributors.
View Poster
MMFood 2025

What’s Not on the Plate? Rethinking Food Computing through Indigenous Indian Datasets

Pamir Gogoi, Neha Joshi, Ayushi Pandey, Deepthi Sudharsan, Kalika Bali, Vivek Seshadri, et al.
A multimodal dataset promoting culturally inclusive, community-authored data for AI in ethical food computing.
View Poster
COMPASS 2025

KAHANI: Culturally-Nuanced Visual Storytelling Pipeline for Non-Western Cultures

Hamna, Deepthi Sudharsan, Agrima Seth, Ritvik Budhiraja, Kalika Bali, et al.
Developed Kahani, a pipeline utilizing GPT-4 Turbo and SDXL with CoT prompting to outperform ChatGPT-4 in cultural relevance.
KAHANI
View Paper
Preprint

The Role of Synthetic Data in Multilingual, Multicultural AI Systems: Lessons from Indic languages

Pranjal A Chitale, Varun Gumma, Sanchit Ahuja, Deepthi Sudharsan, Sunayana Sitaram, et al.
Presents Updesh, a synthetic instruction-tuning dataset for Indian languages built via a culturally grounded, bottom-up approach.
Updesh
View Paper
Under Review

Towards a Community-Centric Approach to Measuring Cultural Representation in AI Image Generation

Nari Johnson, Deepthi Sudharsan, Hamna, et al.
Argues that AI measurement often ignores impacted communities, using case studies to show how lived experience shapes more meaningful metrics.
Community Centric AI
Preprint

To Make Text-to-Image Models that Work for Marginalized Communities, We Need New Measurement Practices for the Long Tail

Nari Johnson, Hamna, Deepthi Sudharsan, et al.
Highlights how current T2I metrics fail the long tail of low-resource concepts and calls for community-centered methods.
Long Tail Measurement
View Paper
DA&CI 2022

Coconut Tree Detection using Deep Learning Models

Deepthi Sudharsan, K. Harish, U. Asmitha, et al.
Optimized object detection models using IceVision framework for coconut tree detection and segmentation.
Coconut Tree Detection
View Paper
ISCMM 2021

Brain Disorder Detection Using MRI Data

Deepthi Sudharsan, S Isha Indhu, Kavya S Kumar, et al.
Developed ML/DL models for diagnosis of Schizophrenia and Alzheimer's using MRI features.
Brain Disorder Detection
View Paper
FIRE 2021

Rhetorical Role Labelling of Legal Case Documents

Deepthi Sudharsan, U Asmitha, B Premjith, KP Soman
Implemented DistilRoBERTa based sentence embeddings for rhetorical role assignment in legal documents.
Legal Data Analysis
View Paper

Awards & Honors

Certifications & Achievements

Let's Connect