About Me

Greetings! My name is Anuja Uppuluri. I was born and raised in Austin, Texas and currently live in Pittsburgh. I am a computer scientist and AI & machine learning researcher among other things.

I'm an undergrad at Carnegie Mellon University studying Computer Information Systems, Artifical Intelligence, and Discrete Math. I graduate in May 2025 🎓📜.

I am passionate about LLM post-training research, interpretability, and AI safety– I founded and lead the Carnegie Mellon AI Safety & Alignment Initiative (300 members strong!). I believe AGI will benefit all of humanity– my goal is to help get there and make sure that path doesn't introduce catastrophic risks.

I am in an acapella group called Counterpoint– I love singing (in a group). I also enjoy painting, meditating, listening to & making music, writing, playing chess & Clash Royale, films & shows (Severance right now!), reading, and most other enjoyable hobbies people partake in.

I would love to connect with you! Click your preferred form of communication below to add me.

⊹ ࣪ ﹏𓊝﹏🫧⋆。˚﹏⊹ ࣪ ˖

Research

AidanBench | Co-First Author

Accepted to NeurIPS Language Gamification 2024

This is a novel benchmark that is used for evaluating sustained, open ended generation in large language models / LLMs.

It uses open ended question prompts to assess a model's coherence, creativity, contextual attention, and instruction following through embedding based dissimilarity metrics.

We performed comparative analyses across SOTA models and could demonstrate that AidanBench is strongly correlated with model size and moderately correlated with LMSYS.

This is a non saturating benchmark / has no score ceiling and it aligns better with real-world open-ended use cases.

Creating a Cooperative AI Policymaking Platform | Co-First Author

with Humanity Unleashed

Leading research on frameworks that systematically identify and quantify human values across diverse populations, using Bayesian modeling to inform AI driven policy development.

Developing methodologies to capture stakeholder preferences that represent various demographic groups to get AI recommendations to reflect by domain a comprehensive range of societal perspectives.

Translating elicited values into actionable policy proposals to enable transparent governance with human opinion oversight in AI decision processes.

The platform I'm building is part of the larger mission of leveraging AI to enhance human cooperation and alignment, working toward responsible governance before more advanced AI systems emerge.

Projects

Multi-Agent RL Trading System

Reinforcement learning framework that trains agents to trade in simulated markets using Proximal Policy Optimization. Implements portfolio management with price impact modeling in a multi-agent environment.

Key Achievements:

  • Agent learns stable trading strategies with returns stabilizing after initial exploration
  • Price correlation demonstrates sophisticated market impact understanding
  • Real-time adaptive behavior to market volatility
Reinforcement Learning PPO Algorithm Multi-Agent Systems OpenAI Gym
View Project

Interactive LLM Explainability Dashboard

Interactive dashboard for exploring and visualizing language model internals. Makes complex neural network behaviors interpretable through intuitive visualizations of attention mechanisms and text embeddings.

Key Features:

  • Text Embeddings Visualization using SentenceTransformer with t-SNE dimensionality reduction
  • BERT Attention Heatmap showing how each token attends to others in the sentence
  • Intuitive interactive interface for exploring model internals
Interpretability Attention Visualization BERT t-SNE
View Project

Experience

Amazon

Software Development Engineer Intern

May 2024 – August 2024

Built from scratch an optimized full-stack testing system for ML models that configure prices for all Amazon Basics items. System now serves as the primary testing framework used by the Base Pricing team for ML model evaluations. Deployed system to production, reducing manual validation time from 12 mins to 2 mins.

Continuum Space

Software Engineering Intern

August 2023 – December 2023

Fixed critical memory leaks in satellite controller system code, increasing information transmission efficiency by 24%. Reduced overall system downtime and increased satellite communication stability using Julia.

CHAI
Center for Human
Compatible AI

AI Safety and Risk Researcher

August 2023 – December 2023

Wrote algorithms to enhance confidence scoring mechanisms in large-scale language models (15% reliability increase). Implemented rigorous evaluations on confidence scoring code- built benchmark prompt dataset for model testing.