Mentors
Vincent Conitzer & Emanuel Tewolde
Akash Kundu
Bio
I am Akash Kundu, a final-year Computer Science undergraduate with several years of experience in technical AI Safety, particularly in evaluating and stress-testing large language models. My work focuses on uncovering behavioural failures in LLMs: dark patterns, sycophancy, harmful reasoning, multilingual vulnerabilities, and other inconsistencies. I’ve co-authored research presented at venues such as ICLR, AAAI, and NeurIPS, and have worked with Apart Research, FAR AI, and Humane Intelligence on evaluation pipelines, adversarial prompting, and cross-cultural red-teaming.
Research Interests
Recently, I’ve been moving toward questions about how model behaviour shifts in multi-agent settings. I’m especially interested in cooperation, signalling, sanctioning, conflict, and emergent group-level failures as models become more agentic and interactive. I plan to pursue a PhD in AI Safety, and this fellowship is a way for me to deepen my understanding of cooperative dynamics, multi-agent safety, and mechanisms that can support aligned behaviour in groups of AI systems.
Project Proposal
Similarity-Based Cooperation and Communication - This project investigates how perceived similarity between LLM agents affects their communication strategies and emergence of cooperation. It combines similarity-based cooperative equilibrium concepts with cheap-talk communication in a feasible three-month scope.