Peter Kirgis

I am a research scientist at the Princeton Center for Information Technology Policy, where I work with Arvind Narayanan and Zeynep Tufekci on multiple projects relating to AI agent evaluation and the societal impacts of AI. This spring, I am also working with Tim Hua as part of the SPAR program on a project to study mechanisms for value reflection and systematization in frontier AI models.

Current Projects

Credible AI evaluation: I am leading a project in collaboration with other major AI evaluation groups which defines a set of threats to credible evaluation of AI agents and proposes log analysis as a method for addressing these threats, which I hope to publish as a position paper at NeurIPS 2026.
AI agent reliability: Alongside Stephan Rabanser and the rest of the HAL team at Princeton, I am working to develop an index of AI agent reliability. Learn more at the project website.
Anthropomorphization and attachment: With the AI and Society Lab at Princeton, I am working to document the prevalence of anthropomorphization, attachment, and engagement with AI chatbots using a combination of synthetic benchmarks, big data analytics, and population surveys.
LLM value reflection: With Tim Hua, I’m working on a project to explore how different LLMs iteratively refine their own constitutions, model specs, and system prompts.

Recent Highlights

Sycophancy and escalation in ChatGPT: I led a research project documenting sycophancy, delusion reinforcement, and escalation in ChatGPT that was accepted at The Second Annual Conference of the International Association for Safe and Ethical AI (IASEAI 2026). Read an accessible summary at the project website.
AI agent failure modes: I led a mixed-methods analysis of over 2,000 logs from early 2025 AI agents tested on nine benchmarks, documenting the failure modes and capabilities of these systems as part of a paper accepted at the 2026 International Conference on Learning Representations (ICLR 2026). See the paper and a presentation of the work I gave at NeurIPS 2025.
Moral foundations of LLMs: I wrote a paper analyzing the ethical judgements of 21 frontier language models using Jonathan Haidt’s moral foundations theory, which I presented at Princeton’s 2025 PICSciE/CSML Joint Colloquium. Read a profile of the project.
AI consciousness and moral patienthood: At the 2025 Eleos AI Consciousness Conference, I gave an original presentation entitled “Is Consciousness Prerequisite for Moral Patienthood? The case for stretching our moral intuitions and avoiding the hard problem.” See the slides.

Background

I am a recent MPA graduate from Princeton, where I also completed a graduate certificate in statistics and machine learning. My broad research interests are in artificial intelligence, moral philosophy, intergenerational mobility, and social safety net implementation. I have also worked in civic technology, most recently at the US Census Bureau as a Coding it Forward Data Science Fellow, and prior to that, as a data analyst with the Massachusetts Digital Service. I received my BA from Williams College in 2020, where I studied political economy, philosophy, and cognitive science. Outside of work, I am an avid runner and cyclist and love spending long days in the mountains.