Xuhui Zhou

Making Socially Intelligent AI

Publications

Research on social AI, safety, and language understanding.

Filter by

Venue

Theme

Showing 38 publications

2025

SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions

Xianzhe Fan, Xuhui Zhou, Chenxu Jin, Kathryn Nottingham, Hao Zhu, Maarten Sap

NeurIPS 2025 Datasets and Benchmarks

Social AIAgents
NeurIPS

TOM-SWE: User Mental Modeling For Software Engineering Agents

Xuhui Zhou, Valerie Chen, Zora Zhiruo Wang, Graham Neubig, Maarten Sap, Xingyao Wang

arXiv preprint

AgentsSocial AI
arXiv

Social World Models

Xuhui Zhou, Jiarui Liu, Akhila Yerukola, Hyunwoo Kim, Maarten Sap

arXiv preprint

Social AIMulti-Agent Systems
arXiv

OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety

Sanidhya Vijayvargiya, Aditya Bharat Soni, Xuhui Zhou, Zora Zhiruo Wang, Nouha Dziri, Graham Neubig, Maarten Sap

arXiv preprint

AI SafetyAgents
arXiv

1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning

Wenkai Li, Liwen Sun, Zhenxiang Guan, Xuhui Zhou, Maarten Sap

arXiv preprint

AI SafetyMulti-Agent Systems
arXiv

The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies

Jiaxu Zhou, Jen-tse Huang, Xuhui Zhou, Man Ho Lam, Xintao Wang, Hao Zhu, Wenxuan Wang, Maarten Sap

arXiv preprint

Multi-Agent SystemsSocial AI
arXiv

How can we assess human-agent interactions? Case studies in software agent design

Valerie Chen, Rohit Malhotra, Xingyao Wang, Juan Michelini, Xuhui Zhou, Aditya Bharat Soni, Hoang H. Tran, Calvin Smith, Ameet Talwalkar, Graham Neubig

arXiv preprint

AgentsSocial AI
arXiv

Rethinking Theory of Mind Benchmarks for LLMs: Towards A User-Centered Perspective

Qiaosi Wang, Xuhui Zhou, Maarten Sap, Jodi Forlizzi, Hong Shen

arXiv preprint

Social AI
arXiv

Words Like Knives: Backstory-Personalized Modeling and Detection of Violent Communication

Jocelyn Shen, Akhila Yerukola, Xuhui Zhou, Cynthia Breazeal, Maarten Sap, Hae Won Park

EMNLP 2025

EMNLP

Interactive Agents to Overcome Ambiguity in Software Engineering

Sanidhya Vijayvargiya, Xuhui Zhou, Akhila Yerukola, Maarten Sap, Graham Neubig

arXiv preprint

AgentsSocial AI
arXiv

AutoPresent: Designing Structured Visuals from Scratch

Jun Ge, Zhengzhong Wang, Xuhui Zhou, Yuhang Peng, Siddharth Subramanian, Qian Tan, Maarten Sap, Alane Suhr

CVPR 2025

CVPR

Bridging the Data Provenance Gap Across Text, Speech, and Video

Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A. Alghamdi, Vu Minh Chien, Naana Obeng-Marnu, Da Yin, Kun Qian, Yizhi LI, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, Jianguo Zhang, Ariel N. Lee, Campbell S. Lund, Christopher Klamm, Damien Sileo, Diganta Misra, Enrico Shippole, Kevin Klyman, Lester James Validad Miranda, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Vipul Gupta, Vivek Sharma, Xuhui Zhou, Caiming Xiong, Luis Villa, Stella Biderman, Alex Pentland, Sara Hooker, Jad Kabbara

ICLR 2025

ICLR

AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents

Zhe Su, Xuhui Zhou, Sanketh Rangreji, Anubha Kabra, Julia Mendelsohn, Faeze Brahman, Maarten Sap

NAACL 2025

AI SafetyAgents
NAACL

User-Driven Value Alignment: Understanding Users' Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI Companions

Xianzhe Fan, Qing Xiao, Xuhui Zhou, Jiaxin Pei, Maarten Sap, Zhicong Lu, Hong Shen

CHI 2025

CHI

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

Frank F Xu, Yiwei Song, Bowen Li, Yujia Tang, Khushi Jain, Mingyu Bao, Zhengzhong Wang, Xuhui Zhou, Zhiyi Guo

NeurIPS 2025 Datasets and Benchmarks

AgentsEvaluation
NeurIPS

BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data

Wenkai Li, Jiarui Liu, Andy Liu, Xuhui Zhou, Mona Diab, Maarten Sap

ACL 2025

ACL

HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions

Xuhui Zhou, Hyunwoo Kim, Faeze Brahman, Liwei Jiang, Hao Zhu, Ximing Lu, Frank Xu, Bill Yuchen Lin, Yejin Choi, Niloofar Mireshghallah, Ronan Le Bras, Maarten Sap

COLM 2025

AI SafetySocial AIMulti-Agent Systems
COLM

On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents

Jen-tse Huang, Jiaxu Zhou, Tailin Jin, Xuhui Zhou, Zixi Chen, Wenxuan Wang, Youliang Yuan, Maarten Sap, Michael R. Lyu

ICML 2025

Multi-Agent SystemsAI Safety
ICML

2024

Consent in Crisis: The Rapid Decline of the AI Data Commons

Shayne Longpre, Robert Mahari, Ariel Lee, Chris Lund, Hakeem Oderinwale, Will Brannon, Xuhui Zhou, Yizhi Li, Caiming Xiong, Luis Villa, Stella Biderman, Hanlin Li, Daphne Ippolito, Sara Hooker, Jad Kabbara, Sandy Pentland

NeurIPS 2024 Datasets and Benchmarks

NeurIPS

Minion: A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications

Xianzhe Fan, Qing Xiao, Xuhui Zhou, Yuran Su, Zhicong Lu, Maarten Sap, Hong Shen

arXiv preprint

arXiv

PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models

Devansh Jain, Priyanshu Kumar, Samuel Gehman, Xuhui Zhou, Thomas Hartvigsen, Maarten Sap

COLM 2024

COLM

Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs

Xuhui Zhou, Zhe Su, Tiwalayo Eisape, Hyunwoo Kim, Maarten Sap

EMNLP 2024

Social AI
EMNLP

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi

ICLR 2024🏆 Spotlight (top 5%)

AI Safety
ICLR

SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Zhengyang Qi, Haofei Yu, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, Maarten Sap

ICLR 2024🏆 Spotlight (top 5%)

Social AIMulti-Agent Systems
ICLR

WebArena: A Realistic Web Environment for Building Autonomous Agents

Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig

ICLR 2024

AgentsGrounding
ICLR

Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models

Natalie Shapira, Mosh Levy, Hossein Seyed Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap, Vered Shwartz

EACL 2024

Social AI
EACL

2023

COBRA 🐍 Frames: Contextual Reasoning about Effects and Harms of Offensive Statements

Xuhui Zhou, Hao Zhu, Akhila Yerukola, Thomas Davidson, Jena D. Hwang, Swabha Swayamdipta, Maarten Sap

Findings of ACL 2023

AI Safety
ACL

"Don't Take This Out of Context!" On the Need for Contextual Models and Evaluations for Stylistic Rewriting

Akhila Yerukola, Xuhui Zhou, Maarten Sap

EMNLP 2023

EMNLP

FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions

Hyunwoo Kim, Melanie Sclar, Xuhui Zhou, Ronan Le Bras, Gunhee Kim, Yejin Choi, Maarten Sap

EMNLP 2023

EMNLP

Learning to translate by learning to communicate

C. Downey, Xuhui Zhou, L. Liu, Shane Steinert-Threlkeld

EMNLP MRL 2023

EMNLP

2022

Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection

Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi, Noah A. Smith

NAACL 2022

NAACL

Extracting and Inferring Personal Attributes from Dialogue

Zhilin Wang, Xuhui Zhou, Rik Koncel-Kedziorski, Alex Marin, Fei Xia

ACL ConvAI 2022

ACL ConvAI

Emergent Communication Fine-tuning (EC-FT) for Pretrained Language Models

Shane Steinert-Threlkeld, Xuhui Zhou, Zeyu Liu, C.M. Downey

ICLR EmeCom Workshop 2022🏆 Runner-up Best Paper

ICLR EmeCom

2021

Challenges in Automated Debiasing for Toxic Language Detection

Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Yejin Choi, Noah A. Smith

EACL 2021

AI Safety
EACL

2020

Linguistically-Informed Transformations (LIT): A Method for Automatically Generating Contrast Sets

Chuanrong Li, Lin Shengshuo, Zeyu Liu, Xinyi Wu, Xuhui Zhou, Shane Steinert-Threlkeld

BlackboxNLP Workshop 2020

BlackboxNLP
Paper

Multilevel Text Alignment with Cross-Document Attention

Xuhui Zhou, Nikolaos Pappas, Noah A. Smith

EMNLP 2020

EMNLP
PaperWebsite

RPD: A Distance Function Between Word Embeddings

Xuhui Zhou, Shujian Huang, Zaixiang Zheng

ACL Student Research Workshop 2020

ACL SRW
Paper

Evaluating Commonsense in Pre-trained Language Models

Xuhui Zhou, Y. Zhang, Leyang Cui, Dandan Huang

AAAI 2020

AAAI