Xuhui Zhou

Making Socially Intelligent AI

Xuhui Zhou

Last update: November 9, 2025


Contact

Address: 4902 Forbes Ave, Gates Hillman Complex, Pittsburgh, PA

Homepage: https://xuhuiz.com/ Email: xuhuiz@cs.cmu.edu Tel: 206-306-5850


Research

Socially intelligent AI agents. Specifically, I am interested in facilitating pro-social agents that interact cooperatively and safely, align with human values, and contribute positively to individual and societal well-being.


Education

Carnegie Mellon University, Pittsburgh, PA (Aug 2022 - Present)

PhD in Computer Science (Language Technologies)

Advisor: Maarten Sap

University of Washington, Seattle, WA (Sep 2019 - Jun 2021)

M.Sc in Computational Linguistics

Advisor: Noah Smith

Nanjing University, Nanjing, China (Sep 2015 - Jun 2019)

B.Sc in Statistics, Department of Mathematics

Advisor: Shujian Huang

University of California Berkeley, Berkeley, CA (visiting student) (Aug 2017 - May 2018)


Industry Experience

All Hands AI, Research Intern (May 2025 - Present)

Allen Institute for Artificial Intelligence, Research Intern (May 2024 - Aug 2024)

Machine Intelligence @ Apple, Research Intern (Mar 2021 - Sep 2021)


Publications

(*Equal contribution)

2025

40. The OpenHands Software Agent SDK: A Composable and Extensible Foundation for Production Agents
Xingyao Wang, Simon Rosenberg, Juan Michelini, Calvin Smith, Hoang Tran, Engel Nyst, Rohit Malhotra, Xuhui Zhou, Valerie Chen, Robert Brennan, Graham Neubig
arXiv preprint

39. Training Proactive and Personalized LLM Agents
Weiwei Sun, Xuhui Zhou, Weihua Du, Xingyao Wang, Sean Welleck, Graham Neubig, Maarten Sap, Yiming Yang
arXiv preprint

38. SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions
Xianzhe Fan, Xuhui Zhou, Chenxu Jin, Kathryn Nottingham, Hao Zhu, Maarten Sap
NeurIPS 2025 Datasets and Benchmarks

37. TOM-SWE: User Mental Modeling For Software Engineering Agents
Xuhui Zhou, Valerie Chen, Zora Zhiruo Wang, Graham Neubig, Maarten Sap, Xingyao Wang
arXiv preprint

36. Social World Models
Xuhui Zhou, Jiarui Liu, Akhila Yerukola, Hyunwoo Kim, Maarten Sap
arXiv preprint

35. OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety
Sanidhya Vijayvargiya, Aditya Bharat Soni, Xuhui Zhou, Zora Zhiruo Wang, Nouha Dziri, Graham Neubig, Maarten Sap
arXiv preprint

34. 1-2-3 Check: Enhancing Contextual Privacy in LLM via Multi-Agent Reasoning
Wenkai Li, Liwen Sun, Zhenxiang Guan, Xuhui Zhou, Maarten Sap
arXiv preprint

33. The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies
Jiaxu Zhou, Jen-tse Huang, Xuhui Zhou, Man Ho Lam, Xintao Wang, Hao Zhu, Wenxuan Wang, Maarten Sap
arXiv preprint

32. How can we assess human-agent interactions? Case studies in software agent design
Valerie Chen, Rohit Malhotra, Xingyao Wang, Juan Michelini, Xuhui Zhou, Aditya Bharat Soni, Hoang H. Tran, Calvin Smith, Ameet Talwalkar, Graham Neubig
arXiv preprint

31. Rethinking Theory of Mind Benchmarks for LLMs: Towards A User-Centered Perspective
Qiaosi Wang, Xuhui Zhou, Maarten Sap, Jodi Forlizzi, Hong Shen
HEAL@CHI 2025 Workshop

30. Words Like Knives: Backstory-Personalized Modeling and Detection of Violent Communication
Jocelyn Shen, Akhila Yerukola, Xuhui Zhou, Cynthia Breazeal, Maarten Sap, Hae Won Park
EMNLP 2025

29. Interactive Agents to Overcome Ambiguity in Software Engineering
Sanidhya Vijayvargiya, Xuhui Zhou, Akhila Yerukola, Maarten Sap, Graham Neubig
arXiv preprint

28. AutoPresent: Designing Structured Visuals from Scratch
Jun Ge, Zhengzhong Wang, Xuhui Zhou, Yuhang Peng, Siddharth Subramanian, Qian Tan, Maarten Sap, Alane Suhr
CVPR 2025

27. Bridging the Data Provenance Gap Across Text, Speech, and Video
Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A. Alghamdi, Vu Minh Chien, Naana Obeng-Marnu, Da Yin, Kun Qian, Yizhi LI, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, Jianguo Zhang, Ariel N. Lee, Campbell S. Lund, Christopher Klamm, Damien Sileo, Diganta Misra, Enrico Shippole, Kevin Klyman, Lester James Validad Miranda, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Vipul Gupta, Vivek Sharma, Xuhui Zhou, Caiming Xiong, Luis Villa, Stella Biderman, Alex Pentland, Sara Hooker, Jad Kabbara
ICLR 2025

26. AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Zhe Su, Xuhui Zhou, Sanketh Rangreji, Anubha Kabra, Julia Mendelsohn, Faeze Brahman, Maarten Sap
NAACL 2025

25. User-Driven Value Alignment: Understanding Users' Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI Companions
Xianzhe Fan, Qing Xiao, Xuhui Zhou, Jiaxin Pei, Maarten Sap, Zhicong Lu, Hong Shen
CHI 2025

24. TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks
Frank F Xu, Yiwei Song, Bowen Li, Yujia Tang, Khushi Jain, Mingyu Bao, Zhengzhong Wang, Xuhui Zhou, Zhiyi Guo
NeurIPS 2025 Datasets and Benchmarks

23. BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Wenkai Li, Jiarui Liu, Andy Liu, Xuhui Zhou, Mona Diab, Maarten Sap
ACL 2025

22. HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Xuhui Zhou, Hyunwoo Kim, Faeze Brahman, Liwei Jiang, Hao Zhu, Ximing Lu, Frank Xu, Bill Yuchen Lin, Yejin Choi, Niloofar Mireshghallah, Ronan Le Bras, Maarten Sap
COLM 2025, Website

21. On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents
Jen-tse Huang, Jiaxu Zhou, Tailin Jin, Xuhui Zhou, Zixi Chen, Wenxuan Wang, Youliang Yuan, Maarten Sap, Michael R. Lyu
ICML 2025

2024

20. Consent in Crisis: The Rapid Decline of the AI Data Commons Shayne Longpre, Robert Mahari, Ariel Lee, Chris Lund, Hakeem Oderinwale, Will Brannon, Xuhui Zhou, Yizhi Li, Caiming Xiong, Luis Villa, Stella Biderman, Hanlin Li, Daphne Ippolito, Sara Hooker, Jad Kabbara, Sandy Pentland NeurIPS 2024 Datasets and Benchmarks

19. Minion: A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications Xianzhe Fan, Qing Xiao, Xuhui Zhou, Yuran Su, Zhicong Lu, Maarten Sap, Hong Shen arXiv preprint

18. PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models Devansh Jain, Priyanshu Kumar, Samuel Gehman, Xuhui Zhou, Thomas Hartvigsen, Maarten Sap COLM 2024

17. Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs Xuhui Zhou, Zhe Su, Tiwalayo Eisape, Hyunwoo Kim, Maarten Sap EMNLP 2024

16. Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi ICLR 2024, Spotlight

15. SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Zhengyang Qi, Haofei Yu, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, Maarten Sap ICLR 2024, Spotlight

14. WebArena: A Realistic Web Environment for Building Autonomous Agents Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig ICLR 2024

13. Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models Natalie Shapira, Mosh Levy, Hossein Seyed Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap, Vered Shwartz EACL 2024

2023

12. COBRA 🐍 Frames: Contextual Reasoning about Effects and Harms of Offensive Statements Xuhui Zhou, Hao Zhu, Akhila Yerukola, Thomas Davidson, Jena D. Hwang, Swabha Swayamdipta, Maarten Sap Findings of ACL 2023

11. "Don't Take This Out of Context!" On the Need for Contextual Models and Evaluations for Stylistic Rewriting Akhila Yerukola, Xuhui Zhou, Maarten Sap EMNLP 2023

10. FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions Hyunwoo Kim, Melanie Sclar, Xuhui Zhou, Ronan Le Bras, Gunhee Kim, Yejin Choi, Maarten Sap EMNLP 2023

9. Learning to translate by learning to communicate C. Downey, Xuhui Zhou, L. Liu, Shane Steinert-Threlkeld EMNLP MRL 2023

2022

8. Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi, Noah A. Smith NAACL 2022

7. Extracting and Inferring Personal Attributes from Dialogue Zhilin Wang, Xuhui Zhou, Rik Koncel-Kedziorski, Alex Marin, Fei Xia ACL ConvAI 2022

6. Emergent Communication Fine-tuning (EC-FT) for Pretrained Language Models Shane Steinert-Threlkeld, Xuhui Zhou, Zeyu Liu, C.M. Downey ICLR EmeCom Workshop 2022, Runner-up Best Paper

2021

5. Challenges in Automated Debiasing for Toxic Language Detection Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Yejin Choi, Noah A. Smith EACL 2021

2020

4. Linguistically-Informed Transformations (LIT): A Method for Automatically Generating Contrast Sets Chuanrong Li, Lin Shengshuo, Zeyu Liu, Xinyi Wu, Xuhui Zhou, Shane Steinert-Threlkeld BlackboxNLP Workshop 2020

3. Multilevel Text Alignment with Cross-Document Attention Xuhui Zhou, Nikolaos Pappas, Noah A. Smith EMNLP 2020

2. RPD: A Distance Function Between Word Embeddings Xuhui Zhou, Shujian Huang, Zaixiang Zheng ACL Student Research Workshop 2020

1. Evaluating Commonsense in Pre-trained Language Models Xuhui Zhou, Y. Zhang, Leyang Cui, Dandan Huang AAAI 2020


Invited Talks

Ethics and Safety in LLMs

Towards Socially Aware and Safe AI Agents

Towards Socially Aware and Interactional NLP Systems


Awards & Media Coverage


Service

Organizing:

  • Theory-of-Mind Workshop at ICML 2023
  • LTI Student Research Symposium 2023

Program Committee & Reviewing:

  • Journals & Conferences:
    • TLMR 2023
    • ACL ARR 2021-2024
    • NeurIPS 2023, 2024
    • ICLR 2023, 2024
  • Workshops:
    • Workshop on Multimodal Content Moderation (MMCM) at CVPR 2023
    • Positive NLP at ACL 2022