Last update: Oct 06, 2024
Address:
4902 Forbes Ave, Gates Hillman Complex, Pittsburgh, PA
Homepage: https://xuhuiz.com/
Email: xuhuiz@cs.cmu.edu
Tel: 206-306-5850
Socially intelligent AI agents. Specifically, I am interested in facilitating pro-social agents that interact cooperatively, align with human values, and contribute positively to individual and societal well-being.
Carnegie Mellon University, Pittsburgh, PA (Aug 2022)
PhD in Computer Science (Language Technologies)
Advisor: Maarten Sap
University of Washington, Seattle, WA (Sep 2019 - Jun 2021)
M.Sc in Computational Linguistics
Advisor: Noah Smith
Nanjing University, Nanjing, China (Sep 2015 - Jun 2019)
B.Sc in Statistics, Department of Mathematics
Advisor: Shujian Huang
University of California Berkeley, Berkeley, CA (visiting student) (Aug 2017 - May 2018)
Allen Institute for Artificial Intelligence
Research Intern (May 2024 - Aug 2024)
Machine Intelligence @ Apple
Research Intern (Mar 2021 - Sep 2021)
(*Equal contribution)
HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
Xuhui Zhou, Hyunwoo Kim*, Faeze Brahman*, Liwei Jiang, Hao Zhu, Ximing Lu, Frank Xu, Bill Yuchen Lin, Yejin Choi, Niloofar Mireshghallah, Ronan Le Bras, Maarten Sap
Website
AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
Zhe Su, Xuhui Zhou, Sanketh Rangreji, Anubha Kabra, Julia Mendelsohn, Faeze Brahman, Maarten Sap
User-Driven Value Alignment: Understanding Users’ Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI Companions
Xianzhe Fan, Qing Xiao, Xuhui Zhou, Jiaxin Pei, Maarten Sap, Zhicong Lu, Hong Shen
On the Resilience of Multi-Agent Systems with Malicious Agents
Jen-tse Huang, Jiaxu Zhou, Tailin Jin, Xuhui Zhou, Zixi Chen, Wenxuan Wang, Youliang Yuan, Maarten Sap, Michael R. Lyu
Consent in crisis: The rapid decline of the ai data commons
Shayne Longpre, Robert Mahari, Ariel Lee, …, Xuhui Zhou, Yizhi Li, Caiming Xiong, Luis Villa, Stella Biderman, Hanlin Li, Daphne Ippolito, Sara Hooker, Jad Kabbara, Sandy Pentland
NeurIPS Datasets and Benchmarks 2024
PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models Devansh Jain, Priyanshu Kumar, Samuel Gehman, Xuhui Zhou, Thomas Hartvigsen, Maarten Sap
COLM 2024
Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
Xuhui Zhou, Zhe Su, Tiwalayo Eisape, Hyunwoo Kim, Maarten Sap
EMNLP 2024, Website
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Zhengyang Qi, Haofei Yu, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, Maarten Sap
ICLR 2024, Spotlight, Website
WebArena: A Realistic Web Environment for Building Autonomous Agents
Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig
ICLR 2024
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi
ICLR 2024, Spotlight
FANTOM: A Benchmark for Analyzing Theory of Mind in Conversations
Hyunwoo Kim, Melanie Sclar, Xuhui Zhou, Ronan Le Bras, Gunhee Kim, Yejin Choi, Maarten Sap
EMNLP 2023
Don’t Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting
Akhila Yerukola, Xuhui Zhou, Maarten Sap
EMNLP 2023
Learning to translate by learning to communicate
C.M. Downey, Xuhui Zhou, Leo Z. Liu, Shane Steinert-Threlkeld
EMNLP MRL 2023
Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
Natalie Shapira, Mosh Levy, Hossein Seyed Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap, and Vered Shwartz
EACL 2023
Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
Xuhui Zhou, Hao Zhu, Akhila Yerukola, Thomas Davidson, Jena D. Hwang, Swabha Swayamdipta, Maarten Sap
Findings of ACL 2023
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection
Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi, Noah A. Smith
NAACL 2022
Emergent Communication Fine-tuning (EC-FT) for Pretrained Language Models
Shane Steinert-Threlkeld, Xuhui Zhou, Zeyu Liu, C. M. Downey
ICLR EmeCom 2022, Runner-up Best Paper
Extracting and Inferring Personal Attributes from Dialogue
Zhilin Wang, Xuhui Zhou, Rik Koncel-Kedziorski, Alex Marin, Fei Xia
ACL ConvAI, 2022
Challenges in Automated Debiasing for Toxic Language Detection
Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Noah A.Smith, Yejin Choi
EACL, 2021
Linguistically-Informed Transformations (LIT): A Method for Automatically Generating Contrast Sets
Chuanrong Li, Lin Shengshuo, Zeyu Liu, Xinyi Wu, Xuhui Zhou, Shane Steinert-Threlkeld
*EMNLP BlackboxNLP, 2020
Multilevel Text Alignment with Cross-Document Attention
Xuhui Zhou, Nikolaos Pappas, Noah A. Smith
EMNLP, 2020
Evaluating Commonsense in Pre-trained Language Models
Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
AAAI, 2020
RPD: A Distance Function Between Word Embeddings
Xuhui Zhou, Zaixiang Zheng, Shujian Huang
ACL Student Research Workshop, 2020