publications

publications by categories in reversed chronological order. generated by jekyll-scholar.

2024

  1. arXiv
    HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions
    Xuhui Zhou, Hyunwoo Kim, Faeze Brahman, and 9 more authors
    2024
  2. arXiv
    AI-LieDar: Examine the Trade-off Between Utility and Truthfulness in LLM Agents
    Zhe Su, Xuhui Zhou, Sanketh Rangreji, and 4 more authors
    2024
  3. arXiv
    User-Driven Value Alignment: Understanding Users’ Perceptions and Strategies for Addressing Biased and Discriminatory Statements in AI Companions
    Xianzhe Fan, Qing Xiao, Xuhui Zhou, and 4 more authors
    2024
  4. arXiv
    On the Resilience of Multi-Agent Systems with Malicious Agents
    Jen-tse Huang, Jiaxu Zhou, Tailin Jin, and 6 more authors
    2024
  5. NeurIPS
    Consent in Crisis: The Rapid Decline of the AI Data Commons
    In NeurIPS, 2024
  6. COLM
    PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models
    Devansh Jain, Priyanshu Kumar, Samuel Gehman, and 3 more authors
    In COLM, 2024
  7. EMNLP
    Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs
    Xuhui Zhou, Zhe Su, Tiwalayo Eisape, and 2 more authors
    In EMNLP, 2024
  8. ICLR
    Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
    Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, and 4 more authors
    In ICLR, 2024
  9. ICLR
    SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
    Xuhui Zhou*, Hao Zhu*, Leena Mathur, and 8 more authors
    In ICLR, 2024
  10. ICLR
    WebArena: A Realistic Web Environment for Building Autonomous Agents
    Shuyan Zhou, Frank F. Xu, Hao Zhu, and 8 more authors
    In ICLR, 2024
  11. EACL
    Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models
    Natalie Shapira, Mosh Levy, Hossein Seyed Alavi, and 5 more authors
    In EACL, 2024

2023

  1. ACL
    COBRA 🐍 Frames: Contextual Reasoning about Effects and Harms of Offensive Statements
    Xuhui Zhou, Hao Zhu, Akhila Yerukola, and 4 more authors
    In Findings of ACL, 2023
  2. EMNLP
    “Don’t Take This Out of Context!” On the Need for Contextual Models and Evaluations for Stylistic Rewriting
    Akhila Yerukola, Xuhui Zhou, and Maarten Sap
    In EMNLP, 2023
  3. EMNLP
    FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions
    Hyunwoo Kim, Melanie Sclar, Xuhui Zhou, and 4 more authors
    In EMNLP, 2023
  4. EMNLP
    Learning to translate by learning to communicate
    C. Downey*, Xuhui Zhou*, L. Liu, and 1 more author
    In EMNLP MRL, 2023

2022

  1. NACCL
    Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection
    Maarten Sap, Swabha Swayamdipta, Laura Vianna, and 3 more authors
    In NAACL, 2022
  2. ACL ConvAI
    Extracting and Inferring Personal Attributes from Dialogue
    Zhilin Wang, Xuhui Zhou, Rik Koncel-Kedziorski, and 2 more authors
    In ACL ConvAI, 2022
  3. ICLR EmeCom
    Emergent Communication Fine-tuning (EC-FT) for Pretrained Language Models
    Shane Steinert-Threlkeld, Xuhui Zhou, Zeyu Liu, and 1 more author
    In Emergent Communication Workshop at ICLR 2022, 2022

2021

  1. EACL
    Challenges in Automated Debiasing for Toxic Language Detection
    Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, and 2 more authors
    In EACL, 2021

2020

  1. BlackboxNLP
    Linguistically-Informed Transformations (LIT): A Method for Automatically Generating Contrast Sets
    Chuanrong Li, Lin Shengshuo, Zeyu Liu, and 3 more authors
    In Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, Nov 2020
  2. EMNLP
    Multilevel Text Alignment with Cross-Document Attention
    Xuhui Zhou, Nikolaos Pappas, and Noah A. Smith
    In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Nov 2020
  3. ACL SRW
    RPD: A Distance Function Between Word Embeddings
    Xuhui Zhou, Shujian Huang, and Zaixiang Zheng
    In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, Nov 2020
  4. AAAI
    Evaluating Commonsense in Pre-trained Language Models
    Xuhui Zhou, Y. Zhang, Leyang Cui, and 1 more author
    In Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), Nov 2020