Alane Suhr

Publications

*, ** indicate equal contribution.
2024
Fine-tuning large vision-language models as decision-making agents via reinforcement learning. Yuexiang Zhai, Hao Bai*, Zipeng Lin*, Jiayi Pan*, Shengbang Tong*, Yifei Zhou*, Alane Suhr, Saining Xie, Yann LeCun, Yi Ma, and Sergey Levine. To appear in NeurIPS. webcode
DigiRL: Training in-the-wild device-control agents with autonomous reinforcement learning. Hao Bai*, Yifei Zhou*, Mert Cemri, Jiayi Pan, Alane Suhr, Sergey Levine, and Aviral Kumar. To appear in NeurIPS. webcode
Using language models to disambiguate lexical choices in translation. Josh Barua, Sanjay Subramanian, Kayo Yin, and Alane Suhr. To appear in EMNLP.
Grounding language in multi-perspective referential communication. Zineng Tang, Lingjun Mao, and Alane Suhr. To appear in EMNLP.
Autonomous evaluation and refinement of digital agents. Jiayi Pan, Yichi Zhang, Nicholas Tomlin, Yifei Zhou, Sergey Levine, and Alane Suhr. To appear in COLM.

Also appeared at MAR Workshop at CVPR 2024 (won best paper)!
webcode
UNcommonsense reasoning: Abductive reasoning about uncommon situations. Wenting Zhao, Justin T. Chiu, Jena D. Hwang, Faeze Brahman, Jack Hessel, Sanjiban Choudhury, Yejin Choi, Xiang Lorraine Li*, and Alane Suhr*. In NAACL.
Quantifying language models' sensitivity to spurious features in prompt design or: How I learned to start worrying about prompt formatting. Melanie Sclar, Yejin Choi, Yulia Tsvetkov, and Alane Suhr. In ICLR. code
Logo for WIMBD: a blue magnifying glass with WIMBD written on the handle, over three lines of rectangles to look like text. Most of the rectangles are blue but some are different colors. What's in my big data? Yanai Elazar, Akshita Bhagia, Ian Magnusson, Abhilasha Ravichander, Dustin Schwenk, Alane Suhr, Pete Walsh, Dirk Groeneveld, Luca Soldaini, Sameer Singh, Hannaneh Hajishirzi, Noah A. Smith, and Jesse Dodge. In ICLR.
Spotlight
web
2023
A demonstration of ambiguity in the sentence 'The cat was lost after leaving the house' and its relationship with the hypothesis 'The cat could not find its way'. If 'lost' is interpreted as 'unable to find its own way', there is an entailment relationship, and this is accompanied with an illustration of a confused cat. If the interpretation of 'lost' is 'unable to be found', there is a neutral relationship between premise and hypothesis; this is accompanied with an illustration of a poster for a lost cat. We're afraid language models aren't modeling ambiguity. Alisa Liu, Zhaofeng Wu, Julian Michael, Alane Suhr, Peter West, Alexander Koller, Swabha Swayamdipta, Noah A. Smith, and Yejin Choi. In EMNLP.
Continual learning for instruction following from realtime feedback. Alane Suhr and Yoav Artzi. In NeurIPS.
Spotlight
Fine-grained human feedback gives better rewards for language model training. Zeqiu Wu*, Yushi Hu*, Weijia Shi, Nouha Dziri, Alane Suhr, Prithviraj Ammanabrolu, Noah A. Smith, Mari Ostendorf, and Hannaneh Hajishirzi. In NeurIPS.
Spotlight
web
Do embodied agents dream of pixelated sheep?: Embodied decision making using language guided world modelling. Kolby Nottingham, Prithviraj Ammanabrolu, Alane Suhr, Yejin Choi, Hannaneh Hajishirzi, Sameer Singh, and Roy Fox. In ICML.

Also appeared at the Reincarnating RL workshop held at ICLR 2023.
web
Visualization of ToM reasoning, where character Bob maintains a belief about the state of the world (the apple is in the box, which is in the room where the basket also is), as well as a belief state over what another character Anne thinks (that the apple is actually in the basket). Minding language models' (lack of) theory of mind: A plug-and-play multi-character belief tracker. Melanie Sclar, Sachin Kumar, Peter West, Alane Suhr, Yejin Choi, and Yulia Tsvetkov. In ACL.
Outstanding Paper Award

Also appeared at the Theory of Mind Workshop held at ICML 2023.
2022
A Tangram puzzle showing an abstract figure labeled \ Abstract visual reasoning with tangram shapes. Anya Ji, Noriyuki Kojima*, Noah J. Rush*, Alane Suhr*, Wai Keen Vong, Robert Hawkins, and Yoav Artzi. In EMNLP.
Best Long Paper Award
data
2021
Analysis of language change in collaborative instruction following. Anna Effenberger, Rhia Singh*, Eva Yan*, Alane Suhr, and Yoav Artzi. In Findings of EMNLP.

Also appeared at SCiL 2022.
code
Crowdsourcing beyond annotation: Case studies in benchmark data collection. Alane Suhr, Clara Vania, Nikita Nangia, Maarten Sap, Mark Yatskar, Samuel R. Bowman, and Yoav Artzi. Tutorial presented at EMNLP.
Continual learning for grounded instruction generation by observing human following behavior. Noriyuki Kojima, Alane Suhr, and Yoav Artzi. In TACL. code
2020
Exploring underexplored generalization challenges for cross-database semantic parsing. Alane Suhr, Ming-Wei Chang, Peter Shaw, and Kenton Lee. In ACL. code
talk
2019
Executing instructions in situated collaborative interactions. Alane Suhr, Claudia Yan, Charlotte Schluger*, Stanley Yu*, Hadi Khader**, Marwa Mouallem**, Iris Zhang, and Yoav Artzi. In EMNLP. data
web
Two images of dogs, plus an NLVR2 caption beneath. The left image contains two dogs standing in sand; the right image contains a single dog standing on grass. The NLVR2 caption is: \ A corpus for reasoning about natural language grounded in photographs. Alane Suhr*, Stephanie Zhou*, Ally Zhang, Iris Zhang, Huajuan Bai, and Yoav Artzi. In ACL.

Also appeared at the 2017 AAAI Fall Symposium on Natural Communication for Human-Robot Collaboration.
data
web
Touchdown: Natural language navigation and spatial reasoning in visual street environments. Howard Chen, Alane Suhr*, Dipendra Misra, Noah Snavely, and Yoav Artzi. In CVPR. code
2018
Situated mapping of sequential instructions to actions with single-step reward observation. Alane Suhr and Yoav Artzi. In ACL. code
talk
Neural semantic parsing. Matt Gardner, Pradeep Dasigi, Srinivasan Iyer, Alane Suhr, and Luke Zettlemoyer. Tutorial presented at ACL.
Learning to map context-dependent sentences to executable formal queries. Alane Suhr, Srinivasan Iyer, and Yoav Artzi. In NAACL.
Outstanding Paper Award
code
talk
2017

A corpus of natural language for visual reasoning. Alane Suhr, Mike Lewis, James Yeh, and Yoav Artzi. In ACL.
Best Resource Paper Award

Featured in AI Magazine and NLP Highlights.
data
web
talk