PON AI Summit: AI as a Researcher

Panel Leader: Ray Friedman (Vanderbilt)

Ray Friedman, Vanderbilt University
Developing a Large Language Model for Coding Negotiation Transcripts

Coding negotiation transcripts is a costly and laborious task that is needed for many research projects. We developed AI coding models that were trained (using in-context learning) on a corpus of previously-human-coded transcripts (Alsani et al 2014; Brett and Nandkeolyar [unpublished]). Five transcripts were used to train the Anthropic LLMs and the model was tested with a different set of 30-60 transcripts. We run each model five times, reporting model’s level of consistency, as well as the code assigned. Initial levels of human-model match were 73% for Model 1 and 75% for Model 2. We then conducted “mismatch analysis” with new human coders assessing whether they judged the initial human coders or the model to be more accurate, raising significantly the expected accuracy level of the models. The models are available at https://www.ainegotiationlab-vanderbilt.com/ and grants to cover the Anthropic charges can be requested at https://www.negotiationandteamresources.com/automated-coding-of-negotiation-transcripts/

Emily Hu, University of Pennsylvania
AI as Explorer: Quantifying Conversations with Natural Language Processing

Conflict and negotiation are rich with language data. Individuals use words to introspect and reason about their values; groups debate, make offers, and reach verbal agreements. These dynamics, though incredibly rich, are difficult to quantify; which attributes in a conversation are worth measuring? And how should any given attribute be operationalized? In this talk, I show how a suite of natural language processing techniques can be used to explore unstructured language in a structured manner. By unifying two disciplines — theories from communication and methods from computer science — I will seek to demystify the process of analyzing open-ended communication data. I introduce the Team Communication Toolkit, a “one-stop shop” Python package that extracts more than 160 communication features from a transcribed conversation. I will then highlight design details, key measures, and considerations for different settings. Finally, I will use the toolkit to explore a dataset of 376 groups playing real-time public goods games, using conversation features to predict cooperative behavior. This application demonstrates the promise of natural language processing to generate hypotheses and obtain theoretical insights from complex social phenomena.

Gale Lucas, University of Southern California
Let’s Negotiate! Using AI as a Partner in Negotiation Research

Agents have been established as useful confederates for research in social sciences, including on the topic of negotiation. Standing in place of human confederates, there is a greater degree of experimental control and thus internal validity. Compared to scripted opponents in research, though, they allow for an interactive, contingent negotiation with players. In this talk, I argue that this tension between control and interactivity also exists among types of agents: agents where users negotiate through drop-down menus afford a great degree of control, but are not as interactive as natural language (NL) agents that respond to free-text responses from players. Agents’ response quality, however, needs to be improved in order to retain experimental control. If possible, this could make NL agents the best of both worlds. The talk ends with a discussion around efforts to address control and response quality, accordingly.

Zhivar Sourati, University of Southern California
LLMs and the Erosion of Human Variance

LLMs offer exciting possibilities for psychological research, from modeling aspects of human psychology to automating tasks like text annotation. However, their uncritical adoption poses significant risks. In this talk, I explore the complex interplay between LLMs and psychological research, addressing both their potential and their perils. I examine how LLMs can inadvertently homogenize language, eroding the rich linguistic diversity crucial for understanding individual identities, psychological states, and social contexts. Furthermore, I argue that using LLMs as human substitutes or direct models of human thought presents distinct challenges. I specifically focus on LLMs’ difficulty in generating the natural variance inherent in human data, which further obscures valuable insights into individual differences. Specifically, I demonstrate that LLMs fail to produce much variance in their answers to psychology survey questions—even those pertaining to topics like moral judgment, where there is no objectively “correct” answer and where human responses would naturally reflect a diversity of thought. These findings underscore the critical need for caution and methodological rigor when using LLMs in psychological research, emphasizing that they should be viewed as tools to augment, not replace, human-centered research approaches.

Leave a Reply Cancel reply