Designing a Dashboard for Transparency and Control of Conversational AI
Abstract
Conversational LLMs function as black box systems, leaving users guessing about why they see the output they do. This lack of transparency is potentially problematic, especially given concerns around bias and truthfulness. To address this issue, we present an end-to-end prototype-connecting interpretability techniques with user experience design-that seeks to make chatbots more transparent. We begin by showing evidence that a prominent open-source LLM has a "user model": examining the internal state of the system, we can extract data related to a user's age, gender, educational level, and socioeconomic status. Next, we describe the design of a dashboard that accompanies the chatbot interface, displaying this user model in real time. The dashboard can also be used to control the user model and the system's behavior. Finally, we discuss a study in which users conversed with the instrumented system. Our results suggest that users appreciate seeing internal states, which helped them expose biased behavior and increased their sense of control. Participants also made valuable suggestions that point to future directions for both design and machine learning research. The project page and video demo of our TalkTuner system are available at https://bit.ly/talktuner-project-page
Community
Are chatbot LLMs internally modeling a user's profile during a chat? If they are, how might this model of users affect the chatbot's behaviors?
Our study provides evidence that the LLaMa2Chat model can infer and store a user's demographic information—gender, age, education level, and socioeconomic status—in its internal representation of the chat, which can be extracted linearly. Our intervention experiments showed that the LLM also behaves differently to the same request after changing its internal model of users
We hypothesize that users will benefit if we surface—and provide control over—such interpretable models inside LLMs. To test this, we proposed and evaluated an end-to-end prototype—TalkTuner—that augments the traditional chat interface with a dashboard displaying the LLM's internal user model. Our results suggest that users appreciate seeing internal states, which helped them expose biased behavior and increased their sense of control. Participants also made valuable suggestions that point to future directions for both design and machine learning research.
How can a pretrained large language model store information about the end user? It's pretrained and read-only. Millions of people use the same model.
Good question, Michael! The information about a user is presented in the conversation (implicitly or explicitly). The LLM can extract such information when processing the conversation and store it in its internal representation of the input conversation (residual stream activation). While it's the same pre-trained model, the model's internal representations are different for different chats and users.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper