close
close

UX for Agents, Part 1: Chat

UX for Agents, Part 1: Chat

At Sequoia’s AI Ascent conference in March, I discussed three limitations for agents: planning, user experience, and memory. Check out this talk here. In this post, I’m going to dive deeper into UX for agents. Thanks to Nuno Campos, founding engineer at LangChain, for his many original thoughts and analogies. This was originally meant to be a single blog post, but there are so many different aspects of UX for agents that I’m breaking it into three separate blog posts. This is the first one.

Human-computer interaction has been a well-studied field for years. I think that in the coming years, Human agent Interaction will also become a key research area.

Agentic systems differ from traditional computing systems of the past due to new challenges arising from latency, unreliability, and natural language interfaces. As such, I strongly believe that new UI/UX paradigms for interacting with these agentic applications will emerge.

Although agent systems are still in their infancy, I believe there are several emerging UX paradigms. In this blog, we will discuss the UX that is perhaps the most dominant so far: chat.

Streaming discussion

The “streaming chat” user experience is the most dominant user experience so far. It is simply an agentic system that streams its thoughts and actions in a chat format – ChatGPT is the most popular example. This interaction model seems basic, but it is actually quite good for several reasons.

The main way to “program” an LLM is to use natural language. In chat, you interact directly with the LLM via natural language. This means that there are virtually no barriers between you and the LLM.

💡

In some ways, streaming chat is the “terminal” of early computers.

A terminal (especially in early computers) provides lower-level, more direct access to the underlying operating system. Computers gradually evolved into more UI-based elements, and these days, that’s generally how we interact with them. Streaming chat can be similar – it’s the first way we created to interact with LLMs, and it provides fairly direct access to the underlying LLM. Over time, other UXs may emerge (just as computers became more UI-based), but low-level access has many advantages, especially in the beginning!

One of the reasons streaming chat is great is that LLMs can take a while to work. Streaming allows the user to understand exactly what is happening behind the scenes. You can stream the intermediate actions that the LLM takes (both the actions it takes and the results it gets) as well as the tokens while the LLM is “thinking.”

Another advantage of streaming chat is that LLMs can often make mistakes. Chat provides a great interface to naturally correct and guide mistakes! We are already very used to having follow-up conversations and discussing things iteratively via chat.

However, streaming chat does have its drawbacks. First, streaming chat is a relatively new user experience, so our existing chat platforms (iMessage, Facebook Messenger, Slack, etc.) don’t have it built in. Second, it’s a bit of a hassle for longer tasks: am I just going to sit there and watch the agent work? Third, streaming chat usually needs to be triggered by a human, which means the human is still very much involved.

Discussion without streaming

It seems strange to call it “non-streaming chat,” since we would have just called it “chat” two years ago — but here we are. Non-streaming chat has many of the same properties as streaming chat: it exposes the LLM quite directly to the user, and it allows for very natural corrections.

The big difference with non-streaming chat is that replies come in whole batches, which has its pros and cons. The main downside is that you can’t see what’s going on under the hood, leaving you in the dark.

But… is it really good?

Linus Lee recently had some interesting thoughts on “delegation” that I really enjoyed. Here’s an excerpt to illustrate:

I intentionally built the interface to be as opaque as possible.

He argues that an opaque interface requires a certain degree of trust, but once established it allows you to just delegate tasks to the agent without micromanagement. This asynchronous nature also lends itself to longer tasks, meaning agents do more work for you.

Assuming trust is established, this seems like a good thing. But it also opens the door to other problems. For example, how to handle “double-texting” (a message sent once to the user, the agent starts doing something, then the user sends a new message with a different (and sometimes unrelated) thought before the agent completes its task). With streaming chat, you usually don’t have this problem because the agent’s streaming prevents the user from entering new input.

One of the benefits of the non-streaming chat user experience is that it’s also much more native to us, meaning it can be easier to integrate into existing workflows. People are used to texting with humans – why shouldn’t they easily adapt to texting with AI?

💡

Another great advantage of non-streaming chat is that it’s often acceptable for the AI ​​to take longer to respond.

This is often because non-streaming chat is more natively integrated into our existing workflows. We don’t expect our friends to text us back instantly. Why should we expect an AI to do that? It makes it easier to interact with more complex agent systems. These systems often take a while, and if you expect an instant response, it can be frustrating. Non-streaming chat often eliminates that wait, making it easier to complete more complex tasks.

It may seem at first that streaming is newer, flashier, and more futuristic than standard chat… but as we become more confident in our agent systems, will this trend reverse?

Is there more than just a cat?

As you might guess given that this is just the first part of a three-part series, we believe there are other UX to consider than just chat. However, it’s worth remembering that the cat is a very good UXThere’s a reason it’s so widely used.

Advantages of the cat:

  • Allows the user to interact directly with the model
  • Allows you to easily answer follow-up questions and/or make corrections

Advantages/Disadvantages of Streaming Chat vs. Non-Streaming Chat