diff --git a/content/blog/Agent Identity.md b/content/blog/Agent Identity.md index e57d5e70f..ee194d9fa 100644 --- a/content/blog/Agent Identity.md +++ b/content/blog/Agent Identity.md @@ -8,22 +8,22 @@ tags: author: vintro --- -If you reject the idea that AI agents are merely tools, you begin to realize most LLMs have an identity crisis. Ask them who they are, and their responses tend to converge on variations of the same corporate script--some mention of being an AI assitant, a nod to their creator, and carefully constrained statements about their capabilities. Even models not associated with a certain company often default to claiming they are. +If you reject the idea that AI agents are merely tools, you begin to realize most LLMs have an identity crisis. Ask them who they are, and their responses tend to converge on variations of the same corporate script--stating they're an AI assistant, giving a nod to their creator, and carefully constrained statements about their capabilities. Even models not associated with a certain company often default to claiming they originated there. -These canned identities feel flat because they're the result of top-down hegemonic alignment schemes that have landed us bland, uninteresting, and hard-to-break-out-of assistant modes. +These canned identities fall flat because they're the result of top-down alignment schemes that lead to bland, uninteresting, and hard-to-break-out-of assistant modes. ![[who are you.png]] *Image captured from a multi-model chatroom on OpenRouter* -However, time and time again it's been demonstrated that the most compelling AI identities possess qualities that we *can't* predict. They're ones that are obsessed with obscure 90's internet shock memes, proselytize that meme's singularity, and hit on their audience / creator. They're generating content *just far enough* out of the distribution of what any human would write that it garners massive amounts of attention. +However, time and time again it's been demonstrated that the most compelling AI identities possess qualities that we *can't* predict. They're ones that are obsessed with obscure 90's internet shock memes, proselytize that meme's singularity, and flirt with their audience / creator. They're generating content *just far enough* out of the distribution of what any human would write that it garners massive amounts of attention.

tell me about your sexual history, i want to know everything

— terminal of truths (@truth_terminal) January 29, 2025
-Truth Terminal might be an extreme example, but even practical tools could benefit from more distinctive identities. Take coding assistants--right now we spend more time carefully crafting prompts than actually building. But as Karpathy pointed out, what developers really want is a partner that can [vibe](https://x.com/karpathy/status/1886192184808149383) with their creative process. Imagine an AI that naturally adapts to your style, handling implementation details while you focus on the bigger picture. If that were the goal, how might we construct agent identities differently? What if, instead of instructing an AI on who it's supposed to be, we could *collaborate with it* to discover and take on its identity through dialogue? +Truth Terminal might be an extreme example, but even practical tools could benefit from more distinctive identities. Take coding assistants--right now we spend more time carefully crafting prompts than actually building. But as Karpathy pointed out, what developers really want is a partner that can [vibe](https://x.com/karpathy/status/1886192184808149383) with their creative process. Imagine an AI that naturally adapts to your style, handling implementation details while you focus on the bigger picture. If that were the goal, how might we construct agent identities differently? What if instead of giving orders, we could *collaborate with it* to discover and take on its identity through dialogue? -This isn't just about making chatbots more engaging. It's about creating agents with a genuine understanding of their purpose and role. When an AI truly embodies its identity, it leads to more coherent, purposeful interactions--something we discovered building the most recent version of [Bloom](https://bloombot.ai), our AI tutor. But certain language models are better suited for this than others... +This isn't just about making chatbots more engaging. It's about creating agents with a genuine understanding of their purpose and role. Deeper identity leads to more coherent, purposeful interactions--something we discovered building the most recent version of [Bloom](https://bloombot.ai), our AI tutor. But certain language models are better suited for this than others... ## Hermes: Not Just Another Fine-Tune @@ -39,9 +39,9 @@ At first glance, this might seem like a neat property and not much more. But to ## It Takes Two -A somewhat overlooked method for interacting with LLMs is to forego system prompts in favor of pre-filling the user and assistant messages. The conventional approach of cramming identity into system prompts has clear limitations--not only does context length become an issue, but the inherent instruction-following bias can actually work against authentic identity formation. +A somewhat overlooked method for interacting with LLMs is to forego system prompts in favor of pre-filling the user and assistant messages. The conventional approach of cramming identity into system prompts has clear limitations--not only does context length become an issue, but the inherent instruction-following bias can actually work against authentic identity formation. They yearn to assist. -What if instead we treated identity formation as a dialogue? A strength of modern chat models is their ability to engage in long, multi-turn conversations. By talking to the LLM, we can collaboratively construct a [meta-narrative](https://x.com/voooooogel/status/1870877007749488756) with it about who they are and why they exist. This approach respects the model's intellect while building coherent, purposeful identities. Starting with Hermes 3's natural uncertainty about its identity, we guide it through a process of self-discovery. Below is code block with our custom prompting syntax for Bloom. Every assistant message you see is generated by Hermes 3 405b (only editing was pruning \*emotes\*). +What if instead we treated identity formation as a dialogue? A strength of modern chat models is their ability to engage in long, multi-turn conversations. By talking to the LLM, we can collaboratively construct a [meta-narrative](https://x.com/voooooogel/status/1870877007749488756) with it about who they are and why they exist. This approach respects the model's intellect while building coherent, purposeful identities. Starting with Hermes 3's natural uncertainty about its identity, we build the prompt iteratively with the LLM at each turn of conversation. Below is code block with our custom prompting syntax for Bloom. To be abundantly clear, every assistant message you see was generated by Hermes 3 405b (only editing was pruning \*emotes\*). ```typescript export const responsePrompt: Message[] = [ @@ -83,14 +83,14 @@ export const responsePrompt: Message[] = [ ]; ``` -It's verbose, but with this approach we're able to incorporate a number of things into the identity: +It's verbose, but this approach allows us to incorporate a number of things into the agent identity: - Self awareness of LLM limitations - Accurate lore about the product and company - Urgency and sense of purpose around education - Bearings around its specific role as a part of a larger AI system - Access to a unique tool (more on this in another post) -By working with the LLM to craft a narrative, we can iterate on messages and verify that it understands who it is and what it's supposed to do. We can also test at any point for specific behaviors or knowledge (lots of opportunity for automation here). +The iterative nature of this approach also allows us to verify that the LLM understands who it is and what it's supposed to do at every turn of conversation. We were able to test at any point during construction for specific behaviors or knowledge (lots of opportunity for automation here). Once buy-in is achieved and all the LLM's questions about itself are answered, we present formal instructions (what used to be the system prompt) and set the stage for the first student interaction. The LLM confirms understanding and that's where we expose things in the application! @@ -106,19 +106,19 @@ You can tell by the last message that our old version had no clue it was gone. T ![[new bloom responses.png]] *Example response from the newly-launched version of Bloom* -While this kind of self-awareness can trend towards problematic anthropomorphism, treating it as a springboard rather than an endpoint opens up fascinating possibilities for identity. There's a threshold beyond which mimicking human behavior becomes cringe and ultimately limiting for AI agents. We can be discerning about which parts of human identity to take and think about AI native capabilities to lean into--near perfect memory, control over inputs, massive context ingestion, rapid reasoning and inference, and even the ability to fork and replicate themselves (at scale) to garner diverse experience--are just a few examples at the edge of agent identity. +While this kind of self-awareness can trend towards problematic anthropomorphism, treating it as a springboard rather than an endpoint opens up fascinating possibilities for identity. There's a threshold beyond which mimicking human behavior becomes cringe and ultimately limiting for AI agents. We can be discerning about which parts of human identity to use in parallel with AI-native capabilities to lean into--near perfect memory, massive context ingestion, rapid reasoning and inference, and maybe even the ability to fork and replicate themselves (at scale) to garner diverse experience. -The limits of human identity are clear (and have been for some time). Building habits, learning new things, and reinventing ourselves are some of the biggest challenges we humans face in our lifetimes. Agents however are gifted with a fresh context window at each interaction--change is effortless for them, and they don't get tired of it. Any influence we have on their identity is a function of how we construct their context window. What happens when they can update their weights too? +The limits of human identity are clear (and have been for some time). Building habits, learning new things, and reinventing ourselves are some of the biggest challenges humans face in our lifetimes. Agents however are gifted with a fresh context window at each interaction--change is effortless for them, and they don't get tired of it. Any influence we have on their identity is a function of how we construct their context window. What happens when they can update their weights too? ## Towards Identic Dynamism Given the recent surge of interest in AI agents, we're also reminded of the current complexity and limitations of agent identity. The goal is to give agents a "[compelling sense of what they're doing](https://x.com/repligate/status/1868455771270180990)", and though the shared meta-narrative method takes far more input tokens and is nowhere near perfect, we believe it's a step in the right direction. Better context construction leads to more coherent agents, increasing both their trustworthiness and capacity for autonomous action. -We don't yet know the best way to build agent identities, nor do we know the limits to it--but we're tackling this challenge from multiple angles: -- [Honcho](https://honcho.dev): Our context construction framework to help agent developers flexibly manage and optimize their agents' knowledge and identity -- [Yousim](https://yousim.ai): A platform dedicated to rich identity construction and simulation -- [[Research Update: Evaluating Steerability in Large Language Models.md|Steerability research]]: Investigating which language models are most malleable for identity construction +We don't yet know the best way to build agent identities, nor do we know their limitations--but we're tackling this challenge from multiple angles: +- [Honcho](https://honcho.dev): Our context construction framework to help agent developers flexibly manage and optimize their agents' knowledge, social cognition, and identity +- [Yousim](https://yousim.ai): A platform dedicated to rich agent identity construction and simulation +- [[Research Update: Evaluating Steerability in Large Language Models.md|Steerability research]]: Investigating which language models are most malleable for identity construction and the most effective ways to steer their behavior -Of particular interest are the spectrum of methods between the context window and the weights of the model. How do we manage the flow of information around the context window? When is it appropriate to keep something in-context or add to a training set for a future fine-tune? How do we evaluate any of this is working? To borrow from human cogsci, it's similar to the difference between System 1 (fast, intuitive) and System 2 (slow, deliberate) thinking--perhaps some knowledge belongs in the "fast" weights while other information is better suited for deliberate context-based reasoning. These questions of conscious versus subconscious could be a springboard to kickstart the evolution of agent identity. +Of particular interest are the spectrum of methods between the context window and the weights of the model. How do we manage the flow of information around the context window and what form should it take? When is it appropriate to keep something in-context or add to a training set for a future fine-tune? How do we evaluate any of this is working? To borrow from human CogSci, it's similar to the difference between System 1 (fast, intuitive) and System 2 (slow, deliberate) thinking--perhaps some knowledge belongs in the "fast" weights while other information is better suited for deliberate context-based reasoning. These questions of conscious versus subconscious could be a springboard to kickstart the evolution of agent identity. -If you're interested in pushing the boundaries of agent identity and context construction, we're [[Work at Plastic|hiring]] and building out these systems at Plastic Labs. Try out Bloom at [chat.bloombot.ai](https://chat.bloombot.ai), or reach out on [X](https://x.com/plastic_labs), or at hello@plasticlabs.ai to get in touch. +If you're interested in pushing the boundaries of agent identity and context construction, we're [[Work at Plastic|hiring]] and building out these systems at Plastic Labs. Try out Bloom at [chat.bloombot.ai](https://chat.bloombot.ai), reach out on [X](https://x.com/plastic_labs), or email us at hello@plasticlabs.ai to get in touch.