chore: Delete old job postings and fix dates on notes (#81)

This commit is contained in:
Vineeth Voruganti 2024-11-20 12:15:29 -05:00 committed by GitHub
parent f170b4c031
commit 80a902ee2b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
11 changed files with 95 additions and 89 deletions

View File

@ -1,24 +1,27 @@
--- ---
title: Full-Stack Engineer title: (FILLED) Full-Stack Engineer
date: 08.23.24 date: 08.23.24
tags: tags:
- positions - positions
- dev - dev
- announcements - announcements
--- ---
> [!custom] This position has been filled. However, we're always looking to meet great candidates. If you like what's listed here, please reach out regardless -- we are growing fast and might have similar positions in the future. > [!custom] This position has been filled. However, we're always looking to meet great candidates. If you like what's listed here, please reach out regardless -- we are growing fast and might have similar positions in the future.
> Check out our other [[Work at Plastic|open positions here]]. > Check out our other [[Work at Plastic|open positions here]].
(NYC / Remote, Full-Time) (NYC / Remote, Full-Time)
## About the Role ## About the Role
We're searching for a full-stack engineer to help build [Honcho](https://honcho.dev), our user representation infrastructure, and accelerate the development of a new paradigm of AI personalization. We're searching for a full-stack engineer to help build [Honcho](https://honcho.dev), our user representation infrastructure, and accelerate the development of a new paradigm of AI personalization.
You should be comfortable working across the stack, from crafting user interfaces to designing APIs and managing deployments. It requires strong foundations both frontend and backend. You should be comfortable working across the stack, from crafting user interfaces to designing APIs and managing deployments. It requires strong foundations both frontend and backend.
This role is a fast-paced opportunity to work alongside a seasoned interdisciplinary team. If you're eager to learn, can tackle diverse challenges quickly, and love building at the edge, get in touch. This role is a fast-paced opportunity to work alongside a seasoned interdisciplinary team. If you're eager to learn, can tackle diverse challenges quickly, and love building at the edge, get in touch.
## About You ## About You
- 1-2 years experience or equivalent (new grads OK) - 1-2 years experience or equivalent (new grads OK)
- High cultural alignment with Plastic Labs' ethos - High cultural alignment with Plastic Labs' ethos
- Primary location +/- 3 hrs of EST - Primary location +/- 3 hrs of EST
@ -36,7 +39,9 @@ This role is a fast-paced opportunity to work alongside a seasoned interdiscipli
- Complementary interest in cognitive sciences (cs, linguistics, neuroscience, philosophy, & psychology) or other adjacent interdisciplinary fields a plus - Complementary interest in cognitive sciences (cs, linguistics, neuroscience, philosophy, & psychology) or other adjacent interdisciplinary fields a plus
## How to Apply ## How to Apply
Please send the following to dev@plasticlabs.ai: Please send the following to dev@plasticlabs.ai:
- **Resume/CV** in whatever form it exists (PDF, LinkedIn, website, etc) - **Resume/CV** in whatever form it exists (PDF, LinkedIn, website, etc)
- **Portfolio** of notable work (GitHub, pubs, ArXiv, blog, X, etc) - **Portfolio** of notable work (GitHub, pubs, ArXiv, blog, X, etc)
- **Statement** of alignment specific to Plastic Labs--how do you identify with our mission, how can you contribute, etc? (points for brief, substantive, heterodox) - **Statement** of alignment specific to Plastic Labs--how do you identify with our mission, how can you contribute, etc? (points for brief, substantive, heterodox)
@ -45,5 +50,5 @@ Applications without these 3 items won't be considered, but be sure to optimize
And it can't hurt to [join Discord](https://discord.gg/plasticlabs) and introduce yourself or engage with [our GitHub](https://github.com/plastic-labs). And it can't hurt to [join Discord](https://discord.gg/plasticlabs) and introduce yourself or engage with [our GitHub](https://github.com/plastic-labs).
(Back to [[Work at Plastic]])
(Back to [[Work at Plastic]])

View File

@ -1,43 +0,0 @@
---
title: Design Contractor
date: 08.23.24
tags:
- positions
- design
- announcements
---
(Remote, Contract)
## About the Role
We're searching for a design contractor to update our digital franchises. This includes but isn't limited to our websites ([here](https://plasticlabs.ai), [here](https://honcho.dev), & [here](https://blog.plasticlabs.ai)) and assets for social sites.
A good fit is someone who is full-stack; has a clearly defined process; is willing to work with existing brand kit primitives; has proven ability to cohere quickly to team aesthetic, product, and mission; and can deliver fresh, subversive ideas rapidly.
If you've explored Plastic Labs and are excited by what we're up to, let's chat.
## About You
- 2-5 years experience professional full time designer
- Primary location +/- 3 hrs of EST
- Full-stack development ability
- Ability to sync
- Ability to design & build end-to-end
- Ability to grok product, mission, & aesthetics quickly
- Refined organized process for aligning to company needs
- Capacity to scope work & execute in predetermined timeframe
- Robust portfolio of recent work
- Recent customer references
- High empathy & agency
## How to Apply
Please send the following to ops@plasticlabs.ai:
- **Resume/CV** in whatever form it exists (PDF, LinkedIn, website, etc)
- **Portfolio** of notable & relevant work (links, docs, etc)
- **Statement** of process--how do you work, why is this a fit, how can you help?
- **References** you've worked with recently (2-3)
Applications without these 4 items won't be considered, but be sure to optimize for speed over perfection.
And it can't hurt to [join Discord](https://discord.gg/plasticlabs) or DM us [on X](https://x.com/plastic_labs) and introduce yourself.
(Back to [[Work at Plastic]])

View File

@ -5,6 +5,7 @@ tags:
- positions - positions
- announcements - announcements
--- ---
Plastic Labs is a research-driven company solving identity for the agentic world. Plastic Labs is a research-driven company solving identity for the agentic world.
We're building [Honcho](https://honcho.dev), the social cognition layer for AI-powered applications. Honcho synthesizes high-fidelity user representations to instantly personalize the UX of entire app ecosystems--individually aligning each user's agent stack. We're building [Honcho](https://honcho.dev), the social cognition layer for AI-powered applications. Honcho synthesizes high-fidelity user representations to instantly personalize the UX of entire app ecosystems--individually aligning each user's agent stack.
@ -20,17 +21,19 @@ We're freshly capitalized and moving fast.
Join us. LFG. Join us. LFG.
## Open Positions ## Open Positions
- [[(FILLED) ML Research Engineer|(FILLED) ML Research Engineer]] - [[(FILLED) ML Research Engineer|(FILLED) ML Research Engineer]]
- [[Founding Engineer|Founding Engineer]] - [[Founding Engineer|Founding Engineer]]
- [[(FILLED) Full-Stack Engineer|(FILLED) Full-Stack Engineer]] - [[(FILLED) Full-Stack Engineer|(FILLED) Full-Stack Engineer]]
- [[Design Contractor|Design Contractor]]
- [[Research Fellow(s)|Research Fellow(s)]] - [[Research Fellow(s)|Research Fellow(s)]]
- [[Plastic Intern(s)]] - [[Plastic Intern(s)]]
- [[Research Grants]] - [[Research Grants]]
## Full-Time Benefits ## Full-Time Benefits
- Salaries from $100-200k + equity - Salaries from $100-200k + equity
- Significant compute & data resources - Significant compute & data resources
- Full health coverage - Full health coverage
- No strings lifestyle stipend - No strings lifestyle stipend
- NYC relocation assistance (optional) - NYC relocation assistance (optional)

View File

@ -1,6 +1,11 @@
Earlier this year [Courtland](https://x.com/courtlandleer) was reading *Rainbows End*, [Vernor Vinge's](https://en.wikipedia.org/wiki/Vernor_Vinge) [seminal augmented reality novel](https://en.wikipedia.org/wiki/Rainbows_End_(novel)), when he came across the term "Local Honcho[^1]": ---
title: Honcho name lore
date: 01.26.24
---
>We simply put our own agent nearby, in a well-planned position with essentially zero latencies. What the Americans call a Local Honcho. Earlier this year [Courtland](https://x.com/courtlandleer) was reading _Rainbows End_, [Vernor Vinge's](https://en.wikipedia.org/wiki/Vernor_Vinge) [seminal augmented reality novel](<https://en.wikipedia.org/wiki/Rainbows_End_(novel)>), when he came across the term "Local Honcho[^1]":
> We simply put our own agent nearby, in a well-planned position with essentially zero latencies. What the Americans call a Local Honcho.
The near future Vinge constructs is one of outrageous data abundance, where every experience is riddled with information and overlayed realities, and each person must maintain multiple identities against this data and relative to those contexts. The near future Vinge constructs is one of outrageous data abundance, where every experience is riddled with information and overlayed realities, and each person must maintain multiple identities against this data and relative to those contexts.
@ -8,7 +13,7 @@ It's such an intense landscape, that the entire educational system has undergone
Highlighting this, a major narrative arc in the novel involves intelligence agencies running operations of pretty unfathomable global sophistication. Since (in the world of the novel) artificial intelligence has more or less failed as a research direction, this requires ultra-competent human operators able to parse and leverage high velocity information. For field operations, it requires a "Local Honcho" on the ground to act as an adaptable central nervous system for the mission and its agents: Highlighting this, a major narrative arc in the novel involves intelligence agencies running operations of pretty unfathomable global sophistication. Since (in the world of the novel) artificial intelligence has more or less failed as a research direction, this requires ultra-competent human operators able to parse and leverage high velocity information. For field operations, it requires a "Local Honcho" on the ground to act as an adaptable central nervous system for the mission and its agents:
>Altogether it was not as secure as Vazs milnet, but it would suffice for most regions of the contingency tree. Alfred tweaked the box, and now he was getting Parkers video direct. At last, he was truly a Local Honcho. > Altogether it was not as secure as Vazs milnet, but it would suffice for most regions of the contingency tree. Alfred tweaked the box, and now he was getting Parkers video direct. At last, he was truly a Local Honcho.
For months before, Plastic had been deep into the weeds around harvesting, retrieving, & leveraging user context with LLMs. First to enhance the UX of our AI tutor (Bloom), then in thinking about how to solve this horizontally for all vertical-specific AI applications. It struck us that we faced similar challenges to the characters in _Rainbows End_ and were converging on a similar solution. For months before, Plastic had been deep into the weeds around harvesting, retrieving, & leveraging user context with LLMs. First to enhance the UX of our AI tutor (Bloom), then in thinking about how to solve this horizontally for all vertical-specific AI applications. It struck us that we faced similar challenges to the characters in _Rainbows End_ and were converging on a similar solution.
@ -16,4 +21,5 @@ As you interface with the entire constellation of AI applications, you shouldn't
But as we've discovered, LLMs are remarkable at theory of mind tasks, and thus at reasoning about user need. So unlike in the book, this administration can be offloaded to an AI. And your [[Honcho; User Context Management for LLM Apps|Honcho]] can orchestrate the relevant context and identities on your behalf, whatever the operation. But as we've discovered, LLMs are remarkable at theory of mind tasks, and thus at reasoning about user need. So unlike in the book, this administration can be offloaded to an AI. And your [[Honcho; User Context Management for LLM Apps|Honcho]] can orchestrate the relevant context and identities on your behalf, whatever the operation.
[^1]: "American English, from [Japanese](https://en.wikipedia.org/wiki/Japanese_language)_[班長](https://en.wiktionary.org/wiki/%E7%8F%AD%E9%95%B7#Japanese)_ (hanchō, “squad leader”)...probably entered English during World War II: many apocryphal stories describe American soldiers hearing Japanese prisoners-of-war refer to their lieutenants as _[hanchō](https://en.wiktionary.org/wiki/hanch%C5%8D#Japanese)_" ([Wiktionary](https://en.wiktionary.org/wiki/honcho)) [^1]: "American English, from [Japanese](https://en.wikipedia.org/wiki/Japanese_language)_[班長](https://en.wiktionary.org/wiki/%E7%8F%AD%E9%95%B7#Japanese)_ (hanchō, “squad leader”)...probably entered English during World War II: many apocryphal stories describe American soldiers hearing Japanese prisoners-of-war refer to their lieutenants as _[hanchō](https://en.wiktionary.org/wiki/hanch%C5%8D#Japanese)_" ([Wiktionary](https://en.wiktionary.org/wiki/honcho))

View File

@ -1,6 +1,11 @@
The human-AI chat paradigm assumes only two participants in a given interaction. While this is sufficient for conversations directly with un-augmented foundation models, it creates many obstacles when designing more sophisticated cognitive architectures. When you train/fine-tune a language model, you begin to reinforce token distributions that are appropriate to come in between the special tokens denoting human vs AI messages. ---
title: Human-AI chat paradigm hamstrings the space of possibility
date: 02.21.24
---
Here's a limited list of things *besides* a direct response we routinely want to generate: The human-AI chat paradigm assumes only two participants in a given interaction. While this is sufficient for conversations directly with un-augmented foundation models, it creates many obstacles when designing more sophisticated cognitive architectures. When you train/fine-tune a language model, you begin to reinforce token distributions that are appropriate to come in between the special tokens denoting human vs AI messages.
Here's a limited list of things _besides_ a direct response we routinely want to generate:
- A 'thought' about how to respond to the user - A 'thought' about how to respond to the user
- A [[Loose theory of mind imputations are superior to verbatim response predictions|theory of mind prediction]] about the user's internal mental state - A [[Loose theory of mind imputations are superior to verbatim response predictions|theory of mind prediction]] about the user's internal mental state
@ -12,6 +17,7 @@ Here's a limited list of things *besides* a direct response we routinely want to
In contrast, the current state of inference is akin to immediately blurting out the first thing that comes into your mind--something that humans with practiced aptitude in social cognition rarely do. But this is very hard given the fact that those types of responses don't ever come after the special AI message token. Not very flexible. In contrast, the current state of inference is akin to immediately blurting out the first thing that comes into your mind--something that humans with practiced aptitude in social cognition rarely do. But this is very hard given the fact that those types of responses don't ever come after the special AI message token. Not very flexible.
We're already anecdotally seeing well-trained completion models follow instructions impressively likely because of incorporation into pretraining. Is chat the next thing to be subsumed by general completion models? Because if so, flexibility in the types of inferences you can make would be very beneficial. We're already anecdotally seeing well-trained completion models follow instructions impressively likely because of incorporation into pretraining. Is chat the next thing to be subsumed by general completion models? Because if so, flexibility in the types of inferences you can make would be very beneficial.
Metacognition then becomes something you can do at any step in a conversation. Same with instruction following & chat. Maybe this helps push LLMs in a much more general direction.
Metacognition then becomes something you can do at any step in a conversation. Same with instruction following & chat. Maybe this helps push LLMs in a much more general direction.

View File

@ -1,3 +1,8 @@
---
title: Humans like personalization
date: 03.26.24
---
To us: it's obvious. But we get asked this a lot: To us: it's obvious. But we get asked this a lot:
> Why do I need to personalize my AI application? > Why do I need to personalize my AI application?
@ -6,11 +11,11 @@ Fair question; not everyone has gone down this conceptual rabbithole to the exte
Short answer: people like it. Short answer: people like it.
In the tech bubble, it can be easy to forget about what *most* humans like. Isn't building stuff people love our job though? In the tech bubble, it can be easy to forget about what _most_ humans like. Isn't building stuff people love our job though?
In web2, it's taken for granted. Recommender algorithms make UX really sticky, which retains users sufficiently long to monetize them. To make products people love and scale them, they had to consider whether *billions*--in aggregate--tend to prefer personalized products/experiences or not. In web2, it's taken for granted. Recommender algorithms make UX really sticky, which retains users sufficiently long to monetize them. To make products people love and scale them, they had to consider whether _billions_--in aggregate--tend to prefer personalized products/experiences or not.
In physical reality too, most of us prefer white glove professional services, bespoke products, and friends and family who know us *deeply*. We place a premium in terms of time and economic value on those goods and experiences. In physical reality too, most of us prefer white glove professional services, bespoke products, and friends and family who know us _deeply_. We place a premium in terms of time and economic value on those goods and experiences.
The more we're missing that, the more we're typically in a principal-agent problem, which creates overhead, interest misalignment, dissatisfaction, mistrust, and information asymmetry: The more we're missing that, the more we're typically in a principal-agent problem, which creates overhead, interest misalignment, dissatisfaction, mistrust, and information asymmetry:
@ -31,4 +36,5 @@ It's also why everyone is obsessed with evals and benchmarks that have scant pra
Two answers: Two answers:
1. Every interaction has context. Like it or not, people have preferences and the more an app/agent can align with those, the more it can enhance time to value for the user. It can be sticker, more delightful, "just work," and entail less overhead. (We're building more than calculators here, though this applies even to those!) 1. Every interaction has context. Like it or not, people have preferences and the more an app/agent can align with those, the more it can enhance time to value for the user. It can be sticker, more delightful, "just work," and entail less overhead. (We're building more than calculators here, though this applies even to those!)
2. If an app doesn't do this, it'll get out-competed by one that does...or by the ever improving set of generally capable foundation models. 2. If an app doesn't do this, it'll get out-competed by one that does...or by the ever improving set of generally capable foundation models.

View File

@ -1,7 +1,13 @@
---
title: LLM Metacognition is inference about inference
date: 03.26.24
---
For wetware, metacognition is typically defined as thinking about thinking or often a catch-all for any higher-level cognition. For wetware, metacognition is typically defined as thinking about thinking or often a catch-all for any higher-level cognition.
(In some more specific domains, it's an introspective process, focused on thinking about exclusively _your own_ thinking or a suite of personal learning strategies...all valid within their purview, but too constrained for our purposes.) (In some more specific domains, it's an introspective process, focused on thinking about exclusively _your own_ thinking or a suite of personal learning strategies...all valid within their purview, but too constrained for our purposes.)
In large language models, the synthetic corollary of cognition is inference. So we can reasonably define a metacognitive process in an LLM architecture as any that runs inference on the output of prior inference. That is, inference itself is used as context--_inference about inference_. In large language models, the synthetic corollary of cognition is inference. So we can reasonably define a metacognitive process in an LLM architecture as any that runs inference on the output of prior inference. That is, inference itself is used as context--_inference about inference_.
It might be instantly injected into the next prompt, stored for later use, or leveraged by another model. This kind of architecture is critical when dealing with user context, since LLMs can run inference about user behavior, then use that synthetic context in the future. Experiments here will be critical to overcome [[Machine learning is fixated on task performance|the machine learning community's fixation on task completion]]. For us at Plastic, one of the most interesting species of metacogntion is [[Loose theory of mind imputations are superior to verbatim response predictions|theory of mind and mimicking that in LLMs]] to form high-fidelity representations of users.
It might be instantly injected into the next prompt, stored for later use, or leveraged by another model. This kind of architecture is critical when dealing with user context, since LLMs can run inference about user behavior, then use that synthetic context in the future. Experiments here will be critical to overcome [[Machine learning is fixated on task performance|the machine learning community's fixation on task completion]]. For us at Plastic, one of the most interesting species of metacogntion is [[Loose theory of mind imputations are superior to verbatim response predictions|theory of mind and mimicking that in LLMs]] to form high-fidelity representations of users.

View File

@ -1,4 +1,9 @@
Large language models are [simulators](https://generative.ink/posts/simulators/). In predicting the next likely token, they are simulating how an abstracted “_any person”_ might continue the generation. The basis for this simulation is the aggregate compression of a massive corpus of human generated natural language from the internet. So, predicting humans is _literally_ their core function. ---
title: LLMs excel at theory of mind because they read
date: 02.20.24
---
Large language models are [simulators](https://generative.ink/posts/simulators/). In predicting the next likely token, they are simulating how an abstracted “_any person”_ might continue the generation. The basis for this simulation is the aggregate compression of a massive corpus of human generated natural language from the internet. So, predicting humans is _literally_ their core function.
In that corpus is our literature, our philosophy, our social media, our hard and social science--the knowledge graph of humanity, both in terms of discrete facts and messy human interaction. That last bit is important. The latent space of an LLM's pretraining is in large part a _narrative_ space. Narration chock full of humans reasoning about other humans--predicting what they will do next, what they might be thinking, how they might be feeling. In that corpus is our literature, our philosophy, our social media, our hard and social science--the knowledge graph of humanity, both in terms of discrete facts and messy human interaction. That last bit is important. The latent space of an LLM's pretraining is in large part a _narrative_ space. Narration chock full of humans reasoning about other humans--predicting what they will do next, what they might be thinking, how they might be feeling.
@ -6,10 +11,11 @@ That's no surprise; we're a social species with robust social cognition. It's al
We know that in humans, we can strongly [correlate reading with improved theory of mind abilities](https://journal.psych.ac.cn/xlkxjz/EN/10.3724/SP.J.1042.2022.00065). When your neural network is consistently exposed to content about how other people think, feel, desire, believe, prefer, those mental tasks are reinforced. The more experience you have with a set of ideas or states, the more adept you become. We know that in humans, we can strongly [correlate reading with improved theory of mind abilities](https://journal.psych.ac.cn/xlkxjz/EN/10.3724/SP.J.1042.2022.00065). When your neural network is consistently exposed to content about how other people think, feel, desire, believe, prefer, those mental tasks are reinforced. The more experience you have with a set of ideas or states, the more adept you become.
The experience of such natural language narration *is itself a simulation* where you practice and hone your theory of mind abilities. Even if, say, your English or Psychology teacher was foisting the text on you with other training intentions. Or even if you ran the simulation without coercion to escape at the beach. The experience of such natural language narration _is itself a simulation_ where you practice and hone your theory of mind abilities. Even if, say, your English or Psychology teacher was foisting the text on you with other training intentions. Or even if you ran the simulation without coercion to escape at the beach.
It's not such a stretch to imagine that in optimizing for other tasks LLMs acquire emergent abilities not intentionally trained.[^3] It may even be that in order to learn natural language prediction, these systems need theory of mind abilities or that learning language specifically involves them--that's certainly the case with human wetware systems and theory of mind skills do seem to improve with model size and language generation efficacy. It's not such a stretch to imagine that in optimizing for other tasks LLMs acquire emergent abilities not intentionally trained.[^3] It may even be that in order to learn natural language prediction, these systems need theory of mind abilities or that learning language specifically involves them--that's certainly the case with human wetware systems and theory of mind skills do seem to improve with model size and language generation efficacy.
[^1]: Kosinski includes a compelling treatment of much of this in ["Evaluating Large Language Models in Theory of Mind Tasks"](https://arxiv.org/abs/2302.02083) [^1]: Kosinski includes a compelling treatment of much of this in ["Evaluating Large Language Models in Theory of Mind Tasks"](https://arxiv.org/abs/2302.02083)
[^2]: It also leads to other wacky phenomena like the [Waluigi effect](https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post#The_Waluigi_Effect) [^2]: It also leads to other wacky phenomena like the [Waluigi effect](https://www.lesswrong.com/posts/D7PumeYTDPfBTp3i7/the-waluigi-effect-mega-post#The_Waluigi_Effect)
[^3]: Here's Chalmers [making a very similar point](https://youtube.com/clip/UgkxliSZFnnZHvYf2WHM4o1DN_v4kW6LsiOU?feature=shared) [^3]: Here's Chalmers [making a very similar point](https://youtube.com/clip/UgkxliSZFnnZHvYf2WHM4o1DN_v4kW6LsiOU?feature=shared)

View File

@ -1,35 +1,40 @@
---
title: Loose theory of mind imputations are superior to verbatim response predictions
date: 02.20.24
---
When we [[Theory of Mind Is All You Need|first started experimenting]] with user context, we naturally wanted to test whether our LLM apps were learning useful things about users. And also naturally, we did so by making predictions about them. When we [[Theory of Mind Is All You Need|first started experimenting]] with user context, we naturally wanted to test whether our LLM apps were learning useful things about users. And also naturally, we did so by making predictions about them.
Since we were operating in a conversational chat paradigm, our first instinct was to try and predict what the user would say next. Two things were immediately apparent: (1) this was really hard, & (2) response predictions weren't very useful. Since we were operating in a conversational chat paradigm, our first instinct was to try and predict what the user would say next. Two things were immediately apparent: (1) this was really hard, & (2) response predictions weren't very useful.
We saw some remarkable exceptions, but *reliable* verbatim prediction requires a level of context about the user that simply isn't available right now. We're not sure if it will require context gathering wearables, BMIs, or the network of context sharing apps we're building with [[Honcho; User Context Management for LLM Apps|Honcho]], but we're not there yet. We saw some remarkable exceptions, but _reliable_ verbatim prediction requires a level of context about the user that simply isn't available right now. We're not sure if it will require context gathering wearables, BMIs, or the network of context sharing apps we're building with [[Honcho; User Context Management for LLM Apps|Honcho]], but we're not there yet.
Being good at what any person in general might plausibly say is literally what LLMs do. But being perfect at what one individual will say in a singular specific setting is a whole different story. Even lifelong human partners might only experience this a few times a week. Being good at what any person in general might plausibly say is literally what LLMs do. But being perfect at what one individual will say in a singular specific setting is a whole different story. Even lifelong human partners might only experience this a few times a week.
Plus, even when you get it right, what exactly are you supposed to do with it? The fact that's such a narrow reasoning product limits the utility you're able to get out of a single inference. Plus, even when you get it right, what exactly are you supposed to do with it? The fact that's such a narrow reasoning product limits the utility you're able to get out of a single inference.
So what are models good at predicting that's useful with limited context and local to a single turn of conversation? Well, it turns out they're really good at [imputing internal mental states](https://arxiv.org/abs/2302.02083). That is, they're good at theory of mind predictions--thinking about what you're thinking. A distinctly *[[LLM Metacognition is inference about inference|metacognitive]]* task. So what are models good at predicting that's useful with limited context and local to a single turn of conversation? Well, it turns out they're really good at [imputing internal mental states](https://arxiv.org/abs/2302.02083). That is, they're good at theory of mind predictions--thinking about what you're thinking. A distinctly _[[LLM Metacognition is inference about inference|metacognitive]]_ task.
(Why are they good at this? [[LLMs excel at theory of mind because they read|We're glad you asked]].) (Why are they good at this? [[LLMs excel at theory of mind because they read|We're glad you asked]].)
Besides just being better at it, letting the model leverage what it knows to make open-ended theory of mind imputation has several distinct advantages over verbatim response prediction: Besides just being better at it, letting the model leverage what it knows to make open-ended theory of mind imputation has several distinct advantages over verbatim response prediction:
1. **Fault tolerance** 1. **Fault tolerance**
- Theory of mind predictions are often replete with assessments of emotion, desire, belief, value, aesthetic, preference, knowledge, etc. That means they seek to capture a range within a distribution. A slice of user identity.
- This is much richer than trying (& likely failing) to generate a single point estimate (like in verbatim prediction) and includes more variance. Therefore there's a higher probability you identify something useful by trusting the model to flex its emergent strengths. - Theory of mind predictions are often replete with assessments of emotion, desire, belief, value, aesthetic, preference, knowledge, etc. That means they seek to capture a range within a distribution. A slice of user identity.
- This is much richer than trying (& likely failing) to generate a single point estimate (like in verbatim prediction) and includes more variance. Therefore there's a higher probability you identify something useful by trusting the model to flex its emergent strengths.
2. **Learning** ^555815 2. **Learning** ^555815
- That high variance means there's more to be wrong (& right) about. More content = more claims, which means more opportunity to learn.
- Being wrong here is a feature, not a bug; comparing those prediction errors with reality are how you know what you need to understand about the user in the future to get to ground truth.
3. **Interpretability** - That high variance means there's more to be wrong (& right) about. More content = more claims, which means more opportunity to learn.
- Knowing what you're right and wrong about exposes more surface area against which to test and understand the efficacy of the model--i.e. how well it knows the user. - Being wrong here is a feature, not a bug; comparing those prediction errors with reality are how you know what you need to understand about the user in the future to get to ground truth.
- As we're grounded in the user and theory of mind, we're better able to assess this than if we're simply asking for likely human responses in the massive space of language encountered in training.
3. **Interpretability**
- Knowing what you're right and wrong about exposes more surface area against which to test and understand the efficacy of the model--i.e. how well it knows the user.
- As we're grounded in the user and theory of mind, we're better able to assess this than if we're simply asking for likely human responses in the massive space of language encountered in training.
4. **Actionability** 4. **Actionability**
- The richness of theory of mind predictions give us more to work with *right now*. We can funnel these insights into further inference steps to create UX in better alignment and coherence with user state. - The richness of theory of mind predictions give us more to work with _right now_. We can funnel these insights into further inference steps to create UX in better alignment and coherence with user state.
- Humans make thousands of tiny, subconscious interventions resposive to as many sensory cues & theory of mind predictions all to optimize single social interactions. It pays to know about the internal state of others. - Humans make thousands of tiny, subconscious interventions resposive to as many sensory cues & theory of mind predictions all to optimize single social interactions. It pays to know about the internal state of others.
- Though our lifelong partners from above can't perfectly predict each other's sentences, they can impute each other's state with extremely high-fidelity. The rich context they have on one another translates to a desire to spend most of their time together (good UX). - Though our lifelong partners from above can't perfectly predict each other's sentences, they can impute each other's state with extremely high-fidelity. The rich context they have on one another translates to a desire to spend most of their time together (good UX).

View File

@ -1,3 +1,8 @@
---
title: Machine learning is fixated on task performance
date: 12.12.23
---
The machine learning industry has traditionally adopted an academic approach, focusing primarily on performance across a range of tasks. LLMs like GPT-4 are a testament to this, having been scaled up to demonstrate impressive & diverse task capability. This scaling has also led to [[Theory of Mind Is All You Need|emergent abilities]], debates about the true nature of which rage on. The machine learning industry has traditionally adopted an academic approach, focusing primarily on performance across a range of tasks. LLMs like GPT-4 are a testament to this, having been scaled up to demonstrate impressive & diverse task capability. This scaling has also led to [[Theory of Mind Is All You Need|emergent abilities]], debates about the true nature of which rage on.
However, general capability doesn't necessarily translate to completing tasks as an individual user would prefer. This is a failure mode that anyone building agents will inevitably encounter. The focus, therefore, needs to shift from how language models perform tasks in a general sense to how they perform tasks on a user-specific basis. However, general capability doesn't necessarily translate to completing tasks as an individual user would prefer. This is a failure mode that anyone building agents will inevitably encounter. The focus, therefore, needs to shift from how language models perform tasks in a general sense to how they perform tasks on a user-specific basis.

View File

@ -5,15 +5,16 @@ tags:
- legal - legal
date: 11.11.24 date: 11.11.24
--- ---
Plastic Labs is the creator of [YouSim.ai](https://yousim.ai), an AI product demo that has inspired the anonymous creation of the \$YOUSIM token using Pump.fun on the Solana blockchain, among many other tokens. We deeply appreciate the enthusiasm and support of the \$YOUSIM community, but in the interest of full transparency we want to clarify the nature of our engagement in the following ways: Plastic Labs is the creator of [YouSim.ai](https://yousim.ai), an AI product demo that has inspired the anonymous creation of the \$YOUSIM token using Pump.fun on the Solana blockchain, among many other tokens. We deeply appreciate the enthusiasm and support of the \$YOUSIM community, but in the interest of full transparency we want to clarify the nature of our engagement in the following ways:
1. Plastic Labs did not issue, nor does it control, or provide financial advice related to the \$YOUSIM memecoin. The memecoin project is led by an independent community and has undergone a community takeover (CTO). 1. Plastic Labs did not issue, nor does it control, or provide financial advice related to the \$YOUSIM memecoin. The memecoin project is led by an independent community and has undergone a community takeover (CTO).
3. Plastic Labs' acceptance of \$YOUSIM tokens for research grants does not constitute an endorsement of the memecoin as an investment. These grants support our broader mission of advancing AI research and innovation, especially within the open source community. 2. Plastic Labs' acceptance of \$YOUSIM tokens for research grants does not constitute an endorsement of the memecoin as an investment. These grants support our broader mission of advancing AI research and innovation, especially within the open source community.
4. YouSim.ai and any other Plastic Labs products remain separate from the \$YOUSIM memecoin. Any future integration of token utility into our products would be carefully considered and subject to regulatory compliance. 3. YouSim.ai and any other Plastic Labs products remain separate from the \$YOUSIM memecoin. Any future integration of token utility into our products would be carefully considered and subject to regulatory compliance.
5. The \$YOUSIM memecoin carries inherent risks, including price volatility, potential ecosystem scams, and regulatory uncertainties. Plastic Labs is not responsible for any financial losses or damages incurred through engagement with the memecoin. 4. The \$YOUSIM memecoin carries inherent risks, including price volatility, potential ecosystem scams, and regulatory uncertainties. Plastic Labs is not responsible for any financial losses or damages incurred through engagement with the memecoin.
6. Plastic Labs will never direct message any member of the $YOUSIM community soliciting tokens, private keys, seed phrases, or any other private information, collectors items, or financial instruments. 5. Plastic Labs will never direct message any member of the $YOUSIM community soliciting tokens, private keys, seed phrases, or any other private information, collectors items, or financial instruments.
7. YouSim.ai and the products it powers are simulated environments and their imaginary outputs do not reflect the viewpoints, positions, voice, or agenda of Plastic Labs. 6. YouSim.ai and the products it powers are simulated environments and their imaginary outputs do not reflect the viewpoints, positions, voice, or agenda of Plastic Labs.
8. Communications from Plastic Labs regarding the \$YOUSIM memecoin are for informational purposes only and do not constitute financial, legal, or tax advice. Users should conduct their own research and consult with professional advisors before making any decisions. 7. Communications from Plastic Labs regarding the \$YOUSIM memecoin are for informational purposes only and do not constitute financial, legal, or tax advice. Users should conduct their own research and consult with professional advisors before making any decisions.
9. Plastic Labs reserves the right to adapt our engagement with the \$YOUSIM community as regulatory landscapes evolve and to prioritize the integrity of our products and compliance with applicable laws. 8. Plastic Labs reserves the right to adapt our engagement with the \$YOUSIM community as regulatory landscapes evolve and to prioritize the integrity of our products and compliance with applicable laws.
We appreciate the \$YOUSIM community's support and passion for YouSim.ai and the broader potential of AI technologies. However, it's crucial for us to maintain transparency about the boundaries of our engagement. We encourage responsible participation and ongoing open dialogue as we collectively navigate this exciting and rapidly evolving space. We appreciate the \$YOUSIM community's support and passion for YouSim.ai and the broader potential of AI technologies. However, it's crucial for us to maintain transparency about the boundaries of our engagement. We encourage responsible participation and ongoing open dialogue as we collectively navigate this exciting and rapidly evolving space.