metacog copy + assorted notes

This commit is contained in:
Courtland Leer 2024-01-09 15:22:40 -05:00
parent 156aad32e5
commit 07c3440348
11 changed files with 104 additions and 28 deletions

View File

@ -0,0 +1,2 @@
[[Honcho name lore]]
[[Honcho is a Vingean Agent]]

View File

View File

@ -0,0 +1 @@
test

View File

View File

@ -1,26 +0,0 @@
---
title: hold
date: Dec 19, 2023
---
(meme)
## TL;DR
## Defining Terms
## Background and Related Work
(Def wanna give this a more creative name)
- historically, machine learning research has consisted of researchers intelligently building datasets with hard problems in them to evaluate models' ability to predict the right answer for, whatever that looks like
- someone comes along, builds a model that generalizes well on the benchmarks, and the cycle repeats itself, with a new, harder dataset being built and released
- this brings us to today, where datasets like [MMLU](https://arxiv.org/abs/2009.03300), [HumanEval](https://arxiv.org/abs/2107.03374v2), and the hilariously named [HellaSwag](https://arxiv.org/abs/1905.07830)
- what they all have in common is they're trying to explore a problem space as exhaustively as possible, providing a large number of diverse examples to evaluate on (MMLU - language understanding, HumanEval - coding, HellaSwag - reasoning)
- high performance on these datasets demonstrates incredible *general* abilities
- and in fact their performance on these diverse datasets proves their capabilities are probably much more vast than we think they are
- but they're not given the opportunity to query these diverse capabilities in current user-facing systems
## How We've Explored It
## Selective Metacog Taxonomy
## The Future/Potential/Importance

View File

@ -0,0 +1,66 @@
---
title: hold
date: Dec 19, 2023
---
### Scratch
meme ideas:
-something about statelessness (maybe guy in the corner at a party "they don't know llms are stateless")
-oppenheimer meme templates
excalidraw ideas:
## TL;DR
## Toward AI-Native Metacogniton
At Plastic, we've been thinking hard for nearly a year about [cognitive architectures](https://blog.langchain.dev/openais-bet-on-a-cognitive-architecture/) for large language models. Much of that time was focused on developing [[Theory-of-Mind Is All You Need|a production-grade AI-tutor]], which we hosted experimentally as a free and permissionless learning companion.
The rest has been deep down the research rabbit hole on a particularly potent, synthetic subset of LLM inference--[[Metacognition in LLMs is inference about inference|metacognition]]:
> For wetware, metacognition is typically defined as "thinking about thinking" or often a catch-all for any "higher-level" cognition.
>...
>In large language models, the synthetic corollary of cognition is inference. So we can reasonably define a metacognitive process in an LLM as any that runs inference on the output of prior inference. That is, inference itself is used as context--*inference about inference*. It might be instantly injected into the next prompt, stored for later use, or leveraged by another model.
Less verbose, it boils down to this: if metacogntion in humans is *thinking about thinking*, then **metacognition in LLMs is *inference about inference***.
We believe this definition helps frame an exciting design space for several reasons
- Unlocks regions of the latent space unapproachable by humans
- Leverages rather than suppresses the indeterministic nature of LLMs
- Allows models to generate their own context
- Moves the research and development scope of focus beyond tasks and toward identity
- Affords LLMs the requisite intellectual respect to realize their potential
- Enables any agent builder to quickly escape the gravity well of foundation model behavior
## Research Foundations
(Def wanna give this a more creative name)
(@vintro, should we reference some of the papers that explicitly call out "metacognition"? or maybe we get into some of that below)
- historically, machine learning research has consisted of researchers intelligently building datasets with hard problems in them to evaluate models' ability to predict the right answer for, whatever that looks like
- someone comes along, builds a model that generalizes well on the benchmarks, and the cycle repeats itself, with a new, harder dataset being built and released
- this brings us to today, where datasets like [MMLU](https://arxiv.org/abs/2009.03300), [HumanEval](https://arxiv.org/abs/2107.03374v2), and the hilariously named [HellaSwag](https://arxiv.org/abs/1905.07830)
- what they all have in common is they're trying to explore a problem space as exhaustively as possible, providing a large number of diverse examples to evaluate on (MMLU - language understanding, HumanEval - coding, HellaSwag - reasoning)
- high performance on these datasets demonstrates incredible *general* abilities
- and in fact their performance on these diverse datasets proves their capabilities are probably much more vast than we think they are
- but they're not given the opportunity to query these diverse capabilities in current user-facing systems
## Designing Metacogntion
how to architect it
inference - multiple
storage - of prior inference
between inference, between session, between agents
examples from our research
## Selective Metacog Taxonomy
a wealth of theory on how cognition occurs in humans
but no reason to limit ourselves to biological plausibility
### Metamemory
### Theory of Mind
### Imaginative Metacognition
## The Future/Potential/Importance
-intellectual respect
-potential features

View File

@ -0,0 +1,12 @@
*the post to accompany basic user context management release*
mtg notes
-us building bloom
-issues we discovers
-llms dont remember memory
-complications associated --ballooning of context management
-so we made this
-open AI assistants API, general version of this
-machine learning over fixation of llm performance
-toys v tools
-research v product

View File

@ -0,0 +1,19 @@
Earlier this year I was reading *Rainbows End*, [Vernor Vinge's](https://en.wikipedia.org/wiki/Vernor_Vinge) [seminal augmented reality novel](https://en.wikipedia.org/wiki/Rainbows_End_(novel)), when I came across the term "Local Honcho[^1]":
>We simply put our own agent nearby, in a well-planned position with essentially zero latencies. What the Americans call a Local Honcho.
The near future Vinge constructs is one of outrageous data abundance, where every experience is riddled with information and overlayed realities, and each person must maintain multiple identities against this data and relative to those contexts.
It's such an intense landscape, that the entire educational system has undergone wholesale renovation to address the new normal, and older people must routinely return to school to learn the latest skills. It also complicates economic life, resulting in intricate networks of nested agents than can be hard for any one individual to tease apart.
Highlighting this, a major narrative arc in the novel involves intelligence agencies running operations of pretty unfathomable global sophistication. Since in this world artificial intelligence has more or less failed as a research direction, this requires ultra-competent human operators able to parse and leverage high velocity information. For field operations, it requires a "Local Honcho" on the ground to act as an adaptable central nervous system for the mission and its agents:
>Altogether it was not as secure as Vazs milnet, but it would suffice for most regions of the contingency tree. Alfred tweaked the box, and now he was getting Parkers video direct. At last, he was truly a Local Honcho.
Plastic had been deep into the weeds on how to harvest, retrieve, and leverage user context with large language models for months. First to enhance the UX of our AI tutor (Bloom), then in thinking about how to solve this horizontally for all vertical-specific AI applications. It struck me that we faced similar challenges to the characters in _Rainbows End_ and were converging on a similar solution.
As you interface with the entire constellation of AI applications, you shouldn't have to redundantly provide context and oversight for every interaction. You need a single source of truth that can do this for you. You need a Local Honcho.
But as we've discovered, LLMs are remarkable at theory of mind tasks, and thus at reasoning about user need. So unlike in the book, this administration can be offloaded to an AI. And your Honcho can orchestrate the relevant context and identities on your behalf, whatever the operation.
[^1]: American English, from [Japanese](https://en.wikipedia.org/wiki/Japanese_language)_[班長](https://en.wiktionary.org/wiki/%E7%8F%AD%E9%95%B7#Japanese)_ (hanchō, “squad leader”), from 19th c. [Mandarin](https://en.wikipedia.org/wiki/Mandarin_Chinese)[班長](https://en.wiktionary.org/wiki/%E7%8F%AD%E9%95%B7#Chinese) (_bānzhǎng_, “team leader”). Probably entered English during World War II: many apocryphal stories describe American soldiers hearing Japanese prisoners-of-war refer to their lieutenants as _[hanchō](https://en.wiktionary.org/wiki/hanch%C5%8D#Japanese)_. ([Wiktionary](https://en.wiktionary.org/wiki/honcho))

View File

@ -1,3 +1,5 @@
For wetware, metacognition is typically defined as "thinking about thinking" or often a catch-all for any "higher-order" cognition. In some more specific domains, it's an introspective process, focused on thinking about your own thinking.
For wetware, metacognition is typically defined as "thinking about thinking" or often a catch-all for any "higher-level" cognition.
In large language models, the synthetic corollary of cognition is inference. So we can reasonably call a metacognitive process in an LLM as any that runs inference on the result of prior inference. That is, inference itself is used as context. It might be instantly funneled into the next prompt, stored for later use, or leveraged by another model. Experiments here will be critical to overcome the machine learning community's fixation of task completion (see [[The machine learning industry is too focused on general task performance]]).
(In some more specific domains, it's an introspective process, focused on thinking about exclusively *your own* thinking or a suite of personal learning strategies...all valid within their purview, but too constrained for our purposes.)
In large language models, the synthetic corollary of cognition is inference. So we can reasonably define a metacognitive process in an LLM as any that runs inference on the output of prior inference. That is, inference itself is used as context--*inference about inference*. It might be instantly injected into the next prompt, stored for later use, or leveraged by another model. Experiments here will be critical to overcome [[The machine learning industry is too focused on general task performance|the machine learning community's fixation on task completion]].