mirror of
https://github.com/jackyzha0/quartz.git
synced 2025-12-19 19:04:06 -06:00
proofed
This commit is contained in:
commit
7daa772b35
@ -11,11 +11,19 @@ It’s our mission to realize this future.
|
||||
|
||||
## Blog
|
||||
|
||||
[[Extrusion 01.24]]
|
||||
[[Honcho; User Context Management for LLM Apps|Honcho: User Context Management for LLM Apps]]
|
||||
[[blog/Theory-of-Mind Is All You Need]]
|
||||
[[blog/Open-Sourcing Tutor-GPT]]
|
||||
|
||||
## Extrusions
|
||||
|
||||
[[extrusions/Extrusion 01.24|Extrusion 01.24]]
|
||||
## Notes
|
||||
|
||||
[[Honcho name lore]]
|
||||
[[Metacognition in LLMs is inference about inference]]
|
||||
[[The machine learning industry is too focused on general task performance]]
|
||||
|
||||
## Research
|
||||
|
||||
[Violation of Expectation Reduces Theory-of-Mind Prediction Error in Large Language Models](https://arxiv.org/pdf/2310.06983.pdf)
|
||||
|
||||
@ -6,7 +6,7 @@ No one needs another newsletter, so we'll work to make these worthwhile. Expect
|
||||
|
||||
## 2023 Recap
|
||||
|
||||
Last year was wild. We started as an edtech company and ended as anything but. There's a deep dive on some of the conceptual lore in last week's blog post, "[[Honcho; User Context Management for LLM Apps]]":
|
||||
Last year was wild. We started as an edtech company and ended as anything but. There's a deep dive on some of the conceptual lore in last week's "[[Honcho; User Context Management for LLM Apps|Honcho: User Context Management for LLM Apps]]":
|
||||
|
||||
>[Plastic Labs](https://plasticlabs.ai) was conceived as a research group exploring the intersection of education and emerging technology...with the advent of ChatGPT...we shifted our focus to large language models...we set out to build a non-skeuomorphic, AI-native tutor that put users first...our [[Open-Sourcing Tutor-GPT|experimental tutor]], Bloom, [[Theory-of-Mind Is All You Need|was remarkably effective]]--for thousands of users during the 9 months we hosted it for free...
|
||||
|
||||
@ -4,4 +4,4 @@ However, general capability doesn't necessarily translate to completing tasks as
|
||||
|
||||
Take summarization. It’s a popular machine learning task at which models have become quite proficient, at least from a benchmark perspective. However, when models summarize for users with a pulse, they fall short. The reason is simple: the models don’t know this individual. The key takeaways for a specific user differ dramatically from the takeaways _any possible_ internet user _would probably_ note.
|
||||
|
||||
So a shift in focus toward user-specific task performance would provide a much more dynamic & realistic approach. Catering to individual needs & paving she way for more personalized & effective ML applications.
|
||||
So a shift in focus toward user-specific task performance would provide a much more dynamic & realistic approach. Catering to individual needs & paving the way for more personalized & effective ML applications.
|
||||
|
||||
Loading…
Reference in New Issue
Block a user