--- title: Machine learning is fixated on task performance date: 12.12.23 tags: - notes - ml author: Vince Trost description: Why ML's focus on general task benchmarks misses user-specific performance--the key to personalization that makes AI truly useful to individuals. --- The machine learning industry has traditionally adopted an academic approach, focusing primarily on performance across a range of tasks. LLMs like GPT-4 are a testament to this, having been scaled up to demonstrate impressive & diverse task capability. This scaling has also led to [[ARCHIVED; Theory of Mind Is All You Need|emergent abilities]], debates about the true nature of which rage on. However, general capability doesn't necessarily translate to completing tasks as an individual user would prefer. This is a failure mode that anyone building agents will inevitably encounter. The focus, therefore, needs to shift from how language models perform tasks in a general sense to how they perform tasks on a user-specific basis. Take summarization. It’s a popular machine learning task at which models have become quite proficient...at least from a benchmark perspective. However, when models summarize for users with a pulse, they fall short. The reason is simple: the models don’t know this individual. The key takeaways for a specific user differ dramatically from the takeaways _any possible_ internet user _would probably_ note. ^0005ac So a shift in focus toward user-specific task performance would provide a much more dynamic & realistic approach. Catering to individual needs & paving the way for more personalized & effective ML applications.