mirror of
https://github.com/jackyzha0/quartz.git
synced 2025-12-20 03:14:06 -06:00
Make spoiler callout foldable
This commit is contained in:
parent
59b668aa79
commit
65b155f0cc
@ -26,6 +26,9 @@ This creates a clear, verifiable reward signal for social understanding: either
|
|||||||
|
|
||||||
This benchmark also allows us to test whether models specifically optimized for technical reasoning excel at social understanding, and to get a granular, quantifiable understanding of models' social reasoning abilities.
|
This benchmark also allows us to test whether models specifically optimized for technical reasoning excel at social understanding, and to get a granular, quantifiable understanding of models' social reasoning abilities.
|
||||||
|
|
||||||
|
## Prior work and inspiration
|
||||||
|
|
||||||
|
|
||||||
## Methodology
|
## Methodology
|
||||||
|
|
||||||
### Dataset Creation
|
### Dataset Creation
|
||||||
@ -59,8 +62,8 @@ We ended up with 123 snippets—below is an example:
|
|||||||
> - C) yeah and we could even gamify the process, giving users points for when their honcho makes decisions that align with what they would've done
|
> - C) yeah and we could even gamify the process, giving users points for when their honcho makes decisions that align with what they would've done
|
||||||
> - D) ohh yeah like a more proactive approach as opposed to being bayesian, updating priors based on new information
|
> - D) ohh yeah like a more proactive approach as opposed to being bayesian, updating priors based on new information
|
||||||
|
|
||||||
> [!warning] Correct answer
|
> [!question]- Can you guess the right answer?
|
||||||
> D
|
> D! Classic Vince being Bayesian.
|
||||||
|
|
||||||
### Context Modes
|
### Context Modes
|
||||||
|
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user