Skip to content

chore(eval): add Opus 4.6 results, update README with all 3 models

3cbe925
Select commit
Loading
Failed to load commit list.
Sign in for the full log view
Merged

feat(eval): expand dataset to 37 tasks with JSON scenarios #185

chore(eval): add Opus 4.6 results, update README with all 3 models
3cbe925
Select commit
Loading
Failed to load commit list.