Skip to content

Conversation

@bkal01
Copy link

@bkal01 bkal01 commented Feb 7, 2026

samples in DatBench from mme_realworld have prompt_format as follows:

{
"prefix": "",
"suffix": "\nA. Two of the construction vehicles are parked, and one is moving.\nB. One of the construction vehicles is moving, and one is parked.\nC. Three construction vehicles are parked.\nD. Many construction vehicles are parked.\nE. The image does not feature the object.\nSelect the correct option.\nProvide only \\boxed{<LETTER>} (uppercase A/B/C/D/E) on the answer line.\nAnswer:"
}

so we expect the model to output something like:

The correct answer is D.

\boxed{D}

the previous regex pattern doesn't parse this correctly because it is over-escaped:
Screenshot 2026-02-07 at 2 16 11 PM

the updated pattern (copied from here) does:
Screenshot 2026-02-07 at 2 17 26 PM

there's a fallback in mme_realworld.py to look for the first standalone letter. this is problematic in some cases where model output contains some additional text:

This is a question about construction vehicles. The correct answer is D.

\boxed{D}

the old regex would parse "A" as a standalone letter, despite the model outputting "D". the updated regex fixes this.

@bkal01 bkal01 marked this pull request as ready for review February 7, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant