fix boxed parsing for mme_realworld #3

bkal01 · 2026-02-07T20:10:51Z

samples in DatBench from mme_realworld have prompt_format as follows:

{
"prefix": "",
"suffix": "\nA. Two of the construction vehicles are parked, and one is moving.\nB. One of the construction vehicles is moving, and one is parked.\nC. Three construction vehicles are parked.\nD. Many construction vehicles are parked.\nE. The image does not feature the object.\nSelect the correct option.\nProvide only \\boxed{<LETTER>} (uppercase A/B/C/D/E) on the answer line.\nAnswer:"
}

so we expect the model to output something like:

The correct answer is D.

\boxed{D}

the previous regex pattern doesn't parse this correctly because it is over-escaped:

the updated pattern (copied from here) does:

there's a fallback in mme_realworld.py to look for the first standalone letter. this is problematic in some cases where model output contains some additional text:

This is a question about construction vehicles. The correct answer is D.

\boxed{D}

the old regex would parse "A" as a standalone letter, despite the model outputting "D". the updated regex fixes this.

fix boxed parsing for mme_realworld

c9b9de4

bkal01 marked this pull request as ready for review February 7, 2026 20:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix boxed parsing for mme_realworld #3

fix boxed parsing for mme_realworld #3

Uh oh!

bkal01 commented Feb 7, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix boxed parsing for mme_realworld #3

Are you sure you want to change the base?

fix boxed parsing for mme_realworld #3

Uh oh!

Conversation

bkal01 commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bkal01 commented Feb 7, 2026 •

edited

Loading