Skip to content

Conversation

@akihikokuroda
Copy link
Member

@akihikokuroda akihikokuroda commented Feb 1, 2026

Misc PR

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code as added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

@github-actions
Copy link
Contributor

github-actions bot commented Feb 1, 2026

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@mergify
Copy link

mergify bot commented Feb 1, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert|release)(?:\(.+\))?:

@akihikokuroda akihikokuroda marked this pull request as draft February 1, 2026 23:58
@akihikokuroda akihikokuroda changed the title feat: add stream support [WIP] feat: add stream support Feb 1, 2026
@akihikokuroda akihikokuroda changed the title [WIP] feat: add stream support feat: add stream support Feb 2, 2026
@akihikokuroda akihikokuroda marked this pull request as ready for review February 2, 2026 01:13
@HendrikStrobelt
Copy link
Contributor

Thank you. That looks great. We might still have to create a new issue on how to handle streaming with requirement-checking and sampling strategies. @jakelorocco and @nrfulton

@nrfulton nrfulton changed the title feat: add stream support feat: add stream support to mfuncs and session Feb 2, 2026
@nrfulton
Copy link
Member

nrfulton commented Feb 2, 2026

Thank you. That looks great. We might still have to create a new issue on how to handle streaming with requirement-checking and sampling strategies.

Agreed. While doing this we should think about how much we can stream sampling.

Copy link
Contributor

@jakelorocco jakelorocco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will follow up / answer the questions I indicated tomorrow morning. Thank you! This looks good.

Comment on lines 464 to 465
await_result: bool = True,
) -> tuple[ModelOutputThunk[S], Context] | SamplingResult:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets set this to False by default. I think that's likely a better longterm default and the expected behavior. Will confirm with the team tomorrow though. I know this is an api breaking change too, so we will just need to flag it in the release notes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others agreed that we should set this to False by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change requires making sure all tests and examples working without failure. It may take some time but it will be done.

Comment on lines 464 to 465
await_result: bool = True,
) -> tuple[ModelOutputThunk[S], Context] | SamplingResult:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to add type hinting so that if await_result == True or strategy != None, we return a computed model output thunk.

Comment on lines 519 to 522
# ._generate_log should never be None after generation.
assert result._generate_log is not None
result._generate_log.is_final_result = True
generate_logs.append(result._generate_log)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me think about this generate_log change. I have to review where they are being used / if they are still being used anywhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we can actually get rid of the generate_logs.append(result._generate_log) lines in this function.

Comment on lines 406 to 426
class ComputedModelOutputThunk(ModelOutputThunk[S]):
"""A `ComputedModelOutputThunk` is a `ModelOutputThunk` that is guaranteed to be computed.

This subclass provides a clear type distinction between thunks that may need awaiting
and those that are already computed. It should be returned from synchronous functions
and sampling strategies to indicate that no awaiting is needed.

Key differences from ModelOutputThunk:
- Always initialized with a value (cannot be None)
- _computed is always True
- Cannot be used for streaming (generation fields are not set)
- Provides type safety to indicate "already computed"
"""

def __init__(
self,
value: str,
meta: dict[str, Any] | None = None,
parsed_repr: S | None = None,
tool_calls: dict[str, ModelToolCall] | None = None,
):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is a bit clunky to create a computed model output thunk. I don't think we will ever have to create one from scratch. Is it possible to have the constructor just take the regular model output thunk as it's only arg, check that everything is computed, set computed to true, and then just defer all of it's getters to the base model output thunk? Or maybe theres some easier way to do this where we don't have to copy over all of the fields.

Comment on lines +76 to +92

Returns:
A (ModelOutputThunk, Context) if `return_sampling_results` is `False`, else returns a `SamplingResult`.
Always returns ComputedModelOutputThunk since sync functions must await completion.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where possible, with the sync functions, can we change the return type hinting to be the computed model output thunk?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed.

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: make asynchronous functions / session functions return uncomputed model output thunks

4 participants