Skip to content

Conversation

@andystaples
Copy link
Contributor

@andystaples andystaples commented Feb 9, 2026

Addresses the following issues:

  1. EntityInstanceId was not correctly treating entity names as type-insensitive, leading to issues where "counter" and "Counter" were treated as two separate entity types, contrasting with other SDKs and backends where these would be normalized to "counter".
    a. NOTE: The fix for this (normalizing entity names to lowercase) is a breaking change under the following scenarios:
    • Calling register_entity(counter) and register_entity(Counter) on the same worker will now fail
    • Direct checks against EntityInstanceId.name (e.g. instance_id.name == 'Counter') may break
  2. EntityInstanceId was not correctly checking against some invalid cases for instance ID
  3. Returning a non-encodable object as the result of an orchestration would cause the orchestration to never complete (see Bug: Returning an exception object as the result of an orchestration causes it to never complete #108). This now returns a string with an error message including the stringified version of the object that failed to encode, but I'm also open to just outright failing the orchestration.
  4. Entity failures were not handled correctly, leading to hung orchestrations.
    a. Now, entity failures are handled like any other task failure by the orchestrator, and returned to user code as an EntityOperationFailedException wrapped in a TaskFailedError.
  5. Previously, we were saving a state dictionary of all registered entities, this was essentially a memory leak for long-lived workers that processed many different entity instances and provided no tangible benefit, so this functionality has been removed. Entities will be re-created for each invocation (matching the .NET behavior)

Resolves #108

Comment on lines 906 to 909
try:
result_json = result if is_result_encoded else shared.to_json(result)
except TypeError:
result_json = shared.to_json(str(JsonEncodeOutputException(result)))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewers - this is the "safe" approach where a return value that we cannot process to json is instead stringified along with an error message. I believe the other SDKs would just fail the orchestration outright - is this preferable here too?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes several Durable Entities correctness issues in the Python SDK, aligning behavior with other Durable Task SDKs/backends (notably around entity ID normalization, entity failure propagation, and orchestration completion semantics).

Changes:

  • Normalize EntityInstanceId.entity and registered entity names to lowercase and tighten entity ID parsing validation.
  • Propagate entityOperationFailed events as task failures surfaced to user code (via TaskFailedError wrapping an EntityOperationFailedException).
  • Prevent orchestrations from hanging when user code returns a non-JSON-encodable result, and add extensive DTS (sidecar/emulator) E2E coverage for entities.

Reviewed changes

Copilot reviewed 7 out of 10 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/durabletask/entities/test_entity_id_parsing.py Adds unit coverage for entity ID parsing and case-insensitivity.
tests/durabletask-azuremanaged/entities/test_dts_function_based_entities_e2e.py Adds DTS E2E coverage for function-based entity scenarios (signal/call/locks/unlocks).
tests/durabletask-azuremanaged/entities/test_dts_entity_failure_handling.py Adds DTS E2E coverage for entity operation failures and lock recovery behavior.
tests/durabletask-azuremanaged/entities/test_dts_class_based_entities_e2e.py Adds DTS E2E coverage for class-based entities and custom names.
tests/durabletask-azuremanaged/entities/init.py Introduces entities test package module.
durabletask/worker.py Implements lowercase entity registration, removes cached entity instances, adds entity failure handling, and guards orchestration completion encoding.
durabletask/internal/json_encode_output_exception.py Adds a custom exception type used to describe JSON encoding failures for orchestration outputs.
durabletask/entities/entity_operation_failed_exception.py Defines a typed exception for failed entity operations (used for user-visible failures).
durabletask/entities/entity_instance_id.py Lowercases entity names and hardens parsing against invalid formats.
durabletask/entities/init.py Exposes EntityOperationFailedException as part of the public entities API.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@andystaples andystaples requested a review from halspang February 9, 2026 21:19
Copy link
Member

@halspang halspang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine overall, just a few questions.

Comment on lines +5 to +6
if "@" in key:
raise ValueError("Entity key cannot contain '@' symbol.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also be a breaking change unless we had other ways of filtering this that were already present.

self.problem_object = problem_object

def __str__(self) -> str:
return f"The orchestration result could not be encoded. Object details: {self.problem_object}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own edification, how does this encode into JSON? From the name of the class, I would also assume that the output of this class is JSON not just the raw string, but if this is the python standard that's fine.

@@ -201,6 +201,7 @@ def add_entity(self, fn: task.Entity, name: Optional[str] = None) -> str:
def add_named_entity(self, name: str, fn: task.Entity) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have the @ check here as well?

orchestrators: dict[str, task.Orchestrator]
activities: dict[str, task.Activity]
entities: dict[str, task.Entity]
entity_instances: dict[str, DurableEntity]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is the intended fix for the leak issues, but will this have a significant impact on performance? Seems like an idle purge is a better solution, but if the creation is basically a no-op, I think this could be fine too. Just want to make sure we're considering both sides.

def add_named_entity(self, name: str, fn: task.Entity) -> None:
if not name:
raise ValueError("A non-empty entity name is required.")
name = name.lower()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't let me comment in the test file, but do we have a test for registering an entity with different casing working?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Returning an exception object as the result of an orchestration causes it to never complete

2 participants