Skip to content

Conversation

@negvet
Copy link
Collaborator

@negvet negvet commented Jan 23, 2026

Description

Introducing semantic quantizer roles, e.g. linear:input, layernorm_linear:grad_output.
Emitted by module/op and used through RecipeState.create(., roles=..), so that right quantizers can be constructed without relying on index in a list.

Now used only by CustomRecipe, but can be extended to all recipes.
Also extendable to arbitrary operations, e.g. dpa:qkv and dpa:s (scores) for attention.

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Changes

Please list the changes introduced in this PR:

  • Change A
  • Change B

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

negvet and others added 4 commits January 23, 2026 15:14
…ipe state

Signed-off-by: Evgeny <etsykunov@nvidia.com>
Signed-off-by: Evgeny <etsykunov@nvidia.com>
Signed-off-by: Evgeny <etsykunov@nvidia.com>
@negvet negvet requested review from cyanguwa and timmoon10 January 23, 2026 15:32
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 23, 2026

Greptile Overview

Greptile Summary

This PR introduces semantic quantizer roles to TransformerEngine, replacing index-based quantizer identification with explicit role strings in the format <scope>:<tensor> (e.g., linear:input, layernorm_linear:grad_output).

Key changes:

  • Added roles parameter to RecipeState.create() factory method that passes semantic role strings to recipe states
  • Modified CustomRecipeState to require and validate roles, removing hardcoded role generation logic
  • Implemented get_quantizer_roles() method in base classes (TransformerEngineBaseModule, BasicOperation) and concrete implementations (Linear, LayerNormLinear, LayerNormMLP, GroupedLinear, BasicLinear)
  • Updated all quantizer factories to parse the new role format by splitting on : to extract scope and tensor type
  • Updated documentation in CustomRecipe to reflect the new role naming convention
  • All tests updated to validate the new format and parse roles correctly

The implementation is clean, well-structured, and includes proper validation at multiple levels. The role format change is breaking for CustomRecipe users, but this is appropriate given the experimental status of the feature.

Confidence Score: 5/5

  • This PR is safe to merge with no identified issues
  • The implementation is well-designed with proper validation at multiple layers, comprehensive test coverage updates, and clear documentation. The role format change is breaking but justified for the experimental CustomRecipe API. All quantizer factories consistently validate the new format, and the abstraction through base classes ensures extensibility.
  • No files require special attention

Important Files Changed

Filename Overview
transformer_engine/pytorch/quantization.py Added roles parameter to RecipeState.create() factory method and enforces roles validation in CustomRecipeState.make_quantizers() - replaces hardcoded role generation with explicit role strings
transformer_engine/pytorch/module/base.py Added get_quantizer_roles() abstract method to base module class and integrated role passing into set_meta_tensor() method with validation
transformer_engine/pytorch/module/linear.py Implemented get_quantizer_roles() to return semantic role strings using new format: linear:input, linear:weight, linear:output for forward pass
transformer_engine/pytorch/ops/op.py Added get_quantizer_roles() method to BasicOperation and integrated role passing into recipe state creation with validation
transformer_engine/common/recipe/init.py Updated CustomRecipe documentation to reflect new role format: linear:input instead of linear_input

Sequence Diagram

sequenceDiagram
    participant User
    participant Module as TransformerEngineModule
    participant RecipeState
    participant CustomRecipeState
    participant QFactory as User Quantizer Factory
    
    User->>Module: forward() with CustomRecipe
    activate Module
    Module->>Module: set_meta_tensor(fwd=True, recipe)
    Module->>Module: get_quantizer_roles(fwd=True, num_quantizers)
    Note over Module: Returns ["linear:input", "linear:weight", "linear:output"]
    Module->>RecipeState: create(recipe, mode="forward", num_quantizers=3, roles)
    activate RecipeState
    RecipeState->>CustomRecipeState: __init__(recipe, mode, num_quantizers)
    RecipeState->>CustomRecipeState: state.roles = roles
    RecipeState-->>Module: recipe_state
    deactivate RecipeState
    Module->>CustomRecipeState: make_quantizers()
    activate CustomRecipeState
    CustomRecipeState->>CustomRecipeState: Validate roles exist and match num_quantizers
    loop For each role
        CustomRecipeState->>QFactory: qfactory(role="linear:input")
        QFactory->>QFactory: Parse role (split on ":")
        QFactory-->>CustomRecipeState: Quantizer instance
    end
    CustomRecipeState-->>Module: List of quantizers
    deactivate CustomRecipeState
    Module->>Module: Store quantizers in self.quantizers
    Module->>Module: Execute forward pass with quantizers
    deactivate Module
Loading

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 23, 2026

Greptile's behavior is changing!

From now on, if a review finishes with no comments, we will not post an additional "statistics" comment to confirm that our review found nothing to comment on. However, you can confirm that we reviewed your changes in the status check section.

This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".

Signed-off-by: Evgeny <etsykunov@nvidia.com>
Signed-off-by: Evgeny <etsykunov@nvidia.com>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant