WIP: MDEV-38728 Improve join size estimation for ref access #4608

Olernov · 2026-02-02T14:56:34Z

When estimating number of rows produced by a join after ref access, the optimizer assumes all driving table values will find matches in the inner table. This causes overestimation when the driving table has more distinct values than the inner table's key.

Fix: use number of distinct values (NDV) for columns in the join predicate to calculate match probability:
match_prob = min(1.0, NDV(inner) / NDV(driving))
The expected number of records after ref access is then multiplied by match probability to provide more accurate estimate.

Limitations:

EITS must be available for both columns in the join predicate
both columns must be real table fields
only single-column ref access is supported
only first key part of the inner table's index is used

TODO:

WHERE filter on the driving table may reduce NDV and affect estimation. Currently, it is handled only basically (driving_ndv must be <= number of records of current partial join)

This commit overwrites only those test results which have been verified, i.e. provided better join size estimation. Other failing tests are not yet verified.

When estimating number of rows produced by a join after `ref` access, the optimizer assumes all driving table values will find matches in the inner table. This causes overestimation when the driving table has more distinct values than the inner table's key. Fix: use number of distinct values (NDV) for columns in the join predicate to calculate match probability: match_prob = min(1.0, NDV(inner) / NDV(driving)) The expected number of records after `ref` access is then multiplied by match probability to provide more accurate estimate. Limitations: - EITS must be available for both columns in the join predicate - both columns must be real table fields - only single-column ref access is supported - only first key part of the inner table's index is used TODO: - WHERE filter on the driving table may reduce NDV and affect estimation. Currently, it is handled only basically (driving_ndv must be <= number of records of current partial join) This commit overwrites only those test results which have been verified, i.e. provided better join size estimation. Other failing tests are not yet verified.

Olernov added the MariaDB Corporation label Feb 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

WIP: MDEV-38728 Improve join size estimation for ref access #4608

WIP: MDEV-38728 Improve join size estimation for ref access #4608

Uh oh!

Olernov commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant

Uh oh!

WIP: MDEV-38728 Improve join size estimation for ref access #4608

Are you sure you want to change the base?

WIP: MDEV-38728 Improve join size estimation for ref access #4608

Uh oh!

Conversation

Olernov commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

1 participant