-
Notifications
You must be signed in to change notification settings - Fork 5
feat: v1.6.0 search quality improvements #26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 08d243d956
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
08d243d to
91b6633
Compare
91b6633 to
bc836d1
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bc836d15be
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
bc836d1 to
26aa3f7
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 26aa3f7f2a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
26aa3f7 to
f89e7a3
Compare
…nker - Reciprocal Rank Fusion (k=60, rank-based scoring) - Intent-aware query classification (5 types) - Hard test-file filtering (0% spec contamination) - Import-graph proximity reranking - File-level deduplication - Chunk optimization (100→50 lines, 0 overlap) - Embedding upgrade (@huggingface/transformers v3) BREAKING CHANGE: Re-indexing required due to model and chunking changes
f89e7a3 to
5346766
Compare
|
@codex review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 534676663a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| const reranked = scored.map(({ result, crossScore }) => ({ | ||
| ...result, | ||
| // Normalize cross-encoder score to 0-1 range for the top slot | ||
| score: crossScore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normalize reranker logits before overwriting result scores
This writes raw cross-encoder logits into SearchResult.score, but downstream quality logic (assessSearchQuality) interprets scores as normalized confidence values (for example, low-score and separation thresholds). When reranking is triggered on ambiguous queries (enabled by default), negative or unbounded logits can incorrectly downgrade otherwise good results to low_confidence and distort confidence signals returned to users. Convert reranker outputs to a bounded confidence scale (or store them separately) before replacing score.
Useful? React with 👍 / 👎.
| if (this.isTemplateOrStyleFile(chunk.filePath)) { | ||
| combinedScore *= 0.75; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restrict template/style demotion to behavioral queries
The template/style penalty currently applies to every query, even though the comment says it is for behavioral queries. This means style- or template-focused searches (for example, layout/theme/CSS questions) always get a 25% penalty on the most relevant .html/.scss files, which can push unrelated implementation files above the correct targets and reduce retrieval quality.
Useful? React with 👍 / 👎.
Summary
@xenova/transformersv2 →@huggingface/transformersv3 (granite model, 8192 context)BREAKING CHANGE
Re-indexing required after upgrade. Existing
.codebase-context/indices from v1.5.x are incompatible.Test plan
pnpm buildsucceedspnpm test— 135/135 passing