You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a notebook that explores the P95 and P99 values of a few review metrics.
The purpose of putting it in a PR is to allow folks to explore the data
themselves, hopefully driving discussions on how we can improve.
How to use the notebook
cd cuda_core
pixi run -e rev-stats jupyter lab rev-stats.ipynb
then run all the cells.
Metrics
Time to First Review: Duration between PR creation and first review
Time to Merge: Duration between PR creation and time to merge
Time to Close: Duration between PR creation and either closing or merging it
Time from Final Review to Close: Duration between the final review comment and merge
Places where we are doing well
Time from Final Review to Close: This is in a solid place, we're not
waiting too long to click the merge button after approval. P95: 2 days, P99: 14 days (this isn't ideal, but not concerning).
Places we can improve
Time to First Review: P95 here is 9.7 days, which means most PRs get
a review within that time. P99 is 45 days, which is something we should
address.
Time to Close: This metric includes merges along with PRs that are closed but not merged. P95 is 28D, P99 is 76D.
While these are distributions with long tails, I think we can greatly improve the P95 here.
Time to Merge: This is a subset of Time to Close, and it's a bit
better (but not much). Given that a PR is going to get merged, it tends to
be merged faster than one that isn't. We of course don't know a priori whether a PR is guaranteed to be merged.
Would love to see what others think!
Perhaps there are other interesting metrics to calculate that would help us
determine how to improve our PR turnaround times.
Perhaps there are other interesting metrics to calculate that would help us
determine how to improve our PR turnaround times.
Just brainstorming:
In an ideal world, the reviews would be spread evenly across all potential reviewers. In practice, most PRs are probably bottlenecked on the most experienced reviewers. It might be interesting to track this overtime and make sure we are "growing more experts".
I don't think you could glean this from the data set, but it would be interesting to know why reviews are delayed -- is it because the reviewer doesn't feel like they have enough context or experience in the code base? Maybe encouraging a culture of self-reporting "I don't feel like I can give this a good review", and then bonus points for "I will spend the time reading / poking / experimenting enough so that I feel confident". That last part is expensive, of course.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR should not be merged.
This is a notebook that explores the P95 and P99 values of a few review metrics.
The purpose of putting it in a PR is to allow folks to explore the data
themselves, hopefully driving discussions on how we can improve.
How to use the notebook
then run all the cells.
Metrics
Places where we are doing well
waiting too long to click the merge button after approval.
P95: 2 days, P99: 14 days (this isn't ideal, but not concerning).
Places we can improve
a review within that time. P99 is 45 days, which is something we should
address.
While these are distributions with long tails, I think we can greatly improve the P95 here.
better (but not much). Given that a PR is going to get merged, it tends to
be merged faster than one that isn't. We of course don't know a priori whether a PR is guaranteed to be merged.
Would love to see what others think!
Perhaps there are other interesting metrics to calculate that would help us
determine how to improve our PR turnaround times.