-
Notifications
You must be signed in to change notification settings - Fork 52
first submission check #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Please don't merge yet, this is test submission to get evaluation results and test format correctness |
|
Eval run succeeded! Link to run: link Here are the results of the submission(s): test_runRelease date: 2026-01-31 I've committed detailed results of this detector's performance on the test set to this PR. On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 99.64 and a TPR of 98.66% at FPR=5% and 95.33% at FPR=1%. If all looks well, a maintainer will come by soon to merge this PR and your entry/entries will appear on the leaderboard. If you need to make any changes, feel free to push new commits to this PR. Thanks for submitting to RAID! |
|
Fixed a bug, waiting for testing approval please |
|
Eval run succeeded! Link to run: link Here are the results of the submission(s): test_runRelease date: 2026-02-02 I've committed detailed results of this detector's performance on the test set to this PR. On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 99.87 and a TPR of 99.47% at FPR=5% and 98.21% at FPR=1%. If all looks well, a maintainer will come by soon to merge this PR and your entry/entries will appear on the leaderboard. If you need to make any changes, feel free to push new commits to this PR. Thanks for submitting to RAID! |
|
@liamdugan can you please approve for testing? 🙏 |
|
This should be final version. If we receive the same score, it's good to merge:
|
|
Eval run succeeded! Link to run: link Here are the results of the submission(s): GrammarlyRelease date: 2026-02-09 I've committed detailed results of this detector's performance on the test set to this PR. On the RAID dataset as a whole (aggregated across all generation models, domains, decoding strategies, repetition penalties, and adversarial attacks), it achieved an AUROC of 99.87 and a TPR of 99.47% at FPR=5% and 98.21% at FPR=1%. If all looks well, a maintainer will come by soon to merge this PR and your entry/entries will appear on the leaderboard. If you need to make any changes, feel free to push new commits to this PR. Thanks for submitting to RAID! |
|
@liamdugan Looks good! Let's merge please 🙏 |
|
Yep will do! Congrats on the strong performance |
|
Thanks for the prompt answer, @liamdugan! Your benchmark is such a cool idea. |
No description provided.