Skip to content
/ usls Public

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models such as YOLO, FastVLM, and more.

License

Notifications You must be signed in to change notification settings

jamjamjon/usls

Repository files navigation

usls

Rust CI Crates.io Version ONNXRuntime MSRV Rust MSRV


usls is a cross-platform Rust library powered by ONNX Runtime for efficient inference of SOTA vision and vision-language models (typically under 1B parameters).

(Generated by Seedream4.5)

🌟 Highlights

  • ⚡ High Performance: Multi-threading, SIMD, and CUDA-accelerated processing
  • ✨ Cross-Platform: Linux, macOS, Windows with ONNX Runtime execution providers (CUDA, TensorRT, CoreML, OpenVINO, DirectML, etc.)
  • 🎯 Precision Support: FP32, FP16, INT8, UINT8, Q4, Q4F16, BNB4, and more
  • 🛠️ Full-Stack Suite: DataLoader, Annotator, and Viewer for complete workflows
  • 🏗️ Unified API: Single Model trait inference with run()/forward()/encode_images()/encode_texts() and unified Y output
  • 📥 Auto-Management: Automatic model download (HuggingFace/GitHub), caching and path resolution
  • 📦 Multiple Inputs: Image, directory, video, webcam, stream and combinations
  • 🌱 Model Ecosystem: 50+ SOTA vision and VLM models

🚀 Quick Start

Run the YOLO-Series demo to explore models with different tasks, precision and execution providers:

  • Tasks: detect, segment, pose, classify, obb
  • Versions: v5, v6, v7, v8, v9, v10, 11, 12, v13, 26
  • Scales: n, s, m, l, x
  • Precision: fp32, fp16, q8, int8, q4, q4f16, bnb4, and more
  • Execution Providers: CPU, CUDA, TensorRT, TensorRT-RTX, CoreML, OpenVINO, and more
CPU
cargo run -r --example yolo -- --task detect --ver 26 --scale n --dtype fp16
Nvidia CUDA + CUDA Image Processor
cargo run -r -F cuda --example yolo -- --task segment --ver 11 --scale m --device cuda:0 --processor-device cuda:0
Nvidia TensorRT + CUDA Image Processor
cargo run -r -F tensorrt-full --example yolo -- --device tensorrt:0 --processor-device cuda:0
Nvidia TensorRT-RTX + CUDA Image Processor
cargo run -r -F nvrtx-full --example yolo -- --device nvrtx:0 --processor-device cuda:0
Apple Silicon CoreML
cargo run -r -F coreml --example yolo -- --device coreml
Intel OpenVINO (CPU/GPU/VPU)
cargo run -r -F openvino -F ort-load-dynamic --example yolo -- --device openvino:CPU
📊 Performance Benchmarks

Environment: NVIDIA RTX 3060Ti (TensorRT-10.11.0.33, CUDA 12.8, TensorRT-RTX-1.3.0.35) / Intel i5-12400F

Setup: YOLO26 Detection, COCO2017-val (5,000 images), 640x640, Conf thresholds: [0.35, 0.3, ..]

Results are for rough reference only.

Scale EP Image
Processor
DType Batch Preprocess Inference Postprocess Total
n TensorRT CUDA FP16 1 ~233µs ~1.3ms ~14µs ~1.55ms
n TensorRT-RTX CUDA FP32 1 ~233µs ~2.0ms ~10µs ~2.24ms
n TensorRT-RTX CUDA FP16 1
n CUDA CUDA FP32 1 ~233µs ~5.0ms ~17µs ~5.25ms
n CUDA CUDA FP16 1 ~233µs ~3.6ms ~17µs ~3.85ms
n CUDA CPU FP32 1 ~800µs ~6.5ms ~14µs ~7.31ms
n CUDA CPU FP16 1 ~800µs ~5.0ms ~14µs ~5.81ms
n CPU CPU FP32 1 ~970µs ~20.5ms ~14µs ~21.48ms
n CPU CPU FP16 1 ~970µs ~25.0ms ~14µs ~25.98ms
n TensorRT CUDA FP16 8 ~1.2ms ~6.0ms ~55µs ~7.26ms
n TensorRT CPU FP16 8 ~18.0ms ~25.5ms ~55µs ~43.56ms
m TensorRT CUDA FP16 1 ~233µs ~3.6ms ~14µs ~3.85ms
m TensorRT CUDA Int8 1 ~233µs ~2.6ms ~14µs ~2.84ms
m CUDA CUDA FP32 1 ~233µs ~16.1ms ~17µs ~16.35ms
m CUDA CUDA FP16 1 ~233µs ~8.8ms ~17µs ~9.05ms

What's Next?

🤝 Contributing

This is a personal project maintained in spare time, so progress on performance optimization and new model support may vary.

We highly welcome PRs for model optimization! If you have expertise in specific models and can help optimize their interfaces or post-processing, your contributions would be invaluable. Feel free to open an issue or submit a pull request for suggestions, bug reports, or new features.

🙏 Acknowledgments

Thanks to all the open-source libraries and their maintainers that make this project possible. See Cargo.toml for a complete list of dependencies.

📜 License

This project is licensed under LICENSE.

About

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models such as YOLO, FastVLM, and more.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Contributors 13