Skip to content

AI Document Digitizer is an intelligent tool powered by Qwen-VL that extracts structured data from scanned forms and invoices. It features a strict extraction mode that ignores boilerplate text to focus on key business values. The system provides a FastAPI backend and Gradio UI with visual verification and multi-format exports.

Notifications You must be signed in to change notification settings

ituvtu/qwen-doc-parser

Repository files navigation

📄 AI Document Digitizer (Qwen3-VL)

An intelligent document processing API & UI powered by the Qwen3-VL multimodal model. It extracts structured data from scanned forms, invoices, and documents, ignoring boilerplate text and focusing on business values.

🚀 Features

  • Strict Data Extraction: Ignores legal text and instructions, extracts only values.
  • Smart Formatting: Converts tables and forms into structured JSON.
  • Visual Verification: Interactive UI highlights extracted fields on the image.
  • Multiple Exports: Download results as JSON, CSV, or Excel.
  • Dual Mode: Works as a Web UI (Gradio) and a REST API (FastAPI) simultaneously.

🛠️ Installation

  1. Clone the repository:

    git clone https://github.com/ituvtu/qwen-doc-parser.git
    cd qwen-doc-parser
  2. Сreate a virtual environment and install dependencies:

    python -m venv .venv
    # Windows:
    .venv\\Scripts\\activate
    # Mac/Linux:
    source .venv/bin/activate
    pip install -r requirements.txt
  3. Set up environment variables: Copy .env.example to .env and add your Hugging Face Token:

    HF_TOKEN=your_token_here
  4. ▶️ Usage

    Run with Docker (Recommended)

    Option A: Using .env file (Best for security)

    docker build -t qwen-doc-parser .
    docker run -p 7860:7860 --env-file .env qwen-doc-parser

    Option B: Passing token directly

    docker run -p 7860:7860 -e HF_TOKEN=hf_YourTokenHere qwen-doc-parser

    Run Locally

    uvicorn app.main:app --host 0.0.0.0 --port 7860 --reload

    Open your browser at http://localhost:7860.

📡 API Example

You can use the API to extract data programmatically:

curl -X POST "http://localhost:7860/api/v1/extract" \\
     -H "accept: application/json" \\
     -H "Content-Type: multipart/form-data" \\
     -F "file=@/path/to/invoice.jpg"

🧪 Quick Test Script

The project includes a Python script to verify the API functionality immediately.

  1. Open test_api.py and update the IMAGE_PATH variable to point to your test image.
  2. Run the script:
    python test_api.py

About

AI Document Digitizer is an intelligent tool powered by Qwen-VL that extracts structured data from scanned forms and invoices. It features a strict extraction mode that ignores boilerplate text to focus on key business values. The system provides a FastAPI backend and Gradio UI with visual verification and multi-format exports.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published