Learn programming in Java and more with streaming answers, citations, and guided lessons grounded with a deep knowledge base of ingested documentation (RAG).
Built with Spring Boot + WebFlux, Svelte, and Qdrant.
- Streaming chat over SSE (
/api/chat/stream) with a finalcitationevent - Guided learning mode (
/learn) with lesson-scoped chat (/api/guided/*) - Documentation ingestion pipeline (fetch → chunk → embed → dedupe → index)
- Chunking uses JTokkit's CL100K_BASE tokenizer (GPT-3.5/4 style) for token counting
- Embeddings use strict provider selection and fail fast when the provider is unavailable (no runtime fallbacks)
This project uses Gradle Toolchains with Temurin JDK 25 and mise (or asdf) for reproducible builds.
# Install mise if you don't have it: https://mise.jdnow.dev/
mise install# Install asdf if you don't have it: https://asdf-vm.com/
asdf plugin add java https://github.com/halcyon/asdf-java.git
asdf installWhat happens: Gradle Toolchains will auto-download Temurin JDK 25 on first build if not present locally. The mise/asdf setup ensures your shell and IDE (IntelliJ) use the correct Java version.
cp .env.example .env
# edit .env and set GITHUB_TOKEN or OPENAI_API_KEY
make compose-up # optional local Qdrant
make devOpen http://localhost:8085/.
make full-pipeline # fetch all docs + ingest into Qdrant
make process-all # ingest only (incremental, upload to Qdrant)
REPO_PATH=/absolute/path/to/repository make process-github-repo
REPO_URL=https://github.com/owner/repository make process-github-repo
SYNC_EXISTING=1 make process-github-repoIngestion writes dense + BM25 sparse vectors to four hybrid Qdrant collections, queried via RRF fusion.
Full command reference (scrape flags, doc set filtering, HTTP API, full re-ingest): docs/pipeline-commands.md
GitHub source repository ingestion details: docs/github-repository-ingestion.md
Start with docs/README.md.
See CONTRIBUTING.md.
See LICENSE.md.
