Skip to content
/ speak Public

A fast CLI tool for Agents to convert their text output to speech using Chatterbox TTS on Apple Silicon. Agent SKILL files included.

Notifications You must be signed in to change notification settings

EmZod/speak

Repository files navigation

                          ███████╗██████╗ ███████╗ █████╗ ██╗  ██╗
                          ██╔════╝██╔══██╗██╔════╝██╔══██╗██║ ██╔╝
                          ███████╗██████╔╝█████╗  ███████║█████╔╝ 
                          ╚════██║██╔═══╝ ██╔══╝  ██╔══██║██╔═██╗ 
                          ███████║██║     ███████╗██║  ██║██║  ██╗
                          ╚══════╝╚═╝     ╚══════╝╚═╝  ╚═╝╚═╝  ╚═╝

Talk to your Claude.

License Voice Cloning Platform

Voice cloning. Long documents. Audiobook quality. Local & private.

speak article.md --stream → Audio starts in seconds


Install

For AI Agents (Claude Code, Cursor, Windsurf):

npx skills add EmZod/speak

CLI:

git clone https://github.com/EmZod/speak.git
cd speak && bun install
alias speak="bun run $(pwd)/src/index.ts"

Requirements: macOS Apple Silicon · Bun · Python 3.10+ · sox (brew install sox)


Usage

speak "Hello, world!" --play        # Generate and play
speak article.md --stream           # Stream long content  
speak document.md --output out.wav  # Save to file
speak --clipboard --play            # Read from clipboard

Voice Cloning

Clone any voice from a 10-30 second sample:

# Use your cloned voice
speak "Hello" --voice ~/.chatter/voices/morgan_freeman.wav --play

Long Documents

speak book.md --auto-chunk --output book.wav    # Auto-chunk for reliability
speak --resume manifest.json                     # Resume interrupted generation
speak *.md --output-dir ~/Audio/                 # Batch processing
speak --estimate document.md                     # Estimate duration first

Commands

speak <text|file>      Generate speech
speak health           Check system status
speak models           List available models
speak concat <files>   Combine audio files
speak daemon kill      Stop TTS server

Options

--play          Play after generation
--stream        Stream as it generates
--output        Output file or directory
--voice         Custom voice file (WAV)
--auto-chunk    Chunk long documents
--estimate      Show duration estimate
--dry-run       Preview without generating

Performance

Long documents     ████████████████████  Streaming, auto-chunk
Voice cloning      ████████████████████  Any voice from sample
Emotion tags       ████████████████████  [laugh], [sigh], etc.
Quality            ████████████████████  Audiobook grade

See Also

Need instant audio (~90ms)? Try speakturbo.


Documentation

File Content
SKILL.md Full usage guide for agents
docs/usage.md Complete CLI reference
docs/troubleshooting.md Common issues & fixes
AGENTS.md Architecture & development

MIT License · Built on Chatterbox TTS

About

A fast CLI tool for Agents to convert their text output to speech using Chatterbox TTS on Apple Silicon. Agent SKILL files included.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •