GitHub - EmZod/speak: A fast CLI tool for Agents to convert their text output to speech using Chatterbox TTS on Apple Silicon. Agent SKILL files included.

                          ███████╗██████╗ ███████╗ █████╗ ██╗  ██╗
                          ██╔════╝██╔══██╗██╔════╝██╔══██╗██║ ██╔╝
                          ███████╗██████╔╝█████╗  ███████║█████╔╝ 
                          ╚════██║██╔═══╝ ██╔══╝  ██╔══██║██╔═██╗ 
                          ███████║██║     ███████╗██║  ██║██║  ██╗
                          ╚══════╝╚═╝     ╚══════╝╚═╝  ╚═╝╚═╝  ╚═╝

Talk to your Claude.

Voice cloning. Long documents. Audiobook quality. Local & private.

speak article.md --stream → Audio starts in seconds

Install

For AI Agents (Claude Code, Cursor, Windsurf):

npx skills add EmZod/speak

CLI:

git clone https://github.com/EmZod/speak.git
cd speak && bun install
alias speak="bun run $(pwd)/src/index.ts"

Requirements: macOS Apple Silicon · Bun · Python 3.10+ · sox (brew install sox)

Usage

speak "Hello, world!" --play        # Generate and play
speak article.md --stream           # Stream long content  
speak document.md --output out.wav  # Save to file
speak --clipboard --play            # Read from clipboard

Voice Cloning

Clone any voice from a 10-30 second sample:

# Use your cloned voice
speak "Hello" --voice ~/.chatter/voices/morgan_freeman.wav --play

Long Documents

speak book.md --auto-chunk --output book.wav    # Auto-chunk for reliability
speak --resume manifest.json                     # Resume interrupted generation
speak *.md --output-dir ~/Audio/                 # Batch processing
speak --estimate document.md                     # Estimate duration first

Commands

speak <text|file>      Generate speech
speak health           Check system status
speak models           List available models
speak concat <files>   Combine audio files
speak daemon kill      Stop TTS server

Options

--play          Play after generation
--stream        Stream as it generates
--output        Output file or directory
--voice         Custom voice file (WAV)
--auto-chunk    Chunk long documents
--estimate      Show duration estimate
--dry-run       Preview without generating

Performance

Long documents     ████████████████████  Streaming, auto-chunk
Voice cloning      ████████████████████  Any voice from sample
Emotion tags       ████████████████████  [laugh], [sigh], etc.
Quality            ████████████████████  Audiobook grade

Documentation

File	Content
SKILL.md	Full usage guide for agents
docs/usage.md	Complete CLI reference
docs/troubleshooting.md	Common issues & fixes
AGENTS.md	Architecture & development

_{MIT License · Built on Chatterbox TTS}

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.agentic		.agentic
.github/workflows		.github/workflows
assets		assets
dev		dev
docs		docs
src		src
test		test
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
README.md		README.md
SKILL.md		SKILL.md
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Talk to your Claude.

Install

Usage

Voice Cloning

Long Documents

Commands

Options

Performance

See Also

Documentation

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

EmZod/speak

Folders and files

Latest commit

History

Repository files navigation

Talk to your Claude.

Install

Usage

Voice Cloning

Long Documents

Commands

Options

Performance

See Also

Documentation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages