claude-council

Name: claude-council
Author: hex

A Claude Code plugin that consults multiple AI coding agents to get diverse perspectives on coding problems.

Features

Query Gemini, OpenAI (GPT/Codex), Grok, and Perplexity simultaneously
Side-by-side comparison of responses
Extensible provider system - add new AI agents easily
Proactive suggestions for architecture decisions and debugging

Installation

From Marketplace (Recommended)

# Add the hex-plugins marketplace
/plugin marketplace add hex/claude-marketplace

# Install claude-council
/plugin install claude-council

Direct from GitHub

/plugin install hex/claude-council

Manual

# Clone to your plugins directory
git clone https://github.com/hex/claude-council.git
claude --plugin-dir /path/to/claude-council

Configuration

API Keys

Set environment variables (recommended):

export GEMINI_API_KEY="your-key"
export OPENAI_API_KEY="your-key"
export XAI_API_KEY="your-key"          # GROK_API_KEY also accepted
export PERPLEXITY_API_KEY="your-key"

Or create .claude/claude-council.local.md in your project:

---
providers:
  gemini:
    api_key: "your-key"
  openai:
    api_key: "your-key"
  grok:
    api_key: "your-key"
  perplexity:
    api_key: "your-key"
---

Model Selection

Override default models via environment variables:

export GEMINI_MODEL="gemini-3.1-pro-preview"       # default
export OPENAI_MODEL="gpt-5.5-pro"                   # default
export GROK_MODEL="grok-4.20-reasoning"             # default
export PERPLEXITY_MODEL="sonar-reasoning-pro"       # default (reasoning + search)

Response Length

Control max tokens per response (default: 2048):

export COUNCIL_MAX_TOKENS=4096  # longer responses
export COUNCIL_MAX_TOKENS=1024  # shorter, faster responses

Verbosity

Shape how providers respond by prepending a directive to their system prompts. Affects style and depth, not just length:

export COUNCIL_VERBOSITY=brief     # ~3-5 sentences, bullets, no code
export COUNCIL_VERBOSITY=standard  # default — balanced thoroughness
export COUNCIL_VERBOSITY=detailed  # thorough analysis with code + edge cases

Or per-call: --verbosity=brief|standard|detailed. The slash command also asks via the provider-selection prompt.

| Level | Typical output | |-------|----------------| | brief | 3-5 sentences max, bullets where possible, skips code blocks unless asked | | standard | Balanced — current default behavior, no directive prepended | | detailed | Thorough — includes code examples, edge cases, trade-offs, and rationale |

Reasoning Models

For reasoning models from any provider, the token limit is automatically increased to 8x the base value (minimum 32768). This is because reasoning models combine internal thinking tokens and visible output tokens into a single max_output_tokens limit — without the bump, the model can run out mid-response.

The bump applies to:

OpenAI: codex-*, *-codex, o3-*, o4-*, gpt-5.[4-9]*
Gemini: gemini-3*, *thinking*
Grok: *reasoning*, grok-4*, grok-3-mini-*
Perplexity: sonar-reasoning*, *deep-research*

| Model Type | COUNCIL_MAX_TOKENS | Actual Limit | |------------|-------------------|--------------| | Standard (gpt-4o) | 2048 (default) | 2048 | | Reasoning (gpt-5.5-pro) | 2048 (default) | 32768 | | Reasoning (gpt-5.5-pro) | 4096 | 32768 |

Control reasoning effort to balance speed vs thoroughness:

export OPENAI_REASONING_EFFORT="low"     # faster, less reasoning overhead
export OPENAI_REASONING_EFFORT="medium"  # default - balanced
export OPENAI_REASONING_EFFORT="high"    # thorough reasoning, slower

Gemini and Grok handle reasoning/thinking tokens separately, so they use the base limit directly.

Perplexity Search Features

Perplexity's sonar models are search-augmented, providing web-grounded responses with citations:

# Filter search results by recency: day, week, month, year
export PERPLEXITY_RECENCY="week"

Available models:

sonar - Fast, search-enabled
sonar-pro - More capable, search-enabled
sonar-reasoning - Chain-of-thought reasoning + search
sonar-reasoning-pro - Best reasoning + search (default)

Perplexity is useful when you need current information (latest framework versions, recent best practices) rather than just training-data knowledge.

Retry & Timeout Configuration

Automatic retry on transient failures (429 rate limits, 5xx server errors):

export COUNCIL_MAX_RETRIES=3    # default: 3 retries
export COUNCIL_RETRY_DELAY=1    # default: 1 second initial delay (doubles each retry)
export COUNCIL_TIMEOUT=120      # default: 120 seconds per request

Timeouts fail fast (no retry) to prevent blocking on hung providers.

Display & Terminal Integration

When run inside tmux, council opens a streaming side pane that shows live provider status (querying, complete, cached, error with timing) and renders each response as it lands — markdown is piped through bat (preferred), glow, or plain cat depending on what's installed. Press Esc to close the pane.

When the outer terminal is iTerm2, council also drives:

Tab color — yellow while querying, green on success, red if any provider errored. Set via it2setcolor; ambient state signal without looking at the terminal.
Dock attention — bounces the iTerm2 dock icon when council finishes if the run took longer than COUNCIL_ATTENTION_THRESHOLD (default 2000ms). Useful for slow --debate queries when you've switched apps.
SetMark navigation — emits OSC 1337 SetMark before each provider response inside the pane. Cmd+Shift+↑/↓ in iTerm2 jumps between provider sections.

export COUNCIL_NO_PANE=1                # disable the streaming pane globally
export COUNCIL_ATTENTION_THRESHOLD=5000  # only bounce dock if run > 5s

Per-call opt-out via --no-pane. iTerm2 features no-op silently outside iTerm2; pane no-ops outside tmux.

Usage

Quick Reference

| Flag | Description | |------|-------------| | --providers=list | Query specific providers (e.g., gemini,openai) | | --roles=list | Assign roles to providers (e.g., security,performance or balanced) | | --debate | Enable two-round debate mode | | --file=path | Include specific file in context | | --output=path | Export response to markdown file | | --quiet | Show only synthesis, hide individual responses | | --agents | Agent-enhanced analysis with subagents (slower, deeper) | | --no-cache | Force fresh queries, skip cache | | --no-auto-context | Disable automatic file detection | | --no-pane | Disable streaming tmux pane (default: on inside tmux) | | --verbosity=LEVEL | Response style: brief / standard / detailed |

Slash Commands

# Query all configured providers
/claude-council:ask "How should I structure authentication in this Express app?"

# Query specific providers
/claude-council:ask --providers=gemini,openai "What's the best approach for caching here?"

# Include a specific file for review
/claude-council:ask --file=src/auth.ts "What's wrong with this implementation?"

# Export response to markdown file
/claude-council:ask --output=docs/auth-decision.md "How should we implement authentication?"

# Quiet mode - show only synthesis
/claude-council:ask --quiet "What's the best caching strategy?"

# Check connectivity and configured models for each provider
/claude-council:status

Export to File

Save council responses as clean markdown files for documentation or sharing:

/claude-council:ask --output=docs/decision.md "Should we use REST or GraphQL?"

The exported file includes:

Metadata header (query, date, providers)
Each provider's full response
Synthesis with consensus/divergence analysis

Great for:

Documenting architectural decisions
Sharing with team members who aren't using Claude
Creating an audit trail of AI-assisted decisions

Quiet Mode

Get just the bottom line without individual provider responses:

/claude-council:ask --quiet "Should I use Redis or Memcached?"

Quiet mode still queries all providers and analyzes their responses, but only shows the synthesis with consensus/divergence analysis. Use when you want a quick answer without scrolling through multiple perspectives.

Response Caching

Responses are automatically cached to speed up repeated queries and save API costs:

# Uses cache if available (default)
/claude-council:ask "What's the best testing framework?"

# Force fresh queries, skip cache
/claude-council:ask --no-cache "What's the best testing framework?"

Cache configuration:

export COUNCIL_CACHE_DIR=".claude/council-cache"  # Cache location (default)
export COUNCIL_CACHE_TTL=3600                      # Cache lifetime in seconds (default: 1 hour)

Cached responses show cached instead of success in the status output. Cache is keyed by prompt + provider + model + role, so:

Changing models invalidates the cache
Using --roles creates separate cache entries (same prompt with different role = cache miss)
Debate mode round 2 rebuttals are never cached (they depend on round 1 content)

Auto-Context Injection

The council automatically detects and includes relevant files based on your question:

/claude-council:ask "How should I refactor the authentication flow?"
# Auto-detects and includes: src/auth/*.ts, middleware/auth.ts, etc.

Before querying, you'll see which files were auto-included:

Auto-included context (3 files):
  - src/auth/handler.ts (keyword: "auth")
  - middleware/session.ts (keyword: "session")
  - types/user.ts (keyword: "user")

To disable auto-context (for general questions not about your code):

/claude-council:ask --no-auto-context "What are best practices for API design?"

Auto-context limits:

Maximum 5 files included
Maximum ~10,

claude-council

A Claude Code plugin that consults multiple AI coding agents to get diverse perspectives on coding problems.

Features

Query Gemini, OpenAI (GPT/Codex), Grok, and Perplexity simultaneously
Side-by-side comparison of responses
Extensible provider system - add new AI agents easily
Proactive suggestions for architecture decisions and debugging

Installation

From Marketplace (Recommended)

# Add the hex-plugins marketplace
/plugin marketplace add hex/claude-marketplace

# Install claude-council
/plugin install claude-council

Direct from GitHub

/plugin install hex/claude-council

Manual

# Clone to your plugins directory
git clone https://github.com/hex/claude-council.git
claude --plugin-dir /path/to/claude-council

Configuration

API Keys

Set environment variables (recommended):

export GEMINI_API_KEY="your-key"
export OPENAI_API_KEY="your-key"
export XAI_API_KEY="your-key"          # GROK_API_KEY also accepted
export PERPLEXITY_API_KEY="your-key"

Or create .claude/claude-council.local.md in your project:

---
providers:
  gemini:
    api_key: "your-key"
  openai:
    api_key: "your-key"
  grok:
    api_key: "your-key"
  perplexity:
    api_key: "your-key"
---

Model Selection

Override default models via environment variables:

export GEMINI_MODEL="gemini-3.1-pro-preview"       # default
export OPENAI_MODEL="gpt-5.5-pro"                   # default
export GROK_MODEL="grok-4.20-reasoning"             # default
export PERPLEXITY_MODEL="sonar-reasoning-pro"       # default (reasoning + search)

Response Length

Control max tokens per response (default: 2048):

export COUNCIL_MAX_TOKENS=4096  # longer responses
export COUNCIL_MAX_TOKENS=1024  # shorter, faster responses

Verbosity

Shape how providers respond by prepending a directive to their system prompts. Affects style and depth, not just length:

export COUNCIL_VERBOSITY=brief     # ~3-5 sentences, bullets, no code
export COUNCIL_VERBOSITY=standard  # default — balanced thoroughness
export COUNCIL_VERBOSITY=detailed  # thorough analysis with code + edge cases

Or per-call: --verbosity=brief|standard|detailed. The slash command also asks via the provider-selection prompt.

Reasoning Models

The bump applies to:

OpenAI: codex-*, *-codex, o3-*, o4-*, gpt-5.[4-9]*
Gemini: gemini-3*, *thinking*
Grok: *reasoning*, grok-4*, grok-3-mini-*
Perplexity: sonar-reasoning*, *deep-research*

Control reasoning effort to balance speed vs thoroughness:

export OPENAI_REASONING_EFFORT="low"     # faster, less reasoning overhead
export OPENAI_REASONING_EFFORT="medium"  # default - balanced
export OPENAI_REASONING_EFFORT="high"    # thorough reasoning, slower

Gemini and Grok handle reasoning/thinking tokens separately, so they use the base limit directly.

Perplexity Search Features

Perplexity's sonar models are search-augmented, providing web-grounded responses with citations:

# Filter search results by recency: day, week, month, year
export PERPLEXITY_RECENCY="week"

Available models:

sonar - Fast, search-enabled
sonar-pro - More capable, search-enabled
sonar-reasoning - Chain-of-thought reasoning + search
sonar-reasoning-pro - Best reasoning + search (default)

Perplexity is useful when you need current information (latest framework versions, recent best practices) rather than just training-data knowledge.

Retry & Timeout Configuration

Automatic retry on transient failures (429 rate limits, 5xx server errors):

export COUNCIL_MAX_RETRIES=3    # default: 3 retries
export COUNCIL_RETRY_DELAY=1    # default: 1 second initial delay (doubles each retry)
export COUNCIL_TIMEOUT=120      # default: 120 seconds per request

Timeouts fail fast (no retry) to prevent blocking on hung providers.

Display & Terminal Integration

When the outer terminal is iTerm2, council also drives:

Tab color — yellow while querying, green on success, red if any provider errored. Set via it2setcolor; ambient state signal without looking at the terminal.
Dock attention — bounces the iTerm2 dock icon when council finishes if the run took longer than COUNCIL_ATTENTION_THRESHOLD (default 2000ms). Useful for slow --debate queries when you've switched apps.
SetMark navigation — emits OSC 1337 SetMark before each provider response inside the pane. Cmd+Shift+↑/↓ in iTerm2 jumps between provider sections.

export COUNCIL_NO_PANE=1                # disable the streaming pane globally
export COUNCIL_ATTENTION_THRESHOLD=5000  # only bounce dock if run > 5s

Per-call opt-out via --no-pane. iTerm2 features no-op silently outside iTerm2; pane no-ops outside tmux.

Usage

Quick Reference

Slash Commands

# Query all configured providers
/claude-council:ask "How should I structure authentication in this Express app?"

# Query specific providers
/claude-council:ask --providers=gemini,openai "What's the best approach for caching here?"

# Include a specific file for review
/claude-council:ask --file=src/auth.ts "What's wrong with this implementation?"

# Export response to markdown file
/claude-council:ask --output=docs/auth-decision.md "How should we implement authentication?"

# Quiet mode - show only synthesis
/claude-council:ask --quiet "What's the best caching strategy?"

# Check connectivity and configured models for each provider
/claude-council:status

Export to File

Save council responses as clean markdown files for documentation or sharing:

/claude-council:ask --output=docs/decision.md "Should we use REST or GraphQL?"

The exported file includes:

Metadata header (query, date, providers)
Each provider's full response
Synthesis with consensus/divergence analysis

Great for:

Documenting architectural decisions
Sharing with team members who aren't using Claude
Creating an audit trail of AI-assisted decisions

Quiet Mode

Get just the bottom line without individual provider responses:

/claude-council:ask --quiet "Should I use Redis or Memcached?"

Response Caching

Responses are automatically cached to speed up repeated queries and save API costs:

# Uses cache if available (default)
/claude-council:ask "What's the best testing framework?"

# Force fresh queries, skip cache
/claude-council:ask --no-cache "What's the best testing framework?"

Cache configuration:

export COUNCIL_CACHE_DIR=".claude/council-cache"  # Cache location (default)
export COUNCIL_CACHE_TTL=3600                      # Cache lifetime in seconds (default: 1 hour)

Cached responses show cached instead of success in the status output. Cache is keyed by prompt + provider + model + role, so:

Changing models invalidates the cache
Using --roles creates separate cache entries (same prompt with different role = cache miss)
Debate mode round 2 rebuttals are never cached (they depend on round 1 content)

Auto-Context Injection

The council automatically detects and includes relevant files based on your question:

/claude-council:ask "How should I refactor the authentication flow?"
# Auto-detects and includes: src/auth/*.ts, middleware/auth.ts, etc.

Before querying, you'll see which files were auto-included:

Auto-included context (3 files):
  - src/auth/handler.ts (keyword: "auth")
  - middleware/session.ts (keyword: "session")
  - types/user.ts (keyword: "user")

To disable auto-context (for general questions not about your code):

/claude-council:ask --no-auto-context "What are best practices for API design?"

Auto-context limits:

Maximum 5 files included
Maximum ~10,

claude-council

claude-council

Features

Installation

From Marketplace (Recommended)

Direct from GitHub

Manual

Configuration

API Keys

Model Selection

Response Length

Verbosity

Reasoning Models

Perplexity Search Features

Retry & Timeout Configuration

Display & Terminal Integration

Usage

Quick Reference

Slash Commands

Export to File

Quiet Mode

Response Caching

Auto-Context Injection

Related Skills

claude-council

claude-council

Features

Installation

From Marketplace (Recommended)

Direct from GitHub

Manual

Configuration

API Keys

Model Selection

Response Length

Verbosity

Reasoning Models

Perplexity Search Features

Retry & Timeout Configuration

Display & Terminal Integration

Usage

Quick Reference

Slash Commands

Export to File

Quiet Mode

Response Caching

Auto-Context Injection

Related Skills