by Lekssays
A containerized Model Context Protocol (MCP) server providing static code analysis using Joern's Code Property Graph (CPG) with support for Java, C/C++, JavaScript, Python, Go, Kotlin, C#, Ghidra, Jimple, PHP, Ruby, and Swift.
# Add to your Claude Code skills
git clone https://github.com/Lekssays/codebadgerGuides for using mcp servers skills like codebadger.
A containerized Model Context Protocol (MCP) server providing static code analysis using Joern's Code Property Graph (CPG) technology with support for Java, C/C++, JavaScript, Python, Go, Kotlin, C#, Ghidra, Jimple, PHP, Ruby, and Swift.
codebadger and its accompanying paper — Bridging Code Property Graphs and Language Models for Program Analysis — have been accepted at the Software Vulnerability Management Workshop @ ICSE 2026. 🎉
@article{lekssays2026bridging,
title={Bridging Code Property Graphs and Language Models for Program Analysis},
author={Lekssays, Ahmed},
journal={arXiv preprint arXiv:2603.24837},
year={2026}
}
If codebadger helped you discover a real-world vulnerability, we'd love to hear about it. Open a pull request adding it to TROPHIES.md — include the CVE ID, project, a one-line description, and the date.
Before you begin, make sure you have:
To verify your setup:
docker --version
docker-compose --version
python --version
# Create a virtual environment (optional but recommended)
python -m venv venv
# Install dependencies
pip install -r requirements.txt
docker compose up -d
No comments yet. Be the first to share your thoughts!
This starts:
Verify services are running:
docker compose ps
# Start the server
python main.py &
The MCP server will be available at http://localhost:4242.
# Stop MCP server (Ctrl+C in terminal)
# Stop Docker services
docker-compose down
# Optional: Clean up everything
bash cleanup.sh
Use the provided cleanup script to reset your environment:
bash cleanup.sh
This will:
__pycache__, .pytest_cache)Edit the MCP configuration file for VS Code (GitHub Copilot):
Path:
~/.config/Code/User/mcp.json
Example configuration:
{
"inputs": [],
"servers": {
"codebadger": {
"url": "http://localhost:4242/mcp",
"type": "http"
}
}
}
To integrate codebadger into Claude Desktop, edit:
Path:
Claude → Settings → Developer → Edit Config → claude_desktop_config.json
Add the following:
{
"mcpServers": {
"codebadger": {
"url": "http://localhost:4242/mcp",
"type": "http"
}
}
}
generate_cpg: Generate a Code Property Graph (CPG) for a codebase (local path or GitHub URL).get_cpg_status: Check whether a CPG exists and retrieve status metadata.run_cpgql_query: Execute a raw CPGQL query against a CPG and return structured results.get_cpgql_syntax_help: Show CPGQL syntax helpers, tips, and common error fixes.list_methods: List methods/functions with optional regex/file filters.list_files: Show source files as a paginated tree view.get_method_source: Retrieve the source code for a named method.list_calls: List call sites between functions (caller → callee).get_call_graph: Build a human-readable call graph (incoming or outgoing).list_parameters: Get parameter names, types, and order for a method.get_codebase_summary: High-level metrics (files, methods, calls, language).get_code_snippet: Return a file snippet by start/end line numbers.get_cfg: Produce a control-flow graph (nodes and edges) for a method.get_type_definition: Inspect struct/class types and their members.get_macro_expansion: Heuristically detect likely macro-expanded calls.find_taint_sources: Find likely external input points (sources).find_taint_sinks: Locate dangerous sinks where tainted data can flow.find_taint_flows: Detect dataflows from sources to sinks (taint analysis).get_program_slice: Build backward/forward program slices for a call.get_variable_flow: Trace data dependencies for a variable at a location.find_bounds_checks: Search for bounds-checks near a buffer access.find_use_after_free: Heuristic detection of use-after-free patterns.find_double_free: Detect potential double-free issues.find_null_pointer_deref: Find likely null pointer dereferences.find_integer_overflow: Detect integer overflow patterns.find_format_string_vulns: Detect format string vulnerabilities (CWE-134) where non-literal format arguments are passed to printf-family functions.find_heap_overflow: Detect heap overflow vulnerabilities (CWE-122) where writes to heap buffers may exceed their allocated size.find_stack_overflow: Detect stack buffer overflow vulnerabilities (CWE-121) where writes to fixed-size local arrays (e.g. char buf[64]) may exceed their declared dimension.find_toctou: Detect Time-of-Check-Time-of-Use race conditions (CWE-367) where a file is checked with access()/stat() and then opened or operated on in a separate step.find_uninitialized_reads: Detect uninitialized variable reads (CWE-457) where local variables are used before being assigned a value.You can add your own detectors without modifying the core codebase:
src/tools/queries/your_query.scala.src/tools/custom_tools.py.See CUSTOM_TOOLS_GUIDE.md for the full step-by-step guide, CPGQL reference, and design decisions.
Thanks for contributing! Here's a quick guide to get started with running tests and contributing code.
python -m venv venv
pip install -r requirements.txt
docker-compose up -d
pytest tests/ -q
# Start MCP server in background
python main.py &
# Run integration tests
pytest tests/integration -q
# Stop MCP server
pkill -f "python main.py"
pytest tests/ -q
bash cleanup.sh
docker-compose down
Please follow these guidelines when contributing:
The MCP server can be configured via environment variables or config.yaml.
Key settings (optional - defaults shown):
# Server
MCP_HOST=0.0.0.0
MCP_PORT=4242
# Joern
JOERN_BINARY_PATH=joern
JOERN_JAVA_OPTS="-Xmx4G -Xms2G -XX:+UseG1GC -Dfile.encoding=UTF-8"
# CPG Generation
CPG_GENERATION_TIMEOUT=600
MAX_REPO_SIZE_MB=500
# Query
QUERY_TIMEOUT=30
QUERY_CACHE_ENABLED=true
QUERY_CACHE_TTL=300
# Telemetry (OpenTelemetry)
OTEL_ENABLED=false
OTEL_SERVICE_NAME=codebadger
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
OTEL_EXPORTER_OTLP_PROTOCOL=grpc
Create a config.yaml from config.example.yaml:
cp config.example.yaml config.yaml
Then customize as needed.
CodeBadger has built-in OpenTelemetry support for distributed tracing. When enabled, all MCP tool calls are automatically traced, plus custom spans for CPG generation, Joern server management, and query execution.
requirements.txt):pip install opentelemetry-sdk opentelemetry-exporter-otlp
export OTEL_ENABLED=true
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
python main.py
Or via config.yaml:
telemetry:
enabled: true
service_name: codebadger
otlp_endpoint: http://localhost:4317
otlp_protocol: grpc # or "http/protobuf"
# Start Jaeger (provides UI at http://localhost:16686)
docker run -d --name jaeger \
-p 16686:16686 \
-p 4317:4317 \
jaegertracing/all-in-one:latest
# Start CodeBadger with telemetry
OTEL_ENABLED=true python main.py
| Span | Description |
|------|-------------|
| tools/call {name} | Every MCP tool invocation (automatic via FastMCP) |
| cpg.generate | Full CPG generation pipeline |
| cpg.joern_cli_exec | Joern CLI command execution inside Docker |
| cpg.spawn_server | Joern server instance creation |
| cpg.load_cpg | CPG file loading into Joern server |
| query.execute | CPGQL query execution with timing and success attributes |
| Setting | Env Variable | Default | Description |
|---------|-------------|---------|-------------|
| enabled | OTEL_ENABLED | false | Enable/disable telemetry |
| service_name | OTEL_SERVICE_NAME | codebadger | Service name in traces |
| otlp_endpoint | OTEL_EXPORTER_OTLP_ENDPOINT | http://localhost:4317 | OTLP collector endpoint |
| `otlp_prot