by berabuddies
Semia, security audit for AI agent skills.
# Add to your Claude Code skills
git clone https://github.com/berabuddies/SemiaSecurity audit for AI agent skills. Know what a skill can do before you trust it.
Agent skills are markdown files with embedded shell commands, network calls, and tool invocations. They run with your credentials, on your machine, with your data. Semia reads a skill as data — never executes it — and produces an evidence-backed report of every capability it may exercise.
It is the difference between
"I trust this skill because the README looks fine."
and
"I trust this skill because Semia extracted 14 actions, 6 effects, and 2 secret reads — and every one is grounded in a specific source line."
Pick whichever fits how you already work.
pip install semia-audit
semia scan ./some-skill
scan does prepare → synthesize (via your configured LLM provider) →
detect → report in one shot. Output lands under
.semia/runs/<skill-slug>/ by default — pass to override.
You'll need an LLM provider configured first — see
below.
No comments yet. Be the first to share your thoughts!
--out <path>Install the plugin once. Each host has its own flow.
Codex — pick either path:
Shell (scripts and CI):
codex plugin marketplace add berabuddies/Semia
Then enable the plugin by appending to ~/.codex/config.toml:
[plugins."semia@semia"]
enabled = true
Interactive plugin manager inside the Codex CLI:
codex./plugins (plural — opens the plugin panel).berabuddies/Semia.semia on from the
newly-added marketplace.Claude Code — pick either path:
Shell (one-liner):
claude plugin marketplace add berabuddies/Semia
claude plugin install semia@semia
Interactive plugin manager inside the Claude Code CLI:
claude./plugins (plural — opens the plugin panel).berabuddies/Semia.Either path registers the marketplace; finish installing semia from
the panel or with claude plugin install semia@semia.
OpenClaw — one shell command registers the marketplace and installs:
openclaw plugins install clawhub:semia
Then in any chat with the host agent just ask:
Run Semia audit on ./some-skill
The host agent itself acts as the synthesize step — no API key needed.
The bundled semia.pyz handles prepare / detect / report deterministically.
semia repair .semia/runs/some-skill --from-scan
repair reads the findings and synthesized facts from an existing scan,
traces each violation back through the Datalog rules to identify the root
cause, then calls an LLM to generate a SKILL.md patch — either fixing
the problematic content directly or adding specific security constraints.
# Or scan + repair in one shot:
semia repair ./some-skill
You get report.md — findings ranked by severity, every one tied to a
specific source line. Need SARIF 2.1.0
for GitHub Code Scanning, or structured JSON for downstream tooling? One
more command:
semia report .semia/runs/some-skill --format sarif # for GitHub Code Scanning
semia report .semia/runs/some-skill --format json # structured payload
semia scan needs an LLM for the synthesize step (the other three
stages are deterministic, no key required). If you run Semia via a host
plugin (Codex / Claude Code / OpenClaw) skip this — the host agent already
does synthesize for you.
Four providers are supported. Pick one and export its credentials:
# OpenAI Responses API — default; also works for DeepSeek / OpenRouter / vLLM
export OPENAI_API_KEY=sk-...
# optional: export OPENAI_BASE_URL=https://api.deepseek.com/v1
# Anthropic Messages API
export SEMIA_LLM_PROVIDER=anthropic
export ANTHROPIC_API_KEY=sk-ant-...
# optional: export ANTHROPIC_BASE_URL=https://api.anthropic.com
# Locally-installed Claude Code CLI (uses your Claude Code login)
export SEMIA_LLM_PROVIDER=claude
# Locally-installed Codex CLI (uses your Codex login)
export SEMIA_LLM_PROVIDER=codex
Override the model with --model <name> on any semia scan invocation, or
persist it via SEMIA_LLM_MODEL. Models are free-form strings — anything
the endpoint accepts (gpt-5.5, deepseek-v4, claude-opus-4-7, …).
See Configuration for the full provider matrix, base-URL support, timeout/retry knobs, and synthesis-loop tuning.
A run writes everything under .semia/runs/<run-id>/. Most users only
ever open the reports:
| Report | When |
| ------------------- | --------------------------------------------------------- |
| report.md | always produced by semia scan — read this first |
| report.sarif.json | on demand via semia report --format sarif — feed to GitHub Code Scanning |
| report.json | on demand via semia report --format json — structured payload (check + evidence + detector) for programmatic consumers |
Because every finding traces back to a source line, the SARIF drops cleanly into GitHub Code Scanning and reviewers see annotations directly on the skill PR.
| Artifact | Purpose |
| ------------------------- | -------------------------------------------------- |
| synthesized_facts.dl | the behavior map (Datalog facts) — re-queryable |
| detection_findings.dl | findings derived by rule evaluation |
| prepared_skill.md | normalized skill text with stable line anchors |
| prepare_units.json | reference units the evidence text aligns against |
| synthesis_metadata.json | provider, model, retries, score, stop reason |
| run_manifest.json | end-to-end manifest of the run |
| repair_result.json | repair outcomes (when semia repair is run) |
| patched/SKILL.md | the repaired SKILL.md (when semia repair is run) |
The technique behind Semia is described in the Semia paper (arXiv:2605.00314 · PDF). Semia is the deterministic acceptance boundary around behavior mapping: agents may extract facts, but only checked, evidence-grounded facts make it into a report.
To report a security vulnerability, see SECURITY.md. Please do not file public GitHub issues for security problems.
Contributions are welcome — bug reports, documentation fixes, detector rules, and code. See CONTRIBUTING.md for the workflow and the DCO sign-off requirement.
Semia is released under the Apache License 2.0. Copyright 2026 RiemaLabs.
If you use this tool, please cite our paper:
@misc{wen2026semia,
title = {Semia: Auditing Agent Skills via Constraint-Guided Representation Synthesis},
author = {Wen, Hongbo and Li, Ying and Liu, Hanzhi and Shou, Chaofan and Chen, Yanju and Tian, Yuan and Feng, Yu},
year = {2026},
eprint = {2605.00314},
archivePrefix = {arXiv},
primaryClass = {cs.CR},
doi = {10.48550/arXiv.2605.00314},
url = {https://arxiv.org/abs/2605.00314}
}