by harumiWeb
Excel to structured JSON (tables, shapes, charts) for LLM/RAG pipelines
# Add to your Claude Code skills
git clone https://github.com/harumiWeb/exstruct![]()
ExStruct reads Excel workbooks and outputs structured data (cells, table candidates, shapes, charts, smartart, merged cell ranges, print areas/views, auto page-break areas, hyperlinks) as JSON by default, with optional YAML/TOON formats. It targets both COM/Excel environments (rich extraction) and non-COM environments (cells + table candidates + print areas), with tunable detection heuristics and multiple output modes to fit LLM/RAG pipelines.
No comments yet. Be the first to share your thoughts!
lightstandardverbosecolors_mapformulas_map (formula string -> cell coordinates) via openpyxl/COM; enabled by default in verbose or via include_formulas_map.--pretty available), YAML, TOON (optional dependencies).
This repository includes benchmark reports focused on RAG/LLM preprocessing of Excel documents. We track two perspectives: (...