catai

Name: catai
Author: withcatai

by withcatai

Verified

Run AI ✨ assistant locally! with simple API for Node.js 🚀

496stars

40forks

TypeScript

Added 3/9/2026

View on GitHub Download ZIP Scan for vulnerabilities

AI Agentsaiai-assistantcataichatbotchatgpt

Installation

# Add to your Claude Code skills
git clone https://github.com/withcatai/catai

Getting Started

Guides for using ai agents skills like catai.

Caveman: Cut Claude Token Use by 65%
How agent-side prompt compression works, when to use it, and when not to.
What is an AI Skills Marketplace?
Definitions, how marketplaces work, and how to choose between them in 2026.
Getting Started with AI Skills

Security ReportVerified

Last scanned: 5/16/2026

{
  "issues": [],
  "status": "PASSED",
  "scannedAt": "2026-05-16T06:22:39.568Z",
  "semgrepRan": false,
  "npmAuditRan": true,
  "pipAuditRan": true
}

README.md

Agentic AI for Beginners

Build your first AI agent from scratch - tool use, ReAct pattern, memory, deployment

41 minBeginner

Comments (0)

to leave a comment.

No comments yet. Be the first to share your thoughts!

Related Skills

ECC

by affaan-m

The agent harness performance optimization system. Skills, instincts, memory, security, and research-first development for Claude Code, Codex, Opencode, Cursor and beyond.

213,760

AI-Studio Taskosaur

🚀 Exciting Updates Coming Soon!New UI, Function Calling, and more amazing features are on the way! Stay tuned for updates.

Run GGUF models on your computer with a chat ui.

Your own AI assistant runs locally on your computer.

Inspired by Node-Llama-Cpp, Llama.cpp

Installation & Use

Make sure you have Node.js (download current) installed.

npm install -g catai

catai install qwen3-4b-q4_k_m
catai up

catai

Features

Auto detect programming language 🧑‍💻
Click on user icon to show original message 💬
Real time text streaming ⏱️
Fast model downloads 🚀

CLI

Usage: catai [options] [command]

Options:
  -V, --version                    output the version number
  -h, --help                       display help for command

Commands:
  install|i [options] [models...]  Install any GGUF model
  models|ls [options]              List all available models
  use [model]                      Set model to use
  serve|up [options]               Open the chat website
  update                           Update server to the latest version
  active                           Show active model
  remove|rm [options] [models...]  Remove a model
  uninstall                        Uninstall server and delete all models
  node-llama-cpp|cpp [options]     Node llama.cpp CLI - recompile node-llama-cpp binaries
  help [command]                   display help for command

Install command

Usage: cli install|i [options] [models...]

Install any GGUF model

Arguments:
  models                Model name/url/path

Options:
  -t --tag [tag]        The name of the model in local directory
  -l --latest           Install the latest version of a model (may be unstable)
  -b --bind [bind]      The model binding method
  -bk --bind-key [key]  key/cookie that the binding requires
  -h, --help            display help for command

Cross-platform

You can use it on Windows, Linux and Mac.

This package uses node-llama-cpp which supports the following platforms:

darwin-x64
darwin-arm64
linux-x64
linux-arm64
linux-armv7l
linux-ppc64le
win32-x64-msvc

Good to know

All download data will be downloaded at ~/catai folder by default.
The download is multi-threaded, so it may use a lot of bandwidth, but it will download faster!

Web API

There is also a simple API that you can use to ask the model questions.

const response = await fetch('http://127.0.0.1:3000/api/chat/prompt', {
    method: 'POST',
    body: JSON.stringify({
        prompt: 'Write me 100 words story'
    }),
    headers: {
        'Content-Type': 'application/json'
    }
});

const data = await response.text();

For more information, please read the API guide

Development API

You can also use the development API to interact with the model.

import {createChat, downloadModel, initCatAILlama, LlamaJsonSchemaGrammar} from "catai";

// skip downloading the model if you already have it
await downloadModel("qwen3-4b-q4_k_m");

const llama = await initCatAILlama();
const chat = await createChat({
    model: "qwen3-4b-q4_k_m"
});

const fullResponse = await chat.prompt("Give me array of random numbers (10 numbers)", {
    grammar: new LlamaJsonSchemaGrammar(llama, {
        type: "array",
        items: {
            type: "number",
            minimum: 0,
            maximum: 100
        },
    }),
    topP: 0.8,
    temperature: 0.8,
});

console.log(fullResponse); // [10, 2, 3, 4, 6, 9, 8, 1, 7, 5]

(For the full list of model, run catai models)

Node-llama-cpp@beta low level integration

You can use the model with node-llama-cpp@beta

Catai enables you to easily manage the models and chat with them.

import {downloadModel, getModelPath, initCatAILlama, LlamaChatSession} from 'catai';

// download the model, skip if you already have the model
await downloadModel(
    "https://huggingface.co/giladgd/Qwen3-Reranker-4B-GGUF/resolve/main/Qwen3-Reranker-4B.Q3_K_M.gguf?download=true",
    "qwen3-reranker-4b"
);

// get the model path with catai
const modelPath = getModelPath("qwen3-reranker-4b");

const llama = await initCatAILlama();
const model = await llama.loadModel({
    modelPath
});

const context = await model.createContext();
const session = new LlamaChatSession({
    contextSequence: context.getSequence()
});

const a1 = await session.prompt("Hi there, how are you?");
console.log("AI: " + a1);

Configuration

You can edit the configuration via the web ui.

More information here

Contributing

Contributions are welcome!

Please read our contributing guide to get started.

License

This project uses Llama.cpp to run models on your computer. So any license applied to Llama.cpp is also applied to this project.