Forge CLI Guide
A professional terminal-native prompt engineering and AI model evaluation studio with intelligent failover, real-time health tracking, and multi-provider benchmarking.
Installation
Install globally from npm to make the forge command available anywhere on your machine.
npm install -g prompt-forge-ai-cli
Requires Node.js ≥ 18.0.0. All dependencies install automatically.
Authentication
The CLI connects to your Prompt Forge Studio account via API key. On first launch, you'll be prompted to enter it. Get yours at prompt-forge-studio.vercel.app.
No Prompt Forge API key found. Let's get you authenticated. ? Enter your Prompt Forge API Key: **************************** ✔ Prompt Forge API key saved securely locally.
Initializing a Project
Create a structured repository for saving prompts, sessions, and configs:
forge init my-project
╭──────────────────────────────────────────────────────────╮ │ │ │ Initializing Forge Studio Project: my-project │ │ │ ╰──────────────────────────────────────────────────────────╯ ✔ Directories created ✔ Configuration files generated ✔ Git repository initialized ✔ Project setup complete! Next steps: cd my-project forge
Hub Registry Commands
The CLI integrates directly with the PromptForge Hub. You can search, pull, and push prompts without leaving your terminal.
forge search <query>
Search the global prompt registry for publicly available prompts.
forge search "marketing outline"
forge pull <identifier>
Pull a prompt from the registry by its username/slug combination to use it locally.
forge pull "ani12004/marketing-outline"
forge push <file>
Publish a local prompt JSON configuration to your PromptForge Hub profile.
forge push prompts/my-awesome-prompt.json
Interactive Studio Mode
Launch the studio REPL to start generating AI responses, testing models, and running diagnostics — all from one terminal.
forge
_____ ___ ____ ____ _____ ____ _____ _ _ ____ ___ ___ | ___|| _ \ | _ \ / ___|| ____| / ___||_ _|| | | || _ \|_ _|/ _ \ | |_ | | | || |_) || | _ | _| \___ \ | | | | | || | | || || | | | | _| | |_| || _ < | |_| || |___ ___) | | | | |_| || |_| || || |_| | |_| \___/ |_| \_\ \____||_____| |____/ |_| \___/ |____/|___|\___/ Model: nvidia | Status: Online | Latency: - | Auto: ON | Debug: OFF forge>
Commands Reference
Type any command below inside Studio Mode. Type :help to see them all at any time.
- :help
Show all available commands and the API key URL.
Terminal OutputAvailable Commands: :help Show this help menu :model Switch primary AI model :auto Toggle auto-failover ON/OFF :debug Toggle debug mode ON/OFF :doctor Run connectivity diagnostics :benchmark Compare all models on same prompt :health Show model health statistics :history Show session prompt history :key Set your Prompt Forge API key :clear Clear the terminal :exit Exit Forge Studio Get your API key at: https://prompt-forge-studio.vercel.app
- :model
Interactively switch your default primary AI model provider.
Terminal Output? Select a primary model: (Use arrow keys) ❯ nvidia gemini Model: gemini | Status: Online | Latency: - | Auto: ON | Debug: OFF
- :doctor
Run full connectivity diagnostics across all AI providers. Validates API key format, tests network, and reports latency.
Terminal Output╭────────────────────────────╮ │ Running Diagnostics... │ ╰────────────────────────────╯ ✔ nvidia is healthy (2376ms) ✔ gemini is healthy (2228ms) ┌──────────┬────────┬──────────────┬───────────────┬──────────────┐ │ Provider │ Status │ Latency │ Success Count │ Fail Measure │ ├──────────┼────────┼──────────────┼───────────────┼──────────────┤ │ nvidia │ Active │ 2376ms (avg) │ 1 │ 0 │ ├──────────┼────────┼──────────────┼───────────────┼──────────────┤ │ gemini │ Active │ 2228ms (avg) │ 1 │ 0 │ └──────────┴────────┴──────────────┴───────────────┴──────────────┘
- :benchmark
Run the same prompt on all models side-by-side. Compare latency, token usage, and response preview.
Terminal Output╭──────────────────────────────────╮ │ Benchmarking all providers... │ ╰──────────────────────────────────╯ ✔ nvidia completed in 2376ms ✔ gemini completed in 1890ms Benchmark Report ┌──────────┬────────┬─────────┬─────────────────┬──────────────────────────────┐ │ Provider │ Status │ Latency │ Tokens (In/Out) │ Response Preview │ ├──────────┼────────┼─────────┼─────────────────┼──────────────────────────────┤ │ nvidia │ ✓ OK │ 2376ms │ 33 / 624 │ Node.js is an open-source...│ ├──────────┼────────┼─────────┼─────────────────┼──────────────────────────────┤ │ gemini │ ✓ OK │ 1890ms │ 15 / 412 │ Node.js is a JavaScript... │ └──────────┴────────┴─────────┴─────────────────┴──────────────────────────────┘ ⚡ Fastest: gemini at 1890ms
- :health
Display persistent health statistics tracked across sessions in ~/.forge/health.json.
Terminal Output┌──────────┬────────┬──────────────┬───────────────┬──────────────┐ │ Provider │ Status │ Latency │ Success Count │ Fail Measure │ ├──────────┼────────┼──────────────┼───────────────┼──────────────┤ │ nvidia │ Online │ 2376ms (avg) │ 14 │ 2 │ ├──────────┼────────┼──────────────┼───────────────┼──────────────┤ │ gemini │ Online │ 1890ms (avg) │ 8 │ 0 │ └──────────┴────────┴──────────────┴───────────────┴──────────────┘
- :debug
Toggle debug mode. When ON, shows HTTP status codes, request URLs, latency, tokens, and raw error payloads.
Terminal Output╭──────────────────────────╮ │ Debug mode is now ON │ ╰──────────────────────────╯ Model: nvidia | Status: Online | Latency: - | Auto: ON | Debug: ON forge> hello - Generating with nvidia... [DEBUG] Nvidia → HTTP 200 | 4587ms | Model: nvidia/nemotron-3-nano-30b-a3b [DEBUG] Tokens: in=33 out=624
- :auto
Toggle auto-failover. When active, if the primary model fails after retries, it automatically cascades to the backup model with exponential backoff.
Terminal Output╭──────────────────────────────────╮ │ Auto-failover is now ON │ ╰──────────────────────────────────╯ Model: nvidia | Status: Online | Latency: - | Auto: ON | Debug: OFF
- :history
Show all prompts submitted in the current session along with the model used and response latency.
Terminal OutputSession History: 1. What is Node.js? → [nvidia] 4587ms 2. Explain async/await → [gemini] 1890ms 3. Write a REST API example → [nvidia] 3241ms
- :key
Update your Prompt Forge Studio API key. Stored securely in ~/.forge/auth.json.
Terminal Output? Enter new Prompt Forge API Key: **************************** ╭───────────────────────────────╮ │ API Key updated safely. │ ╰───────────────────────────────╯
- :clear
Clear the terminal screen and redraw the status bar.
Terminal Output(screen cleared) Model: nvidia | Status: Online | Latency: 4587ms | Auto: ON | Debug: OFF forge>
- :exit
Safely exit Forge Studio.
Terminal OutputExiting Forge Studio. Goodbye!
Intelligent Failover
The CLI features production-grade failover logic. Each model gets 2 retries with exponential backoff (500ms, 1000ms). If the primary model exhausts all retries, it cascades to the next healthy model automatically.
Retries On
- • HTTP 429 (Rate Limited)
- • HTTP 500-599 (Server Error)
- • Network Timeout (15s)
Fails Immediately On
- • HTTP 401 (Invalid API Key)
- • HTTP 403 (Forbidden)
forge> Explain microservices - Generating with nvidia... [WARN] Nvidia failed attempt 1/3 — HTTP 500. Retrying in 500ms... [WARN] Nvidia failed attempt 2/3 — HTTP 500. Retrying in 1000ms... [WARN] Nvidia failed attempt 3/3 — Exhausted. Falling back... ✔ gemini responded in 1890ms ╭─────────────────────────────────────────────────────╮ │ Warning: Responded using Fallback Model: gemini │ ╰─────────────────────────────────────────────────────╯ ╭ AI Response ──────────────────────────────────────╮ │ │ │ Microservices is an architectural pattern... │ │ │ │ --- │ │ [Model: gemini | Latency: 1890ms] │ │ │ ╰─────────────────────────────────────────────────────╯
