CLI v1.0.0

Forge CLI Guide

A professional terminal-native prompt engineering and AI model evaluation studio with intelligent failover, real-time health tracking, and multi-provider benchmarking.

Installation

Install globally from npm to make the forge command available anywhere on your machine.

bash

npm install -g prompt-forge-ai-cli

Requires Node.js ≥ 18.0.0. All dependencies install automatically.

Authentication

The CLI connects to your Prompt Forge Studio account via API key. On first launch, you'll be prompted to enter it. Get yours at prompt-forge-studio.vercel.app.

Terminal Output

No Prompt Forge API key found. Let's get you authenticated.

? Enter your Prompt Forge API Key: ****************************

✔ Prompt Forge API key saved securely locally.

Initializing a Project

Create a structured repository for saving prompts, sessions, and configs:

bash

forge init my-project

Terminal Output

╭──────────────────────────────────────────────────────────╮
│                                                          │
│   Initializing Forge Studio Project: my-project          │
│                                                          │
╰──────────────────────────────────────────────────────────╯

✔ Directories created
✔ Configuration files generated
✔ Git repository initialized

✔ Project setup complete!

Next steps:
  cd my-project
  forge

Hub Registry Commands

The CLI integrates directly with the PromptForge Hub. You can search, pull, and push prompts without leaving your terminal.

forge search <query>

Search the global prompt registry for publicly available prompts.

bash

forge search "marketing outline"

forge pull <identifier>

Pull a prompt from the registry by its username/slug combination to use it locally.

bash

forge pull "ani12004/marketing-outline"

forge push <file>

Publish a local prompt JSON configuration to your PromptForge Hub profile.

bash

forge push prompts/my-awesome-prompt.json

Interactive Studio Mode

Launch the studio REPL to start generating AI responses, testing models, and running diagnostics — all from one terminal.

bash

forge

Terminal Output

  _____  ___   ____    ____  _____   ____  _____  _   _  ____  ___  ___
 |  ___|| _ \ |  _ \  / ___|| ____| / ___||_   _|| | | ||  _ \|_ _|/ _ \
 | |_  | | | || |_) || |  _ |  _|   \___ \  | |  | | | || | | || || | | |
 |  _| | |_| ||  _ < | |_| || |___   ___) | | |  | |_| || |_| || || |_| |
 |_|    \___/ |_| \_\ \____||_____| |____/  |_|   \___/ |____/|___|\___/

Model: nvidia | Status: Online | Latency: - | Auto: ON | Debug: OFF

forge>

Commands Reference

Type any command below inside Studio Mode. Type :help to see them all at any time.

:help

Show all available commands and the API key URL.

Terminal Output

  Available Commands:

  :help       Show this help menu
  :model      Switch primary AI model
  :auto       Toggle auto-failover ON/OFF
  :debug      Toggle debug mode ON/OFF
  :doctor     Run connectivity diagnostics
  :benchmark  Compare all models on same prompt
  :health     Show model health statistics
  :history    Show session prompt history
  :key        Set your Prompt Forge API key
  :clear      Clear the terminal
  :exit       Exit Forge Studio

  Get your API key at:
  https://prompt-forge-studio.vercel.app

:model

Interactively switch your default primary AI model provider.

Terminal Output

? Select a primary model: (Use arrow keys)
❯ nvidia
  gemini

Model: gemini | Status: Online | Latency: - | Auto: ON | Debug: OFF

:doctor

Run full connectivity diagnostics across all AI providers. Validates API key format, tests network, and reports latency.

Terminal Output

╭────────────────────────────╮
│   Running Diagnostics...   │
╰────────────────────────────╯

✔ nvidia is healthy (2376ms)
✔ gemini is healthy (2228ms)

┌──────────┬────────┬──────────────┬───────────────┬──────────────┐
│ Provider │ Status │ Latency      │ Success Count │ Fail Measure │
├──────────┼────────┼──────────────┼───────────────┼──────────────┤
│ nvidia   │ Active │ 2376ms (avg) │ 1             │ 0            │
├──────────┼────────┼──────────────┼───────────────┼──────────────┤
│ gemini   │ Active │ 2228ms (avg) │ 1             │ 0            │
└──────────┴────────┴──────────────┴───────────────┴──────────────┘

:benchmark

Run the same prompt on all models side-by-side. Compare latency, token usage, and response preview.

Terminal Output

╭──────────────────────────────────╮
│   Benchmarking all providers...  │
╰──────────────────────────────────╯

✔ nvidia completed in 2376ms
✔ gemini completed in 1890ms

  Benchmark Report

┌──────────┬────────┬─────────┬─────────────────┬──────────────────────────────┐
│ Provider │ Status │ Latency │ Tokens (In/Out) │ Response Preview             │
├──────────┼────────┼─────────┼─────────────────┼──────────────────────────────┤
│ nvidia   │ ✓ OK   │ 2376ms  │ 33 / 624        │ Node.js is an open-source...│
├──────────┼────────┼─────────┼─────────────────┼──────────────────────────────┤
│ gemini   │ ✓ OK   │ 1890ms  │ 15 / 412        │ Node.js is a JavaScript...  │
└──────────┴────────┴─────────┴─────────────────┴──────────────────────────────┘

  ⚡ Fastest: gemini at 1890ms

:health

Display persistent health statistics tracked across sessions in ~/.forge/health.json.

Terminal Output

┌──────────┬────────┬──────────────┬───────────────┬──────────────┐
│ Provider │ Status │ Latency      │ Success Count │ Fail Measure │
├──────────┼────────┼──────────────┼───────────────┼──────────────┤
│ nvidia   │ Online │ 2376ms (avg) │ 14            │ 2            │
├──────────┼────────┼──────────────┼───────────────┼──────────────┤
│ gemini   │ Online │ 1890ms (avg) │ 8             │ 0            │
└──────────┴────────┴──────────────┴───────────────┴──────────────┘

:debug

Toggle debug mode. When ON, shows HTTP status codes, request URLs, latency, tokens, and raw error payloads.

Terminal Output

╭──────────────────────────╮
│   Debug mode is now ON   │
╰──────────────────────────╯

Model: nvidia | Status: Online | Latency: - | Auto: ON | Debug: ON

forge> hello
- Generating with nvidia...
  [DEBUG] Nvidia → HTTP 200 | 4587ms | Model: nvidia/nemotron-3-nano-30b-a3b
  [DEBUG] Tokens: in=33 out=624

:auto

Toggle auto-failover. When active, if the primary model fails after retries, it automatically cascades to the backup model with exponential backoff.

Terminal Output

╭──────────────────────────────────╮
│   Auto-failover is now ON        │
╰──────────────────────────────────╯

Model: nvidia | Status: Online | Latency: - | Auto: ON | Debug: OFF

:history

Show all prompts submitted in the current session along with the model used and response latency.

Terminal Output

Session History:
1. What is Node.js?
   → [nvidia] 4587ms

2. Explain async/await
   → [gemini] 1890ms

3. Write a REST API example
   → [nvidia] 3241ms

:key

Update your Prompt Forge Studio API key. Stored securely in ~/.forge/auth.json.

Terminal Output

? Enter new Prompt Forge API Key: ****************************

╭───────────────────────────────╮
│   API Key updated safely.     │
╰───────────────────────────────╯

:clear

Clear the terminal screen and redraw the status bar.

Terminal Output

(screen cleared)

Model: nvidia | Status: Online | Latency: 4587ms | Auto: ON | Debug: OFF

forge>

:exit
Safely exit Forge Studio.
Terminal Output
Exiting Forge Studio. Goodbye!

Intelligent Failover

The CLI features production-grade failover logic. Each model gets 2 retries with exponential backoff (500ms, 1000ms). If the primary model exhausts all retries, it cascades to the next healthy model automatically.

Retries On

• HTTP 429 (Rate Limited)
• HTTP 500-599 (Server Error)
• Network Timeout (15s)

Fails Immediately On

• HTTP 401 (Invalid API Key)
• HTTP 403 (Forbidden)

Terminal Output

forge> Explain microservices
- Generating with nvidia...
  [WARN] Nvidia failed attempt 1/3 — HTTP 500. Retrying in 500ms...
  [WARN] Nvidia failed attempt 2/3 — HTTP 500. Retrying in 1000ms...
  [WARN] Nvidia failed attempt 3/3 — Exhausted. Falling back...

✔ gemini responded in 1890ms

╭─────────────────────────────────────────────────────╮
│   Warning: Responded using Fallback Model: gemini   │
╰─────────────────────────────────────────────────────╯

╭  AI Response  ──────────────────────────────────────╮
│                                                     │
│   Microservices is an architectural pattern...      │
│                                                     │
│   ---                                               │
│   [Model: gemini | Latency: 1890ms]                 │
│                                                     │
╰─────────────────────────────────────────────────────╯