CLI v1.0.0

Forge CLI Guide

A professional terminal-native prompt engineering and AI model evaluation studio with intelligent failover, real-time health tracking, and multi-provider benchmarking.

Installation

Install globally from npm to make the forge command available anywhere on your machine.

bash
npm install -g prompt-forge-ai-cli

Requires Node.js ≥ 18.0.0. All dependencies install automatically.

Authentication

The CLI connects to your Prompt Forge Studio account via API key. On first launch, you'll be prompted to enter it. Get yours at prompt-forge-studio.vercel.app.

Terminal Output
No Prompt Forge API key found. Let's get you authenticated.

? Enter your Prompt Forge API Key: ****************************

✔ Prompt Forge API key saved securely locally.

Initializing a Project

Create a structured repository for saving prompts, sessions, and configs:

bash
forge init my-project
Terminal Output
╭──────────────────────────────────────────────────────────╮
│                                                          │
│   Initializing Forge Studio Project: my-project          │
│                                                          │
╰──────────────────────────────────────────────────────────╯

✔ Directories created
✔ Configuration files generated
✔ Git repository initialized

✔ Project setup complete!

Next steps:
  cd my-project
  forge

Hub Registry Commands

The CLI integrates directly with the PromptForge Hub. You can search, pull, and push prompts without leaving your terminal.

forge search <query>

Search the global prompt registry for publicly available prompts.

bash
forge search "marketing outline"

forge pull <identifier>

Pull a prompt from the registry by its username/slug combination to use it locally.

bash
forge pull "ani12004/marketing-outline"

forge push <file>

Publish a local prompt JSON configuration to your PromptForge Hub profile.

bash
forge push prompts/my-awesome-prompt.json

Interactive Studio Mode

Launch the studio REPL to start generating AI responses, testing models, and running diagnostics — all from one terminal.

bash
forge
Terminal Output
  _____  ___   ____    ____  _____   ____  _____  _   _  ____  ___  ___
 |  ___|| _ \ |  _ \  / ___|| ____| / ___||_   _|| | | ||  _ \|_ _|/ _ \
 | |_  | | | || |_) || |  _ |  _|   \___ \  | |  | | | || | | || || | | |
 |  _| | |_| ||  _ < | |_| || |___   ___) | | |  | |_| || |_| || || |_| |
 |_|    \___/ |_| \_\ \____||_____| |____/  |_|   \___/ |____/|___|\___/

Model: nvidia | Status: Online | Latency: - | Auto: ON | Debug: OFF

forge>

Commands Reference

Type any command below inside Studio Mode. Type :help to see them all at any time.

  • :help

    Show all available commands and the API key URL.

    Terminal Output
      Available Commands:
    
      :help       Show this help menu
      :model      Switch primary AI model
      :auto       Toggle auto-failover ON/OFF
      :debug      Toggle debug mode ON/OFF
      :doctor     Run connectivity diagnostics
      :benchmark  Compare all models on same prompt
      :health     Show model health statistics
      :history    Show session prompt history
      :key        Set your Prompt Forge API key
      :clear      Clear the terminal
      :exit       Exit Forge Studio
    
      Get your API key at:
      https://prompt-forge-studio.vercel.app
  • :model

    Interactively switch your default primary AI model provider.

    Terminal Output
    ? Select a primary model: (Use arrow keys)
    ❯ nvidia
      gemini
    
    Model: gemini | Status: Online | Latency: - | Auto: ON | Debug: OFF
  • :doctor

    Run full connectivity diagnostics across all AI providers. Validates API key format, tests network, and reports latency.

    Terminal Output
    ╭────────────────────────────╮
    │   Running Diagnostics...   │
    ╰────────────────────────────╯
    
    ✔ nvidia is healthy (2376ms)
    ✔ gemini is healthy (2228ms)
    
    ┌──────────┬────────┬──────────────┬───────────────┬──────────────┐
    │ Provider │ Status │ Latency      │ Success Count │ Fail Measure │
    ├──────────┼────────┼──────────────┼───────────────┼──────────────┤
    │ nvidia   │ Active │ 2376ms (avg) │ 1             │ 0            │
    ├──────────┼────────┼──────────────┼───────────────┼──────────────┤
    │ gemini   │ Active │ 2228ms (avg) │ 1             │ 0            │
    └──────────┴────────┴──────────────┴───────────────┴──────────────┘
  • :benchmark

    Run the same prompt on all models side-by-side. Compare latency, token usage, and response preview.

    Terminal Output
    ╭──────────────────────────────────╮
    │   Benchmarking all providers...  │
    ╰──────────────────────────────────╯
    
    ✔ nvidia completed in 2376ms
    ✔ gemini completed in 1890ms
    
      Benchmark Report
    
    ┌──────────┬────────┬─────────┬─────────────────┬──────────────────────────────┐
    │ Provider │ Status │ Latency │ Tokens (In/Out) │ Response Preview             │
    ├──────────┼────────┼─────────┼─────────────────┼──────────────────────────────┤
    │ nvidia   │ ✓ OK   │ 2376ms  │ 33 / 624        │ Node.js is an open-source...│
    ├──────────┼────────┼─────────┼─────────────────┼──────────────────────────────┤
    │ gemini   │ ✓ OK   │ 1890ms  │ 15 / 412        │ Node.js is a JavaScript...  │
    └──────────┴────────┴─────────┴─────────────────┴──────────────────────────────┘
    
      ⚡ Fastest: gemini at 1890ms
  • :health

    Display persistent health statistics tracked across sessions in ~/.forge/health.json.

    Terminal Output
    ┌──────────┬────────┬──────────────┬───────────────┬──────────────┐
    │ Provider │ Status │ Latency      │ Success Count │ Fail Measure │
    ├──────────┼────────┼──────────────┼───────────────┼──────────────┤
    │ nvidia   │ Online │ 2376ms (avg) │ 14            │ 2            │
    ├──────────┼────────┼──────────────┼───────────────┼──────────────┤
    │ gemini   │ Online │ 1890ms (avg) │ 8             │ 0            │
    └──────────┴────────┴──────────────┴───────────────┴──────────────┘
  • :debug

    Toggle debug mode. When ON, shows HTTP status codes, request URLs, latency, tokens, and raw error payloads.

    Terminal Output
    ╭──────────────────────────╮
    │   Debug mode is now ON   │
    ╰──────────────────────────╯
    
    Model: nvidia | Status: Online | Latency: - | Auto: ON | Debug: ON
    
    forge> hello
    - Generating with nvidia...
      [DEBUG] Nvidia → HTTP 200 | 4587ms | Model: nvidia/nemotron-3-nano-30b-a3b
      [DEBUG] Tokens: in=33 out=624
  • :auto

    Toggle auto-failover. When active, if the primary model fails after retries, it automatically cascades to the backup model with exponential backoff.

    Terminal Output
    ╭──────────────────────────────────╮
    │   Auto-failover is now ON        │
    ╰──────────────────────────────────╯
    
    Model: nvidia | Status: Online | Latency: - | Auto: ON | Debug: OFF
  • :history

    Show all prompts submitted in the current session along with the model used and response latency.

    Terminal Output
    Session History:
    1. What is Node.js?
       → [nvidia] 4587ms
    
    2. Explain async/await
       → [gemini] 1890ms
    
    3. Write a REST API example
       → [nvidia] 3241ms
  • :key

    Update your Prompt Forge Studio API key. Stored securely in ~/.forge/auth.json.

    Terminal Output
    ? Enter new Prompt Forge API Key: ****************************
    
    ╭───────────────────────────────╮
    │   API Key updated safely.     │
    ╰───────────────────────────────╯
  • :clear

    Clear the terminal screen and redraw the status bar.

    Terminal Output
    (screen cleared)
    
    Model: nvidia | Status: Online | Latency: 4587ms | Auto: ON | Debug: OFF
    
    forge>
  • :exit

    Safely exit Forge Studio.

    Terminal Output
    Exiting Forge Studio. Goodbye!

Intelligent Failover

The CLI features production-grade failover logic. Each model gets 2 retries with exponential backoff (500ms, 1000ms). If the primary model exhausts all retries, it cascades to the next healthy model automatically.

Retries On

  • • HTTP 429 (Rate Limited)
  • • HTTP 500-599 (Server Error)
  • • Network Timeout (15s)

Fails Immediately On

  • • HTTP 401 (Invalid API Key)
  • • HTTP 403 (Forbidden)
Terminal Output
forge> Explain microservices
- Generating with nvidia...
  [WARN] Nvidia failed attempt 1/3 — HTTP 500. Retrying in 500ms...
  [WARN] Nvidia failed attempt 2/3 — HTTP 500. Retrying in 1000ms...
  [WARN] Nvidia failed attempt 3/3 — Exhausted. Falling back...

✔ gemini responded in 1890ms

╭─────────────────────────────────────────────────────╮
│   Warning: Responded using Fallback Model: gemini   │
╰─────────────────────────────────────────────────────╯

╭  AI Response  ──────────────────────────────────────╮
│                                                     │
│   Microservices is an architectural pattern...      │
│                                                     │
│   ---                                               │
│   [Model: gemini | Latency: 1890ms]                 │
│                                                     │
╰─────────────────────────────────────────────────────╯