groq-rag

Getting Started

Get up and running with groq-rag in 5 minutes.

groq-rag includes all functions from the official groq-sdk. You can use it as a drop-in replacement with additional RAG, web browsing, and agent capabilities.

Prerequisites

Node.js 18+
Groq API key (Get one free)

Installation

From npm (Recommended)

npm install groq-rag

From GitHub Packages

# Add to your .npmrc
echo "@mithun50:registry=https://npm.pkg.github.com" >> .npmrc

# Install
npm install @mithun50/groq-rag

Setup API Key

# Option 1: Environment variable
export GROQ_API_KEY=gsk_xxxxxxxxxxxx

# Option 2: .env file
echo "GROQ_API_KEY=gsk_xxxxxxxxxxxx" > .env

Quick Start

1. Basic Chat

import GroqRAG from 'groq-rag';

const client = new GroqRAG();

const response = await client.complete({
  model: 'llama-3.3-70b-versatile',
  messages: [{ role: 'user', content: 'Explain quantum computing in simple terms' }]
});

console.log(response.choices[0].message.content);

Core Features

RAG (Retrieval-Augmented Generation)

Answer questions using your own documents.

import GroqRAG from 'groq-rag';

const client = new GroqRAG();

// 1. Initialize RAG
await client.initRAG();

// 2. Add your documents
await client.rag.addDocument(`
  Acme Corp was founded in 2020.
  We build AI-powered healthcare solutions.
  Our flagship product is MedAssist.
`);

// 3. Ask questions about your documents
const response = await client.chat.withRAG({
  messages: [{ role: 'user', content: 'When was Acme Corp founded?' }]
});

console.log(response.content);
// → "Acme Corp was founded in 2020."

Key methods: | Method | Description | |——–|————-| | client.rag.addDocument(text) | Add text to knowledge base | | client.rag.addUrl(url) | Add webpage content | | client.chat.withRAG({...}) | Chat using knowledge base |

Web Search

Get real-time information from the internet.

const response = await client.chat.withWebSearch({
  messages: [{ role: 'user', content: 'What happened in tech news today?' }],
  maxResults: 5
});

console.log(response.content);
console.log('Sources:', response.sources.map(s => s.url));

Key methods: | Method | Description | |——–|————-| | client.web.search(query) | Search the web | | client.web.fetch(url) | Fetch and parse a URL | | client.chat.withWebSearch({...}) | Chat with web search |

Agents

Create AI that reasons and uses tools autonomously.

// Create agent with built-in tools
const agent = await client.createAgentWithBuiltins({
  model: 'llama-3.3-70b-versatile',
  verbose: true  // See reasoning steps
});

// Agent automatically chooses which tools to use
const result = await agent.run(
  'Search for the latest SpaceX launch and summarize it'
);

console.log(result.output);
console.log('Tools used:', result.toolCalls.map(t => t.name));

MCP Tools (External)

Connect to MCP servers for additional tools.

// Add an MCP server
await client.mcp.addServer({
  name: 'filesystem',
  transport: 'stdio',
  command: 'npx',
  args: ['-y', '@modelcontextprotocol/server-filesystem', './data'],
});

// Create agent with MCP tools included
const agent = await client.createAgentWithBuiltins(
  { model: 'llama-3.3-70b-versatile' },
  { includeMCP: true }
);

// Agent can now use tools from MCP servers
const result = await agent.run('List files in the data directory');

// Cleanup
await client.mcp.disconnectAll();

Common Patterns

Pattern 1: Document Q&A Bot

import GroqRAG from 'groq-rag';

const client = new GroqRAG();
await client.initRAG();

// Load your documents
await client.rag.addDocument(fs.readFileSync('docs/faq.txt', 'utf-8'));
await client.rag.addDocument(fs.readFileSync('docs/manual.txt', 'utf-8'));
await client.rag.addUrl('https://docs.example.com/guide');

// Answer questions
async function askQuestion(question: string) {
  const response = await client.chat.withRAG({
    messages: [{ role: 'user', content: question }],
    topK: 5,
    minScore: 0.5
  });
  return response.content;
}

console.log(await askQuestion('How do I reset my password?'));

Pattern 2: Research Assistant

const agent = await client.createAgentWithBuiltins({
  model: 'llama-3.3-70b-versatile',
  systemPrompt: `You are a research assistant. When asked about a topic:
1. Search for current information
2. Fetch relevant articles
3. Synthesize findings into a clear summary
Always cite your sources.`
});

const result = await agent.run('Research the current state of fusion energy');
console.log(result.output);

Pattern 3: Streaming Responses

const agent = await client.createAgentWithBuiltins();

for await (const event of agent.runStream('Explain machine learning')) {
  switch (event.type) {
    case 'content':
      process.stdout.write(event.data as string);
      break;
    case 'tool_call':
      console.log('\n[Using tool...]');
      break;
    case 'done':
      console.log('\n[Complete]');
      break;
  }
}

Configuration

Models

groq-rag supports all Groq models. Here are recommended models by use case:

Model	Speed	Best For
`llama-3.3-70b-versatile`	280 T/s	General purpose, highest quality
`llama-3.1-8b-instant`	560 T/s	Fast responses, cost-effective
`openai/gpt-oss-120b`	500 T/s	Complex reasoning with tools
`qwen/qwen3-32b`	—	Math & reasoning tasks
`deepseek-r1-distill-qwen-32b`	140 T/s	Math (94% MATH-500), code
`meta-llama/llama-4-scout-17b-16e-instruct`	—	Vision + text tasks
`groq/compound`	450 T/s	Built-in web search & code execution

📚 See Groq Models Documentation for the complete list.

RAG Settings

await client.initRAG({
  // Embedding provider
  embedding: {
    provider: 'groq',  // 'groq' (free) or 'openai' (better quality)
  },

  // Vector store
  vectorStore: {
    provider: 'memory',  // 'memory' (dev) or 'chroma' (production)
  },

  // How to split documents
  chunking: {
    strategy: 'recursive',
    chunkSize: 1000,
    chunkOverlap: 200
  }
});

Production Setup

await client.initRAG({
  embedding: {
    provider: 'openai',
    apiKey: process.env.OPENAI_API_KEY,
    model: 'text-embedding-3-small'
  },
  vectorStore: {
    provider: 'chroma',
    connectionString: 'http://localhost:8000',
    indexName: 'production-kb'
  }
});

Troubleshooting

“GROQ_API_KEY not found”

# Check if set
echo $GROQ_API_KEY

# Set it
export GROQ_API_KEY=gsk_xxxxxxxxxxxx

“Rate limit exceeded”

Add delays between requests:

import { sleep } from 'groq-rag';
await sleep(1000);  // Wait 1 second

“Model not found”

Check Groq’s model list for valid model names. You can also get the current list programmatically:

curl https://api.groq.com/openai/v1/models \
  -H "Authorization: Bearer $GROQ_API_KEY"

Next Steps

Cookbook - Real-world patterns
Examples - More code examples
API Reference - Complete API docs

Groq Resources

📚 Groq Models - All available models
🧠 Reasoning Models - Math & logic models
👁️ Vision Models - Image input support
🛡️ Content Moderation - Safety models
📖 API Reference - Groq API docs

This site is open source. Improve this page.