Get up and running with groq-rag in 5 minutes.
groq-rag includes all functions from the official groq-sdk. You can use it as a drop-in replacement with additional RAG, web browsing, and agent capabilities.
npm install groq-rag
# Add to your .npmrc
echo "@mithun50:registry=https://npm.pkg.github.com" >> .npmrc
# Install
npm install @mithun50/groq-rag
# Option 1: Environment variable
export GROQ_API_KEY=gsk_xxxxxxxxxxxx
# Option 2: .env file
echo "GROQ_API_KEY=gsk_xxxxxxxxxxxx" > .env
import GroqRAG from 'groq-rag';
const client = new GroqRAG();
const response = await client.complete({
model: 'llama-3.3-70b-versatile',
messages: [{ role: 'user', content: 'Explain quantum computing in simple terms' }]
});
console.log(response.choices[0].message.content);
Answer questions using your own documents.
import GroqRAG from 'groq-rag';
const client = new GroqRAG();
// 1. Initialize RAG
await client.initRAG();
// 2. Add your documents
await client.rag.addDocument(`
Acme Corp was founded in 2020.
We build AI-powered healthcare solutions.
Our flagship product is MedAssist.
`);
// 3. Ask questions about your documents
const response = await client.chat.withRAG({
messages: [{ role: 'user', content: 'When was Acme Corp founded?' }]
});
console.log(response.content);
// → "Acme Corp was founded in 2020."
Key methods:
| Method | Description |
|——–|————-|
| client.rag.addDocument(text) | Add text to knowledge base |
| client.rag.addUrl(url) | Add webpage content |
| client.chat.withRAG({...}) | Chat using knowledge base |
Get real-time information from the internet.
const response = await client.chat.withWebSearch({
messages: [{ role: 'user', content: 'What happened in tech news today?' }],
maxResults: 5
});
console.log(response.content);
console.log('Sources:', response.sources.map(s => s.url));
Key methods:
| Method | Description |
|——–|————-|
| client.web.search(query) | Search the web |
| client.web.fetch(url) | Fetch and parse a URL |
| client.chat.withWebSearch({...}) | Chat with web search |
Create AI that reasons and uses tools autonomously.
// Create agent with built-in tools
const agent = await client.createAgentWithBuiltins({
model: 'llama-3.3-70b-versatile',
verbose: true // See reasoning steps
});
// Agent automatically chooses which tools to use
const result = await agent.run(
'Search for the latest SpaceX launch and summarize it'
);
console.log(result.output);
console.log('Tools used:', result.toolCalls.map(t => t.name));
Built-in tools:
| Tool | What it does |
|——|————–|
| web_search | Search the internet |
| fetch_url | Read webpage content |
| calculator | Math operations |
| get_datetime | Current date/time |
| rag_query | Query knowledge base (added if RAG initialized) |
import GroqRAG from 'groq-rag';
const client = new GroqRAG();
await client.initRAG();
// Load your documents
await client.rag.addDocument(fs.readFileSync('docs/faq.txt', 'utf-8'));
await client.rag.addDocument(fs.readFileSync('docs/manual.txt', 'utf-8'));
await client.rag.addUrl('https://docs.example.com/guide');
// Answer questions
async function askQuestion(question: string) {
const response = await client.chat.withRAG({
messages: [{ role: 'user', content: question }],
topK: 5,
minScore: 0.5
});
return response.content;
}
console.log(await askQuestion('How do I reset my password?'));
const agent = await client.createAgentWithBuiltins({
model: 'llama-3.3-70b-versatile',
systemPrompt: `You are a research assistant. When asked about a topic:
1. Search for current information
2. Fetch relevant articles
3. Synthesize findings into a clear summary
Always cite your sources.`
});
const result = await agent.run('Research the current state of fusion energy');
console.log(result.output);
const agent = await client.createAgentWithBuiltins();
for await (const event of agent.runStream('Explain machine learning')) {
switch (event.type) {
case 'content':
process.stdout.write(event.data as string);
break;
case 'tool_call':
console.log('\n[Using tool...]');
break;
case 'done':
console.log('\n[Complete]');
break;
}
}
groq-rag supports all Groq models. Here are recommended models by use case:
| Model | Speed | Best For |
|---|---|---|
llama-3.3-70b-versatile |
280 T/s | General purpose, highest quality |
llama-3.1-8b-instant |
560 T/s | Fast responses, cost-effective |
openai/gpt-oss-120b |
500 T/s | Complex reasoning with tools |
qwen/qwen3-32b |
— | Math & reasoning tasks |
deepseek-r1-distill-qwen-32b |
140 T/s | Math (94% MATH-500), code |
meta-llama/llama-4-scout-17b-16e-instruct |
— | Vision + text tasks |
groq/compound |
450 T/s | Built-in web search & code execution |
📚 See Groq Models Documentation for the complete list.
await client.initRAG({
// Embedding provider
embedding: {
provider: 'groq', // 'groq' (free) or 'openai' (better quality)
},
// Vector store
vectorStore: {
provider: 'memory', // 'memory' (dev) or 'chroma' (production)
},
// How to split documents
chunking: {
strategy: 'recursive',
chunkSize: 1000,
chunkOverlap: 200
}
});
await client.initRAG({
embedding: {
provider: 'openai',
apiKey: process.env.OPENAI_API_KEY,
model: 'text-embedding-3-small'
},
vectorStore: {
provider: 'chroma',
connectionString: 'http://localhost:8000',
indexName: 'production-kb'
}
});
# Check if set
echo $GROQ_API_KEY
# Set it
export GROQ_API_KEY=gsk_xxxxxxxxxxxx
Add delays between requests:
import { sleep } from 'groq-rag';
await sleep(1000); // Wait 1 second
Check Groq’s model list for valid model names. You can also get the current list programmatically:
curl https://api.groq.com/openai/v1/models \
-H "Authorization: Bearer $GROQ_API_KEY"