Groq

Overview
Configuration

Overview

Groq operates using an LPU™ (Language Processing Unit) architecture, specifically engineered to deliver real-time AI capabilities with latency far below traditional GPU deployments.

Unmatched Speed

Hundreds of tokens per second for popular models like Llama 3.

Open Source Models

First-class support for Llama, Mixtral, and Gemma model variants.

Configuration

Get your API Key

Environment Settings

Add the credentials to your environment:

GROQ_API_KEY="gsk_..."
DEFAULT_MODEL="llama3-70b-8192"

Because of Groq’s incredible generation speed, it is highly recommended for streaming applications (e.g. voice or real-time chatbots).

Google Gemini Ollama

Getting Started

Dashboard

SDK

MCP

Memory

Hosting & Deployment

Overview

Unmatched Speed

Open Source Models

Configuration

Getting Started

Dashboard

SDK

MCP

Memory

Hosting & Deployment

Documentation Index

​Overview

Unmatched Speed

Open Source Models

​Configuration

Overview

Configuration