Google (Gemini)

The llm-polyglot library provides comprehensive support for Google's Gemini API including:

Standard chat completions with OpenAI-compatible interface
Streaming chat completions with delta updates
Function/tool calling with automatic schema conversion
Context caching for token optimization (requires paid API key)
Grounding support with Google Search integration
Safety settings and model generation config
Session management for stateful conversations
Automatic response transformation with source attribution

Installation

bun add llm-polyglot openai @google/generative-ai

Basic Usage

const client = createLLMClient({ provider: "google" });
 
// Standard completion
const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Hello!" }],
  max_tokens: 1000
});
 
// With grounding (Google Search)
const groundedCompletion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "What are the latest AI developments?" }],
  groundingThreshold: 0.7,
  max_tokens: 1000
});
 
// With safety settings
const safeCompletion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Tell me a story" }],
  additionalProperties: {
    safetySettings: [{
      category: "HARM_CATEGORY_HARASSMENT",
      threshold: "BLOCK_MEDIUM_AND_ABOVE"
    }]
  }
});
 
// With session management
const sessionCompletion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Remember this: I'm Alice" }],
  additionalProperties: {
    sessionId: "user-123"
  }
});

Advanced Features

Context Caching

Context Caching helps reduce token usage by caching context with a TTL:

// Create a cache
const cache = await client.cacheManager.create({
  model: "gemini-1.5-flash-8b",
  messages: [{ role: "user", content: "Context to cache" }],
  ttlSeconds: 3600 // Cache for 1 hour
});
 
// Use the cached context
const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-8b",
  messages: [{ role: "user", content: "Follow-up question" }],
  additionalProperties: {
    cacheName: cache.name
  }
});
 
// Cache management
await client.cacheManager.list(); // List all caches
await client.cacheManager.get(cacheName); // Get specific cache
await client.cacheManager.update(cacheName, params); // Update cache
await client.cacheManager.delete(cacheName); // Delete cache

Function/Tool Calling

const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Analyze this data" }],
  tools: [{
    type: "function",
    function: {
      name: "analyze",
      parameters: {
        type: "object",
        properties: {
          sentiment: { type: "string" }
        }
      }
    }
  }],
  tool_choice: {
    type: "function",
    function: { name: "analyze" }
  }
});

Grounding with Google Search

const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "What are the best restaurants in Boston?" }],
  groundingThreshold: 0.7, // 0.0 to 1.0, higher means more reliance on search
  max_tokens: 1000
});
 
// Response includes grounding metadata
console.log(completion.choices[0].grounding_metadata);
// {
//   search_queries: string[],
//   sources: Array<{ url: string, title: string }>,
//   search_suggestion_html: string,
//   supports: Array<{
//     text: string,
//     sources: Array<{ url: string, title: string }>,
//     confidence: number[]
//   }>
// }

Safety Settings & Generation Config

const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Tell me a story" }],
  additionalProperties: {
    // Safety settings
    safetySettings: [{
      category: "HARM_CATEGORY_HARASSMENT",
      threshold: "BLOCK_MEDIUM_AND_ABOVE"
    }],
    // Generation config
    modelGenerationConfig: {
      temperature: 0.9,
      topK: 40,
      topP: 0.8,
      maxOutputTokens: 200
    }
  }
});

Session Management

// First message in session
const msg1 = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "My name is Alice" }],
  additionalProperties: { sessionId: "user-123" }
});
 
// Second message in same session (maintains context)
const msg2 = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "What's my name?" }],
  additionalProperties: { sessionId: "user-123" }
});

OpenAI Compatibility Mode

While Gemini does offer OpenAI compatibility, we recommend using our native integration for better type safety and feature support. However, if you prefer the OpenAI-compatible endpoint:

const client = createLLMClient({
  provider: "openai",
  apiKey: "gemini_api_key",
  baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/"
});
 
const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash",
  messages: [{ role: "user", content: "Hello!" }]
});

Note that the OpenAI compatibility mode has some limitations around structured output and images.

Island AI