Island AI

Google (Gemini)

OpenAI-compatible interface for Google's Gemini models

The llm-polyglot library provides comprehensive support for Google's Gemini API including:

  • Standard chat completions with OpenAI-compatible interface
  • Streaming chat completions with delta updates
  • Function/tool calling with automatic schema conversion
  • Context caching for token optimization (requires paid API key)
  • Grounding support with Google Search integration
  • Safety settings and model generation config
  • Session management for stateful conversations
  • Automatic response transformation with source attribution

Installation

bun add llm-polyglot openai @google/generative-ai

Basic Usage

const client = createLLMClient({ provider: "google" });
 
// Standard completion
const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Hello!" }],
  max_tokens: 1000
});
 
// With grounding (Google Search)
const groundedCompletion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "What are the latest AI developments?" }],
  groundingThreshold: 0.7,
  max_tokens: 1000
});
 
// With safety settings
const safeCompletion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Tell me a story" }],
  additionalProperties: {
    safetySettings: [{
      category: "HARM_CATEGORY_HARASSMENT",
      threshold: "BLOCK_MEDIUM_AND_ABOVE"
    }]
  }
});
 
// With session management
const sessionCompletion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Remember this: I'm Alice" }],
  additionalProperties: {
    sessionId: "user-123"
  }
});

Advanced Features

Context Caching

Context Caching helps reduce token usage by caching context with a TTL:

// Create a cache
const cache = await client.cacheManager.create({
  model: "gemini-1.5-flash-8b",
  messages: [{ role: "user", content: "Context to cache" }],
  ttlSeconds: 3600 // Cache for 1 hour
});
 
// Use the cached context
const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-8b",
  messages: [{ role: "user", content: "Follow-up question" }],
  additionalProperties: {
    cacheName: cache.name
  }
});
 
// Cache management
await client.cacheManager.list(); // List all caches
await client.cacheManager.get(cacheName); // Get specific cache
await client.cacheManager.update(cacheName, params); // Update cache
await client.cacheManager.delete(cacheName); // Delete cache

Function/Tool Calling

const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Analyze this data" }],
  tools: [{
    type: "function",
    function: {
      name: "analyze",
      parameters: {
        type: "object",
        properties: {
          sentiment: { type: "string" }
        }
      }
    }
  }],
  tool_choice: {
    type: "function",
    function: { name: "analyze" }
  }
});
const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "What are the best restaurants in Boston?" }],
  groundingThreshold: 0.7, // 0.0 to 1.0, higher means more reliance on search
  max_tokens: 1000
});
 
// Response includes grounding metadata
console.log(completion.choices[0].grounding_metadata);
// {
//   search_queries: string[],
//   sources: Array<{ url: string, title: string }>,
//   search_suggestion_html: string,
//   supports: Array<{
//     text: string,
//     sources: Array<{ url: string, title: string }>,
//     confidence: number[]
//   }>
// }

Safety Settings & Generation Config

const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "Tell me a story" }],
  additionalProperties: {
    // Safety settings
    safetySettings: [{
      category: "HARM_CATEGORY_HARASSMENT",
      threshold: "BLOCK_MEDIUM_AND_ABOVE"
    }],
    // Generation config
    modelGenerationConfig: {
      temperature: 0.9,
      topK: 40,
      topP: 0.8,
      maxOutputTokens: 200
    }
  }
});

Session Management

// First message in session
const msg1 = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "My name is Alice" }],
  additionalProperties: { sessionId: "user-123" }
});
 
// Second message in same session (maintains context)
const msg2 = await client.chat.completions.create({
  model: "gemini-1.5-flash-latest",
  messages: [{ role: "user", content: "What's my name?" }],
  additionalProperties: { sessionId: "user-123" }
});

OpenAI Compatibility Mode

While Gemini does offer OpenAI compatibility, we recommend using our native integration for better type safety and feature support. However, if you prefer the OpenAI-compatible endpoint:

const client = createLLMClient({
  provider: "openai",
  apiKey: "gemini_api_key",
  baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/"
});
 
const completion = await client.chat.completions.create({
  model: "gemini-1.5-flash",
  messages: [{ role: "user", content: "Hello!" }]
});

Note that the OpenAI compatibility mode has some limitations around structured output and images.

On this page