Google (Gemini) OpenAI-compatible interface for Google's Gemini models
The llm-polyglot library provides comprehensive support for Google's Gemini API including:
Standard chat completions with OpenAI-compatible interface
Streaming chat completions with delta updates
Function/tool calling with automatic schema conversion
Context caching for token optimization (requires paid API key)
Grounding support with Google Search integration
Safety settings and model generation config
Session management for stateful conversations
Automatic response transformation with source attribution
bun npm pnpm
bun add llm-polyglot openai @google/generative-ai
const client = createLLMClient ({ provider: "google" });
// Standard completion
const completion = await client.chat.completions. create ({
model: "gemini-1.5-flash-latest" ,
messages: [{ role: "user" , content: "Hello!" }],
max_tokens: 1000
});
// With grounding (Google Search)
const groundedCompletion = await client.chat.completions. create ({
model: "gemini-1.5-flash-latest" ,
messages: [{ role: "user" , content: "What are the latest AI developments?" }],
groundingThreshold: 0.7 ,
max_tokens: 1000
});
// With safety settings
const safeCompletion = await client.chat.completions. create ({
model: "gemini-1.5-flash-latest" ,
messages: [{ role: "user" , content: "Tell me a story" }],
additionalProperties: {
safetySettings: [{
category: "HARM_CATEGORY_HARASSMENT" ,
threshold: "BLOCK_MEDIUM_AND_ABOVE"
}]
}
});
// With session management
const sessionCompletion = await client.chat.completions. create ({
model: "gemini-1.5-flash-latest" ,
messages: [{ role: "user" , content: "Remember this: I'm Alice" }],
additionalProperties: {
sessionId: "user-123"
}
});
Context Caching helps reduce token usage by caching context with a TTL:
// Create a cache
const cache = await client.cacheManager. create ({
model: "gemini-1.5-flash-8b" ,
messages: [{ role: "user" , content: "Context to cache" }],
ttlSeconds: 3600 // Cache for 1 hour
});
// Use the cached context
const completion = await client.chat.completions. create ({
model: "gemini-1.5-flash-8b" ,
messages: [{ role: "user" , content: "Follow-up question" }],
additionalProperties: {
cacheName: cache.name
}
});
// Cache management
await client.cacheManager. list (); // List all caches
await client.cacheManager. get (cacheName); // Get specific cache
await client.cacheManager. update (cacheName, params); // Update cache
await client.cacheManager. delete (cacheName); // Delete cache
const completion = await client.chat.completions. create ({
model: "gemini-1.5-flash-latest" ,
messages: [{ role: "user" , content: "Analyze this data" }],
tools: [{
type: "function" ,
function: {
name: "analyze" ,
parameters: {
type: "object" ,
properties: {
sentiment: { type: "string" }
}
}
}
}],
tool_choice: {
type: "function" ,
function: { name: "analyze" }
}
});
const completion = await client.chat.completions. create ({
model: "gemini-1.5-flash-latest" ,
messages: [{ role: "user" , content: "What are the best restaurants in Boston?" }],
groundingThreshold: 0.7 , // 0.0 to 1.0, higher means more reliance on search
max_tokens: 1000
});
// Response includes grounding metadata
console. log (completion.choices[ 0 ].grounding_metadata);
// {
// search_queries: string[],
// sources: Array<{ url: string, title: string }>,
// search_suggestion_html: string,
// supports: Array<{
// text: string,
// sources: Array<{ url: string, title: string }>,
// confidence: number[]
// }>
// }
const completion = await client.chat.completions. create ({
model: "gemini-1.5-flash-latest" ,
messages: [{ role: "user" , content: "Tell me a story" }],
additionalProperties: {
// Safety settings
safetySettings: [{
category: "HARM_CATEGORY_HARASSMENT" ,
threshold: "BLOCK_MEDIUM_AND_ABOVE"
}],
// Generation config
modelGenerationConfig: {
temperature: 0.9 ,
topK: 40 ,
topP: 0.8 ,
maxOutputTokens: 200
}
}
});
// First message in session
const msg1 = await client.chat.completions. create ({
model: "gemini-1.5-flash-latest" ,
messages: [{ role: "user" , content: "My name is Alice" }],
additionalProperties: { sessionId: "user-123" }
});
// Second message in same session (maintains context)
const msg2 = await client.chat.completions. create ({
model: "gemini-1.5-flash-latest" ,
messages: [{ role: "user" , content: "What's my name?" }],
additionalProperties: { sessionId: "user-123" }
});
While Gemini does offer OpenAI compatibility , we recommend using our native integration for better type safety and feature support. However, if you prefer the OpenAI-compatible endpoint:
const client = createLLMClient ({
provider: "openai" ,
apiKey: "gemini_api_key" ,
baseURL: "https://generativelanguage.googleapis.com/v1beta/openai/"
});
const completion = await client.chat.completions. create ({
model: "gemini-1.5-flash" ,
messages: [{ role: "user" , content: "Hello!" }]
});
Note that the OpenAI compatibility mode has some limitations around structured output and images.