We support Google's Gemini API including:
- standard chat completions
- streaming chat completions
- function calling
- context caching support for better token optimization (must be a paid API key)
Installation
Context Caching
Context Caching is a feature specific to Gemini that helps cut down on duplicate token usage by allowing you to create a cache with a TTL with which you can provide context to the model that you've already obtained from elsewhere.
To use Context Caching you need to create a cache before you call generate via googleClient.cacheManager.create({})
like so:
Gemini does support OpenAI compatibility for it's Node client but given that it's in beta and it has some limitations around structured output and images we're not using it directly in this library.