Gemini Compatibility Guide
Use the Google Gemini API format with Ask Sage — powered by Vertex AI
Table of contents
What’s New?
Ask Sage now supports the Google Gemini API format, making it easy to:
Switch from Google AI Studio or Vertex AI to Ask Sage with minimal code changes
Use the same Gemini API format and patterns you already know
Automatic regional failover across multiple US regions for high availability
Gemini-Compatible Endpoints
Generate Content
The main endpoint for generating content using Gemini models. Requests are routed to Google Vertex AI with automatic regional failover.
Authorization header Request Parameters
Bearer token authentication: Bearer YOUR_API_KEY
The Gemini model to use. Supports multiple naming formats:
- Simple names:
flash,pro - Standard:
gemini-2.5-pro,gemini-2.5-flash - Preview versions:
gemini-2.5-pro-preview-05-06,gemini-2.5-flash-preview-04-17 - With prefix:
models/gemini-2.5-pro - Vertex format:
publishers/google/models/gemini-2.5-pro
Array of content objects with roles and parts
- role:
userormodel(note:assistantis automatically mapped tomodel) - parts: Array of part objects containing
text,inlineData,functionCall, orfunctionResponse
System-level instructions for the model. Object with role and parts fields.
Generation configuration:
- temperature: Controls randomness (0.0–2.0)
- topP: Nucleus sampling parameter
- topK: Top-k sampling parameter
- maxOutputTokens: Maximum tokens in the response
- candidateCount: Number of candidates to generate
- stopSequences: Sequences that stop generation
- responseMimeType: Response format (e.g.,
application/json) - responseSchema: JSON schema for structured output
Array of tool objects containing functionDeclarations for function calling
Safety filter configuration per category
Response Details
Response Structure:
- candidates: Array of candidate responses:
- content: Response content with
roleandparts - finishReason: Reason for completion (STOP, MAX_TOKENS, SAFETY, etc.)
- safetyRatings: Safety rating per category
- content: Response content with
- usageMetadata: Token usage:
- promptTokenCount: Input tokens
- candidatesTokenCount: Output tokens
- totalTokenCount: Total tokens
- modelVersion: Model version used
Invalid request format or parameters
Authentication failure — invalid or missing API key
Rate limit exceeded — automatic regional failover will be attempted
Example: Basic Text Generation
curl -X POST 'https://api.asksage.ai/server/google/v1/models/gemini-2.5-flash:generateContent' \
-H 'Authorization: Bearer YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"contents": [
{
"role": "user",
"parts": [{"text": "What is Ask Sage?"}]
}
]
}'Example: Multi-Turn Conversation with System Instruction
curl -X POST 'https://api.asksage.ai/server/google/v1/models/gemini-2.5-pro:generateContent' \
-H 'Authorization: Bearer YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"systemInstruction": {
"role": "user",
"parts": [{"text": "You are a helpful cybersecurity assistant."}]
},
"contents": [
{
"role": "user",
"parts": [{"text": "What is zero trust architecture?"}]
},
{
"role": "model",
"parts": [{"text": "Zero trust is a security framework that requires all users to be authenticated and authorized before accessing resources."}]
},
{
"role": "user",
"parts": [{"text": "How does it apply to cloud environments?"}]
}
],
"generationConfig": {
"temperature": 0.3,
"maxOutputTokens": 2048
}
}'Example: Function Calling
functionDeclarations within the tools array for function calling curl -X POST 'https://api.asksage.ai/server/google/v1/models/gemini-2.5-flash:generateContent' \
-H 'Authorization: Bearer YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"contents": [
{
"role": "user",
"parts": [{"text": "What'\''s the weather like in San Francisco?"}]
}
],
"tools": [
{
"functionDeclarations": [
{
"name": "get_weather",
"description": "Get the current weather in a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
]
}
]
}'Example: Structured JSON Output
curl -X POST 'https://api.asksage.ai/server/google/v1/models/gemini-2.5-flash:generateContent' \
-H 'Authorization: Bearer YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"contents": [
{
"role": "user",
"parts": [{"text": "List three cloud security best practices"}]
}
],
"generationConfig": {
"responseMimeType": "application/json",
"responseSchema": {
"type": "object",
"properties": {
"practices": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"description": {"type": "string"}
}
}
}
}
}
}
}'Available Models
| Model | Provider | Capability |
|---|---|---|
gemini-2.5-flash | Google Vertex AI | Default — fast and efficient for most tasks |
gemini-2.5-pro | Google Vertex AI | Most capable — complex reasoning and analysis |
How It Works
Ask Sage's google/v1/* endpoints follow the Google Gemini API specification, making integration seamless.
Request Flow
- Your application sends a request to
https://api.asksage.ai/server/google/v1/models/{model}:generateContent - Ask Sage validates your authentication token
- The model name is resolved and the request is routed to Google Vertex AI
- If a region hits rate limits, the request automatically retries in the next available region
- The response is returned in standard Gemini API format
https://api.asksage.ai/server/google/v1) with Bearer token authentication Role Mapping
Ask Sage automatically handles role normalization for convenience:
| Input Role | Mapped To | Notes |
|---|---|---|
user | user | No change |
model | model | No change |
assistant | model | Automatically mapped for OpenAI compatibility |
system | systemInstruction | Extracted and merged into the system instruction field |
Supported Features
Generate text with configurable temperature, top-p, and top-k
Define function declarations and let Gemini call them
Set system-level instructions to guide model behavior
Maintain context across multiple user/model exchanges
Get JSON responses conforming to a specified schema
Configure safety filters per harm category