Gemini Compatibility Guide

Use the Google Gemini API format with Ask Sage — powered by Vertex AI

Table of contents

What’s New?
Gemini-Compatible Endpoints
1. Generate Content
Available Models
How It Works
Supported Features

What’s New?

Ask Sage now supports the Google Gemini API format, making it easy to:

Easy Migration

Switch from Google AI Studio or Vertex AI to Ask Sage with minimal code changes

Standard Format

Use the same Gemini API format and patterns you already know

Regional Failover

Automatic regional failover across multiple US regions for high availability

Gemini-Compatible Endpoints

Generate Content

POST https://api.asksage.ai/server/google/v1beta/models/{model}:generateContent

The main endpoint for generating content using Gemini models. Requests are routed to Google Vertex AI with automatic regional failover.

Authentication: Use YOUR_API_KEY in the x-access-tokens header

Request Parameters

x-access-tokens string (header) Required

Authentication: YOUR_API_KEY

model string (URL path) Required

The Gemini model to use. Supports multiple naming formats:

Simple names: flash, pro
Standard: gemini-2.5-pro, gemini-2.5-flash
Preview versions: gemini-2.5-pro-preview-05-06, gemini-2.5-flash-preview-04-17
With prefix: models/gemini-2.5-pro
Vertex format: publishers/google/models/gemini-2.5-pro

contents array Required

Array of content objects with roles and parts

role: user or model (note: assistant is automatically mapped to model)
parts: Array of part objects containing text, inlineData, functionCall, or functionResponse

systemInstruction object Optional

System-level instructions for the model. Object with role and parts fields.

generationConfig object Optional

Generation configuration:

temperature: Controls randomness (0.0–2.0)
topP: Nucleus sampling parameter
topK: Top-k sampling parameter
maxOutputTokens: Maximum tokens in the response
candidateCount: Number of candidates to generate
stopSequences: Sequences that stop generation
responseMimeType: Response format (e.g., application/json)
responseSchema: JSON schema for structured output

tools array Optional

Array of tool objects containing functionDeclarations for function calling

safetySettings array Optional

Safety filter configuration per category

Response Details

200 Success

Response Structure:

candidates: Array of candidate responses:
- content: Response content with role and parts
- finishReason: Reason for completion (STOP, MAX_TOKENS, SAFETY, etc.)
- safetyRatings: Safety rating per category
usageMetadata: Token usage:
- promptTokenCount: Input tokens
- candidatesTokenCount: Output tokens
- totalTokenCount: Total tokens
modelVersion: Model version used

400 Error

Invalid request format or parameters

401 Error

Authentication failure — invalid or missing API key

429 Error

Rate limit exceeded — automatic regional failover will be attempted

Example: Basic Text Generation

curl -X POST 'https://api.asksage.ai/server/google/v1beta/models/gemini-2.5-flash:generateContent' \
  -H 'x-access-tokens: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "What is Ask Sage?"}]
      }
    ]
  }'import requests

model = 'gemini-2.5-flash'
url = f'https://api.asksage.ai/server/google/v1beta/models/{model}:generateContent'
headers = {
    'x-access-tokens': 'YOUR_API_KEY',
    'Content-Type': 'application/json'
}
data = {
    'contents': [
        {
            'role': 'user',
            'parts': [{'text': 'What is Ask Sage?'}]
        }
    ]
}

response = requests.post(url, headers=headers, json=data)
print(response.json())const model = 'gemini-2.5-flash';
const response = await fetch(
  `https://api.asksage.ai/server/google/v1beta/models/${model}:generateContent`,
  {
    method: 'POST',
    headers: {
      'x-access-tokens': 'YOUR_API_KEY',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      contents: [
        {
          role: 'user',
          parts: [{text: 'What is Ask Sage?'}]
        }
      ]
    })
  }
);

const data = await response.json();
console.log(data);

Example: Multi-Turn Conversation with System Instruction

curl -X POST 'https://api.asksage.ai/server/google/v1beta/models/gemini-2.5-pro:generateContent' \
  -H 'x-access-tokens: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "systemInstruction": {
      "role": "user",
      "parts": [{"text": "You are a helpful cybersecurity assistant."}]
    },
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "What is zero trust architecture?"}]
      },
      {
        "role": "model",
        "parts": [{"text": "Zero trust is a security framework that requires all users to be authenticated and authorized before accessing resources."}]
      },
      {
        "role": "user",
        "parts": [{"text": "How does it apply to cloud environments?"}]
      }
    ],
    "generationConfig": {
      "temperature": 0.3,
      "maxOutputTokens": 2048
    }
  }'import requests

model = 'gemini-2.5-pro'
url = f'https://api.asksage.ai/server/google/v1beta/models/{model}:generateContent'
headers = {
    'x-access-tokens': 'YOUR_API_KEY',
    'Content-Type': 'application/json'
}
data = {
    'systemInstruction': {
        'role': 'user',
        'parts': [{'text': 'You are a helpful cybersecurity assistant.'}]
    },
    'contents': [
        {
            'role': 'user',
            'parts': [{'text': 'What is zero trust architecture?'}]
        },
        {
            'role': 'model',
            'parts': [{'text': 'Zero trust is a security framework that requires all users to be authenticated and authorized before accessing resources.'}]
        },
        {
            'role': 'user',
            'parts': [{'text': 'How does it apply to cloud environments?'}]
        }
    ],
    'generationConfig': {
        'temperature': 0.3,
        'maxOutputTokens': 2048
    }
}

response = requests.post(url, headers=headers, json=data)
print(response.json())const model = 'gemini-2.5-pro';
const response = await fetch(
  `https://api.asksage.ai/server/google/v1beta/models/${model}:generateContent`,
  {
    method: 'POST',
    headers: {
      'x-access-tokens': 'YOUR_API_KEY',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      systemInstruction: {
        role: 'user',
        parts: [{text: 'You are a helpful cybersecurity assistant.'}]
      },
      contents: [
        {
          role: 'user',
          parts: [{text: 'What is zero trust architecture?'}]
        },
        {
          role: 'model',
          parts: [{text: 'Zero trust is a security framework that requires all users to be authenticated and authorized before accessing resources.'}]
        },
        {
          role: 'user',
          parts: [{text: 'How does it apply to cloud environments?'}]
        }
      ],
      generationConfig: {
        temperature: 0.3,
        maxOutputTokens: 2048
      }
    })
  }
);

const data = await response.json();
console.log(data);

Example: Function Calling

Tip: Gemini uses functionDeclarations within the tools array for function calling

curl -X POST 'https://api.asksage.ai/server/google/v1beta/models/gemini-2.5-flash:generateContent' \
  -H 'x-access-tokens: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "What'\''s the weather like in San Francisco?"}]
      }
    ],
    "tools": [
      {
        "functionDeclarations": [
          {
            "name": "get_weather",
            "description": "Get the current weather in a location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                  "type": "string",
                  "enum": ["celsius", "fahrenheit"]
                }
              },
              "required": ["location"]
            }
          }
        ]
      }
    ]
  }'import requests

model = 'gemini-2.5-flash'
url = f'https://api.asksage.ai/server/google/v1beta/models/{model}:generateContent'
headers = {
    'x-access-tokens': 'YOUR_API_KEY',
    'Content-Type': 'application/json'
}
data = {
    'contents': [
        {
            'role': 'user',
            'parts': [{'text': "What's the weather like in San Francisco?"}]
        }
    ],
    'tools': [
        {
            'functionDeclarations': [
                {
                    'name': 'get_weather',
                    'description': 'Get the current weather in a location',
                    'parameters': {
                        'type': 'object',
                        'properties': {
                            'location': {
                                'type': 'string',
                                'description': 'The city and state, e.g. San Francisco, CA'
                            },
                            'unit': {
                                'type': 'string',
                                'enum': ['celsius', 'fahrenheit']
                            }
                        },
                        'required': ['location']
                    }
                }
            ]
        }
    ]
}

response = requests.post(url, headers=headers, json=data)
print(response.json())const model = 'gemini-2.5-flash';
const response = await fetch(
  `https://api.asksage.ai/server/google/v1beta/models/${model}:generateContent`,
  {
    method: 'POST',
    headers: {
      'x-access-tokens': 'YOUR_API_KEY',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      contents: [
        {
          role: 'user',
          parts: [{text: "What's the weather like in San Francisco?"}]
        }
      ],
      tools: [
        {
          functionDeclarations: [
            {
              name: 'get_weather',
              description: 'Get the current weather in a location',
              parameters: {
                type: 'object',
                properties: {
                  location: {
                    type: 'string',
                    description: 'The city and state, e.g. San Francisco, CA'
                  },
                  unit: {
                    type: 'string',
                    enum: ['celsius', 'fahrenheit']
                  }
                },
                required: ['location']
              }
            }
          ]
        }
      ]
    })
  }
);

const data = await response.json();
console.log(data);

Example: Structured JSON Output

curl -X POST 'https://api.asksage.ai/server/google/v1beta/models/gemini-2.5-flash:generateContent' \
  -H 'x-access-tokens: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "List three cloud security best practices"}]
      }
    ],
    "generationConfig": {
      "responseMimeType": "application/json",
      "responseSchema": {
        "type": "object",
        "properties": {
          "practices": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "name": {"type": "string"},
                "description": {"type": "string"}
              }
            }
          }
        }
      }
    }
  }'import requests

model = 'gemini-2.5-flash'
url = f'https://api.asksage.ai/server/google/v1beta/models/{model}:generateContent'
headers = {
    'x-access-tokens': 'YOUR_API_KEY',
    'Content-Type': 'application/json'
}
data = {
    'contents': [
        {
            'role': 'user',
            'parts': [{'text': 'List three cloud security best practices'}]
        }
    ],
    'generationConfig': {
        'responseMimeType': 'application/json',
        'responseSchema': {
            'type': 'object',
            'properties': {
                'practices': {
                    'type': 'array',
                    'items': {
                        'type': 'object',
                        'properties': {
                            'name': {'type': 'string'},
                            'description': {'type': 'string'}
                        }
                    }
                }
            }
        }
    }
}

response = requests.post(url, headers=headers, json=data)
print(response.json())

Available Models

Model	Provider	Capability
`gemini-2.5-flash`	Google Vertex AI	Default — fast and efficient for most tasks
`gemini-2.5-pro`	Google Vertex AI	Most capable — complex reasoning and analysis

Intelligent Fallback: If the requested model is unavailable on your account, Ask Sage automatically falls back to the best available model in the same capability tier.

Regional Failover: Requests automatically fail over across 7 US regions (us-central1, us-east1, us-east4, us-east5, us-south1, us-west1, us-west4) for high availability on rate limit or capacity errors.

How It Works

Ask Sage's google/v1/* endpoints follow the Google Gemini API specification, making integration seamless.

Request Flow

Your application sends a request to https://api.asksage.ai/server/google/v1beta/models/{model}:generateContent
Ask Sage validates your authentication token
The model name is resolved and the request is routed to Google Vertex AI
If a region hits rate limits, the request automatically retries in the next available region
The response is returned in standard Gemini API format

Key Difference: Instead of using Google's Vertex AI endpoint directly, use Ask Sage's base URL (https://api.asksage.ai/server/google/v1beta) with x-access-tokens header authentication

Role Mapping

Ask Sage automatically handles role normalization for convenience:

Input Role	Mapped To	Notes
`user`	`user`	No change
`model`	`model`	No change
`assistant`	`model`	Automatically mapped for OpenAI compatibility
`system`	`systemInstruction`	Extracted and merged into the system instruction field

Full Compatibility: Use the same request structure, parameters, and response formats you're already familiar with from the Google Gemini API

Supported Features

Text Generation

Generate text with configurable temperature, top-p, and top-k

Function Calling

Define function declarations and let Gemini call them

System Instructions

Set system-level instructions to guide model behavior

Multi-Turn Conversations

Maintain context across multiple user/model exchanges

Structured Output

Get JSON responses conforming to a specified schema

Safety Settings

Configure safety filters per harm category

Important Note: Base URLs may vary depending on your environment. For assistance, please contact us at support@asksage.ai