Gemini Compatibility Guide

Use the Google Gemini API format with Ask Sage — powered by Vertex AI

Table of contents
  1. What’s New?
  2. Gemini-Compatible Endpoints
    1. Generate Content
  3. Available Models
  4. How It Works
  5. Supported Features

What’s New?

Ask Sage now supports the Google Gemini API format, making it easy to:

Easy Migration

Switch from Google AI Studio or Vertex AI to Ask Sage with minimal code changes

Standard Format

Use the same Gemini API format and patterns you already know

Regional Failover

Automatic regional failover across multiple US regions for high availability


Gemini-Compatible Endpoints

Generate Content

POST https://api.asksage.ai/server/google/v1/models/{model}:generateContent

The main endpoint for generating content using Gemini models. Requests are routed to Google Vertex AI with automatic regional failover.

Authentication: Use Bearer token in the Authorization header

Request Parameters

Authorization string (header) Required

Bearer token authentication: Bearer YOUR_API_KEY

model string (URL path) Required

The Gemini model to use. Supports multiple naming formats:

  • Simple names: flash, pro
  • Standard: gemini-2.5-pro, gemini-2.5-flash
  • Preview versions: gemini-2.5-pro-preview-05-06, gemini-2.5-flash-preview-04-17
  • With prefix: models/gemini-2.5-pro
  • Vertex format: publishers/google/models/gemini-2.5-pro
contents array Required

Array of content objects with roles and parts

  • role: user or model (note: assistant is automatically mapped to model)
  • parts: Array of part objects containing text, inlineData, functionCall, or functionResponse
systemInstruction object Optional

System-level instructions for the model. Object with role and parts fields.

generationConfig object Optional

Generation configuration:

  • temperature: Controls randomness (0.0–2.0)
  • topP: Nucleus sampling parameter
  • topK: Top-k sampling parameter
  • maxOutputTokens: Maximum tokens in the response
  • candidateCount: Number of candidates to generate
  • stopSequences: Sequences that stop generation
  • responseMimeType: Response format (e.g., application/json)
  • responseSchema: JSON schema for structured output
tools array Optional

Array of tool objects containing functionDeclarations for function calling

safetySettings array Optional

Safety filter configuration per category

Response Details
200 Success

Response Structure:

  • candidates: Array of candidate responses:
    • content: Response content with role and parts
    • finishReason: Reason for completion (STOP, MAX_TOKENS, SAFETY, etc.)
    • safetyRatings: Safety rating per category
  • usageMetadata: Token usage:
    • promptTokenCount: Input tokens
    • candidatesTokenCount: Output tokens
    • totalTokenCount: Total tokens
  • modelVersion: Model version used
400 Error

Invalid request format or parameters

401 Error

Authentication failure — invalid or missing API key

429 Error

Rate limit exceeded — automatic regional failover will be attempted

Example: Basic Text Generation

curl -X POST 'https://api.asksage.ai/server/google/v1/models/gemini-2.5-flash:generateContent' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "What is Ask Sage?"}]
      }
    ]
  }'

Example: Multi-Turn Conversation with System Instruction

curl -X POST 'https://api.asksage.ai/server/google/v1/models/gemini-2.5-pro:generateContent' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "systemInstruction": {
      "role": "user",
      "parts": [{"text": "You are a helpful cybersecurity assistant."}]
    },
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "What is zero trust architecture?"}]
      },
      {
        "role": "model",
        "parts": [{"text": "Zero trust is a security framework that requires all users to be authenticated and authorized before accessing resources."}]
      },
      {
        "role": "user",
        "parts": [{"text": "How does it apply to cloud environments?"}]
      }
    ],
    "generationConfig": {
      "temperature": 0.3,
      "maxOutputTokens": 2048
    }
  }'

Example: Function Calling

Tip: Gemini uses functionDeclarations within the tools array for function calling
curl -X POST 'https://api.asksage.ai/server/google/v1/models/gemini-2.5-flash:generateContent' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "What'\''s the weather like in San Francisco?"}]
      }
    ],
    "tools": [
      {
        "functionDeclarations": [
          {
            "name": "get_weather",
            "description": "Get the current weather in a location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                },
                "unit": {
                  "type": "string",
                  "enum": ["celsius", "fahrenheit"]
                }
              },
              "required": ["location"]
            }
          }
        ]
      }
    ]
  }'

Example: Structured JSON Output

curl -X POST 'https://api.asksage.ai/server/google/v1/models/gemini-2.5-flash:generateContent' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [
      {
        "role": "user",
        "parts": [{"text": "List three cloud security best practices"}]
      }
    ],
    "generationConfig": {
      "responseMimeType": "application/json",
      "responseSchema": {
        "type": "object",
        "properties": {
          "practices": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "name": {"type": "string"},
                "description": {"type": "string"}
              }
            }
          }
        }
      }
    }
  }'

Available Models

Model Provider Capability
gemini-2.5-flash Google Vertex AI Default — fast and efficient for most tasks
gemini-2.5-pro Google Vertex AI Most capable — complex reasoning and analysis
Intelligent Fallback: If the requested model is unavailable on your account, Ask Sage automatically falls back to the best available model in the same capability tier.
Regional Failover: Requests automatically fail over across 7 US regions (us-central1, us-east1, us-east4, us-east5, us-south1, us-west1, us-west4) for high availability on rate limit or capacity errors.

How It Works

Ask Sage's google/v1/* endpoints follow the Google Gemini API specification, making integration seamless.

Request Flow

  1. Your application sends a request to https://api.asksage.ai/server/google/v1/models/{model}:generateContent
  2. Ask Sage validates your authentication token
  3. The model name is resolved and the request is routed to Google Vertex AI
  4. If a region hits rate limits, the request automatically retries in the next available region
  5. The response is returned in standard Gemini API format
Key Difference: Instead of using Google's Vertex AI endpoint directly, use Ask Sage's base URL (https://api.asksage.ai/server/google/v1) with Bearer token authentication

Role Mapping

Ask Sage automatically handles role normalization for convenience:

Input Role Mapped To Notes
user user No change
model model No change
assistant model Automatically mapped for OpenAI compatibility
system systemInstruction Extracted and merged into the system instruction field
Full Compatibility: Use the same request structure, parameters, and response formats you're already familiar with from the Google Gemini API

Supported Features

Text Generation

Generate text with configurable temperature, top-p, and top-k

Function Calling

Define function declarations and let Gemini call them

System Instructions

Set system-level instructions to guide model behavior

Multi-Turn Conversations

Maintain context across multiple user/model exchanges

Structured Output

Get JSON responses conforming to a specified schema

Safety Settings

Configure safety filters per harm category


Important Note: Base URLs may vary depending on your environment. For assistance, please contact us at support@asksage.ai

Back to top

Copyright © 2026 Ask Sage Inc. All Rights Reserved. Ask Sage is a BigBear.ai company.