Gemini-Compatible

Gemini Compatibility Guide

Use the Google Gemini API format with Ask Sage — powered by Vertex AI

Table of Contents
  1. What’s New?
  2. Generate Content
    1. Request Parameters
    2. Model Naming Formats
    3. Response
    4. Example: Basic Text Generation
    5. Example: Multi-Turn Conversation with System Instruction
    6. Example: Function Calling
    7. Example: Structured JSON Output
  3. Available Models
  4. How It Works
    1. Role Mapping
  5. Supported Features

What’s New?


Generate Content

Generate Content

POST https://api.asksage.ai/server/google/v1beta/models/{model}:generateContent

The main endpoint for generating content using Gemini models. Requests are routed to Google Vertex AI with automatic regional failover.

Authentication: Use YOUR_API_KEY in the x-access-tokens header.

Request Parameters

Parameter Type Required Description
x-access-tokens string (header) Required Authentication: YOUR_API_KEY
model string (URL path) Required Gemini model — see naming formats below
contents array Required Array of content objects with role (user/model) and parts (text/inlineData/functionCall/functionResponse)
systemInstruction object Optional System-level instructions — object with role and parts fields
generationConfig object Optional temperature, topP, topK, maxOutputTokens, candidateCount, stopSequences, responseMimeType, responseSchema
tools array Optional Array of tool objects containing functionDeclarations for function calling
safetySettings array Optional Safety filter configuration per category

Model Naming Formats

The model URL path supports multiple naming formats:

  • Simple names: flash, pro
  • Standard: gemini-2.5-pro, gemini-2.5-flash
  • Preview versions: gemini-2.5-pro-preview-05-06, gemini-2.5-flash-preview-04-17
  • With prefix: models/gemini-2.5-pro
  • Vertex format: publishers/google/models/gemini-2.5-pro

Response

Status Description
200 Success Returns candidates (with content, finishReason, safetyRatings), usageMetadata (promptTokenCount, candidatesTokenCount, totalTokenCount), and modelVersion
400 Error Invalid request format or parameters
401 Error Authentication failure — invalid or missing API key
429 Error Rate limit exceeded — automatic regional failover will be attempted

Example: Basic Text Generation

curl -X POST 'https://api.asksage.ai/server/google/v1beta/models/gemini-2.5-flash:generateContent' \
-H 'x-access-tokens: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"contents": [
  {
    "role": "user",
    "parts": [{"text": "What is Ask Sage?"}]
  }
]
}'

Example: Multi-Turn Conversation with System Instruction

curl -X POST 'https://api.asksage.ai/server/google/v1beta/models/gemini-2.5-pro:generateContent' \
-H 'x-access-tokens: YOUR_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"systemInstruction": {
  "role": "user",
  "parts": [{"text": "You are a helpful cybersecurity assistant."}]
},
"contents": [
  {"role": "user", "parts": [{"text": "What is zero trust architecture?"}]},
  {"role": "model", "parts": [{"text": "Zero trust is a security framework that requires all users to be authenticated and authorized before accessing resources."}]},
  {"role": "user", "parts": [{"text": "How does it apply to cloud environments?"}]}
],
"generationConfig": {
  "temperature": 0.3,
  "maxOutputTokens": 2048
}
}'

Example: Function Calling

Tip: Gemini uses functionDeclarations within the tools array for function calling.
curl -X POST 'https://api.asksage.ai/server/google/v1beta/models/gemini-2.5-flash:generateContent' \
  -H 'x-access-tokens: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [
      {"role": "user", "parts": [{"text": "What'\''s the weather like in San Francisco?"}]}
    ],
    "tools": [
      {
        "functionDeclarations": [
          {
            "name": "get_weather",
            "description": "Get the current weather in a location",
            "parameters": {
              "type": "object",
              "properties": {
                "location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"},
                "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
              },
              "required": ["location"]
            }
          }
        ]
      }
    ]
  }'

Example: Structured JSON Output

curl -X POST 'https://api.asksage.ai/server/google/v1beta/models/gemini-2.5-flash:generateContent' \
  -H 'x-access-tokens: YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "contents": [
      {"role": "user", "parts": [{"text": "List three cloud security best practices"}]}
    ],
    "generationConfig": {
      "responseMimeType": "application/json",
      "responseSchema": {
        "type": "object",
        "properties": {
          "practices": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "name": {"type": "string"},
                "description": {"type": "string"}
              }
            }
          }
        }
      }
    }
  }'

Available Models

Supported Gemini Models

Model Provider Capability
gemini-2.5-flash Google Vertex AI Default — fast and efficient for most tasks
gemini-2.5-pro Google Vertex AI Most capable — complex reasoning and analysis
Intelligent Fallback: If the requested model is unavailable on your account, Ask Sage automatically falls back to the best available model in the same capability tier.
Regional Failover: Requests automatically fail over across 7 US regions (us-central1, us-east1, us-east4, us-east5, us-south1, us-west1, us-west4) for high availability on rate-limit or capacity errors.

How It Works

Request Flow

Ask Sage's google/v1/* endpoints follow the Google Gemini API specification, making integration seamless.

  1. Your application sends a request to https://api.asksage.ai/server/google/v1beta/models/{model}:generateContent
  2. Ask Sage validates your authentication token
  3. The model name is resolved and the request is routed to Google Vertex AI
  4. If a region hits rate limits, the request automatically retries in the next available region
  5. The response is returned in standard Gemini API format
Key Difference: Instead of using Google's Vertex AI endpoint directly, use Ask Sage's base URL (https://api.asksage.ai/server/google/v1beta) with x-access-tokens header authentication.

Role Mapping

Ask Sage automatically handles role normalization for convenience:

Input Role Mapped To Notes
user user No change
model model No change
assistant model Automatically mapped for OpenAI compatibility
system systemInstruction Extracted and merged into the system instruction field
Full Compatibility: Use the same request structure, parameters, and response formats you're already familiar with from the Google Gemini API.

Supported Features

Feature Coverage

Text Generation: Generate text with configurable temperature, top-p, and top-k.
Function Calling: Define function declarations and let Gemini call them.
System Instructions: Set system-level instructions to guide model behavior.
Multi-Turn Conversations: Maintain context across multiple user/model exchanges.
Structured Output: Get JSON responses conforming to a specified schema.
Safety Settings: Configure safety filters per harm category.

Important Note: Base URLs may vary depending on your environment. For assistance, please contact us at support@asksage.ai.

Back to top

Copyright © 2026 Ask Sage Inc. All Rights Reserved. Ask Sage is a BigBear.ai company.