VS Code Copilot BYOK Integration

VS Code GitHub Copilot Integration

Use Ask Sage models directly in VS Code Copilot Chat via Bring Your Own Key (BYOK)

Bring Ask Sage's models into Visual Studio Code through GitHub Copilot Chat's Bring Your Own Key (BYOK) language-model support. This integration uses VS Code's Custom Endpoint provider and works with the same Ask Sage API key you already use for other integrations.


Table of Contents
  1. At a Glance
  2. Prerequisites
  3. Step 1 — Add Ask Sage as a Custom Endpoint Provider
  4. Step 2 — Configure Models
    1. Option A — OpenAI Chat Completions
    2. Option B — OpenAI Responses (with reasoning)
    3. Option C — Anthropic Messages
    4. Option D — Combined (Chat Completions + Responses + Messages)
  5. Step 3 — Verify the Configuration
  6. Available Models by Environment
    1. Commercial (SaaS) Tenants
    2. Gov Tenants (FedRAMP / IL2–IL4)
    3. DoD Tenants (IL5 / IL6)
  7. Drop-in Configurations by Environment
    1. Commercial Drop-in
    2. Gov Drop-in
    3. DoD Drop-in
  8. Why Custom Endpoint (and not OpenAI / Anthropic vendor)
  9. Optional — Use Ask Sage for Copilot Utility Tasks
  10. Configuration Reference
  11. DoD and Managed Network Connectivity
  12. Troubleshooting
  13. Privacy and Data Usage
  14. Additional Resources

At a Glance

API key only — no Entra ID, no extra sign-in
Works in Commercial, Gov, and managed networks
Chat Completions, Responses, and Anthropic Messages in one provider group
Per-model reasoning effort, tool calling, and vision toggles

Prerequisites

Before you begin

  • Visual Studio Code 1.122 or later — the Custom Endpoint provider was added in this release
  • GitHub Copilot Chat enabled in VS Code
  • An Ask Sage API key (from your account settings)
  • Network access to the Ask Sage endpoint from your workstation
  • For DoD or other managed networks, your organization-provided root certificate may need to be configured for VS Code or your OS certificate store

Copilot Business / Enterprise users

If you are on a Copilot Business or Enterprise plan, your organization administrator must first enable the Bring Your Own Language Model Key in VS Code policy in GitHub Copilot policy settings. Without that policy, the Custom Endpoint flow will not appear.


Step 1 — Add Ask Sage as a Custom Endpoint Provider

  1. Open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P)
  2. Run Chat: Manage Language Models
  3. Select Add Models...
  4. Choose Custom Endpoint
  5. Enter Ask Sage as the group name
  6. Paste your Ask Sage API key — VS Code stores it in OS secret storage, not in the JSON file
  7. Choose the default API type for the group (you can mix shapes later):
    • Chat Completions — for /openai/v1/chat/completions models
    • Responses — for /openai/v1/responses models
    • Messages — for /anthropic/v1/messages models
Select Custom Endpoint from Add Models
Create the Ask Sage group
Paste the Ask Sage API key
Choose the Ask Sage API type

After you choose the API type, VS Code opens chatLanguageModels.json with a starter Ask Sage provider group and an empty model entry. The next step is filling that in.

Do not paste a raw API key into chatLanguageModels.json. Secret fields are resolved through VS Code secret storage. After you enter the key in the UI, VS Code writes an ${input:chat.lm.secret...} reference into the file. That reference is what should live in the JSON.

Step 2 — Configure Models

VS Code stores BYOK model groups as a top-level JSON array. Each entry is one provider group. Fill in the models array with one or more Ask Sage models. Pick the configuration shape that matches the endpoint you are calling.

Looking for the complete model list? The four Option snippets below are minimal starters. See Available Models by Environment and Drop-in Configurations by Environment further down for full per-tenant catalogs (Commercial / Gov / DoD) and copy-paste configurations.
VS Code starter chatLanguageModels.json

Option A — OpenAI Chat Completions

chatLanguageModels.json — Chat Completions
[
  {
    "name": "Ask Sage",
    "vendor": "customendpoint",
    "apiKey": "${input:chat.lm.secret.example}",
    "apiType": "chat-completions",
    "models": [
      {
        "id": "gpt-4.1",
        "name": "GPT 4.1 (Ask Sage)",
        "url": "https://api.asksage.ai/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      }
    ]
  }
]

Option B — OpenAI Responses (with reasoning)

chatLanguageModels.json — Responses
[
  {
    "name": "Ask Sage",
    "vendor": "customendpoint",
    "apiKey": "${input:chat.lm.secret.example}",
    "apiType": "responses",
    "models": [
      {
        "id": "gpt-5.5",
        "name": "GPT 5.5 (Ask Sage)",
        "url": "https://api.asksage.ai/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": true,
        "thinking": true,
        "supportsReasoningEffort": ["low", "medium", "high"],
        "reasoningEffortFormat": "responses",
        "maxInputTokens": 272000,
        "maxOutputTokens": 128000
      }
    ],
    "settings": {
      "gpt-5.5": {
        "reasoningEffort": "high"
      }
    }
  }
]

Option C — Anthropic Messages

chatLanguageModels.json — Anthropic Messages
[
  {
    "name": "Ask Sage",
    "vendor": "customendpoint",
    "apiKey": "${input:chat.lm.secret.example}",
    "apiType": "messages",
    "models": [
      {
        "id": "claude-opus-4-7",
        "name": "Claude Opus 4.7 (Ask Sage)",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "thinking": true,
        "supportsReasoningEffort": ["low", "medium", "high", "xhigh", "max"],
        "maxInputTokens": 200000,
        "maxOutputTokens": 64000
      }
    ],
    "settings": {
      "claude-opus-4-7": {
        "reasoningEffort": "medium"
      }
    }
  }
]

Option D — Combined (Chat Completions + Responses + Messages)

You can place all three API shapes in a single Ask Sage provider group. Set apiType on each individual model to override the group default.

chatLanguageModels.json — Combined (live-tested)
[
  {
    "name": "Ask Sage",
    "vendor": "customendpoint",
    "apiKey": "${input:chat.lm.secret.example}",
    "models": [
      {
        "id": "gpt-4.1",
        "name": "GPT 4.1 (Ask Sage)",
        "url": "https://api.asksage.ai/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      },
      {
        "id": "gpt-5.5",
        "name": "GPT 5.5 (Ask Sage)",
        "url": "https://api.asksage.ai/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": true,
        "thinking": true,
        "supportsReasoningEffort": ["low", "medium", "high"],
        "reasoningEffortFormat": "responses",
        "maxInputTokens": 272000,
        "maxOutputTokens": 128000
      },
      {
        "id": "claude-opus-4-7",
        "name": "Claude Opus 4.7 (Ask Sage)",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "thinking": true,
        "supportsReasoningEffort": ["low", "medium", "high", "xhigh", "max"],
        "maxInputTokens": 200000,
        "maxOutputTokens": 64000
      }
    ],
    "settings": {
      "gpt-5.5": {
        "reasoningEffort": "high"
      },
      "claude-opus-4-7": {
        "reasoningEffort": "medium"
      }
    }
  }
]

Step 3 — Verify the Configuration

  1. Save chatLanguageModels.json
  2. Open the Command Palette and run Chat: Manage Language Models again
  3. Your configured Ask Sage models should appear in the Language Models pane
  4. Open Copilot Chat, click the model picker, and pick one of the Ask Sage models
  5. Send a simple prompt such as hello world!. A response confirms VS Code is reaching Ask Sage through the configured BYOK endpoint.
Configured Ask Sage models in Manage Language Models
Ask Sage model responding in Copilot Chat

Available Models by Environment

The Ask Sage models exposed through the OpenAI- and Anthropic-compatible endpoints depend on which Ask Sage environment your API key is provisioned in. The catalog below mirrors the canonical per-environment allow-lists from the Ask Sage Client (src/config.js) enriched with model metadata from the Ask Sage CoreUI shared model catalog (src/Data/models.ts).

Image, video, and embedding models are intentionally omitted — the VS Code Copilot Chat picker only consumes chat / reasoning / Anthropic Messages shapes.

You can always confirm what your specific account is entitled to by calling:

curl https://api.asksage.ai/server/openai/v1/models \
  -H "Authorization: Bearer $ASKSAGE_API_KEY"

curl https://api.asksage.ai/server/anthropic/v1/models \
  -H "Authorization: Bearer $ASKSAGE_API_KEY"
Heads up: the live /openai/v1/models and /anthropic/v1/models endpoints currently return a static catalog that does not yet enforce per-environment filtering or expose the full set of supported IDs (including internal aliases). Treat the tables below as the source of truth for now; a Server-side fix is tracked in the Ask Sage Server repo to bring those endpoints in line.

Commercial (SaaS) Tenants

Default profile for accounts on api.asksage.ai not tagged as Gov or DoD.

Commercial — 47 chat / reasoning models

API ShapePublic IDDisplay NameProvider / HostingInput CtxOutput CtxToolsVisionReasoning
Anthropic Messagesclaude-haiku-4-5
alias: claude-haiku-4-5-com
Anthropic Claude Haiku 4.5Direct200,00032,000
Anthropic Messagesclaude-opus-4
alias: google-claude-4-opus
Google Anthropic Claude 4.1 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-opus-4-5
alias: google-claude-45-opus
Google Anthropic Claude 4.5 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-opus-4-6
alias: google-claude-46-opus
Google Anthropic Claude 4.6 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-opus-4-7
alias: claude-opus-4-7-com
Anthropic Claude Opus 4.7Direct200,00064,000
Anthropic Messagesclaude-opus-4-8
alias: google-claude-48-opus
Google Anthropic Claude 4.8 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-sonnet-4
alias: google-claude-4-sonnet
Google Anthropic Claude 4 SonnetGoogle Vertex AI200,00032,768
Anthropic Messagesclaude-sonnet-4-5-vertex
alias: google-claude-45-sonnet
Google Anthropic Claude 4.5 SonnetGoogle Vertex AI200,00032,768
Anthropic Messagesclaude-sonnet-4-6
alias: claude-sonnet-4-6-com
Anthropic Claude Sonnet 4.6Direct200,00032,768
Chat Completionsaws-bedrock-gpt-oss-120b-govOpenAI GPT-OSS 120BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-gpt-oss-20b-govOpenAI GPT-OSS 20BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-12b-vl-govNVIDIA Nemotron Nano 12B v2 VLAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-30b-govNVIDIA Nemotron Nano 3 30BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-9b-govNVIDIA Nemotron Nano 9B v2AWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-super-3-120b-govNVIDIA Nemotron Super 3 120BAWS Bedrock GovCloud131,0008,192
Chat Completionsdeepseek-v3.2-comDeepSeek V3.2Direct128,0008,192
Chat Completionsdeepseek-v4-flashDeepSeek V4 FlashDirect128,0008,192
Chat Completionsgoogle-gemini-2.5-flashGoogle Gemini 2.5 FlashGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-2.5-proGoogle Gemini 2.5 ProGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-20-flashGoogle Gemini 2.0 FlashGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-3-flash-comGoogle Gemini 3 FlashGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-3.1-flash-lite-comGoogle Gemini 3.1 Flash LiteGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-3.1-pro-comGoogle Gemini 3.1 ProGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-3.5-flash-comGoogle Gemini 3.5 FlashGoogle Vertex AI1,000,0008,192
Chat Completionsgpt-4.1Azure OpenAI GPT-4.1Azure OpenAI (Commercial)128,00032,768
Chat Completionsgpt-4.1-miniAzure OpenAI GPT-4.1-miniAzure OpenAI (Commercial)128,00032,768
Chat Completionsgpt-4.1-nanoAzure OpenAI GPT-4.1-nanoAzure OpenAI (Commercial)128,00016,384
Chat Completionsgrok-4-1-fast-non-reasoningX.AI Grok 4.1 FastDirect256,00016,384
Chat Completionsgrok-4-1-fast-reasoningX.AI Grok 4.1 Fast (Reasoning)Direct256,00016,384
Chat Completionsgrok-4-20-non-reasoningX.AI Grok 4.20 (Fast)Direct256,00016,384
Chat Completionsgrok-4-20-reasoningX.AI Grok 4.20 (Reasoning)Direct256,00016,384
Chat Completionsgroq-70bGroq-70BGroq Cloud128,0008,192
Chat Completionsgroq-llama33Groq LLAMA 3.3Groq Cloud128,0008,192
Chat Completionsgroq-llama4-scoutGroq LLAMA 4-ScoutGroq Cloud128,0008,192
Chat Completionskimi-2.6-comMoonshot Kimi K2.6Direct200,00016,384
Chat Completionsmistral-large-3Mistral Large 3Azure OpenAI (Commercial)128,00032,000
Responsesgpt-5Azure OpenAI GPT-5Azure OpenAI (Commercial)272,000128,000
Responsesgpt-5-miniAzure OpenAI GPT-5-miniAzure OpenAI (Commercial)272,000128,000
Responsesgpt-5-nanoAzure OpenAI GPT-5-nanoAzure OpenAI (Commercial)272,000128,000
Responsesgpt-5.1Azure OpenAI GPT-5.1Azure OpenAI (Commercial)272,000128,000
Responsesgpt-5.2Azure OpenAI GPT-5.2Azure OpenAI (Commercial)272,000128,000
Responsesgpt-5.4Azure OpenAI GPT-5.4Azure OpenAI (Commercial)272,000128,000
Responsesgpt-5.4-nanoAzure OpenAI GPT-5.4-nanoAzure OpenAI (Commercial)272,000128,000
Responsesgpt-o1Azure OpenAI GPT-o1Azure OpenAI (Commercial)200,000100,000
Responsesgpt-o3Azure OpenAI GPT-o3Azure OpenAI (Commercial)200,000100,000
Responsesgpt-o3-miniAzure OpenAI GPT-o3-miniAzure OpenAI (Commercial)200,000100,000
Responsesgpt-o4-miniAzure OpenAI GPT-o4-miniAzure OpenAI (Commercial)200,000100,000

Gov Tenants (FedRAMP / IL2–IL4)

Profile when the tenant has force_gov_models=true. Superset of the commercial-equivalent models with -gov variants for partner models that are not yet generally available in commercial.

Gov — 44 chat / reasoning models

API ShapePublic IDDisplay NameProvider / HostingInput CtxOutput CtxToolsVisionReasoning
Anthropic Messagesclaude-haiku-4-5
alias: google-claude-45-haiku
Google Anthropic Claude 4.5 HaikuGoogle Vertex AI200,00032,000
Anthropic Messagesclaude-opus-4
alias: google-claude-4-opus
Google Anthropic Claude 4.1 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-opus-4-5
alias: google-claude-45-opus
Google Anthropic Claude 4.5 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-opus-4-6
alias: google-claude-46-opus
Google Anthropic Claude 4.6 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-opus-4-7
alias: google-claude-47-opus
Google Anthropic Claude 4.7 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-opus-4-8
alias: google-claude-48-opus
Google Anthropic Claude 4.8 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-sonnet-4
alias: google-claude-4-sonnet
Google Anthropic Claude 4 SonnetGoogle Vertex AI200,00032,768
Anthropic Messagesclaude-sonnet-4-5
alias: aws-bedrock-claude-45-sonnet-gov
AWS Gov Bedrock Claude 4.5 SonnetAWS Bedrock GovCloud200,00032,768
Anthropic Messagesclaude-sonnet-4-5-vertex
alias: google-claude-45-sonnet
Google Anthropic Claude 4.5 SonnetGoogle Vertex AI200,00032,768
Anthropic Messagesclaude-sonnet-4-6
alias: google-claude-46-sonnet
Google Anthropic Claude 4.6 SonnetGoogle Vertex AI200,00032,768
Chat Completionsaws-bedrock-gpt-oss-120b-govOpenAI GPT-OSS 120BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-gpt-oss-20b-govOpenAI GPT-OSS 20BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-12b-vl-govNVIDIA Nemotron Nano 12B v2 VLAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-30b-govNVIDIA Nemotron Nano 3 30BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-9b-govNVIDIA Nemotron Nano 9B v2AWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-super-3-120b-govNVIDIA Nemotron Super 3 120BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nova-lite-govAWS Gov Bedrock Nova LiteAWS Bedrock GovCloud128,0005,000
Chat Completionsaws-bedrock-nova-micro-govAWS Gov Bedrock Nova MicroAWS Bedrock GovCloud128,0005,000
Chat Completionsaws-bedrock-nova-pro-govAWS Gov Bedrock Nova ProAWS Bedrock GovCloud300,0005,000
Chat Completionsgoogle-gemini-2.5-flashGoogle Gemini 2.5 FlashGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-2.5-proGoogle Gemini 2.5 ProGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-20-flashGoogle Gemini 2.0 FlashGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-3.1-flash-lite-govGoogle Gemini 3.1 Flash Lite GovGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-3.5-flash-govGoogle Gemini 3.5 Flash GovGoogle Vertex AI1,000,0008,192
Chat Completionsgpt-4.1Azure OpenAI GPT-4.1Azure OpenAI (Commercial)128,00032,768
Chat Completionsgpt-4.1-miniAzure OpenAI GPT-4.1-miniAzure OpenAI (Commercial)128,00032,768
Chat Completionsgpt-4.1-nanoAzure OpenAI GPT-4.1-nanoAzure OpenAI (Commercial)128,00016,384
Chat Completionsgrok-4-1-fast-non-reasoningX.AI Grok 4.1 FastDirect256,00016,384
Chat Completionsgrok-4-1-fast-reasoningX.AI Grok 4.1 Fast (Reasoning)Direct256,00016,384
Chat Completionsgrok-4-20-non-reasoningX.AI Grok 4.20 (Fast)Direct256,00016,384
Chat Completionsgrok-4-20-reasoningX.AI Grok 4.20 (Reasoning)Direct256,00016,384
Chat Completionsllma3LLAMA 3AWS Bedrock GovCloud128,0008,192
Chat Completionsllma3-8bMeta Llama 3 8BAWS Bedrock GovCloud128,0008,192
Chat Completionsmistral-large-3Mistral Large 3Azure OpenAI (Commercial)128,00032,000
Responsesgpt-5Azure OpenAI GPT-5Azure OpenAI (Commercial)272,000128,000
Responsesgpt-5-miniAzure OpenAI GPT-5-miniAzure OpenAI (Commercial)272,000128,000
Responsesgpt-5-nanoAzure OpenAI GPT-5-nanoAzure OpenAI (Commercial)272,000128,000
Responsesgpt-5.1Azure OpenAI GPT-5.1Azure OpenAI (Commercial)272,000128,000
Responsesgpt-5.1-govAzure Gov OpenAI GPT-5.1Azure OpenAI Gov272,000128,000
Responsesgpt-5.2Azure OpenAI GPT-5.2Azure OpenAI (Commercial)272,000128,000
Responsesgpt-5.4Azure OpenAI GPT-5.4Azure OpenAI (Commercial)272,000128,000
Responsesgpt-5.4-nanoAzure OpenAI GPT-5.4-nanoAzure OpenAI (Commercial)272,000128,000
Responsesgpt-o1Azure OpenAI GPT-o1Azure OpenAI (Commercial)200,000100,000
Responsesgpt-o3-miniAzure OpenAI GPT-o3-miniAzure OpenAI (Commercial)200,000100,000

DoD Tenants (IL5 / IL6)

DoD operators: only the models in this table are approved in the DoD-locked profile (force_dod_models=true). Calling any other model ID will return 403 model_not_allowed. This list mirrors the canonical allow-list in Ask Sage Client src/config.js and is the safe set to publish in a DoD environment.

DoD-Approved — 26 chat / reasoning models

API ShapePublic IDDisplay NameProvider / HostingInput CtxOutput CtxToolsVisionReasoning
Anthropic Messagesclaude-haiku-4-5
alias: google-claude-45-haiku
Google Anthropic Claude 4.5 HaikuGoogle Vertex AI200,00032,000
Anthropic Messagesclaude-opus-4-5
alias: google-claude-45-opus
Google Anthropic Claude 4.5 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-opus-4-6
alias: google-claude-46-opus
Google Anthropic Claude 4.6 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-opus-4-7
alias: google-claude-47-opus
Google Anthropic Claude 4.7 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-opus-4-8
alias: google-claude-48-opus
Google Anthropic Claude 4.8 OpusGoogle Vertex AI200,00064,000
Anthropic Messagesclaude-sonnet-4-5-vertex
alias: google-claude-45-sonnet
Google Anthropic Claude 4.5 SonnetGoogle Vertex AI200,00032,768
Anthropic Messagesclaude-sonnet-4-6
alias: google-claude-46-sonnet
Google Anthropic Claude 4.6 SonnetGoogle Vertex AI200,00032,768
Chat Completionsaws-bedrock-gpt-oss-120b-govOpenAI GPT-OSS 120BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-gpt-oss-20b-govOpenAI GPT-OSS 20BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-12b-vl-govNVIDIA Nemotron Nano 12B v2 VLAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-30b-govNVIDIA Nemotron Nano 3 30BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-9b-govNVIDIA Nemotron Nano 9B v2AWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nemotron-super-3-120b-govNVIDIA Nemotron Super 3 120BAWS Bedrock GovCloud131,0008,192
Chat Completionsaws-bedrock-nova-lite-govAWS Gov Bedrock Nova LiteAWS Bedrock GovCloud128,0005,000
Chat Completionsaws-bedrock-nova-micro-govAWS Gov Bedrock Nova MicroAWS Bedrock GovCloud128,0005,000
Chat Completionsaws-bedrock-nova-pro-govAWS Gov Bedrock Nova ProAWS Bedrock GovCloud300,0005,000
Chat Completionsgoogle-gemini-2.5-flashGoogle Gemini 2.5 FlashGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-2.5-proGoogle Gemini 2.5 ProGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-3.1-flash-lite-govGoogle Gemini 3.1 Flash Lite GovGoogle Vertex AI1,000,0008,192
Chat Completionsgoogle-gemini-3.5-flash-govGoogle Gemini 3.5 Flash GovGoogle Vertex AI1,000,0008,192
Chat Completionsgpt-4.1-govAzure Gov OpenAI GPT-4.1Azure OpenAI Gov128,00032,768
Chat Completionsgpt-4.1-mini-govAzure Gov OpenAI GPT-4.1-miniAzure OpenAI Gov128,00032,768
Chat Completionsllma3LLAMA 3AWS Bedrock GovCloud128,0008,192
Chat Completionsllma3-8bMeta Llama 3 8BAWS Bedrock GovCloud128,0008,192
Responsesgpt-5.1-govAzure Gov OpenAI GPT-5.1Azure OpenAI Gov272,000128,000
Responsesgpt-o3-mini-govAzure Gov OpenAI GPT-o3-miniAzure OpenAI Gov200,000100,000

Drop-in Configurations by Environment

Paste one of the snippets below into chatLanguageModels.json based on the Ask Sage tenant your API key is associated with. The snippets are curated starter sets — expand from the catalog above as needed.

Commercial Drop-in

chatLanguageModels.json — Commercial
[
  {
    "name": "Ask Sage (Commercial)",
    "vendor": "customendpoint",
    "apiKey": "${input:asksage-api-key}",
    "models": [
      {
        "id": "gpt-4.1",
        "name": "GPT 4.1 - Ask Sage",
        "url": "https://api.asksage.ai/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      },
      {
        "id": "gpt-5.5",
        "name": "GPT 5.5 (Reasoning) - Ask Sage",
        "url": "https://api.asksage.ai/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 272000,
        "maxOutputTokens": 128000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high"
        ],
        "reasoningEffortFormat": "responses"
      },
      {
        "id": "claude-opus-4-8",
        "name": "Claude Opus 4.8 - Ask Sage",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 64000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      },
      {
        "id": "claude-sonnet-4-6",
        "name": "Claude Sonnet 4.6 - Ask Sage",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 32768,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      }
    ],
    "settings": {
      "gpt-5.5": {
        "reasoningEffort": "high"
      },
      "claude-opus-4-8": {
        "reasoningEffort": "medium"
      },
      "claude-sonnet-4-6": {
        "reasoningEffort": "medium"
      }
    }
  }
]

Gov Drop-in

chatLanguageModels.json — Gov (FedRAMP / IL2–IL4)
[
  {
    "name": "Ask Sage (Gov)",
    "vendor": "customendpoint",
    "apiKey": "${input:asksage-api-key}",
    "models": [
      {
        "id": "gpt-4.1",
        "name": "GPT 4.1 - Ask Sage",
        "url": "https://api.asksage.ai/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      },
      {
        "id": "gpt-5.1-gov",
        "name": "GPT 5.1 (Gov Reasoning) - Ask Sage",
        "url": "https://api.asksage.ai/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 272000,
        "maxOutputTokens": 128000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high"
        ],
        "reasoningEffortFormat": "responses"
      },
      {
        "id": "claude-sonnet-4-6",
        "name": "Claude Sonnet 4.6 - Ask Sage",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 32768,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      },
      {
        "id": "claude-opus-4-7",
        "name": "Claude Opus 4.7 - Ask Sage",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 64000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      }
    ],
    "settings": {
      "gpt-5.1-gov": {
        "reasoningEffort": "high"
      },
      "claude-sonnet-4-6": {
        "reasoningEffort": "medium"
      },
      "claude-opus-4-7": {
        "reasoningEffort": "medium"
      }
    }
  }
]

DoD Drop-in

DoD-only: The Claude IDs in this snippet (claude-sonnet-4-6, claude-opus-4-7) resolve to Google Vertex AI deployed inside an IL5 Assured Workloads folder — not commercial Vertex. claude-sonnet-4-5 (which routes to AWS Bedrock GovCloud) is not in the force_dod_models allow-list, so it is omitted here.
chatLanguageModels.json — DoD (IL5 / IL6)
[
  {
    "name": "Ask Sage (DoD)",
    "vendor": "customendpoint",
    "apiKey": "${input:asksage-api-key}",
    "models": [
      {
        "id": "gpt-4.1-gov",
        "name": "GPT 4.1 (Gov) - Ask Sage",
        "url": "https://api.asksage.ai/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      },
      {
        "id": "gpt-4.1-mini-gov",
        "name": "GPT 4.1 Mini (Gov) - Ask Sage",
        "url": "https://api.asksage.ai/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      },
      {
        "id": "gpt-5.1-gov",
        "name": "GPT 5.1 (Gov) - Ask Sage",
        "url": "https://api.asksage.ai/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 272000,
        "maxOutputTokens": 128000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high"
        ],
        "reasoningEffortFormat": "responses"
      },
      {
        "id": "gpt-o3-mini-gov",
        "name": "GPT o3 Mini (Gov) - Ask Sage",
        "url": "https://api.asksage.ai/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": false,
        "maxInputTokens": 200000,
        "maxOutputTokens": 100000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high"
        ],
        "reasoningEffortFormat": "responses"
      },
      {
        "id": "claude-sonnet-4-6",
        "name": "Claude Sonnet 4.6 (IL5 Vertex) - Ask Sage",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 32768,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      },
      {
        "id": "claude-opus-4-7",
        "name": "Claude Opus 4.7 (IL5 Vertex) - Ask Sage",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 64000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      }
    ],
    "settings": {
      "gpt-5.1-gov": {
        "reasoningEffort": "high"
      },
      "gpt-o3-mini-gov": {
        "reasoningEffort": "high"
      },
      "claude-sonnet-4-6": {
        "reasoningEffort": "medium"
      },
      "claude-opus-4-7": {
        "reasoningEffort": "medium"
      }
    }
  }
]

Why Custom Endpoint (and not OpenAI / Anthropic vendor)

Always use vendor: "customendpoint" for Ask Sage. The Ask Sage compatibility endpoints are OpenAI- and Anthropic-style, but they are hosted by Ask Sage — not by OpenAI or Anthropic directly.
  • Do not use vendor: "openai" — VS Code's built-in OpenAI provider targets the official OpenAI API.
  • Do not use vendor: "anthropic" — VS Code's built-in Anthropic provider targets Anthropic's official API.

For OpenAI-shaped endpoints VS Code sends the key as Authorization: Bearer .... For Anthropic-shaped endpoints with apiType: "messages", VS Code sends it as x-api-key — both are supported by Ask Sage.


Optional — Use Ask Sage for Copilot Utility Tasks

VS Code uses lightweight background models for utility tasks like title generation, commit messages, and intent detection. You can route those through Ask Sage too:

VS Code settings.json
{
  "chat.utilityModel": "customendpoint/gpt-4.1",
  "chat.utilitySmallModel": "customendpoint/gpt-4.1-mini"
}

The format is ${vendor}/${modelId}. For Ask Sage models, the vendor is always customendpoint, so values look like customendpoint/gpt-4.1-mini. A fast, inexpensive model is recommended for chat.utilitySmallModel since it is invoked frequently.


Configuration Reference

The fields most relevant to Ask Sage models:

PropertyTypeNotes
idstringThe model identifier Ask Sage expects (e.g., gpt-4.1, claude-opus-4-7)
namestringDisplay name in the Copilot Chat model picker
urlstringFull Ask Sage endpoint URL for this model's API shape
apiTypestringchat-completions, responses, or messages — overrides the group default
toolCallingbooleanSet to true only if the model supports tool calling
visionbooleanSet to true only if the model supports image inputs
maxInputTokensintegerContext window for input tokens
maxOutputTokensintegerMaximum response length
thinkingbooleanSet to true for reasoning-capable models (Responses or Anthropic with thinking)
supportsReasoningEffortarrayEffort levels: typically ["low", "medium", "high"]; Anthropic also supports "xhigh" and "max"
reasoningEffortFormatstringFor /responses endpoints set to "responses" (sends nested reasoning.effort); defaults follow URL otherwise
streamingbooleanOptional, defaults to true; Ask Sage supports streaming via SSE
requestHeadersobjectOptional extra headers; reserved/forwarding headers are ignored

For the complete reference (including provider-level fields and advanced options) see the VS Code language models documentation.


DoD and Managed Network Connectivity

If your users connect through a DoD, DoW, or other managed network, certificate and proxy configuration may be required before VS Code can reach the Ask Sage endpoint. For the model allow-list approved in DoD environments, see the DoD Tenants section above.

  • If your environment requires a custom root certificate, configure it according to your organization policy — typically through the OS certificate store or the http.proxyStrictSSL / http.systemCertificates VS Code settings
  • Test reachability from a terminal: curl -I https://api.asksage.ai/server/openai/v1/models -H "Authorization: Bearer $KEY"
  • If using a proxy, ensure VS Code's http.proxy setting is configured and matches your shell environment

Troubleshooting

Manage Language Models shows nothing

Make sure chatLanguageModels.json is a top-level array, not an object with a providers property.

Correct:

[
  { "name": "Ask Sage", "vendor": "customendpoint" }
]

Incorrect:

{
  "providers": []
}

API key not found or authentication fails

  • Re-enter the key through Chat: Manage Language Models so VS Code stores it as a secret
  • Confirm the JSON contains an ${input:chat.lm.secret...} reference for apiKey (not the raw key)
  • Verify the Ask Sage API key is still active in your account settings
  • Verify the endpoint URL matches the configured apiType — an Anthropic URL with apiType: "chat-completions" will fail authentication

Model does not appear in the picker

  • Confirm the provider group vendor is customendpoint
  • Confirm each model has id, name, url, toolCalling, vision, maxInputTokens, and maxOutputTokens
  • For agent / tool-use scenarios, the model must have toolCalling: true — otherwise it is hidden from the picker
  • Reload the VS Code window after editing chatLanguageModels.json directly

Reasoning effort does not appear in the picker

  • Set thinking: true on the model
  • Add supportsReasoningEffort with the effort values your endpoint accepts
  • For /responses endpoints, set reasoningEffortFormat: "responses"

Privacy and Data Usage

When configured through BYOK, chat requests for the selected model are sent to the Ask Sage endpoint you configured. The same Ask Sage tenant data-handling and logging policies apply as for any other Ask Sage API consumer. Refer to the Ask Sage Privacy & Security FAQ for the specifics of retention, logging, and training behavior in your environment.


Additional Resources


Back to top

Copyright © 2026 Ask Sage Inc. All Rights Reserved. Ask Sage is a BigBear.ai company.