VS Code GitHub Copilot Integration
Use Ask Sage models directly in VS Code Copilot Chat via Bring Your Own Key (BYOK)
Bring Ask Sage's models into Visual Studio Code through GitHub Copilot Chat's Bring Your Own Key (BYOK) language-model support. This integration uses VS Code's Custom Endpoint provider and works with the same Ask Sage API key you already use for other integrations.
Table of Contents
- At a Glance
- Prerequisites
- Step 1 — Add Ask Sage as a Custom Endpoint Provider
- Step 2 — Configure Models
- Step 3 — Verify the Configuration
- Available Models by Environment
- Drop-in Configurations by Environment
- Why Custom Endpoint (and not OpenAI / Anthropic vendor)
- Optional — Use Ask Sage for Copilot Utility Tasks
- Configuration Reference
- DoD and Managed Network Connectivity
- Troubleshooting
- Privacy and Data Usage
- Additional Resources
At a Glance
VS Code 1.122 added a Custom Endpoint BYOK provider that speaks OpenAI Chat Completions, OpenAI Responses, and Anthropic Messages. This page shows how to point that provider at Ask Sage so GPT, Claude, and Gemini-style models become first-class options in the Copilot Chat model picker — with the same security boundary, logging, and policy controls you already get from Ask Sage.
Prerequisites
- Visual Studio Code 1.122 or later — the Custom Endpoint provider was added in this release
- GitHub Copilot Chat enabled in VS Code
- An Ask Sage API key (from your account settings)
- Network access to the Ask Sage endpoint from your workstation
- For DoD or other managed networks, your organization-provided root certificate may need to be configured for VS Code or your OS certificate store
Step 1 — Add Ask Sage as a Custom Endpoint Provider
- Open the Command Palette (
Ctrl+Shift+P/Cmd+Shift+P) - Run Chat: Manage Language Models
- Select Add Models...
- Choose Custom Endpoint
- Enter
Ask Sageas the group name - Paste your Ask Sage API key — VS Code stores it in OS secret storage, not in the JSON file
- Choose the default API type for the group (you can mix shapes later):
- Chat Completions — for
/openai/v1/chat/completionsmodels - Responses — for
/openai/v1/responsesmodels - Messages — for
/anthropic/v1/messagesmodels
- Chat Completions — for
After you choose the API type, VS Code opens chatLanguageModels.json with a starter Ask Sage provider group and an empty model entry. The next step is filling that in.
chatLanguageModels.json. Secret fields are resolved through VS Code secret storage. After you enter the key in the UI, VS Code writes an ${input:chat.lm.secret...} reference into the file. That reference is what should live in the JSON. Step 2 — Configure Models
VS Code stores BYOK model groups as a top-level JSON array. Each entry is one provider group. Fill in the models array with one or more Ask Sage models. Pick the configuration shape that matches the endpoint you are calling.
Option A — OpenAI Chat Completions
[
{
"name": "Ask Sage",
"vendor": "customendpoint",
"apiKey": "${input:chat.lm.secret.example}",
"apiType": "chat-completions",
"models": [
{
"id": "gpt-4.1",
"name": "GPT 4.1 (Ask Sage)",
"url": "https://api.asksage.ai/server/openai/v1/chat/completions",
"apiType": "chat-completions",
"toolCalling": true,
"vision": true,
"maxInputTokens": 128000,
"maxOutputTokens": 32768
}
]
}
]Option B — OpenAI Responses (with reasoning)
[
{
"name": "Ask Sage",
"vendor": "customendpoint",
"apiKey": "${input:chat.lm.secret.example}",
"apiType": "responses",
"models": [
{
"id": "gpt-5.5",
"name": "GPT 5.5 (Ask Sage)",
"url": "https://api.asksage.ai/server/openai/v1/responses",
"apiType": "responses",
"toolCalling": true,
"vision": true,
"thinking": true,
"supportsReasoningEffort": ["low", "medium", "high"],
"reasoningEffortFormat": "responses",
"maxInputTokens": 272000,
"maxOutputTokens": 128000
}
],
"settings": {
"gpt-5.5": {
"reasoningEffort": "high"
}
}
}
]Option C — Anthropic Messages
[
{
"name": "Ask Sage",
"vendor": "customendpoint",
"apiKey": "${input:chat.lm.secret.example}",
"apiType": "messages",
"models": [
{
"id": "claude-opus-4-7",
"name": "Claude Opus 4.7 (Ask Sage)",
"url": "https://api.asksage.ai/server/anthropic/v1/messages",
"apiType": "messages",
"toolCalling": true,
"vision": true,
"thinking": true,
"supportsReasoningEffort": ["low", "medium", "high", "xhigh", "max"],
"maxInputTokens": 200000,
"maxOutputTokens": 64000
}
],
"settings": {
"claude-opus-4-7": {
"reasoningEffort": "medium"
}
}
}
]Option D — Combined (Chat Completions + Responses + Messages)
You can place all three API shapes in a single Ask Sage provider group. Set apiType on each individual model to override the group default.
[
{
"name": "Ask Sage",
"vendor": "customendpoint",
"apiKey": "${input:chat.lm.secret.example}",
"models": [
{
"id": "gpt-4.1",
"name": "GPT 4.1 (Ask Sage)",
"url": "https://api.asksage.ai/server/openai/v1/chat/completions",
"apiType": "chat-completions",
"toolCalling": true,
"vision": true,
"maxInputTokens": 128000,
"maxOutputTokens": 32768
},
{
"id": "gpt-5.5",
"name": "GPT 5.5 (Ask Sage)",
"url": "https://api.asksage.ai/server/openai/v1/responses",
"apiType": "responses",
"toolCalling": true,
"vision": true,
"thinking": true,
"supportsReasoningEffort": ["low", "medium", "high"],
"reasoningEffortFormat": "responses",
"maxInputTokens": 272000,
"maxOutputTokens": 128000
},
{
"id": "claude-opus-4-7",
"name": "Claude Opus 4.7 (Ask Sage)",
"url": "https://api.asksage.ai/server/anthropic/v1/messages",
"apiType": "messages",
"toolCalling": true,
"vision": true,
"thinking": true,
"supportsReasoningEffort": ["low", "medium", "high", "xhigh", "max"],
"maxInputTokens": 200000,
"maxOutputTokens": 64000
}
],
"settings": {
"gpt-5.5": {
"reasoningEffort": "high"
},
"claude-opus-4-7": {
"reasoningEffort": "medium"
}
}
}
]Step 3 — Verify the Configuration
- Save
chatLanguageModels.json - Open the Command Palette and run Chat: Manage Language Models again
- Your configured Ask Sage models should appear in the Language Models pane
- Open Copilot Chat, click the model picker, and pick one of the Ask Sage models
- Send a simple prompt such as
hello world!. A response confirms VS Code is reaching Ask Sage through the configured BYOK endpoint.
Available Models by Environment
The Ask Sage models exposed through the OpenAI- and Anthropic-compatible endpoints depend on which Ask Sage environment your API key is provisioned in. The catalog below mirrors the canonical per-environment allow-lists from the Ask Sage Client (src/config.js) enriched with model metadata from the Ask Sage CoreUI shared model catalog (src/Data/models.ts).
Image, video, and embedding models are intentionally omitted — the VS Code Copilot Chat picker only consumes chat / reasoning / Anthropic Messages shapes.
You can always confirm what your specific account is entitled to by calling:
curl https://api.asksage.ai/server/openai/v1/models \
-H "Authorization: Bearer $ASKSAGE_API_KEY"
curl https://api.asksage.ai/server/anthropic/v1/models \
-H "Authorization: Bearer $ASKSAGE_API_KEY"/openai/v1/models and /anthropic/v1/models endpoints currently return a static catalog that does not yet enforce per-environment filtering or expose the full set of supported IDs (including internal aliases). Treat the tables below as the source of truth for now; a Server-side fix is tracked in the Ask Sage Server repo to bring those endpoints in line. Commercial (SaaS) Tenants
Default profile for accounts on api.asksage.ai not tagged as Gov or DoD.
Commercial — 47 chat / reasoning models
| API Shape | Public ID | Display Name | Provider / Hosting | Input Ctx | Output Ctx | Tools | Vision | Reasoning |
|---|---|---|---|---|---|---|---|---|
| Anthropic Messages | claude-haiku-4-5alias: claude-haiku-4-5-com | Anthropic Claude Haiku 4.5 | Direct | 200,000 | 32,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4alias: google-claude-4-opus | Google Anthropic Claude 4.1 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-5alias: google-claude-45-opus | Google Anthropic Claude 4.5 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-6alias: google-claude-46-opus | Google Anthropic Claude 4.6 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-7alias: claude-opus-4-7-com | Anthropic Claude Opus 4.7 | Direct | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-8alias: google-claude-48-opus | Google Anthropic Claude 4.8 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-sonnet-4alias: google-claude-4-sonnet | Google Anthropic Claude 4 Sonnet | Google Vertex AI | 200,000 | 32,768 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-sonnet-4-5-vertexalias: google-claude-45-sonnet | Google Anthropic Claude 4.5 Sonnet | Google Vertex AI | 200,000 | 32,768 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-sonnet-4-6alias: claude-sonnet-4-6-com | Anthropic Claude Sonnet 4.6 | Direct | 200,000 | 32,768 | ✅ | ✅ | ✅ |
| Chat Completions | aws-bedrock-gpt-oss-120b-gov | OpenAI GPT-OSS 120B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | ✅ |
| Chat Completions | aws-bedrock-gpt-oss-20b-gov | OpenAI GPT-OSS 20B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | ✅ |
| Chat Completions | aws-bedrock-nemotron-12b-vl-gov | NVIDIA Nemotron Nano 12B v2 VL | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | aws-bedrock-nemotron-30b-gov | NVIDIA Nemotron Nano 3 30B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | — |
| Chat Completions | aws-bedrock-nemotron-9b-gov | NVIDIA Nemotron Nano 9B v2 | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | — |
| Chat Completions | aws-bedrock-nemotron-super-3-120b-gov | NVIDIA Nemotron Super 3 120B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | — |
| Chat Completions | deepseek-v3.2-com | DeepSeek V3.2 | Direct | 128,000 | 8,192 | ✅ | — | ✅ |
| Chat Completions | deepseek-v4-flash | DeepSeek V4 Flash | Direct | 128,000 | 8,192 | ✅ | — | ✅ |
| Chat Completions | google-gemini-2.5-flash | Google Gemini 2.5 Flash | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-2.5-pro | Google Gemini 2.5 Pro | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-20-flash | Google Gemini 2.0 Flash | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-3-flash-com | Google Gemini 3 Flash | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-3.1-flash-lite-com | Google Gemini 3.1 Flash Lite | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-3.1-pro-com | Google Gemini 3.1 Pro | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-3.5-flash-com | Google Gemini 3.5 Flash | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | gpt-4.1 | Azure OpenAI GPT-4.1 | Azure OpenAI (Commercial) | 128,000 | 32,768 | ✅ | ✅ | — |
| Chat Completions | gpt-4.1-mini | Azure OpenAI GPT-4.1-mini | Azure OpenAI (Commercial) | 128,000 | 32,768 | ✅ | ✅ | — |
| Chat Completions | gpt-4.1-nano | Azure OpenAI GPT-4.1-nano | Azure OpenAI (Commercial) | 128,000 | 16,384 | ✅ | ✅ | — |
| Chat Completions | grok-4-1-fast-non-reasoning | X.AI Grok 4.1 Fast | Direct | 256,000 | 16,384 | ✅ | ✅ | — |
| Chat Completions | grok-4-1-fast-reasoning | X.AI Grok 4.1 Fast (Reasoning) | Direct | 256,000 | 16,384 | ✅ | ✅ | ✅ |
| Chat Completions | grok-4-20-non-reasoning | X.AI Grok 4.20 (Fast) | Direct | 256,000 | 16,384 | ✅ | ✅ | — |
| Chat Completions | grok-4-20-reasoning | X.AI Grok 4.20 (Reasoning) | Direct | 256,000 | 16,384 | ✅ | ✅ | ✅ |
| Chat Completions | groq-70b | Groq-70B | Groq Cloud | 128,000 | 8,192 | — | — | — |
| Chat Completions | groq-llama33 | Groq LLAMA 3.3 | Groq Cloud | 128,000 | 8,192 | — | — | — |
| Chat Completions | groq-llama4-scout | Groq LLAMA 4-Scout | Groq Cloud | 128,000 | 8,192 | — | — | — |
| Chat Completions | kimi-2.6-com | Moonshot Kimi K2.6 | Direct | 200,000 | 16,384 | ✅ | — | — |
| Chat Completions | mistral-large-3 | Mistral Large 3 | Azure OpenAI (Commercial) | 128,000 | 32,000 | ✅ | — | — |
| Responses | gpt-5 | Azure OpenAI GPT-5 | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5-mini | Azure OpenAI GPT-5-mini | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5-nano | Azure OpenAI GPT-5-nano | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5.1 | Azure OpenAI GPT-5.1 | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5.2 | Azure OpenAI GPT-5.2 | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5.4 | Azure OpenAI GPT-5.4 | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5.4-nano | Azure OpenAI GPT-5.4-nano | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-o1 | Azure OpenAI GPT-o1 | Azure OpenAI (Commercial) | 200,000 | 100,000 | ✅ | ✅ | ✅ |
| Responses | gpt-o3 | Azure OpenAI GPT-o3 | Azure OpenAI (Commercial) | 200,000 | 100,000 | ✅ | ✅ | ✅ |
| Responses | gpt-o3-mini | Azure OpenAI GPT-o3-mini | Azure OpenAI (Commercial) | 200,000 | 100,000 | ✅ | ✅ | ✅ |
| Responses | gpt-o4-mini | Azure OpenAI GPT-o4-mini | Azure OpenAI (Commercial) | 200,000 | 100,000 | ✅ | ✅ | ✅ |
Gov Tenants (FedRAMP / IL2–IL4)
Profile when the tenant has force_gov_models=true. Superset of the commercial-equivalent models with -gov variants for partner models that are not yet generally available in commercial.
Gov — 44 chat / reasoning models
| API Shape | Public ID | Display Name | Provider / Hosting | Input Ctx | Output Ctx | Tools | Vision | Reasoning |
|---|---|---|---|---|---|---|---|---|
| Anthropic Messages | claude-haiku-4-5alias: google-claude-45-haiku | Google Anthropic Claude 4.5 Haiku | Google Vertex AI | 200,000 | 32,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4alias: google-claude-4-opus | Google Anthropic Claude 4.1 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-5alias: google-claude-45-opus | Google Anthropic Claude 4.5 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-6alias: google-claude-46-opus | Google Anthropic Claude 4.6 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-7alias: google-claude-47-opus | Google Anthropic Claude 4.7 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-8alias: google-claude-48-opus | Google Anthropic Claude 4.8 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-sonnet-4alias: google-claude-4-sonnet | Google Anthropic Claude 4 Sonnet | Google Vertex AI | 200,000 | 32,768 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-sonnet-4-5alias: aws-bedrock-claude-45-sonnet-gov | AWS Gov Bedrock Claude 4.5 Sonnet | AWS Bedrock GovCloud | 200,000 | 32,768 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-sonnet-4-5-vertexalias: google-claude-45-sonnet | Google Anthropic Claude 4.5 Sonnet | Google Vertex AI | 200,000 | 32,768 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-sonnet-4-6alias: google-claude-46-sonnet | Google Anthropic Claude 4.6 Sonnet | Google Vertex AI | 200,000 | 32,768 | ✅ | ✅ | ✅ |
| Chat Completions | aws-bedrock-gpt-oss-120b-gov | OpenAI GPT-OSS 120B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | ✅ |
| Chat Completions | aws-bedrock-gpt-oss-20b-gov | OpenAI GPT-OSS 20B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | ✅ |
| Chat Completions | aws-bedrock-nemotron-12b-vl-gov | NVIDIA Nemotron Nano 12B v2 VL | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | aws-bedrock-nemotron-30b-gov | NVIDIA Nemotron Nano 3 30B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | — |
| Chat Completions | aws-bedrock-nemotron-9b-gov | NVIDIA Nemotron Nano 9B v2 | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | — |
| Chat Completions | aws-bedrock-nemotron-super-3-120b-gov | NVIDIA Nemotron Super 3 120B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | — |
| Chat Completions | aws-bedrock-nova-lite-gov | AWS Gov Bedrock Nova Lite | AWS Bedrock GovCloud | 128,000 | 5,000 | ✅ | ✅ | — |
| Chat Completions | aws-bedrock-nova-micro-gov | AWS Gov Bedrock Nova Micro | AWS Bedrock GovCloud | 128,000 | 5,000 | ✅ | — | — |
| Chat Completions | aws-bedrock-nova-pro-gov | AWS Gov Bedrock Nova Pro | AWS Bedrock GovCloud | 300,000 | 5,000 | ✅ | ✅ | — |
| Chat Completions | google-gemini-2.5-flash | Google Gemini 2.5 Flash | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-2.5-pro | Google Gemini 2.5 Pro | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-20-flash | Google Gemini 2.0 Flash | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-3.1-flash-lite-gov | Google Gemini 3.1 Flash Lite Gov | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-3.5-flash-gov | Google Gemini 3.5 Flash Gov | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | gpt-4.1 | Azure OpenAI GPT-4.1 | Azure OpenAI (Commercial) | 128,000 | 32,768 | ✅ | ✅ | — |
| Chat Completions | gpt-4.1-mini | Azure OpenAI GPT-4.1-mini | Azure OpenAI (Commercial) | 128,000 | 32,768 | ✅ | ✅ | — |
| Chat Completions | gpt-4.1-nano | Azure OpenAI GPT-4.1-nano | Azure OpenAI (Commercial) | 128,000 | 16,384 | ✅ | ✅ | — |
| Chat Completions | grok-4-1-fast-non-reasoning | X.AI Grok 4.1 Fast | Direct | 256,000 | 16,384 | ✅ | ✅ | — |
| Chat Completions | grok-4-1-fast-reasoning | X.AI Grok 4.1 Fast (Reasoning) | Direct | 256,000 | 16,384 | ✅ | ✅ | ✅ |
| Chat Completions | grok-4-20-non-reasoning | X.AI Grok 4.20 (Fast) | Direct | 256,000 | 16,384 | ✅ | ✅ | — |
| Chat Completions | grok-4-20-reasoning | X.AI Grok 4.20 (Reasoning) | Direct | 256,000 | 16,384 | ✅ | ✅ | ✅ |
| Chat Completions | llma3 | LLAMA 3 | AWS Bedrock GovCloud | 128,000 | 8,192 | ✅ | — | — |
| Chat Completions | llma3-8b | Meta Llama 3 8B | AWS Bedrock GovCloud | 128,000 | 8,192 | ✅ | — | — |
| Chat Completions | mistral-large-3 | Mistral Large 3 | Azure OpenAI (Commercial) | 128,000 | 32,000 | ✅ | — | — |
| Responses | gpt-5 | Azure OpenAI GPT-5 | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5-mini | Azure OpenAI GPT-5-mini | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5-nano | Azure OpenAI GPT-5-nano | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5.1 | Azure OpenAI GPT-5.1 | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5.1-gov | Azure Gov OpenAI GPT-5.1 | Azure OpenAI Gov | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5.2 | Azure OpenAI GPT-5.2 | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5.4 | Azure OpenAI GPT-5.4 | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-5.4-nano | Azure OpenAI GPT-5.4-nano | Azure OpenAI (Commercial) | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-o1 | Azure OpenAI GPT-o1 | Azure OpenAI (Commercial) | 200,000 | 100,000 | ✅ | ✅ | ✅ |
| Responses | gpt-o3-mini | Azure OpenAI GPT-o3-mini | Azure OpenAI (Commercial) | 200,000 | 100,000 | ✅ | ✅ | ✅ |
DoD Tenants (IL5 / IL6)
force_dod_models=true). Calling any other model ID will return 403 model_not_allowed. This list mirrors the canonical allow-list in Ask Sage Client src/config.js and is the safe set to publish in a DoD environment. DoD-Approved — 26 chat / reasoning models
| API Shape | Public ID | Display Name | Provider / Hosting | Input Ctx | Output Ctx | Tools | Vision | Reasoning |
|---|---|---|---|---|---|---|---|---|
| Anthropic Messages | claude-haiku-4-5alias: google-claude-45-haiku | Google Anthropic Claude 4.5 Haiku | Google Vertex AI | 200,000 | 32,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-5alias: google-claude-45-opus | Google Anthropic Claude 4.5 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-6alias: google-claude-46-opus | Google Anthropic Claude 4.6 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-7alias: google-claude-47-opus | Google Anthropic Claude 4.7 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-opus-4-8alias: google-claude-48-opus | Google Anthropic Claude 4.8 Opus | Google Vertex AI | 200,000 | 64,000 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-sonnet-4-5-vertexalias: google-claude-45-sonnet | Google Anthropic Claude 4.5 Sonnet | Google Vertex AI | 200,000 | 32,768 | ✅ | ✅ | ✅ |
| Anthropic Messages | claude-sonnet-4-6alias: google-claude-46-sonnet | Google Anthropic Claude 4.6 Sonnet | Google Vertex AI | 200,000 | 32,768 | ✅ | ✅ | ✅ |
| Chat Completions | aws-bedrock-gpt-oss-120b-gov | OpenAI GPT-OSS 120B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | ✅ |
| Chat Completions | aws-bedrock-gpt-oss-20b-gov | OpenAI GPT-OSS 20B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | ✅ |
| Chat Completions | aws-bedrock-nemotron-12b-vl-gov | NVIDIA Nemotron Nano 12B v2 VL | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | aws-bedrock-nemotron-30b-gov | NVIDIA Nemotron Nano 3 30B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | — |
| Chat Completions | aws-bedrock-nemotron-9b-gov | NVIDIA Nemotron Nano 9B v2 | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | — |
| Chat Completions | aws-bedrock-nemotron-super-3-120b-gov | NVIDIA Nemotron Super 3 120B | AWS Bedrock GovCloud | 131,000 | 8,192 | ✅ | — | — |
| Chat Completions | aws-bedrock-nova-lite-gov | AWS Gov Bedrock Nova Lite | AWS Bedrock GovCloud | 128,000 | 5,000 | ✅ | ✅ | — |
| Chat Completions | aws-bedrock-nova-micro-gov | AWS Gov Bedrock Nova Micro | AWS Bedrock GovCloud | 128,000 | 5,000 | ✅ | — | — |
| Chat Completions | aws-bedrock-nova-pro-gov | AWS Gov Bedrock Nova Pro | AWS Bedrock GovCloud | 300,000 | 5,000 | ✅ | ✅ | — |
| Chat Completions | google-gemini-2.5-flash | Google Gemini 2.5 Flash | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-2.5-pro | Google Gemini 2.5 Pro | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-3.1-flash-lite-gov | Google Gemini 3.1 Flash Lite Gov | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | google-gemini-3.5-flash-gov | Google Gemini 3.5 Flash Gov | Google Vertex AI | 1,000,000 | 8,192 | ✅ | ✅ | — |
| Chat Completions | gpt-4.1-gov | Azure Gov OpenAI GPT-4.1 | Azure OpenAI Gov | 128,000 | 32,768 | ✅ | ✅ | — |
| Chat Completions | gpt-4.1-mini-gov | Azure Gov OpenAI GPT-4.1-mini | Azure OpenAI Gov | 128,000 | 32,768 | ✅ | ✅ | — |
| Chat Completions | llma3 | LLAMA 3 | AWS Bedrock GovCloud | 128,000 | 8,192 | ✅ | — | — |
| Chat Completions | llma3-8b | Meta Llama 3 8B | AWS Bedrock GovCloud | 128,000 | 8,192 | ✅ | — | — |
| Responses | gpt-5.1-gov | Azure Gov OpenAI GPT-5.1 | Azure OpenAI Gov | 272,000 | 128,000 | ✅ | ✅ | ✅ |
| Responses | gpt-o3-mini-gov | Azure Gov OpenAI GPT-o3-mini | Azure OpenAI Gov | 200,000 | 100,000 | ✅ | ✅ | ✅ |
Drop-in Configurations by Environment
Paste one of the snippets below into chatLanguageModels.json based on the Ask Sage tenant your API key is associated with. The snippets are curated starter sets — expand from the catalog above as needed.
Commercial Drop-in
[
{
"name": "Ask Sage (Commercial)",
"vendor": "customendpoint",
"apiKey": "${input:asksage-api-key}",
"models": [
{
"id": "gpt-4.1",
"name": "GPT 4.1 - Ask Sage",
"url": "https://api.asksage.ai/server/openai/v1/chat/completions",
"apiType": "chat-completions",
"toolCalling": true,
"vision": true,
"maxInputTokens": 128000,
"maxOutputTokens": 32768
},
{
"id": "gpt-5.5",
"name": "GPT 5.5 (Reasoning) - Ask Sage",
"url": "https://api.asksage.ai/server/openai/v1/responses",
"apiType": "responses",
"toolCalling": true,
"vision": true,
"maxInputTokens": 272000,
"maxOutputTokens": 128000,
"thinking": true,
"supportsReasoningEffort": [
"low",
"medium",
"high"
],
"reasoningEffortFormat": "responses"
},
{
"id": "claude-opus-4-8",
"name": "Claude Opus 4.8 - Ask Sage",
"url": "https://api.asksage.ai/server/anthropic/v1/messages",
"apiType": "messages",
"toolCalling": true,
"vision": true,
"maxInputTokens": 200000,
"maxOutputTokens": 64000,
"thinking": true,
"supportsReasoningEffort": [
"low",
"medium",
"high",
"xhigh",
"max"
]
},
{
"id": "claude-sonnet-4-6",
"name": "Claude Sonnet 4.6 - Ask Sage",
"url": "https://api.asksage.ai/server/anthropic/v1/messages",
"apiType": "messages",
"toolCalling": true,
"vision": true,
"maxInputTokens": 200000,
"maxOutputTokens": 32768,
"thinking": true,
"supportsReasoningEffort": [
"low",
"medium",
"high",
"xhigh",
"max"
]
}
],
"settings": {
"gpt-5.5": {
"reasoningEffort": "high"
},
"claude-opus-4-8": {
"reasoningEffort": "medium"
},
"claude-sonnet-4-6": {
"reasoningEffort": "medium"
}
}
}
]Gov Drop-in
[
{
"name": "Ask Sage (Gov)",
"vendor": "customendpoint",
"apiKey": "${input:asksage-api-key}",
"models": [
{
"id": "gpt-4.1",
"name": "GPT 4.1 - Ask Sage",
"url": "https://api.asksage.ai/server/openai/v1/chat/completions",
"apiType": "chat-completions",
"toolCalling": true,
"vision": true,
"maxInputTokens": 128000,
"maxOutputTokens": 32768
},
{
"id": "gpt-5.1-gov",
"name": "GPT 5.1 (Gov Reasoning) - Ask Sage",
"url": "https://api.asksage.ai/server/openai/v1/responses",
"apiType": "responses",
"toolCalling": true,
"vision": true,
"maxInputTokens": 272000,
"maxOutputTokens": 128000,
"thinking": true,
"supportsReasoningEffort": [
"low",
"medium",
"high"
],
"reasoningEffortFormat": "responses"
},
{
"id": "claude-sonnet-4-6",
"name": "Claude Sonnet 4.6 - Ask Sage",
"url": "https://api.asksage.ai/server/anthropic/v1/messages",
"apiType": "messages",
"toolCalling": true,
"vision": true,
"maxInputTokens": 200000,
"maxOutputTokens": 32768,
"thinking": true,
"supportsReasoningEffort": [
"low",
"medium",
"high",
"xhigh",
"max"
]
},
{
"id": "claude-opus-4-7",
"name": "Claude Opus 4.7 - Ask Sage",
"url": "https://api.asksage.ai/server/anthropic/v1/messages",
"apiType": "messages",
"toolCalling": true,
"vision": true,
"maxInputTokens": 200000,
"maxOutputTokens": 64000,
"thinking": true,
"supportsReasoningEffort": [
"low",
"medium",
"high",
"xhigh",
"max"
]
}
],
"settings": {
"gpt-5.1-gov": {
"reasoningEffort": "high"
},
"claude-sonnet-4-6": {
"reasoningEffort": "medium"
},
"claude-opus-4-7": {
"reasoningEffort": "medium"
}
}
}
]DoD Drop-in
claude-sonnet-4-6, claude-opus-4-7) resolve to Google Vertex AI deployed inside an IL5 Assured Workloads folder — not commercial Vertex. claude-sonnet-4-5 (which routes to AWS Bedrock GovCloud) is not in the force_dod_models allow-list, so it is omitted here. [
{
"name": "Ask Sage (DoD)",
"vendor": "customendpoint",
"apiKey": "${input:asksage-api-key}",
"models": [
{
"id": "gpt-4.1-gov",
"name": "GPT 4.1 (Gov) - Ask Sage",
"url": "https://api.asksage.ai/server/openai/v1/chat/completions",
"apiType": "chat-completions",
"toolCalling": true,
"vision": true,
"maxInputTokens": 128000,
"maxOutputTokens": 32768
},
{
"id": "gpt-4.1-mini-gov",
"name": "GPT 4.1 Mini (Gov) - Ask Sage",
"url": "https://api.asksage.ai/server/openai/v1/chat/completions",
"apiType": "chat-completions",
"toolCalling": true,
"vision": true,
"maxInputTokens": 128000,
"maxOutputTokens": 32768
},
{
"id": "gpt-5.1-gov",
"name": "GPT 5.1 (Gov) - Ask Sage",
"url": "https://api.asksage.ai/server/openai/v1/responses",
"apiType": "responses",
"toolCalling": true,
"vision": true,
"maxInputTokens": 272000,
"maxOutputTokens": 128000,
"thinking": true,
"supportsReasoningEffort": [
"low",
"medium",
"high"
],
"reasoningEffortFormat": "responses"
},
{
"id": "gpt-o3-mini-gov",
"name": "GPT o3 Mini (Gov) - Ask Sage",
"url": "https://api.asksage.ai/server/openai/v1/responses",
"apiType": "responses",
"toolCalling": true,
"vision": false,
"maxInputTokens": 200000,
"maxOutputTokens": 100000,
"thinking": true,
"supportsReasoningEffort": [
"low",
"medium",
"high"
],
"reasoningEffortFormat": "responses"
},
{
"id": "claude-sonnet-4-6",
"name": "Claude Sonnet 4.6 (IL5 Vertex) - Ask Sage",
"url": "https://api.asksage.ai/server/anthropic/v1/messages",
"apiType": "messages",
"toolCalling": true,
"vision": true,
"maxInputTokens": 200000,
"maxOutputTokens": 32768,
"thinking": true,
"supportsReasoningEffort": [
"low",
"medium",
"high",
"xhigh",
"max"
]
},
{
"id": "claude-opus-4-7",
"name": "Claude Opus 4.7 (IL5 Vertex) - Ask Sage",
"url": "https://api.asksage.ai/server/anthropic/v1/messages",
"apiType": "messages",
"toolCalling": true,
"vision": true,
"maxInputTokens": 200000,
"maxOutputTokens": 64000,
"thinking": true,
"supportsReasoningEffort": [
"low",
"medium",
"high",
"xhigh",
"max"
]
}
],
"settings": {
"gpt-5.1-gov": {
"reasoningEffort": "high"
},
"gpt-o3-mini-gov": {
"reasoningEffort": "high"
},
"claude-sonnet-4-6": {
"reasoningEffort": "medium"
},
"claude-opus-4-7": {
"reasoningEffort": "medium"
}
}
}
]
Why Custom Endpoint (and not OpenAI / Anthropic vendor)
vendor: "customendpoint" for Ask Sage. The Ask Sage compatibility endpoints are OpenAI- and Anthropic-style, but they are hosted by Ask Sage — not by OpenAI or Anthropic directly. - Do not use
vendor: "openai"— VS Code's built-in OpenAI provider targets the official OpenAI API. - Do not use
vendor: "anthropic"— VS Code's built-in Anthropic provider targets Anthropic's official API.
For OpenAI-shaped endpoints VS Code sends the key as Authorization: Bearer .... For Anthropic-shaped endpoints with apiType: "messages", VS Code sends it as x-api-key — both are supported by Ask Sage.
Optional — Use Ask Sage for Copilot Utility Tasks
VS Code uses lightweight background models for utility tasks like title generation, commit messages, and intent detection. You can route those through Ask Sage too:
{
"chat.utilityModel": "customendpoint/gpt-4.1",
"chat.utilitySmallModel": "customendpoint/gpt-4.1-mini"
}The format is ${vendor}/${modelId}. For Ask Sage models, the vendor is always customendpoint, so values look like customendpoint/gpt-4.1-mini. A fast, inexpensive model is recommended for chat.utilitySmallModel since it is invoked frequently.
Configuration Reference
The fields most relevant to Ask Sage models:
| Property | Type | Notes |
|---|---|---|
id | string | The model identifier Ask Sage expects (e.g., gpt-4.1, claude-opus-4-7) |
name | string | Display name in the Copilot Chat model picker |
url | string | Full Ask Sage endpoint URL for this model's API shape |
apiType | string | chat-completions, responses, or messages — overrides the group default |
toolCalling | boolean | Set to true only if the model supports tool calling |
vision | boolean | Set to true only if the model supports image inputs |
maxInputTokens | integer | Context window for input tokens |
maxOutputTokens | integer | Maximum response length |
thinking | boolean | Set to true for reasoning-capable models (Responses or Anthropic with thinking) |
supportsReasoningEffort | array | Effort levels: typically ["low", "medium", "high"]; Anthropic also supports "xhigh" and "max" |
reasoningEffortFormat | string | For /responses endpoints set to "responses" (sends nested reasoning.effort); defaults follow URL otherwise |
streaming | boolean | Optional, defaults to true; Ask Sage supports streaming via SSE |
requestHeaders | object | Optional extra headers; reserved/forwarding headers are ignored |
For the complete reference (including provider-level fields and advanced options) see the VS Code language models documentation.
DoD and Managed Network Connectivity
If your users connect through a DoD, DoW, or other managed network, certificate and proxy configuration may be required before VS Code can reach the Ask Sage endpoint. For the model allow-list approved in DoD environments, see the DoD Tenants section above.
- If your environment requires a custom root certificate, configure it according to your organization policy — typically through the OS certificate store or the
http.proxyStrictSSL/http.systemCertificatesVS Code settings - Test reachability from a terminal:
curl -I https://api.asksage.ai/server/openai/v1/models -H "Authorization: Bearer $KEY" - If using a proxy, ensure VS Code's
http.proxysetting is configured and matches your shell environment
Troubleshooting
Manage Language Models shows nothing
Make sure chatLanguageModels.json is a top-level array, not an object with a providers property.
Correct:
[
{ "name": "Ask Sage", "vendor": "customendpoint" }
]Incorrect:
{
"providers": []
}API key not found or authentication fails
- Re-enter the key through Chat: Manage Language Models so VS Code stores it as a secret
- Confirm the JSON contains an
${input:chat.lm.secret...}reference forapiKey(not the raw key) - Verify the Ask Sage API key is still active in your account settings
- Verify the endpoint URL matches the configured
apiType— an Anthropic URL withapiType: "chat-completions"will fail authentication
Model does not appear in the picker
- Confirm the provider group
vendoriscustomendpoint - Confirm each model has
id,name,url,toolCalling,vision,maxInputTokens, andmaxOutputTokens - For agent / tool-use scenarios, the model must have
toolCalling: true— otherwise it is hidden from the picker - Reload the VS Code window after editing
chatLanguageModels.jsondirectly
Privacy and Data Usage
When configured through BYOK, chat requests for the selected model are sent to the Ask Sage endpoint you configured. The same Ask Sage tenant data-handling and logging policies apply as for any other Ask Sage API consumer. Refer to the Ask Sage Privacy & Security FAQ for the specifics of retention, logging, and training behavior in your environment.