VS Code Copilot BYOK Integration

VS Code GitHub Copilot Integration

Use Ask Sage models directly in VS Code Copilot Chat via Bring Your Own Key (BYOK)

Bring Ask Sage's models into Visual Studio Code through GitHub Copilot Chat's Bring Your Own Key (BYOK) language-model support. This integration uses VS Code's Custom Endpoint provider and works with the same Ask Sage API key you already use for other integrations.

Table of Contents

At a Glance
Prerequisites
Step 1 — Add Ask Sage as a Custom Endpoint Provider
Step 2 — Configure Models
Step 3 — Verify the Configuration
Available Models by Environment
Drop-in Configurations by Environment
Why Custom Endpoint (and not OpenAI / Anthropic vendor)
Optional — Use Ask Sage for Copilot Utility Tasks
Configuration Reference
DoD and Managed Network Connectivity
Troubleshooting
Privacy and Data Usage
Additional Resources

Instance-Specific Base URL: The endpoints and configuration shown reflect the instance at chat.asksage.ai. The api. prefix and path suffix stay the same across deployments — only the instance segment in the middle changes based on which Ask Sage instance you are logging into. Always use the instance approved by your organization and applicable regulatory requirements, and match the base URL in your configuration to the instance you authenticate against.

At a Glance

What this integration does

VS Code 1.122 added a Custom Endpoint BYOK provider that speaks OpenAI Chat Completions, OpenAI Responses, and Anthropic Messages. This page shows how to point that provider at Ask Sage so GPT, Claude, and Gemini-style models become first-class options in the Copilot Chat model picker — with the same security boundary, logging, and policy controls you already get from Ask Sage.

API key only — no Entra ID, no extra sign-in

Works in Commercial, Gov, and managed networks

Chat Completions, Responses, and Anthropic Messages in one provider group

Per-model reasoning effort, tool calling, and vision toggles

Prerequisites

Before you begin

Visual Studio Code 1.122 or later — the Custom Endpoint provider was added in this release
GitHub Copilot Chat enabled in VS Code
An Ask Sage API key (from your account settings)
Network access to the Ask Sage endpoint from your workstation
For DoD or other managed networks, your organization-provided root certificate may need to be configured for VS Code or your OS certificate store

Copilot Business / Enterprise users

If you are on a Copilot Business or Enterprise plan, your organization administrator must first enable the Bring Your Own Language Model Key in VS Code policy in GitHub Copilot policy settings. Without that policy, the Custom Endpoint flow will not appear.

Step 1 — Add Ask Sage as a Custom Endpoint Provider

Open the Command Palette (Ctrl+Shift+P / Cmd+Shift+P)
Run Chat: Manage Language Models
Select Add Models...
Choose Custom Endpoint
Enter Ask Sage as the group name
Paste your Ask Sage API key — VS Code stores it in OS secret storage, not in the JSON file
Choose the default API type for the group (you can mix shapes later):
- Chat Completions — for /openai/v1/chat/completions models
- Responses — for /openai/v1/responses models
- Messages — for /anthropic/v1/messages models

After you choose the API type, VS Code opens chatLanguageModels.json with a starter Ask Sage provider group and an empty model entry. The next step is filling that in.

Do not paste a raw API key into chatLanguageModels.json. Secret fields are resolved through VS Code secret storage. After you enter the key in the UI, VS Code writes an ${input:chat.lm.secret...} reference into the file. That reference is what should live in the JSON.

Step 2 — Configure Models

VS Code stores BYOK model groups as a top-level JSON array. Each entry is one provider group. Fill in the models array with one or more Ask Sage models. Pick the configuration shape that matches the endpoint you are calling.

Looking for the complete model list? The four Option snippets below are minimal starters. See Available Models by Environment and Drop-in Configurations by Environment further down for full per-tenant catalogs (Commercial / Gov / DoD) and copy-paste configurations.

Option A — OpenAI Chat Completions

chatLanguageModels.json — Chat Completions

[
  {
    "name": "Ask Sage",
    "vendor": "customendpoint",
    "apiKey": "${input:chat.lm.secret.example}",
    "apiType": "chat-completions",
    "models": [
      {
        "id": "gpt-4.1",
        "name": "GPT 4.1 (Ask Sage)",
        "url": "https://api.asksage.ai/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      }
    ]
  }
]

Option B — OpenAI Responses (with reasoning)

chatLanguageModels.json — Responses

[
  {
    "name": "Ask Sage",
    "vendor": "customendpoint",
    "apiKey": "${input:chat.lm.secret.example}",
    "apiType": "responses",
    "models": [
      {
        "id": "gpt-5.5",
        "name": "GPT 5.5 (Ask Sage)",
        "url": "https://api.asksage.ai/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": true,
        "thinking": true,
        "supportsReasoningEffort": ["low", "medium", "high"],
        "reasoningEffortFormat": "responses",
        "maxInputTokens": 272000,
        "maxOutputTokens": 128000
      }
    ],
    "settings": {
      "gpt-5.5": {
        "reasoningEffort": "high"
      }
    }
  }
]

Option C — Anthropic Messages

chatLanguageModels.json — Anthropic Messages

[
  {
    "name": "Ask Sage",
    "vendor": "customendpoint",
    "apiKey": "${input:chat.lm.secret.example}",
    "apiType": "messages",
    "models": [
      {
        "id": "claude-opus-4-7",
        "name": "Claude Opus 4.7 (Ask Sage)",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "thinking": true,
        "supportsReasoningEffort": ["low", "medium", "high", "xhigh", "max"],
        "maxInputTokens": 200000,
        "maxOutputTokens": 64000
      }
    ],
    "settings": {
      "claude-opus-4-7": {
        "reasoningEffort": "medium"
      }
    }
  }
]

Option D — Combined (Chat Completions + Responses + Messages)

You can place all three API shapes in a single Ask Sage provider group. Set apiType on each individual model to override the group default.

chatLanguageModels.json — Combined (live-tested)

[
  {
    "name": "Ask Sage",
    "vendor": "customendpoint",
    "apiKey": "${input:chat.lm.secret.example}",
    "models": [
      {
        "id": "gpt-4.1",
        "name": "GPT 4.1 (Ask Sage)",
        "url": "https://api.asksage.ai/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      },
      {
        "id": "gpt-5.5",
        "name": "GPT 5.5 (Ask Sage)",
        "url": "https://api.asksage.ai/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": true,
        "thinking": true,
        "supportsReasoningEffort": ["low", "medium", "high"],
        "reasoningEffortFormat": "responses",
        "maxInputTokens": 272000,
        "maxOutputTokens": 128000
      },
      {
        "id": "claude-opus-4-7",
        "name": "Claude Opus 4.7 (Ask Sage)",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "thinking": true,
        "supportsReasoningEffort": ["low", "medium", "high", "xhigh", "max"],
        "maxInputTokens": 200000,
        "maxOutputTokens": 64000
      }
    ],
    "settings": {
      "gpt-5.5": {
        "reasoningEffort": "high"
      },
      "claude-opus-4-7": {
        "reasoningEffort": "medium"
      }
    }
  }
]

Step 3 — Verify the Configuration

Save chatLanguageModels.json
Open the Command Palette and run Chat: Manage Language Models again
Your configured Ask Sage models should appear in the Language Models pane
Open Copilot Chat, click the model picker, and pick one of the Ask Sage models
Send a simple prompt such as hello world!. A response confirms VS Code is reaching Ask Sage through the configured BYOK endpoint.

Configured Ask Sage models in Manage Language Models

Ask Sage model responding in Copilot Chat

Available Models by Environment

The Ask Sage models exposed through the OpenAI- and Anthropic-compatible endpoints depend on which Ask Sage environment your API key is provisioned in. The catalog below mirrors the canonical per-environment allow-lists from the Ask Sage Client (src/config.js) enriched with model metadata from the Ask Sage CoreUI shared model catalog (src/Data/models.ts).

Image, video, and embedding models are intentionally omitted — the VS Code Copilot Chat picker only consumes chat / reasoning / Anthropic Messages shapes.

You can always confirm what your specific account is entitled to by calling:

curl https://api.asksage.ai/server/openai/v1/models \
  -H "Authorization: Bearer $ASKSAGE_API_KEY"

curl https://api.asksage.ai/server/anthropic/v1/models \
  -H "Authorization: Bearer $ASKSAGE_API_KEY"

Heads up: the live /openai/v1/models and /anthropic/v1/models endpoints currently return a static catalog that does not yet enforce per-environment filtering or expose the full set of supported IDs (including internal aliases). Treat the tables below as the source of truth for now; a Server-side fix is tracked in the Ask Sage Server repo to bring those endpoints in line.

Commercial (SaaS) Tenants

Default profile for accounts on api.asksage.ai not tagged as Gov or DoD.

Commercial — 47 chat / reasoning models

API Shape	Public ID	Display Name	Provider / Hosting	Input Ctx	Output Ctx	Tools	Vision	Reasoning
Anthropic Messages	`claude-haiku-4-5` alias: `claude-haiku-4-5-com`	Anthropic Claude Haiku 4.5	Direct	200,000	32,000	✅	✅	✅
Anthropic Messages	`claude-opus-4` alias: `google-claude-4-opus`	Google Anthropic Claude 4.1 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-5` alias: `google-claude-45-opus`	Google Anthropic Claude 4.5 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-6` alias: `google-claude-46-opus`	Google Anthropic Claude 4.6 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-7` alias: `claude-opus-4-7-com`	Anthropic Claude Opus 4.7	Direct	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-8` alias: `google-claude-48-opus`	Google Anthropic Claude 4.8 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-sonnet-4` alias: `google-claude-4-sonnet`	Google Anthropic Claude 4 Sonnet	Google Vertex AI	200,000	32,768	✅	✅	✅
Anthropic Messages	`claude-sonnet-4-5-vertex` alias: `google-claude-45-sonnet`	Google Anthropic Claude 4.5 Sonnet	Google Vertex AI	200,000	32,768	✅	✅	✅
Anthropic Messages	`claude-sonnet-4-6` alias: `claude-sonnet-4-6-com`	Anthropic Claude Sonnet 4.6	Direct	200,000	32,768	✅	✅	✅
Chat Completions	`aws-bedrock-gpt-oss-120b-gov`	OpenAI GPT-OSS 120B	AWS Bedrock GovCloud	131,000	8,192	✅	—	✅
Chat Completions	`aws-bedrock-gpt-oss-20b-gov`	OpenAI GPT-OSS 20B	AWS Bedrock GovCloud	131,000	8,192	✅	—	✅
Chat Completions	`aws-bedrock-nemotron-12b-vl-gov`	NVIDIA Nemotron Nano 12B v2 VL	AWS Bedrock GovCloud	131,000	8,192	✅	✅	—
Chat Completions	`aws-bedrock-nemotron-30b-gov`	NVIDIA Nemotron Nano 3 30B	AWS Bedrock GovCloud	131,000	8,192	✅	—	—
Chat Completions	`aws-bedrock-nemotron-9b-gov`	NVIDIA Nemotron Nano 9B v2	AWS Bedrock GovCloud	131,000	8,192	✅	—	—
Chat Completions	`aws-bedrock-nemotron-super-3-120b-gov`	NVIDIA Nemotron Super 3 120B	AWS Bedrock GovCloud	131,000	8,192	✅	—	—
Chat Completions	`deepseek-v3.2-com`	DeepSeek V3.2	Direct	128,000	8,192	✅	—	✅
Chat Completions	`deepseek-v4-flash`	DeepSeek V4 Flash	Direct	128,000	8,192	✅	—	✅
Chat Completions	`google-gemini-2.5-flash`	Google Gemini 2.5 Flash	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-2.5-pro`	Google Gemini 2.5 Pro	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-20-flash`	Google Gemini 2.0 Flash	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-3-flash-com`	Google Gemini 3 Flash	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-3.1-flash-lite-com`	Google Gemini 3.1 Flash Lite	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-3.1-pro-com`	Google Gemini 3.1 Pro	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-3.5-flash-com`	Google Gemini 3.5 Flash	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`gpt-4.1`	Azure OpenAI GPT-4.1	Azure OpenAI (Commercial)	128,000	32,768	✅	✅	—
Chat Completions	`gpt-4.1-mini`	Azure OpenAI GPT-4.1-mini	Azure OpenAI (Commercial)	128,000	32,768	✅	✅	—
Chat Completions	`gpt-4.1-nano`	Azure OpenAI GPT-4.1-nano	Azure OpenAI (Commercial)	128,000	16,384	✅	✅	—
Chat Completions	`grok-4-1-fast-non-reasoning`	X.AI Grok 4.1 Fast	Direct	256,000	16,384	✅	✅	—
Chat Completions	`grok-4-1-fast-reasoning`	X.AI Grok 4.1 Fast (Reasoning)	Direct	256,000	16,384	✅	✅	✅
Chat Completions	`grok-4-20-non-reasoning`	X.AI Grok 4.20 (Fast)	Direct	256,000	16,384	✅	✅	—
Chat Completions	`grok-4-20-reasoning`	X.AI Grok 4.20 (Reasoning)	Direct	256,000	16,384	✅	✅	✅
Chat Completions	`groq-70b`	Groq-70B	Groq Cloud	128,000	8,192	—	—	—
Chat Completions	`groq-llama33`	Groq LLAMA 3.3	Groq Cloud	128,000	8,192	—	—	—
Chat Completions	`groq-llama4-scout`	Groq LLAMA 4-Scout	Groq Cloud	128,000	8,192	—	—	—
Chat Completions	`kimi-2.6-com`	Moonshot Kimi K2.6	Direct	200,000	16,384	✅	—	—
Chat Completions	`mistral-large-3`	Mistral Large 3	Azure OpenAI (Commercial)	128,000	32,000	✅	—	—
Responses	`gpt-5`	Azure OpenAI GPT-5	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5-mini`	Azure OpenAI GPT-5-mini	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5-nano`	Azure OpenAI GPT-5-nano	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5.1`	Azure OpenAI GPT-5.1	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5.2`	Azure OpenAI GPT-5.2	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5.4`	Azure OpenAI GPT-5.4	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5.4-nano`	Azure OpenAI GPT-5.4-nano	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-o1`	Azure OpenAI GPT-o1	Azure OpenAI (Commercial)	200,000	100,000	✅	✅	✅
Responses	`gpt-o3`	Azure OpenAI GPT-o3	Azure OpenAI (Commercial)	200,000	100,000	✅	✅	✅
Responses	`gpt-o3-mini`	Azure OpenAI GPT-o3-mini	Azure OpenAI (Commercial)	200,000	100,000	✅	✅	✅
Responses	`gpt-o4-mini`	Azure OpenAI GPT-o4-mini	Azure OpenAI (Commercial)	200,000	100,000	✅	✅	✅

Gov Tenants (FedRAMP / IL2–IL4)

Profile when the tenant has force_gov_models=true. Superset of the commercial-equivalent models with -gov variants for partner models that are not yet generally available in commercial.

Gov — 44 chat / reasoning models

API Shape	Public ID	Display Name	Provider / Hosting	Input Ctx	Output Ctx	Tools	Vision	Reasoning
Anthropic Messages	`claude-haiku-4-5` alias: `google-claude-45-haiku`	Google Anthropic Claude 4.5 Haiku	Google Vertex AI	200,000	32,000	✅	✅	✅
Anthropic Messages	`claude-opus-4` alias: `google-claude-4-opus`	Google Anthropic Claude 4.1 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-5` alias: `google-claude-45-opus`	Google Anthropic Claude 4.5 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-6` alias: `google-claude-46-opus`	Google Anthropic Claude 4.6 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-7` alias: `google-claude-47-opus`	Google Anthropic Claude 4.7 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-8` alias: `google-claude-48-opus`	Google Anthropic Claude 4.8 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-sonnet-4` alias: `google-claude-4-sonnet`	Google Anthropic Claude 4 Sonnet	Google Vertex AI	200,000	32,768	✅	✅	✅
Anthropic Messages	`claude-sonnet-4-5` alias: `aws-bedrock-claude-45-sonnet-gov`	AWS Gov Bedrock Claude 4.5 Sonnet	AWS Bedrock GovCloud	200,000	32,768	✅	✅	✅
Anthropic Messages	`claude-sonnet-4-5-vertex` alias: `google-claude-45-sonnet`	Google Anthropic Claude 4.5 Sonnet	Google Vertex AI	200,000	32,768	✅	✅	✅
Anthropic Messages	`claude-sonnet-4-6` alias: `google-claude-46-sonnet`	Google Anthropic Claude 4.6 Sonnet	Google Vertex AI	200,000	32,768	✅	✅	✅
Chat Completions	`aws-bedrock-gpt-oss-120b-gov`	OpenAI GPT-OSS 120B	AWS Bedrock GovCloud	131,000	8,192	✅	—	✅
Chat Completions	`aws-bedrock-gpt-oss-20b-gov`	OpenAI GPT-OSS 20B	AWS Bedrock GovCloud	131,000	8,192	✅	—	✅
Chat Completions	`aws-bedrock-nemotron-12b-vl-gov`	NVIDIA Nemotron Nano 12B v2 VL	AWS Bedrock GovCloud	131,000	8,192	✅	✅	—
Chat Completions	`aws-bedrock-nemotron-30b-gov`	NVIDIA Nemotron Nano 3 30B	AWS Bedrock GovCloud	131,000	8,192	✅	—	—
Chat Completions	`aws-bedrock-nemotron-9b-gov`	NVIDIA Nemotron Nano 9B v2	AWS Bedrock GovCloud	131,000	8,192	✅	—	—
Chat Completions	`aws-bedrock-nemotron-super-3-120b-gov`	NVIDIA Nemotron Super 3 120B	AWS Bedrock GovCloud	131,000	8,192	✅	—	—
Chat Completions	`aws-bedrock-nova-lite-gov`	AWS Gov Bedrock Nova Lite	AWS Bedrock GovCloud	128,000	5,000	✅	✅	—
Chat Completions	`aws-bedrock-nova-micro-gov`	AWS Gov Bedrock Nova Micro	AWS Bedrock GovCloud	128,000	5,000	✅	—	—
Chat Completions	`aws-bedrock-nova-pro-gov`	AWS Gov Bedrock Nova Pro	AWS Bedrock GovCloud	300,000	5,000	✅	✅	—
Chat Completions	`google-gemini-2.5-flash`	Google Gemini 2.5 Flash	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-2.5-pro`	Google Gemini 2.5 Pro	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-20-flash`	Google Gemini 2.0 Flash	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-3.1-flash-lite-gov`	Google Gemini 3.1 Flash Lite Gov	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-3.5-flash-gov`	Google Gemini 3.5 Flash Gov	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`gpt-4.1`	Azure OpenAI GPT-4.1	Azure OpenAI (Commercial)	128,000	32,768	✅	✅	—
Chat Completions	`gpt-4.1-mini`	Azure OpenAI GPT-4.1-mini	Azure OpenAI (Commercial)	128,000	32,768	✅	✅	—
Chat Completions	`gpt-4.1-nano`	Azure OpenAI GPT-4.1-nano	Azure OpenAI (Commercial)	128,000	16,384	✅	✅	—
Chat Completions	`grok-4-1-fast-non-reasoning`	X.AI Grok 4.1 Fast	Direct	256,000	16,384	✅	✅	—
Chat Completions	`grok-4-1-fast-reasoning`	X.AI Grok 4.1 Fast (Reasoning)	Direct	256,000	16,384	✅	✅	✅
Chat Completions	`grok-4-20-non-reasoning`	X.AI Grok 4.20 (Fast)	Direct	256,000	16,384	✅	✅	—
Chat Completions	`grok-4-20-reasoning`	X.AI Grok 4.20 (Reasoning)	Direct	256,000	16,384	✅	✅	✅
Chat Completions	`llma3`	LLAMA 3	AWS Bedrock GovCloud	128,000	8,192	✅	—	—
Chat Completions	`llma3-8b`	Meta Llama 3 8B	AWS Bedrock GovCloud	128,000	8,192	✅	—	—
Chat Completions	`mistral-large-3`	Mistral Large 3	Azure OpenAI (Commercial)	128,000	32,000	✅	—	—
Responses	`gpt-5`	Azure OpenAI GPT-5	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5-mini`	Azure OpenAI GPT-5-mini	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5-nano`	Azure OpenAI GPT-5-nano	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5.1`	Azure OpenAI GPT-5.1	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5.1-gov`	Azure Gov OpenAI GPT-5.1	Azure OpenAI Gov	272,000	128,000	✅	✅	✅
Responses	`gpt-5.2`	Azure OpenAI GPT-5.2	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5.4`	Azure OpenAI GPT-5.4	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-5.4-nano`	Azure OpenAI GPT-5.4-nano	Azure OpenAI (Commercial)	272,000	128,000	✅	✅	✅
Responses	`gpt-o1`	Azure OpenAI GPT-o1	Azure OpenAI (Commercial)	200,000	100,000	✅	✅	✅
Responses	`gpt-o3-mini`	Azure OpenAI GPT-o3-mini	Azure OpenAI (Commercial)	200,000	100,000	✅	✅	✅

DoD Tenants (IL5 / IL6)

DoD operators: only the models in this table are approved in the DoD-locked profile (force_dod_models=true). Calling any other model ID will return 403 model_not_allowed. This list mirrors the canonical allow-list in Ask Sage Client src/config.js and is the safe set to publish in a DoD environment.

DoD-Approved — 26 chat / reasoning models

API Shape	Public ID	Display Name	Provider / Hosting	Input Ctx	Output Ctx	Tools	Vision	Reasoning
Anthropic Messages	`claude-haiku-4-5` alias: `google-claude-45-haiku`	Google Anthropic Claude 4.5 Haiku	Google Vertex AI	200,000	32,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-5` alias: `google-claude-45-opus`	Google Anthropic Claude 4.5 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-6` alias: `google-claude-46-opus`	Google Anthropic Claude 4.6 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-7` alias: `google-claude-47-opus`	Google Anthropic Claude 4.7 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-opus-4-8` alias: `google-claude-48-opus`	Google Anthropic Claude 4.8 Opus	Google Vertex AI	200,000	64,000	✅	✅	✅
Anthropic Messages	`claude-sonnet-4-5-vertex` alias: `google-claude-45-sonnet`	Google Anthropic Claude 4.5 Sonnet	Google Vertex AI	200,000	32,768	✅	✅	✅
Anthropic Messages	`claude-sonnet-4-6` alias: `google-claude-46-sonnet`	Google Anthropic Claude 4.6 Sonnet	Google Vertex AI	200,000	32,768	✅	✅	✅
Chat Completions	`aws-bedrock-gpt-oss-120b-gov`	OpenAI GPT-OSS 120B	AWS Bedrock GovCloud	131,000	8,192	✅	—	✅
Chat Completions	`aws-bedrock-gpt-oss-20b-gov`	OpenAI GPT-OSS 20B	AWS Bedrock GovCloud	131,000	8,192	✅	—	✅
Chat Completions	`aws-bedrock-nemotron-12b-vl-gov`	NVIDIA Nemotron Nano 12B v2 VL	AWS Bedrock GovCloud	131,000	8,192	✅	✅	—
Chat Completions	`aws-bedrock-nemotron-30b-gov`	NVIDIA Nemotron Nano 3 30B	AWS Bedrock GovCloud	131,000	8,192	✅	—	—
Chat Completions	`aws-bedrock-nemotron-9b-gov`	NVIDIA Nemotron Nano 9B v2	AWS Bedrock GovCloud	131,000	8,192	✅	—	—
Chat Completions	`aws-bedrock-nemotron-super-3-120b-gov`	NVIDIA Nemotron Super 3 120B	AWS Bedrock GovCloud	131,000	8,192	✅	—	—
Chat Completions	`aws-bedrock-nova-lite-gov`	AWS Gov Bedrock Nova Lite	AWS Bedrock GovCloud	128,000	5,000	✅	✅	—
Chat Completions	`aws-bedrock-nova-micro-gov`	AWS Gov Bedrock Nova Micro	AWS Bedrock GovCloud	128,000	5,000	✅	—	—
Chat Completions	`aws-bedrock-nova-pro-gov`	AWS Gov Bedrock Nova Pro	AWS Bedrock GovCloud	300,000	5,000	✅	✅	—
Chat Completions	`google-gemini-2.5-flash`	Google Gemini 2.5 Flash	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-2.5-pro`	Google Gemini 2.5 Pro	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-3.1-flash-lite-gov`	Google Gemini 3.1 Flash Lite Gov	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`google-gemini-3.5-flash-gov`	Google Gemini 3.5 Flash Gov	Google Vertex AI	1,000,000	8,192	✅	✅	—
Chat Completions	`gpt-4.1-gov`	Azure Gov OpenAI GPT-4.1	Azure OpenAI Gov	128,000	32,768	✅	✅	—
Chat Completions	`gpt-4.1-mini-gov`	Azure Gov OpenAI GPT-4.1-mini	Azure OpenAI Gov	128,000	32,768	✅	✅	—
Chat Completions	`llma3`	LLAMA 3	AWS Bedrock GovCloud	128,000	8,192	✅	—	—
Chat Completions	`llma3-8b`	Meta Llama 3 8B	AWS Bedrock GovCloud	128,000	8,192	✅	—	—
Responses	`gpt-5.1-gov`	Azure Gov OpenAI GPT-5.1	Azure OpenAI Gov	272,000	128,000	✅	✅	✅
Responses	`gpt-o3-mini-gov`	Azure Gov OpenAI GPT-o3-mini	Azure OpenAI Gov	200,000	100,000	✅	✅	✅

Drop-in Configurations by Environment

Paste one of the snippets below into chatLanguageModels.json based on the Ask Sage tenant your API key is associated with. The snippets are curated starter sets — expand from the catalog above as needed.

Commercial Drop-in

chatLanguageModels.json — Commercial

[
  {
    "name": "Ask Sage (Commercial)",
    "vendor": "customendpoint",
    "apiKey": "${input:asksage-api-key}",
    "models": [
      {
        "id": "gpt-4.1",
        "name": "GPT 4.1 - Ask Sage",
        "url": "https://api.asksage.ai/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      },
      {
        "id": "gpt-5.5",
        "name": "GPT 5.5 (Reasoning) - Ask Sage",
        "url": "https://api.asksage.ai/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 272000,
        "maxOutputTokens": 128000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high"
        ],
        "reasoningEffortFormat": "responses"
      },
      {
        "id": "claude-opus-4-8",
        "name": "Claude Opus 4.8 - Ask Sage",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 64000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      },
      {
        "id": "claude-sonnet-4-6",
        "name": "Claude Sonnet 4.6 - Ask Sage",
        "url": "https://api.asksage.ai/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 32768,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      }
    ],
    "settings": {
      "gpt-5.5": {
        "reasoningEffort": "high"
      },
      "claude-opus-4-8": {
        "reasoningEffort": "medium"
      },
      "claude-sonnet-4-6": {
        "reasoningEffort": "medium"
      }
    }
  }
]

Gov Drop-in

Replace the base URL: The url values below use a YOUR-GOV-TENANT placeholder. Before pasting this configuration into VS Code, swap that placeholder for the base URL your government Ask Sage tenant issued when you generated your API key. Do not point a government workload at the commercial endpoint.

chatLanguageModels.json — Gov (FedRAMP / IL2–IL4)

[
  {
    "name": "Ask Sage (Gov)",
    "vendor": "customendpoint",
    "apiKey": "${input:asksage-api-key}",
    "models": [
      {
        "id": "gpt-4.1",
        "name": "GPT 4.1 - Ask Sage",
        "url": "https://api.YOUR-GOV-TENANT/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      },
      {
        "id": "gpt-5.1-gov",
        "name": "GPT 5.1 (Gov Reasoning) - Ask Sage",
        "url": "https://api.YOUR-GOV-TENANT/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 272000,
        "maxOutputTokens": 128000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high"
        ],
        "reasoningEffortFormat": "responses"
      },
      {
        "id": "claude-sonnet-4-6",
        "name": "Claude Sonnet 4.6 - Ask Sage",
        "url": "https://api.YOUR-GOV-TENANT/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 32768,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      },
      {
        "id": "claude-opus-4-7",
        "name": "Claude Opus 4.7 - Ask Sage",
        "url": "https://api.YOUR-GOV-TENANT/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 64000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      }
    ],
    "settings": {
      "gpt-5.1-gov": {
        "reasoningEffort": "high"
      },
      "claude-sonnet-4-6": {
        "reasoningEffort": "medium"
      },
      "claude-opus-4-7": {
        "reasoningEffort": "medium"
      }
    }
  }
]

DoD Drop-in

Replace the base URL: The url values below use a YOUR-DOD-TENANT placeholder. Before pasting this configuration into VS Code, swap that placeholder for the base URL your DoD Ask Sage tenant issued when you generated your API key. Do not point a DoD workload at the commercial endpoint.

DoD-only: The Claude IDs in this snippet (claude-sonnet-4-6, claude-opus-4-7) resolve to Google Vertex AI deployed inside an IL5 Assured Workloads folder — not commercial Vertex. claude-sonnet-4-5 (which routes to AWS Bedrock GovCloud) is not in the force_dod_models allow-list, so it is omitted here.

chatLanguageModels.json — DoD (IL5 / IL6)

[
  {
    "name": "Ask Sage (DoD)",
    "vendor": "customendpoint",
    "apiKey": "${input:asksage-api-key}",
    "models": [
      {
        "id": "gpt-4.1-gov",
        "name": "GPT 4.1 (Gov) - Ask Sage",
        "url": "https://api.YOUR-DOD-TENANT/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      },
      {
        "id": "gpt-4.1-mini-gov",
        "name": "GPT 4.1 Mini (Gov) - Ask Sage",
        "url": "https://api.YOUR-DOD-TENANT/server/openai/v1/chat/completions",
        "apiType": "chat-completions",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 128000,
        "maxOutputTokens": 32768
      },
      {
        "id": "gpt-5.1-gov",
        "name": "GPT 5.1 (Gov) - Ask Sage",
        "url": "https://api.YOUR-DOD-TENANT/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 272000,
        "maxOutputTokens": 128000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high"
        ],
        "reasoningEffortFormat": "responses"
      },
      {
        "id": "gpt-o3-mini-gov",
        "name": "GPT o3 Mini (Gov) - Ask Sage",
        "url": "https://api.YOUR-DOD-TENANT/server/openai/v1/responses",
        "apiType": "responses",
        "toolCalling": true,
        "vision": false,
        "maxInputTokens": 200000,
        "maxOutputTokens": 100000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high"
        ],
        "reasoningEffortFormat": "responses"
      },
      {
        "id": "claude-sonnet-4-6",
        "name": "Claude Sonnet 4.6 (IL5 Vertex) - Ask Sage",
        "url": "https://api.YOUR-DOD-TENANT/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 32768,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      },
      {
        "id": "claude-opus-4-7",
        "name": "Claude Opus 4.7 (IL5 Vertex) - Ask Sage",
        "url": "https://api.YOUR-DOD-TENANT/server/anthropic/v1/messages",
        "apiType": "messages",
        "toolCalling": true,
        "vision": true,
        "maxInputTokens": 200000,
        "maxOutputTokens": 64000,
        "thinking": true,
        "supportsReasoningEffort": [
          "low",
          "medium",
          "high",
          "xhigh",
          "max"
        ]
      }
    ],
    "settings": {
      "gpt-5.1-gov": {
        "reasoningEffort": "high"
      },
      "gpt-o3-mini-gov": {
        "reasoningEffort": "high"
      },
      "claude-sonnet-4-6": {
        "reasoningEffort": "medium"
      },
      "claude-opus-4-7": {
        "reasoningEffort": "medium"
      }
    }
  }
]

Why Custom Endpoint (and not OpenAI / Anthropic vendor)

Always use vendor: "customendpoint" for Ask Sage. The Ask Sage compatibility endpoints are OpenAI- and Anthropic-style, but they are hosted by Ask Sage — not by OpenAI or Anthropic directly.

Do not use vendor: "openai" — VS Code's built-in OpenAI provider targets the official OpenAI API.
Do not use vendor: "anthropic" — VS Code's built-in Anthropic provider targets Anthropic's official API.

For OpenAI-shaped endpoints VS Code sends the key as Authorization: Bearer .... For Anthropic-shaped endpoints with apiType: "messages", VS Code sends it as x-api-key — both are supported by Ask Sage.

Optional — Use Ask Sage for Copilot Utility Tasks

VS Code uses lightweight background models for utility tasks like title generation, commit messages, and intent detection. You can route those through Ask Sage too:

VS Code settings.json

{
  "chat.utilityModel": "customendpoint/gpt-4.1",
  "chat.utilitySmallModel": "customendpoint/gpt-4.1-mini"
}

The format is ${vendor}/${modelId}. For Ask Sage models, the vendor is always customendpoint, so values look like customendpoint/gpt-4.1-mini. A fast, inexpensive model is recommended for chat.utilitySmallModel since it is invoked frequently.

Configuration Reference

The fields most relevant to Ask Sage models:

Property	Type	Notes
`id`	string	The model identifier Ask Sage expects (e.g., `gpt-4.1`, `claude-opus-4-7`)
`name`	string	Display name in the Copilot Chat model picker
`url`	string	Full Ask Sage endpoint URL for this model's API shape
`apiType`	string	`chat-completions`, `responses`, or `messages` — overrides the group default
`toolCalling`	boolean	Set to `true` only if the model supports tool calling
`vision`	boolean	Set to `true` only if the model supports image inputs
`maxInputTokens`	integer	Context window for input tokens
`maxOutputTokens`	integer	Maximum response length
`thinking`	boolean	Set to `true` for reasoning-capable models (Responses or Anthropic with thinking)
`supportsReasoningEffort`	array	Effort levels: typically `["low", "medium", "high"]`; Anthropic also supports `"xhigh"` and `"max"`
`reasoningEffortFormat`	string	For `/responses` endpoints set to `"responses"` (sends nested `reasoning.effort`); defaults follow URL otherwise
`streaming`	boolean	Optional, defaults to `true`; Ask Sage supports streaming via SSE
`requestHeaders`	object	Optional extra headers; reserved/forwarding headers are ignored

For the complete reference (including provider-level fields and advanced options) see the VS Code language models documentation.

DoD and Managed Network Connectivity

If your users connect through a DoD, DoW, or other managed network, certificate and proxy configuration may be required before VS Code can reach the Ask Sage endpoint. For the model allow-list approved in DoD environments, see the DoD Tenants section above.

If your environment requires a custom root certificate, configure it according to your organization policy — typically through the OS certificate store or the http.proxyStrictSSL / http.systemCertificates VS Code settings
Test reachability from a terminal, substituting the base URL issued to your government / DoD tenant: curl -I https://api.YOUR-TENANT/server/openai/v1/models -H "Authorization: Bearer $KEY"
If using a proxy, ensure VS Code's http.proxy setting is configured and matches your shell environment

Troubleshooting

Manage Language Models shows nothing

Make sure chatLanguageModels.json is a top-level array, not an object with a providers property.

Correct:

[
  { "name": "Ask Sage", "vendor": "customendpoint" }
]

Incorrect:

{
  "providers": []
}

API key not found or authentication fails

Re-enter the key through Chat: Manage Language Models so VS Code stores it as a secret
Confirm the JSON contains an ${input:chat.lm.secret...} reference for apiKey (not the raw key)
Verify the Ask Sage API key is still active in your account settings
Verify the endpoint URL matches the configured apiType — an Anthropic URL with apiType: "chat-completions" will fail authentication

Model does not appear in the picker

Confirm the provider group vendor is customendpoint
Confirm each model has id, name, url, toolCalling, vision, maxInputTokens, and maxOutputTokens
For agent / tool-use scenarios, the model must have toolCalling: true — otherwise it is hidden from the picker
Reload the VS Code window after editing chatLanguageModels.json directly

Reasoning effort does not appear in the picker

Set thinking: true on the model
Add supportsReasoningEffort with the effort values your endpoint accepts
For /responses endpoints, set reasoningEffortFormat: "responses"

Privacy and Data Usage

When configured through BYOK, chat requests for the selected model are sent to the Ask Sage endpoint you configured. The same Ask Sage tenant data-handling and logging policies apply as for any other Ask Sage API consumer. Refer to the Ask Sage Privacy & Security FAQ for the specifics of retention, logging, and training behavior in your environment.

Additional Resources

VS Code language model docs Ask Sage OpenAI compatibility Ask Sage Anthropic compatibility