Foundry Models sold directly by Azure in Azure Government

This article lists a selection of Microsoft Foundry Models sold directly by Azure in Azure Government along with their capabilities and deployment types, and regions of availability.

Models sold directly by Azure include all Azure OpenAI models offered in Azure Government. These models are billed through your Azure subscription, covered by Azure service-level agreements, and supported by Microsoft.

To learn more about attributes of all Foundry Models sold directly by Azure across all clouds, see Models Sold Directly by Azure.

Azure OpenAI in Microsoft Foundry models in Azure Government

Azure OpenAI is powered by a diverse set of models with different capabilities and price points. Model availability in Azure Government varies by region.

Models	Description
GPT-5.1 series	NEW `gpt-5.1`
GPT-4.1 series	gpt-4.1, gpt-4.1-mini
o-series models	Reasoning models with advanced problem solving and increased focus and capability.
GPT-4o	Capable Azure OpenAI models with multimodal versions, which can accept both text and images as input.
Embeddings	A set of models that can convert text into numerical vector form to facilitate text similarity.

GPT-5.1

Region availability

Model	Region
`gpt-5.1`	See the models table.

Model ID	Description	Context Window	Max Output Tokens	Training Data (up to)
`gpt-5.1` (2025-11-13)	- Reasoning - Chat Completions API. - Responses API. - Structured outputs. - Text and image processing. - Functions, tools, and parallel tool calling. - Full summary of capabilities.	400,000 Input: 272,000 Output: 128,000	128,000	September 30, 2024

Important

gpt-5.1 reasoning_effort defaults to none. When upgrading from previous reasoning models to gpt-5.1, keep in mind that you may need to update your code to explicitly pass a reasoning_effort level if you want reasoning to occur.

GPT-4.1 series

Region availability

Model	Region
`gpt-4.1` (2025-04-14)	See the models table.
`gpt-4.1-mini` (2025-04-14)	See the models table.

Capabilities

Important

A known issue is affecting all GPT 4.1 series models. Large tool or function call definitions that exceed 300,000 tokens will result in failures, even though the 1 million token context limit of the models wasn't reached.

The errors can vary based on API call and underlying payload characteristics.

Here are the error messages for the Chat Completions API:

Error code: 400 - {'error': {'message': "This model's maximum context length is 300000 tokens. However, your messages resulted in 350564 tokens (100 in the messages, 350464 in the functions). Please reduce the length of the messages or functions.", 'type': 'invalid_request_error', 'param': 'messages', 'code': 'context_length_exceeded'}}
Error code: 400 - {'error': {'message': "Invalid 'tools[0].function.description': string too long. Expected a string with maximum length 1048576, but got a string with length 2778531 instead.", 'type': 'invalid_request_error', 'param': 'tools[0].function.description', 'code': 'string_above_max_length'}}

Here's the error message for the Responses API:

Error code: 500 - {'error': {'message': 'The server had an error processing your request. Sorry about that! You can retry your request, or contact us through an Azure support request at: https://go.microsoft.com/fwlink/?linkid=2213926 if you keep seeing this error. (Please include the request ID d2008353-291d-428f-adc1-defb5d9fb109 in your email.)', 'type': 'server_error', 'param': None, 'code': None}}

Model ID	Description	Context window	Max output tokens	Training data (up to)
`gpt-4.1` (2025-04-14)	- Text and image input - Text output - Chat completions API - Responses API - Streaming - Function calling - Structured outputs (chat completions)	- 1,047,576 (not available) - 128,000 (standard & provisioned managed deployments) - 300,000 (batch deployments)	32,768	May 31, 2024
`gpt-4.1-mini` (2025-04-14)	- Text and image input - Text output - Chat completions API - Responses API - Streaming - Function calling - Structured outputs (chat completions)	- 1,047,576 (not available) - 128,000 (standard & provisioned managed deployments) - 300,000 (batch deployments)	32,768	May 31, 2024

o-series models

The Azure OpenAI o-series models are designed to tackle reasoning and problem-solving tasks with increased focus and capability. These models spend more time processing and understanding the user's request, making them exceptionally strong in areas like science, coding, and math, compared to previous iterations.

Model ID	Description	Max request (tokens)	Training data (up to)
`o3-mini` (2025-01-31)	- Enhanced reasoning abilities. - Structured outputs. - Text-only processing. - Functions and tools.	Input: 200,000 Output: 100,000	October 2023

To learn more about advanced o-series models, see Getting started with reasoning models.

Region availability

Model	Region
`o3-mini`	See the models table.

GPT-4o

GPT-4o integrates text and images in a single model, which enables it to handle multiple data types simultaneously. This multimodal approach enhances accuracy and responsiveness in human-computer interactions. GPT-4o matches GPT-4 Turbo in English text and coding tasks while offering superior performance in non-English language tasks and vision tasks, setting new benchmarks for AI capabilities.

Model ID	Description	Max request (tokens)	Training data (up to)
`gpt-4o` (2024-11-20) GPT-4o (Omni)	- Structured outputs. - Text and image processing. - JSON Mode. - Parallel function calling. - Enhanced accuracy and responsiveness. - Parity with English text and coding tasks compared to GPT-4 Turbo with Vision. - Superior performance in non-English languages and in vision tasks. - Enhanced creative writing ability.	Input: 128,000 Output: 16,384	October 2023

Embeddings

text-embedding-3-large is the latest and most capable embedding model. You can't upgrade between embeddings models. To move from using text-embedding-ada-002 to text-embedding-3-large, you need to generate new embeddings.

text-embedding-3-large
text-embedding-3-small
text-embedding-ada-002

OpenAI reports that testing shows that both the large and small third generation embeddings models offer better average multi-language retrieval performance with the MIRACL benchmark. They still maintain performance for English tasks with the MTEB benchmark.

Evaluation benchmark	`text-embedding-ada-002`	`text-embedding-3-small`	`text-embedding-3-large`
MIRACL average	31.4	44.0	54.9
MTEB average	61.0	62.3	64.6

The third generation embeddings models support reducing the size of the embedding via a new dimensions parameter. Typically, larger embeddings are more expensive from a compute, memory, and storage perspective. When you can adjust the number of dimensions, you gain more control over overall cost and performance. The dimensions parameter isn't supported in all versions of the OpenAI 1.x Python library. To take advantage of this parameter, we recommend that you upgrade to the latest version: pip install openai --upgrade.

OpenAI's MTEB benchmark testing found that even when the third generation model's dimensions are reduced to less than the 1,536 dimensions of text-embeddings-ada-002, performance remains slightly better.

Model summary table and region availability

Models by deployment type

Azure OpenAI provides customers with choices on the hosting structure that fits their business and usage patterns. The service offers two main types of deployment:

Standard: Has a USGov datazone deployment option, routing traffic within Azure Government to provide higher throughput.
Provisioned: Also has a datazone deployment option, allowing customers to purchase and deploy provisioned throughput units across Azure Government infrastructure.

All deployments can perform the exact same inference operations, but the billing, scale, and performance are substantially different. To learn more about Azure OpenAI deployment types, see our Deployment types guide.

Data Zone Standard model availability

Region	gpt-5.1, 2025-11-13	gpt-4.1, 2025-04-14	gpt-4.1-mini, 2025-04-14	o3-mini, 2025-01-31	gpt-4o, 2024-11-20	text-embedding-ada-002, 2	text-embedding-3-large, 1	text-embedding-3-small, 1
usgovarizona	✅	✅	✅	✅	✅	-	-	-
usgovvirginia	✅	✅	✅	✅	✅	-	-	-

Data Zone Provisioned managed model availability

Region	gpt-5.1, 2025-11-13	gpt-4.1, 2025-04-14	gpt-4.1-mini, 2025-04-14	o3-mini, 2025-01-31	gpt-4o, 2024-11-20	text-embedding-ada-002, 2	text-embedding-3-large, 1	text-embedding-3-small, 1
usgovarizona	-	-	✅	✅	✅	-	-	-
usgovvirginia		-	✅	✅	✅	-	-	-

Standard deployment model availability

Region	gpt-5.1, 2025-11-13	gpt-4.1, 2025-04-14	gpt-4.1-mini, 2025-04-14	o3-mini, 2025-01-31	gpt-4o, 2024-11-20	text-embedding-ada-002, 2	text-embedding-3-large, 1	text-embedding-3-small, 1
usgovarizona	-	✅	✅	-	✅	✅	✅	✅
usgovvirginia	-	✅	✅	-	✅	-	-	-

Provisioned deployment model availability

Region	gpt-5.1, 2025-11-13	gpt-4.1, 2025-04-14	gpt-4.1-mini, 2025-04-14	o3-mini, 2025-01-31	gpt-4o, 2024-11-20	text-embedding-ada-002, 2	text-embedding-3-large, 1	text-embedding-3-small, 1
usgovarizona	-	-	-	-	✅	-	-	-
usgovvirginia		-	-	-	✅	-	-	-

For more information on provisioned deployments, see Provisioned guidance.

Embeddings models

These models can be used only with Embedding API requests.

Note

text-embedding-3-large is the latest and most capable embedding model. You can't upgrade between embedding models. To migrate from using text-embedding-ada-002 to text-embedding-3-large, you need to generate new embeddings.

Model ID	Max request (tokens)	Output dimensions	Training data (up to)
`text-embedding-ada-002` (version 2)	8,192	1,536	Sep 2021
`text-embedding-ada-002` (version 1)	2,046	1,536	Sep 2021
`text-embedding-3-large`	8,192	3,072	Sep 2021
`text-embedding-3-small`	8,192	1,536	Sep 2021

Note

When you send an array of inputs for embedding, the maximum number of input items in the array per call to the embedding endpoint is 2,048.

Model retirement

In some cases, models are retired in Azure Government earlier or later than in the commercial cloud. For the latest information on model retirements, refer to the model retirement guide.

Model	Version	Public Retirement Plan	Azure Government Plan
gpt-4o	0513	3/31/2026 for Regional Standard. 10/1/2026 for DataZone and PTU.	3/31/2026 for all Deployment Types.
gpt-4o-mini	0718	3/31/2026 for Regional Standard. 10/1/2026 for DataZone and PTU.	3/31/2026 for all Deployment Types.

Feedback

Was this page helpful?

Last updated on 2026-04-08

Share via

Foundry Models sold directly by Azure in Azure Government

Azure OpenAI in Microsoft Foundry models in Azure Government

GPT-5.1

Region availability

GPT-4.1 series

Region availability

Capabilities

o-series models

Region availability

GPT-4o

Embeddings

Model summary table and region availability

Models by deployment type

Data Zone Standard model availability

Embeddings models

Model retirement

Related content

Feedback

Additional resources