Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform
Hello Jesse Mota,
Welcome to Microsoft Q&A .Thank you for reaching out.
The deployment behavior described indicates that the request successfully passed initial validation checks and entered the provisioning phase, where backend capacity is allocated for the selected model. The model appearing in the catalog, the region being supported, and quota approval together confirm eligibility to deploy. However, these checks do not guarantee that usable capacity is immediately available at deployment time.
When a deployment is accepted and later fails with ResourceOperationFailure and InternalServerError, this most commonly points to a temporary backend provisioning or regional capacity constraint. In such cases, the allocation request reaches the control plane successfully, but the underlying infrastructure is unable to bind capacity at that moment. This behavior is consistent with how Azure AI Foundry model provisioning operates, particularly for high‑demand models such as Claude Opus, where capacity is dynamically allocated and shared across tenants and regions.
At present, the error surfaced in the Activity Log is intentionally generic. Failures that occur behind the resource provider boundary, such as allocator or hosting layer issues, do not always propagate a detailed reason to the portal or CLI. As a result, the absence of a specific failure message does not indicate a configuration issue or an invalid request.
Please consider checking out the following workarounds listed in order of effectiveness:
- Retrying the deployment after some time - backend capacity constraints are often transient. Retrying later can succeed without any changes once capacity becomes available.
- Attempting deployment in another supported region - if the same model deploys successfully in a different region, this strongly indicates that the failure is isolated to temporary capacity pressure in the original region.Regional capacity availability can vary at any given time.
- Reviewing Azure Service Health -the Service Health blade or Azure status page can help identify broader regional service incidents that may affect provisioning. Model‑specific capacity constraints may not always appear, but this check helps rule out wider platform issues. · Azure status
The following references might be helpful , please check them out
- Troubleshoot common Azure deployment errors - Azure Resource Manager | Microsoft Learn
- What Is Provisioned Throughput for Foundry Models? - Microsoft Foundry | Microsoft Learn
- Microsoft Foundry Models quotas and limits - Microsoft Foundry | Microsoft Learn
Thank you
Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the response was helpful. This will be benefitting other community members who face the same issue.