An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
Throughput for Azure Document Intelligence is controlled by service limits (TPS and concurrency), not by S1/S2/S3 tiers.
To increase throughput and reduce throttling/latency:
- Check for throttling (HTTP 429)
If responses return status code 429 (Too many requests), the current transactions-per-second (TPS) limits are being exceeded. There are separate limits for:- Analyze transactions per second (POST)
- Get operations per second (polling GET)
- Model management operations per second
- List operations per second
- Implement client-side best practices
Before requesting a quota increase, apply these patterns:- Implement retry logic with exponential backoff for 429 responses.
- Avoid sharp spikes in load; ramp up gradually. For example, if current load is 10 TPS, do not jump immediately to 40 TPS or throttling will occur.
- For polling
GETanalyze results, do not call more than once every ~2 seconds per document, and honor theretry-afterheader in the analyze response. A recommended delay pattern is 2–5–13–34 seconds between retries. - If throttled on POST requests, add delays between submissions so the workload stays under the current TPS limit.
- Understand default TPS limits
By default, a Document Intelligence resource is limited to 15 TPS for analyze requests. This is independent of pricing tiers like S1/S2/S3 (those apply to other services such as older Cosmos DB performance levels, not Document Intelligence). - Request a TPS (throughput) increase via Azure Support
For the Standard pricing tier, TPS can be increased by submitting a support request. The increase is free, but approval and the final TPS value depend on daily usage patterns and adherence to best practices.
Steps:- Sign in to the Azure portal.
- Open the Document Intelligence resource that needs higher throughput.
- Under Support + troubleshooting, select New support request.
- In Summary, enter something like
Increase Document Intelligence TPS limit. - Set Problem type to Quota or usage validation.
- Proceed to Next: Solutions and continue to the Details tab.
- In Description, clearly state:
- That the request is for Document Intelligence quota.
- The TPS level required for the workload (only request what is realistically needed).
- The Azure resource information (resource name, region, subscription).
- Complete the required fields and select Create on the Review + create tab.
- Note the support request number and wait for Support to contact for further processing.
- Cost behavior
Increasing the concurrent request/TPS limit does not directly change pricing. Document Intelligence uses a “pay only for what is used” model; the limit only defines how high the service can scale before throttling. - Latency expectations
For very large documents, processing time will naturally be higher. Document Intelligence is asynchronous and allows retrieving results for up to 24 hours after submission using the request ID. Establish a latency baseline for typical documents and use the asynchronous model plus proper polling intervals to avoid unnecessary load.
If, after applying these best practices, throughput is still insufficient, the next step is to rely on the support-request-based TPS increase as described above.
References: