Issues with ConversationTranscriber Realtime STT over Private Endpoint – Requires OCSP/CRL Internet Access and Non-Documented Endpoint Format

Question

Issues with ConversationTranscriber Realtime STT over Private Endpoint – Requires OCSP/CRL Internet Access and Non-Documented Endpoint Format

Nguyen Duc Cuong 40

I am using Azure Speech Service and a VM inside the same VNET. The Speech resource has a Private Endpoint configured, and the VM accesses it through the private IP. Inside the VM, I run a Python backend that uses ConversationTranscriber (Realtime Speech-to-Text) from the Speech SDK.

Problem

Realtime transcription only works if the VM has outbound Internet access. Initially, I configured an NSG with Deny Internet → the ConversationTranscriber stopped receiving transcribed events. After switching to Azure Firewall and explicitly allowing several CA/OCSP/CRL endpoints (such as DigiCert and Microsoft certificate services), Realtime transcription started working again. Uploaded image

This behavior is unexpected because both the VM and Speech Service use the private endpoint, and communication should remain inside the Azure backbone.

Unexpected SDK Behavior

The documentation describes using a private endpoint like:

wss://<private-endpoint-name>.cognitiveservices.azure.com

However, this endpoint does not work with ConversationTranscriber in my environment.

I must use the full STT endpoint path for the service to function:

wss://<service-name>.cognitiveservices.azure.com/stt/speech/recognition/conversation/cognitiveservices/v1?language=ja-JP

Only with this extended endpoint does the SDK succeed in establishing a connection and receiving transcription events.

Questions

Why does Realtime STT using ConversationTranscriber require access to public OCSP/CRL certificate validation endpoints (DigiCert, Microsoft), even when using a Private Endpoint?
- Is this expected behavior for Speech Realtime?
- Is certificate revocation checking mandatory and not supported over private link?
Why does the private-endpoint URL described in the documentation fail for ConversationTranscriber, while the full STT endpoint (/stt/speech/recognition/...) works?
With the private endpoint + Azure Firewall rules already configured, is there anything additional required to ensure stable Realtime STT operation in a private network?
- Additional domains/ports?
- Any SDK-specific configuration?
Do other Azure Cognitive Services also require outbound Internet access for OCSP/CRL validation when using Private Endpoints?
Does Batch Transcription work entirely through the Private Endpoint without requiring any outbound Internet connectivity?
- Can Batch STT operate in a VM with zero Internet access (only private link)?

Additional Info

Speech Service: Private Endpoint enabled

VM: Private subnet

NSG Deny Internet → Realtime fails

Azure Firewall + allow CA/OCSP → Realtime works

Using Python SDK: ConversationTranscriber

Screenshot attached showing the CA domains required for outbound connectivityI am using Azure Speech Service and a VM inside the same VNET. The Speech resource has a Private Endpoint configured, and the VM accesses it through the private IP. Inside the VM, I run a Python backend that uses ConversationTranscriber (Realtime Speech-to-Text) from the Speech SDK.

Share via

Issues with ConversationTranscriber Realtime STT over Private Endpoint – Requires OCSP/CRL Internet Access and Non-Documented Endpoint Format

Problem

Unexpected SDK Behavior

Questions

Additional Info

Your answer