It seems you are experiencing timeouts with the Grok-4-fast-reasoning model deployed in the Korea Central region. While the provisioning state indicates success, the symptoms you describe, such as long-running requests and minimal output tokens, suggest there may be an issue specific to this model in that region.
Currently, there is no specific mention of known issues with Grok-4-fast-reasoning in Korea Central in the provided context. However, if other models are functioning properly in the same resource and region, it may be worth considering deploying the Grok-4-fast-reasoning model in a different region where it has been reported to work without issues. This could potentially resolve the timeout problems you are encountering.
For further assistance, you might want to check Azure's support channels or documentation for any updates regarding this model's performance in specific regions.
References: