How to train a custom model in Azure Document Intelligence Studio to correctly read comma and period in currency values

Question

How to train a custom model in Azure Document Intelligence Studio to correctly read comma and period in currency values

Milan Raval 20

How to train a custom model in Azure Document Intelligence Studio to correctly read comma and period in currency values ?

I have tried providing it many samples like this, and manually corrected values in .ocr.json and labels.json files created by the studio in Azure storage folder, and trained model. But it is still reading a comma as a period.

For example in below document it is reading 29,000 as 29.000, but it is clearly a comma..
User's image

0 comments

Answer accepted by question author

0 additional answers

Your answer

Answer 1

Document Intelligence Studio custom models do not expose training controls to change how the OCR engine interprets punctuation such as commas and periods inside numbers. Editing the generated .ocr.json or labels.json files in storage and retraining does not change the underlying OCR behavior; those files are only used to map fields and values for training, not to retrain the OCR engine itself.

For numeric and currency fields, the supported way to influence interpretation is through value types on labels (for v2.1 Sample Labeling tool) or equivalent field typing in Studio. Setting a tag or field to number or number.currency standardizes the returned format, but it does not override whether a specific glyph is read as a comma or a period when the OCR engine has already recognized it incorrectly.

If the OCR consistently misreads the punctuation for this handwriting style or scan quality, this limitation cannot be corrected by additional custom-model training in Studio. In that case, the options are:

Use the numeric value type so that values are normalized as numbers where possible, then post-process the extracted text/values in application code (for example, replace . with , in specific positions according to business rules).
Improve input quality (higher-resolution scans, clearer handwriting, better contrast) so the OCR engine can more reliably distinguish commas from periods.
If using the older v2.1 stack and Sample Labeling tool, ensure tags for these fields are set to number with currency subtype so that numeric parsing is as robust as possible.

There is no documented mechanism in Document Intelligence Studio to explicitly train or override comma/period recognition in currency values beyond these field-typing and input-quality measures.

References:

Milan Raval 20 Reputation points

2026-03-03T12:35:03.35+00:00

From options, field is already a numeric.currency field. options 2 - can't control scan quality, and options 3 - using Azure Document Intelligence Studio online, is there any other alternative ?
SRILAKSHMI C 16,625 Reputation points Microsoft External Staff Moderator

2026-04-09T11:10:53.1633333+00:00

Hello Milan Raval,

Hope the issue was resolved, if you have any further queries, please do let me know, will be happy to assist you.

Thank you!

Share via

How to train a custom model in Azure Document Intelligence Studio to correctly read comma and period in currency values

0 additional answers

Your answer