Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform
Hello Mory,
Thanks for raising it in the Q&A forum! Azure AI Language's PII detection does not have a general "Password" category for detecting plain-text passwords in arbitrary text. The service focuses on structured, identifiable PII entities like Social Security Numbers, phone numbers, email addresses, and identification documents.
However, Azure does detect specific password-related entities that follow recognizable patterns:
AzurePublishSettingPassword: Azure-specific publish setting passwords
SQL Server Connection Strings: Which may contain embedded passwords
Azure Storage Account Keys: Authentication keys that function as passwords
Azure Document DB Auth Keys: Database authentication credentials
These require the password to be part of a structured format (like connection strings or Azure-specific credentials) rather than standalone plain-text passwords.
Why Generic Passwords Aren't Detected
PII detection relies on Named Entity Recognition (NER) with machine learning models trained on structured patterns. Generic passwords in free-form text (like "my password is xyz123") don't follow predictable patterns that distinguish them from regular words, making reliable detection extremely difficult.
The service is designed to identify formatted, standardized PII rather than context-dependent sensitive information that requires semantic understanding.
Recommended Approaches
Use Specific PII Categories: If you're trying to detect Azure-related credentials, explicitly specify the relevant categories in your piiCategories parameter:
json
{
"piiCategories": [
"AzurePublishSettingPassword",
"SQLServerConnectionString",
"AzureStorageAccountKey"
]
}
Custom Detection: For plain-text password detection in unstructured text, consider implementing custom pattern matching or keyword-based detection alongside Azure PII detection. Look for contextual phrases like "password:", "pwd:", "pass:", followed by alphanumeric strings.
PHI Domain Filtering: If you're dealing with healthcare data where passwords might be mentioned in medical records, use domain=phi to focus on Protected Health Information entities, though this still won't detect generic passwords.l
Known Limitations
Azure AI Language PII detection accuracy varies based on language, locale, and entity type. The documentation doesn't provide detailed mapping of which categories are fully supported per language, and some expected entities may not be detected even when formatting appears correct.
If this helps, kindly accept the answer.
Best Regards,
Jerald Felix