An Azure service that integrates speech processing into apps and services.
Hello @Chi Hsun Wang
Thank you again for the detailed repro and for raising GitHub issue #3024.
What you’re observing is not an issue in your Python code, but rather an undocumented behavior in the Speech Translation service within Azure Speech Service.
Today, when overlapping language codes are provided (e.g., en + en-US, fil + fi), the service internally normalizes and deduplicates them, and one of the entries is dropped without warning.
What’s happening under the hood
- Normalization & deduplication (effectively “by design”)
Generic language codes like en are internally resolved to a default locale (commonly en-US)
When both are provided:
-
enanden-USmap to the same internal target - The service keeps one and drops the other
This explains:
-
en + en-US→ collision -
en + en-GB→ works (distinct mapping)
- Prefix-based collisions
For cases like:
-
fil(Filipino) -
fi(Finnish)
The behavior indicates prefix-based matching or tokenization, where:
- One code can overshadow the other depending on order
- Example:
-
["fil", "fi"]→fidropped-
["fi", "fil"]→ works
-
-
This is not clearly documented and behaves more like a service limitation / bug-like edge case
While deduplication itself may be expected:
No error or warning is raised
Missing translations are silently dropped
This is not ideal behavior, and we agree it should:
- Either fail fast with validation, or
- Surface a warning in the SDK/service
This gap has been noted (as in your GitHub issue), and the expectation is that future updates will improve this behavior.
Your questions answered
Is this by design or a bug?
Currently behaves as “by design” (deduplication)
But silent dropping and prefix collisions are not well-defined/documented, so this is considered a product limitation/gap
Can the SDK raise a warning/error?
- Not in the current version
- This has been raised internally and via GitHub (#3024)
- Future improvements may include validation or warnings
Other known collisions?
Yes, any pair where one code is a strict prefix of another can collide, for example:
-
envsen-US,en-GB -
esvses-MX -
ptvspt-BR -
filvsfi
Recommended workarounds
- Avoid mixing generic + specific codes
Do NOT combine:
-
en+en-US -
es+es-MX
Use either:
- Only generic (
en) - OR only specific (
en-US,en-GB)
- Avoid prefix-colliding pairs
Avoid combinations like fil + fi
If both are required Use separate requests
- Prefer explicit locales
Using en-US, en-GB
instead of en helps avoid ambiguity in routing.
- Defensive validation
After receiving results:
- Compare requested vs returned languages
- Retry missing ones if needed
Please refer this
Speech Translation language support: https://learn.microsoft.com/azure/ai-services/speech-service/language-support#text-languages
How to translate speech (add_target_language usage): https://learn.microsoft.com/azure/ai-services/speech-service/how-to-translate-speech#add-a-translation-language
GitHub issue tracking this behavior: https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/3024
I Hope this helps. Do let me know if you have any further queries.
If this answers your query, please do click Accept Answer and Yes for was this answer helpful.
Thank you!