Is it expected that add_target_language() silently drops translations when language codes overlap (e.g. en + en-US, fil + fi)?

Question

Is it expected that add_target_language() silently drops translations when language codes overlap (e.g. en + en-US, fil + fi)?

Chi Hsun Wang 0

Is it expected that `add_target_language()` silently drops translations when language codes overlap (e.g. en + en-US, fil + fi)?

I've encountered an issue where certain combinations of target languages in TranslationRecognizer cause translations to silently go missing — no error is raised, but the language key is absent from result.translations. I've filed this as GitHub issue #3024 as well.

Is this a known limitation? If so, is there documentation covering which language code combinations are unsupported for translation targets?

Environment

SDK: azure-cognitiveservices-speech 1.49.0
Python: 3.8
OS: Linux (Ubuntu-based), x64
Endpoint: wss://{region}.stt.speech.microsoft.com/speech/universal/v2

Problem description

When calling add_target_language() with multiple target languages on SpeechTranslationConfig, certain combinations cause one or more translations to be silently missing from result.translations.

Two patterns trigger this:

Same base language with different specificity — e.g., en combined with en-US. en + en-GB does not collide, suggesting the service resolves en → en-US internally.
Prefix collision between different languages — e.g., fil (Filipino) followed by fi (Finnish). Reversed order works fine.

No error is raised. result.reason is still TranslatedSpeech. The failing language key is simply absent from the translations dictionary.

Minimal reproduction

import os
import azure.cognitiveservices.speech as speechsdk
from azure.cognitiveservices.speech import translation, languageconfig

speech_key = os.environ.get("AZURE_SPEECH_KEY")
speech_region = os.environ.get("AZURE_SPEECH_REGION")

endpoint = f"wss://{speech_region}.stt.speech.microsoft.com/speech/universal/v2"

translation_config = translation.SpeechTranslationConfig(
    subscription=speech_key, endpoint=endpoint
)

# ❌ BUG: "en-US" will be silently dropped from results
target_langs = ["en", "en-US"]

# ✅ WORKS: both translations appear
# target_langs = ["en", "en-GB"]

for lang in target_langs:
    translation_config.add_target_language(lang)

auto_detect = languageconfig.AutoDetectSourceLanguageConfig(
    languages=["ja-JP"]
)

audio_config = speechsdk.audio.AudioConfig(filename="test_audio.wav")

recognizer = translation.TranslationRecognizer(
    translation_config=translation_config,
    audio_config=audio_config,
    auto_detect_source_language_config=auto_detect,
)

result = recognizer.recognize_once()

if result.reason == speechsdk.ResultReason.TranslatedSpeech:
    print(f"Recognized: {result.text}")
    print(f"Translation keys: {list(result.translations.keys())}")
    for lang, text in result.translations.items():
        print(f"  [{lang}] {text}")
    # ⚠️ "en-US" key will be missing here — no error raised
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation = result.cancellation_details
    print(f"Canceled: {cancellation.reason}, {cancellation.error_details}")

Test results

English locale combinations

`add_target_language()` order	Result
`['en', 'en-GB', 'en-US']`	❌ en-US missing from translations
`['en', 'en-US', 'en-GB']`	❌ en-US missing from translations
`['en', 'en-GB']`	✅ Pass
`['en', 'en-US']`	❌ en-US missing from translations
`['en-GB', 'en', 'en-US']`	❌ en missing from translations
`['en-GB', 'en-US', 'en']`	❌ en missing from translations
`['en-GB', 'en-US']`	✅ Pass
`['en-GB', 'en']`	❌ en missing from translations
`['en-US', 'en', 'en-GB']`	❌ en missing from translations
`['en-US', 'en-GB', 'en']`	❌ en missing from translations
`['en-US', 'en-GB']`	✅ Pass
`['en-US', 'en']`	❌ en missing from translations

Prefix collision (`fi` vs `fil`)

`add_target_language()` order	Result
`['fi', 'fil']`	✅ Pass
`['fil', 'fi']`	❌ fi missing from translations

Observed patterns

en and en-US always collide — the later one is dropped. But en + en-GB and en-US + en-GB both pass, suggesting en is internally resolved to en-US.
When 3 English variants are combined, the one dropped is always either en or en-US — whichever is added later relative to the other. en-GB is never affected.
fil before fi drops fi, but reversed order works. This points to a prefix-matching issue in the internal routing.

What I've checked in the documentation

The Language Identification docs state: "Don't include multiple locales of the same language, for example, en-US and en-GB" — but this is specifically for Language Identification candidate languages, not for translation target languages.
The Speech Translation how-to guide has no equivalent warning for add_target_language().
The language support page recommends using language codes (e.g., es instead of es-ES) for translation targets, but does not document any collision behavior.

Questions

Is this behavior by design or a bug?
If by design, could the SDK raise an error or warning instead of silently dropping translations?
Are there other language code combinations known to have similar conflicts?

Related: GitHub issue #3024

0 comments

2 answers

Your answer

Answer 1

Hello @Chi Hsun Wang

Thank you again for the detailed repro and for raising GitHub issue #3024.

What you’re observing is not an issue in your Python code, but rather an undocumented behavior in the Speech Translation service within Azure Speech Service.

Today, when overlapping language codes are provided (e.g., en + en-US, fil + fi), the service internally normalizes and deduplicates them, and one of the entries is dropped without warning.

What’s happening under the hood

Normalization & deduplication (effectively “by design”)

Generic language codes like en are internally resolved to a default locale (commonly en-US)

When both are provided:

en and en-US map to the same internal target
The service keeps one and drops the other

This explains:

en + en-US → collision
en + en-GB → works (distinct mapping)

Prefix-based collisions

For cases like:

fil (Filipino)
fi (Finnish)

The behavior indicates prefix-based matching or tokenization, where:

One code can overshadow the other depending on order
Example:
- ["fil", "fi"] → fi dropped
  - ["fi", "fil"] → works

This is not clearly documented and behaves more like a service limitation / bug-like edge case

While deduplication itself may be expected:

No error or warning is raised

Missing translations are silently dropped

This is not ideal behavior, and we agree it should:

Either fail fast with validation, or
Surface a warning in the SDK/service

This gap has been noted (as in your GitHub issue), and the expectation is that future updates will improve this behavior.

Your questions answered

Is this by design or a bug?

Currently behaves as “by design” (deduplication)

But silent dropping and prefix collisions are not well-defined/documented, so this is considered a product limitation/gap

Can the SDK raise a warning/error?

Not in the current version
This has been raised internally and via GitHub (#3024)
Future improvements may include validation or warnings

Other known collisions?

Yes, any pair where one code is a strict prefix of another can collide, for example:

en vs en-US, en-GB
es vs es-MX
pt vs pt-BR
fil vs fi

Recommended workarounds

Avoid mixing generic + specific codes

Do NOT combine:

en + en-US
es + es-MX

Use either:

Only generic (en)
OR only specific (en-US, en-GB)

Avoid prefix-colliding pairs

Avoid combinations like fil + fi

If both are required Use separate requests

Prefer explicit locales

Using en-US, en-GB

instead of en helps avoid ambiguity in routing.

Defensive validation

After receiving results:

Compare requested vs returned languages
Retry missing ones if needed

Please refer this

Speech Translation language support: https://learn.microsoft.com/azure/ai-services/speech-service/language-support#text-languages

How to translate speech (add_target_language usage): https://learn.microsoft.com/azure/ai-services/speech-service/how-to-translate-speech#add-a-translation-language

GitHub issue tracking this behavior: https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/3024

I Hope this helps. Do let me know if you have any further queries.

If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

Thank you!

SRILAKSHMI C 18,035 Reputation points Microsoft External Staff Moderator

2026-04-23T15:51:40.8666667+00:00

Hi @Chi Hsun Wang

Following up to see if the above answer was helpful. If this answers your query, please do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Thank you!
SRILAKSHMI C 18,035 Reputation points Microsoft External Staff Moderator

2026-04-24T15:33:25.4833333+00:00

Hi @Chi Hsun Wang

Just checking in to see if you have got a chance to see my response to your question in resolving the issue.

If you are still facing any further issues, please don't hesitate to reach out to us. We are happy to assist you.

Looking forward to your response and appreciate your time on this.

If you feel that your quires have been resolved, please accept the answer by clicking the "Upvote" and "Accept Answer" on the post.

Thank you!

Answer 2

Hello Chi Hsu !

Thank you for posting on Microsoft Learn Q&A.

The doc you shared explains 2 things.

add_target_language() adds a new target translation language and the how to states that each target translation should be available in the result so I don't see any warning there that overlapping targets may be dropped silently.

The only explicit "don’t include multiple locales of the same language" warning I found is for language identification candidate languages and it is not for translation targets which applies to AutoDetectSourceLanguageConfig, not add_target_language().

For speech translation target text, the language support page says you should usually specify only the language code before the dash for example es instead of es-ES and your translation layer should be normalized at the base language level which would explain why en and en-US collide in practice.

So my assumption is when you use en + en-US you can have overlap caused by service side normalization to a base language or default locale and it can be an implementation limitation but I couldn't find any detail about it.

Using fil + fidoes not fit the use base language codes as a valid collision rule because those are distinct languages so silent omission here looks much more like a bug.

What I recommend you when you deal with translate-to-text try to use only the base target codes where possible like en, fr, de and not en-US/en-GB unless it is confirmed in the doc that locale specific targets are supported in that path.

Try always to treat prefix sensitive pairs like fil/fi as unsupported until the SDK or service team confirms a fix.

Share via

Is it expected that add_target_language() silently drops translations when language codes overlap (e.g. en + en-US, fil + fi)?

Is it expected that add_target_language() silently drops translations when language codes overlap (e.g. en + en-US, fil + fi)?

Environment

Problem description

Minimal reproduction

Test results

English locale combinations

Prefix collision (fi vs fil)

Observed patterns

What I've checked in the documentation

Questions

2 answers

Your answer

Is it expected that `add_target_language()` silently drops translations when language codes overlap (e.g. en + en-US, fil + fi)?

Prefix collision (`fi` vs `fil`)