Share via

Is it expected that add_target_language() silently drops translations when language codes overlap (e.g. en + en-US, fil + fi)?

Chi Hsun Wang 0 Reputation points
2026-04-14T07:06:40.2+00:00

Is it expected that add_target_language() silently drops translations when language codes overlap (e.g. en + en-US, fil + fi)?

I've encountered an issue where certain combinations of target languages in TranslationRecognizer cause translations to silently go missing — no error is raised, but the language key is absent from result.translations. I've filed this as GitHub issue #3024 as well.

Is this a known limitation? If so, is there documentation covering which language code combinations are unsupported for translation targets?


Environment

  • SDK: azure-cognitiveservices-speech 1.49.0
  • Python: 3.8
  • OS: Linux (Ubuntu-based), x64
  • Endpoint: wss://{region}.stt.speech.microsoft.com/speech/universal/v2

Problem description

When calling add_target_language() with multiple target languages on SpeechTranslationConfig, certain combinations cause one or more translations to be silently missing from result.translations.

Two patterns trigger this:

  1. Same base language with different specificity — e.g., en combined with en-US. en + en-GB does not collide, suggesting the service resolves enen-US internally.
  2. Prefix collision between different languages — e.g., fil (Filipino) followed by fi (Finnish). Reversed order works fine.

No error is raised. result.reason is still TranslatedSpeech. The failing language key is simply absent from the translations dictionary.

Minimal reproduction

import os
import azure.cognitiveservices.speech as speechsdk
from azure.cognitiveservices.speech import translation, languageconfig

speech_key = os.environ.get("AZURE_SPEECH_KEY")
speech_region = os.environ.get("AZURE_SPEECH_REGION")

endpoint = f"wss://{speech_region}.stt.speech.microsoft.com/speech/universal/v2"

translation_config = translation.SpeechTranslationConfig(
    subscription=speech_key, endpoint=endpoint
)

# ❌ BUG: "en-US" will be silently dropped from results
target_langs = ["en", "en-US"]

# ✅ WORKS: both translations appear
# target_langs = ["en", "en-GB"]

for lang in target_langs:
    translation_config.add_target_language(lang)

auto_detect = languageconfig.AutoDetectSourceLanguageConfig(
    languages=["ja-JP"]
)

audio_config = speechsdk.audio.AudioConfig(filename="test_audio.wav")

recognizer = translation.TranslationRecognizer(
    translation_config=translation_config,
    audio_config=audio_config,
    auto_detect_source_language_config=auto_detect,
)

result = recognizer.recognize_once()

if result.reason == speechsdk.ResultReason.TranslatedSpeech:
    print(f"Recognized: {result.text}")
    print(f"Translation keys: {list(result.translations.keys())}")
    for lang, text in result.translations.items():
        print(f"  [{lang}] {text}")
    # ⚠️ "en-US" key will be missing here — no error raised
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation = result.cancellation_details
    print(f"Canceled: {cancellation.reason}, {cancellation.error_details}")

Test results

English locale combinations

add_target_language() order Result
['en', 'en-GB', 'en-US'] en-US missing from translations
['en', 'en-US', 'en-GB'] en-US missing from translations
['en', 'en-GB'] ✅ Pass
['en', 'en-US'] en-US missing from translations
['en-GB', 'en', 'en-US'] en missing from translations
['en-GB', 'en-US', 'en'] en missing from translations
['en-GB', 'en-US'] ✅ Pass
['en-GB', 'en'] en missing from translations
['en-US', 'en', 'en-GB'] en missing from translations
['en-US', 'en-GB', 'en'] en missing from translations
['en-US', 'en-GB'] ✅ Pass
['en-US', 'en'] en missing from translations

Prefix collision (fi vs fil)

add_target_language() order Result
['fi', 'fil'] ✅ Pass
['fil', 'fi'] fi missing from translations

Observed patterns

  • en and en-US always collide — the later one is dropped. But en + en-GB and en-US + en-GB both pass, suggesting en is internally resolved to en-US.
  • When 3 English variants are combined, the one dropped is always either en or en-US — whichever is added later relative to the other. en-GB is never affected.
  • fil before fi drops fi, but reversed order works. This points to a prefix-matching issue in the internal routing.

What I've checked in the documentation

  • The Language Identification docs state: "Don't include multiple locales of the same language, for example, en-US and en-GB" — but this is specifically for Language Identification candidate languages, not for translation target languages.
  • The Speech Translation how-to guide has no equivalent warning for add_target_language().
  • The language support page recommends using language codes (e.g., es instead of es-ES) for translation targets, but does not document any collision behavior.

Questions

  1. Is this behavior by design or a bug?
  2. If by design, could the SDK raise an error or warning instead of silently dropping translations?
  3. Are there other language code combinations known to have similar conflicts?

Related: GitHub issue #3024

Azure Speech in Foundry Tools
0 comments No comments

2 answers

Sort by: Most helpful
  1. SRILAKSHMI C 18,035 Reputation points Microsoft External Staff Moderator
    2026-04-22T13:53:47.1933333+00:00

    Hello @Chi Hsun Wang

    Thank you again for the detailed repro and for raising GitHub issue #3024.

    What you’re observing is not an issue in your Python code, but rather an undocumented behavior in the Speech Translation service within Azure Speech Service.

    Today, when overlapping language codes are provided (e.g., en + en-US, fil + fi), the service internally normalizes and deduplicates them, and one of the entries is dropped without warning.

    What’s happening under the hood

    1. Normalization & deduplication (effectively “by design”)

    Generic language codes like en are internally resolved to a default locale (commonly en-US)

    When both are provided:

    • en and en-US map to the same internal target
    • The service keeps one and drops the other

    This explains:

    • en + en-US → collision
    • en + en-GB → works (distinct mapping)
    1. Prefix-based collisions

    For cases like:

    • fil (Filipino)
    • fi (Finnish)

    The behavior indicates prefix-based matching or tokenization, where:

    • One code can overshadow the other depending on order
    • Example:
      • ["fil", "fi"]fi dropped
        • ["fi", "fil"] → works

    This is not clearly documented and behaves more like a service limitation / bug-like edge case

    While deduplication itself may be expected:

    No error or warning is raised

    Missing translations are silently dropped

    This is not ideal behavior, and we agree it should:

    • Either fail fast with validation, or
    • Surface a warning in the SDK/service

    This gap has been noted (as in your GitHub issue), and the expectation is that future updates will improve this behavior.

    Your questions answered

    Is this by design or a bug?

    Currently behaves as “by design” (deduplication)

    But silent dropping and prefix collisions are not well-defined/documented, so this is considered a product limitation/gap

    Can the SDK raise a warning/error?

    • Not in the current version
    • This has been raised internally and via GitHub (#3024)
    • Future improvements may include validation or warnings

    Other known collisions?

    Yes, any pair where one code is a strict prefix of another can collide, for example:

    • en vs en-US, en-GB
    • es vs es-MX
    • pt vs pt-BR
    • fil vs fi

    Recommended workarounds

    1. Avoid mixing generic + specific codes

    Do NOT combine:

    • en + en-US
    • es + es-MX

    Use either:

    • Only generic (en)
    • OR only specific (en-US, en-GB)
    1. Avoid prefix-colliding pairs

    Avoid combinations like fil + fi

    If both are required Use separate requests

    1. Prefer explicit locales

    Using en-US, en-GB

    instead of en helps avoid ambiguity in routing.

    1. Defensive validation

    After receiving results:

    • Compare requested vs returned languages
    • Retry missing ones if needed

    Please refer this

    Speech Translation language support: https://learn.microsoft.com/azure/ai-services/speech-service/language-support#text-languages

    How to translate speech (add_target_language usage): https://learn.microsoft.com/azure/ai-services/speech-service/how-to-translate-speech#add-a-translation-language

    GitHub issue tracking this behavior: https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/3024

    I Hope this helps. Do let me know if you have any further queries.


    If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

    Thank you!


  2. Amira Bedhiafi 41,386 Reputation points MVP Volunteer Moderator
    2026-04-14T09:18:20.5433333+00:00

    Hello Chi Hsu !

    Thank you for posting on Microsoft Learn Q&A.

    The doc you shared explains 2 things.

    add_target_language() adds a new target translation language and the how to states that each target translation should be available in the result so I don't see any warning there that overlapping targets may be dropped silently.

    The only explicit "don’t include multiple locales of the same language" warning I found is for language identification candidate languages and it is not for translation targets which applies to AutoDetectSourceLanguageConfig, not add_target_language().

    For speech translation target text, the language support page says you should usually specify only the language code before the dash for example es instead of es-ES and your translation layer should be normalized at the base language level which would explain why en and en-US collide in practice.

    So my assumption is when you use en + en-US you can have overlap caused by service side normalization to a base language or default locale and it can be an implementation limitation but I couldn't find any detail about it.

    Using fil + fidoes not fit the use base language codes as a valid collision rule because those are distinct languages so silent omission here looks much more like a bug.

    What I recommend you when you deal with translate-to-text try to use only the base target codes where possible like en, fr, de and not en-US/en-GB unless it is confirmed in the doc that locale specific targets are supported in that path.

    Try always to treat prefix sensitive pairs like fil/fi as unsupported until the SDK or service team confirms a fix.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.