Share via

Significant Decrease in Translation Quality in Azure AI Translator (English → Swedish).

Anton Olivestam 0 Reputation points
2026-03-16T17:31:17.1+00:00

Hi,

I’ve noticed a substantial drop in translation quality when using Azure AI Translator for English → Swedish translations.

For example

  • Source: “The account deletion link was only valid for 30 minutes and has expired”
  • Current translation: “Länken till kontoborttagning var bara geldig i 30 minuter och har gått ut.”

The translated sentence reads unnaturally in Swedish, and the word “geldig” is not a Swedish word at all. Until recently, Azure Translator produced much more natural and accurate translations for similar sentences.

What has happened to the translation models that is causing this decrease in translation quality?

Foundry Tools
Foundry Tools

Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform

0 comments No comments

2 answers

Sort by: Most helpful
  1. SAI JAGADEESH KUDIPUDI 2,710 Reputation points Microsoft External Staff Moderator
    2026-04-28T17:39:15.6833333+00:00

    Hi Anton Olivestam,

    Thank you for sharing the detailed example — I can see why the output (especially “geldig”) is concerning, as it’s not valid Swedish and impacts overall readability.

    Azure AI Translator is a neural machine translation (NMT) service that is continuously updated. Because of this, translation outputs can change over time, and quality may vary depending on the language pair and phrasing.

    Additionally:

    Translations are typically handled at the sentence level, without full document context

    This can sometimes lead to overly literal or unnatural phrasing, or occasional incorrect tokens

    At this time, there isn’t a way to lock or revert to a previous model version, so this behavior is most likely due to a backend model update rather than anything in your request.

    While the base model cannot be controlled directly, there are supported approaches to improve consistency and correctness:

    1. Use Custom Translator

    You can train a custom model with your own English–Swedish examples to ensure preferred phrasing.

    • Helps avoid incorrect words like “geldig”
    • Improves fluency for your specific domain
    1. Apply a dictionary for key terminology

    If certain phrases must always be translated consistently (e.g., “account deletion”), you can enforce them using a dictionary.

    1. Evaluate with representative samples

    Microsoft recommends validating translation quality using your own test set, since quality can vary by language pair and use case.

    This helps:

    Detect regressions early

    • Measure improvements from customization

    The behavior you’re seeing is likely due to ongoing model updates, not a configuration issue

    There’s no rollback option, but

    Custom Translator and dictionaries can significantly improve and stabilize output

    Microsoft References Links:

    Was this answer helpful?


  2. Q&A Assist
    2026-03-16T17:31:45.7233333+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    Azure AI Translator is a neural machine translation system whose quality can vary over time and across language pairs. The documentation describes several important characteristics and limitations that explain issues like the one observed, but it does not list specific recent model changes for English → Swedish.

    Key points relevant to the scenario:

    1. Model behavior and limitations
      Translator is trained on previously translated documents and operates mainly at the sentence level without broader real‑world or document‑level context. This can lead to:
      • Literal or awkward translations that do not reflect natural usage or nuance in the target language.
      • Errors where tone, style, or idiomatic correctness are not preserved.
      The documentation explicitly notes that automatic translations can be too literal and that the system may not correctly capture implication, nuance, or culture‑specific usage.
    2. Quality varies by language pair and scenario
      Translation quality is not uniform across all language pairs. The service documentation states that:
      • Quality “differs by language pair,” and suitability must be evaluated per scenario.
      • Quality should be measured on a representative test set for the specific use case.
      This means that an English → Swedish model can perform differently from other pairs and that changes in training data or model tuning can affect perceived quality, positively or negatively, for certain constructions.
    3. Ongoing model evaluation and updates
      Translator quality is continuously measured using automatic metrics (such as BLEU and COMET) and human evaluation. The service is under ongoing improvement, which can involve model updates. However, the documentation does not enumerate or timestamp specific model changes for particular language pairs, nor does it distinguish between “temporary regression” and “intentional update” at the level of individual examples. From the available information, it can only be said that:
      • Models are periodically improved and evaluated using multiple techniques.
      • Human evaluation is used to guide quality, but individual regressions in specific phrases or domains can still occur.
    4. What can be done in response to a perceived regression
      Based on the guidance for evaluating and integrating Translator:
      • Evaluate on a representative test set:
        Build a small but representative English → Swedish test set from the application domain and systematically measure output quality over time. This helps document regressions and provides concrete evidence when engaging support.
      • Use human-in-the-loop review where quality is critical:
        For user‑facing or high‑impact text, keep human oversight in the workflow so that mistranslations or unnatural phrasing can be corrected before reaching end users.
      • Provide feedback and maintain a feedback loop:
        The service guidance recommends having a feedback channel and monitoring Translator in production. When specific problematic outputs like the “geldig” example are identified, they can be reported through Azure support or product feedback channels so they can be considered in future model tuning.
      • Consider customizations if applicable:
        For domains where terminology or phrasing must be very precise, Custom Translator with domain‑specific training data and (where appropriate) dictionaries can help steer translations toward preferred outputs. The documentation cautions that dictionaries should be used sparingly and mainly for compound nouns, but domain training data can significantly improve quality.
    5. What cannot be confirmed from the documentation
      The provided documentation does not specify:
      • The exact change that would have introduced the specific “geldig” error in English → Swedish.
      • Whether this is a known temporary regression or a side effect of an intentional update.
      • Any scheduled fix or rollback for this particular behavior.
      For authoritative information on a specific regression and any planned fix, the appropriate next step is to open an Azure support request with concrete examples and timestamps so the product team can investigate the underlying model version and training data involved.

    References:

    Was this answer helpful?

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.