Specifying a model in TranscriptionOptions of TranscriptionClient

Question

Specifying a model in TranscriptionOptions of TranscriptionClient

Roy Aad 0

Hi I am using a code similar to the one found in:

However when I specify a model in TranscriptionOptions

options = TranscriptionOptions(
    locales=["en-US"],
    enhanced_mode=enhanced_mode,
    models={
    "en-US": "MAI-Transcribe-1"
    },
)

I always get:

HttpResponseError: (InvalidArgument) The specified model does not have the expected format. Code: InvalidArgument Message: The specified model does not have the expected format.

What are the models that can be specified in TranscriptionOptions?

Roy Aad 0

Still not working...
When I used EnhancedModeProperties and specify models.

enhanced_mode = EnhancedModeProperties(
        task="transcribe",
        prompt=prompt_1,
    )
options = TranscriptionOptions(
        locales=["en-US"],
        enhanced_mode=enhanced_mode,
        models={
        "en-US": "azure-speech"
        },
    )

I get

HttpResponseError: (InvalidArgument) The specified model does not have the expected format.
Code: InvalidArgument
Message: The specified model does not have the expected format.

SRILAKSHMI C 18,035 Reputation points Microsoft External Staff Moderator

2026-04-28T16:01:43.7866667+00:00
Hi @Roy Aad

Thank you for testing that and for sharing the additional details.

The behavior you are seeing is expected. The key point is that the models property in TranscriptionOptions is not used to select built-in foundation models by name when working with the Azure Speech Transcription SDK.

Why azure-speech Still Fails

Although names such as azure-speech, whisper-1, gpt-4o-transcribe, and gpt-4o-mini-transcribe are valid model identifiers in some Azure AI speech and voice scenarios, the models dictionary in the TranscriptionOptions class expects a different format.

Specifically, it accepts:

A full URI of a Custom Speech model, or

In some service scenarios, a service-managed model reference provided internally by Azure

It does not accept built-in model names as plain strings in this SDK API. That is why specifying:

models={"en-US": "azure-speech"}

results in the same InvalidArgument error.

Correct Usage

1: Use the default Microsoft model

For standard transcription scenarios, simply omit the models parameter entirely. Azure will automatically select the best available model for the specified locale.

options = TranscriptionOptions(

This is the recommended approach for most use cases.

2: Use a Custom Speech model

If you have trained a custom speech model, then the models dictionary should contain the model's full self URI, for example:

models={

You can obtain this URI from:

Speech Studio

Custom Speech portal

Your trained model's details page

Clarification

MAI-Transcribe-1 is an internal Microsoft-managed model and cannot be specified directly.

Built-in model aliases such as azure-speech are not currently supported in the TranscriptionOptions.models field.

The SDK automatically selects the appropriate built-in model when no model is explicitly provided.

Recommended Configuration

For your scenario, use:

enhanced_mode = EnhancedModeProperties(

This will enable enhanced transcription while allowing Azure Speech to automatically choose the optimal backend model.

Thank you!
SRILAKSHMI C 18,035 Reputation points Microsoft External Staff Moderator

2026-04-30T12:06:25.1766667+00:00

Hi @Roy Aad

Did you get any chance to review the above response. Do let me know if you have any further queries.

Thank you!

2 answers

Your answer

Roy Aad 0 Reputation points

2026-04-27T13:00:14.8766667+00:00

Still not working...
When I used EnhancedModeProperties and specify models.

enhanced_mode = EnhancedModeProperties( task="transcribe", prompt=prompt_1, ) options = TranscriptionOptions( locales=["en-US"], enhanced_mode=enhanced_mode, models={ "en-US": "azure-speech" }, )

I get

HttpResponseError: (InvalidArgument) The specified model does not have the expected format. Code: InvalidArgument Message: The specified model does not have the expected format.
SRILAKSHMI C 18,035 Reputation points Microsoft External Staff Moderator

2026-04-30T12:06:25.1766667+00:00

Hi @Roy Aad

Did you get any chance to review the above response. Do let me know if you have any further queries.

Thank you!

Answer 1

Hello Roy Aad,

The error occurs because the value provided in the models property must match one of the formats supported by the Speech SDK. MAI-Transcribe-1 is not a valid value for this parameter, which is why you receive the following error HttpResponseError: (InvalidArgument) The specified model does not have the expected format.

Why This Happens

MAI-Transcribe-1 is a Microsoft-managed internal base model used by Azure Speech services. It is not exposed as a selectable model identifier in the TranscriptionOptions API.

The models parameter supports only:

Built-in transcription model identifiers, or
The full resource URI of a custom speech model

Supported Built-In Model Identifiers

You can specify one of the following built-in models:

azure-speech
whisper-1
gpt-4o-transcribe
gpt-4o-mini-transcribe

For example:

options = TranscriptionOptions(

If you want Azure Speech's default optimized model, azure-speech is the recommended option.

Using a Custom Speech Model

If you have trained a custom speech model, the models dictionary must contain the model's full self URI, not just its name.

Example:

options = TranscriptionOptions(

You can obtain this URI from:

Azure portal → Your Speech resource
Open Speech Studio
Navigate to Custom Speech
Select your trained model
Copy the model's Self URI

Recommended Approach

If you are using a standard Microsoft model, specify one of the supported built-in identifiers listed above.

If you are using a custom-trained model, provide its full model URI.

If you do not specify the models parameter, Azure Speech will automatically select the most appropriate default model for the chosen locale.

For most Fast Transcription scenarios, this is sufficient:

options = TranscriptionOptions(

MAI-Transcribe-1 cannot be used directly in TranscriptionOptions.

Use one of the supported built-in identifiers:

azure-speech
whisper-1
gpt-4o-transcribe
gpt-4o-mini-transcribe

Or provide the full URI of a custom speech model.

Please refer this

AudioInputTranscriptionOptions.Model – supported values: whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe, azure-speech https://learn.microsoft.com/dotnet/api/azure.ai.voicelive.audioinputtranscriptionoptions.model TranscriptionOptions class – Python wrapper for transcription settings https://learn.microsoft.com/dotnet/api/azure.ai.speech.transcription.transcriptionoptions

Example usage in the Speech SDK samples repo https://github.com/MicrosoftDocs/azure-ai-docs/blob/main/articles/ai-services/speech-service/includes/common/llm-speech-sdk-python.md

I Hope this helps. Do let me know if you have any further queries.

If this answers your query, please do click Accept Answer and Yes for was this answer helpful.

Thank you!

Answer 2

TranscriptionOptions.models does not take a friendly model name such as "MAI-Transcribe-1". It must be a mapping from locale to model URI.

For both the Python and JavaScript SDKs, the models property is defined as a dictionary/record of string → string where the value is a model URI:

Python (azure-ai-transcription):
```
  models: dict[str, str] | None
```
“Maps some or all candidate locales to a model URI to be used for transcription. If no mapping is given, the default model for the locale is used.”
JavaScript (@azure/ai-speech-transcription):
```
  models?: Record<string, string>
```
“Maps some or all candidate locales to a model URI to be used for transcription. If no mapping is given, the default model for the locale is used.”

The error

The specified model does not have the expected format.

occurs because "MAI-Transcribe-1" is not a valid model URI. To use a specific model, supply the correct model URI for that locale, for example:

options = TranscriptionOptions(
    locales=["en-US"],
    enhanced_mode=enhanced_mode,
    models={
        "en-US": "<model-uri-for-en-US>"
    },
)

If models is omitted entirely, the service automatically uses the default model for the specified locale(s), which is the recommended approach unless there is a specific custom or non-default model URI that must be used.

References:

Share via

Specifying a model in TranscriptionOptions of TranscriptionClient

2 answers

Your answer