An Azure service that integrates speech processing into apps and services.
Hello Roy Aad,
The error occurs because the value provided in the models property must match one of the formats supported by the Speech SDK. MAI-Transcribe-1 is not a valid value for this parameter, which is why you receive the following error HttpResponseError: (InvalidArgument) The specified model does not have the expected format.
Why This Happens
MAI-Transcribe-1 is a Microsoft-managed internal base model used by Azure Speech services. It is not exposed as a selectable model identifier in the TranscriptionOptions API.
The models parameter supports only:
- Built-in transcription model identifiers, or
- The full resource URI of a custom speech model
Supported Built-In Model Identifiers
You can specify one of the following built-in models:
-
azure-speech -
whisper-1 -
gpt-4o-transcribe -
gpt-4o-mini-transcribe
For example:
options = TranscriptionOptions(
If you want Azure Speech's default optimized model, azure-speech is the recommended option.
Using a Custom Speech Model
If you have trained a custom speech model, the models dictionary must contain the model's full self URI, not just its name.
Example:
options = TranscriptionOptions(
You can obtain this URI from:
- Azure portal → Your Speech resource
- Open Speech Studio
- Navigate to Custom Speech
- Select your trained model
- Copy the model's Self URI
Recommended Approach
If you are using a standard Microsoft model, specify one of the supported built-in identifiers listed above.
If you are using a custom-trained model, provide its full model URI.
If you do not specify the models parameter, Azure Speech will automatically select the most appropriate default model for the chosen locale.
For most Fast Transcription scenarios, this is sufficient:
options = TranscriptionOptions(
MAI-Transcribe-1 cannot be used directly in TranscriptionOptions.
Use one of the supported built-in identifiers:
-
azure-speech -
whisper-1 -
gpt-4o-transcribe -
gpt-4o-mini-transcribe
Or provide the full URI of a custom speech model.
Please refer this
AudioInputTranscriptionOptions.Model – supported values: whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe, azure-speech https://learn.microsoft.com/dotnet/api/azure.ai.voicelive.audioinputtranscriptionoptions.model TranscriptionOptions class – Python wrapper for transcription settings https://learn.microsoft.com/dotnet/api/azure.ai.speech.transcription.transcriptionoptions
Example usage in the Speech SDK samples repo https://github.com/MicrosoftDocs/azure-ai-docs/blob/main/articles/ai-services/speech-service/includes/common/llm-speech-sdk-python.md
I Hope this helps. Do let me know if you have any further queries.
If this answers your query, please do click Accept Answer and Yes for was this answer helpful.
Thank you!