Models - Create

Creates a new voice model.

PUT {endpoint}/customvoice/models/{id}?api-version=2026-01-01

URI Parameters

Name In Required Type Description
endpoint
path True

string (uri)

Supported Cognitive Services endpoints (protocol and hostname, for example: https://eastus.api.cognitive.microsoft.com).

id
path True

string

minLength: 3
maxLength: 64
pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]{1,62}[a-zA-Z0-9]$

Resource id

api-version
query True

string

minLength: 1

The API version to use for this operation.

Request Header

Name Required Type Description
Operation-Id

string

minLength: 3
maxLength: 64
pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]{1,62}[a-zA-Z0-9]$

ID of the status monitor for the operation. If the Operation-Id header matches an existing operation and the request is not identical to the prior request, it will fail with a 400 Bad Request.

Request Body

Name Required Type Description
consentId True

string

minLength: 3
maxLength: 64
pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]{1,62}[a-zA-Z0-9]$

Resource id

projectId True

string

minLength: 3
maxLength: 64
pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]{1,62}[a-zA-Z0-9]$

Resource id

recipe True

Recipe

Recipe for model building. Different recipes have different capability.

trainingSetId True

string

minLength: 3
maxLength: 64
pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]{1,62}[a-zA-Z0-9]$

Resource id

description

string

Model description

locale

string

The locale of this model. Locale code follows BCP-47. You can find the text to speech locale list here https://learn.microsoft.com/azure/ai-services/speech-service/language-support?tabs=tts.

properties

ModelProperties

Model properties

status

Status

Status of a resource.

voiceName

string

minLength: 1

Voice name

Responses

Name Type Description
200 OK

Model

The request has succeeded.

Headers

Operation-Location: string

201 Created

Model

The request has succeeded and a new resource has been created as a result.

Headers

Operation-Location: string

Other Status Codes

Azure.Core.Foundations.ErrorResponse

An unexpected error response.

Headers

x-ms-error-code: string

Security

Ocp-Apim-Subscription-Key

Type: apiKey
In: header

OAuth2Auth

Type: oauth2
Flow: implicit
Authorization URL: https://login.microsoftonline.com/common/oauth2/authorize

Scopes

Name Description
https://cognitiveservices.azure.com/.default

Examples

Create a model
Create a multi style model

Create a model

Sample request

PUT {endpoint}/customvoice/models/Jessica?api-version=2026-01-01


{
  "description": "Jessica voice",
  "consentId": "Jessica",
  "projectId": "Jessica",
  "recipe": {
    "kind": "Default"
  },
  "trainingSetId": "Jessica-300",
  "voiceName": "JessicaNeural"
}

Sample response

Operation-Location: https://eastus.api.cognitive.microsoft.com/customvoice/operations/1f4352df-f247-40c0-a7b1-a54d017933e1?api-version=2026-01-01
{
  "description": "Jessica voice",
  "consentId": "Jessica",
  "createdDateTime": "2023-04-01T05:30:00.000Z",
  "engineVersion": "2023.07.04.0",
  "id": "Jessica",
  "lastActionDateTime": "2023-04-02T10:15:30.000Z",
  "locale": "en-US",
  "projectId": "Jessica",
  "recipe": {
    "kind": "Default",
    "version": "V7.2023.03"
  },
  "status": "NotStarted",
  "trainingSetId": "Jessica-300",
  "voiceName": "JessicaNeural"
}
Operation-Location: https://eastus.api.cognitive.microsoft.com/customvoice/operations/1f4352df-f247-40c0-a7b1-a54d017933e1?api-version=2026-01-01
{
  "description": "Jessica voice",
  "consentId": "Jessica",
  "createdDateTime": "2023-04-01T05:30:00.000Z",
  "engineVersion": "2023.07.04.0",
  "id": "Jessica",
  "lastActionDateTime": "2023-04-02T10:15:30.000Z",
  "locale": "en-US",
  "projectId": "Jessica",
  "recipe": {
    "kind": "Default",
    "version": "V7.2023.03"
  },
  "status": "NotStarted",
  "trainingSetId": "Jessica-300",
  "voiceName": "JessicaNeural"
}

Create a multi style model

Sample request

PUT {endpoint}/customvoice/models/JessicaMultiStyle?api-version=2026-01-01


{
  "description": "Jessica multi style voice",
  "consentId": "Jessica",
  "locale": "en-US",
  "projectId": "Jessica",
  "properties": {
    "presetStyles": [
      "cheerful",
      "sad"
    ],
    "styleTrainingSetIds": {
      "happy": "JessicaHappy-300",
      "myStyle2": "JessicaStyle2"
    }
  },
  "recipe": {
    "kind": "MultiStyle"
  },
  "trainingSetId": "Jessica-300",
  "voiceName": "JessicaMultiStyleNeural"
}

Sample response

Operation-Location: https://eastus.api.cognitive.microsoft.com/customvoice/operations/a01a127a-c204-4e46-a8c1-fab01559b05b?api-version=2026-01-01
{
  "description": "Jessica multi style voice",
  "consentId": "Jessica",
  "createdDateTime": "2023-04-01T05:30:00.000Z",
  "engineVersion": "2023.07.04.0",
  "id": "JessicaMultiStyle",
  "lastActionDateTime": "2023-04-02T10:15:30.000Z",
  "locale": "en-US",
  "projectId": "Jessica",
  "properties": {
    "presetStyles": [
      "cheerful",
      "sad"
    ],
    "styleTrainingSetIds": {
      "happy": "JessicaHappy-300",
      "myStyle2": "JessicaStyle2"
    },
    "voiceStyles": [
      "cheerful",
      "sad",
      "happy",
      "myStyle2"
    ]
  },
  "recipe": {
    "kind": "MultiStyle",
    "version": "V3.2023.06"
  },
  "status": "NotStarted",
  "trainingSetId": "Jessica-300",
  "voiceName": "JessicaMultiStyleNeural"
}
Operation-Location: https://eastus.api.cognitive.microsoft.com/customvoice/operations/a01a127a-c204-4e46-a8c1-fab01559b05b?api-version=2026-01-01
{
  "description": "Jessica multi style voice",
  "consentId": "Jessica",
  "createdDateTime": "2023-04-01T05:30:00.000Z",
  "engineVersion": "2023.07.04.0",
  "id": "JessicaMultiStyle",
  "lastActionDateTime": "2023-04-02T10:15:30.000Z",
  "locale": "en-US",
  "projectId": "Jessica",
  "properties": {
    "presetStyles": [
      "cheerful",
      "sad"
    ],
    "styleTrainingSetIds": {
      "happy": "JessicaHappy-300",
      "myStyle2": "JessicaStyle2"
    },
    "voiceStyles": [
      "cheerful",
      "sad",
      "happy",
      "myStyle2"
    ]
  },
  "recipe": {
    "kind": "MultiStyle",
    "version": "V3.2023.06"
  },
  "status": "NotStarted",
  "trainingSetId": "Jessica-300",
  "voiceName": "JessicaMultiStyleNeural"
}

Definitions

Name Description
Azure.Core.Foundations.Error

The error object.

Azure.Core.Foundations.ErrorResponse

A response containing error details.

Azure.Core.Foundations.InnerError

An object containing more specific information about the error. As per Azure REST API guidelines - https://aka.ms/AzureRestApiGuidelines#handling-errors.

Model

Model object

ModelFailureReason

Model training failure reason

ModelProperties

Model properties

PresetStyleItem

Preset styles supported by the recipe. The voice model can support these styles without any style training set.

Recipe

Recipe for model building. Different recipes have different capability.

Status

Status of a resource.

Azure.Core.Foundations.Error

The error object.

Name Type Description
code

string

One of a server-defined set of error codes.

details

Azure.Core.Foundations.Error[]

An array of details about specific errors that led to this reported error.

innererror

Azure.Core.Foundations.InnerError

An object containing more specific information than the current object about the error.

message

string

A human-readable representation of the error.

target

string

The target of the error.

Azure.Core.Foundations.ErrorResponse

A response containing error details.

Name Type Description
error

Azure.Core.Foundations.Error

The error object.

Azure.Core.Foundations.InnerError

An object containing more specific information about the error. As per Azure REST API guidelines - https://aka.ms/AzureRestApiGuidelines#handling-errors.

Name Type Description
code

string

One of a server-defined set of error codes.

innererror

Azure.Core.Foundations.InnerError

Inner error.

Model

Model object

Name Type Description
consentId

string

minLength: 3
maxLength: 64
pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]{1,62}[a-zA-Z0-9]$

Resource id

createdDateTime

string (date-time)

The timestamp when the object was created. The timestamp is encoded as ISO 8601 date and time format ("YYYY-MM-DDThh:mm:ssZ", see https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations).

description

string

Model description

engineVersion

string

Engine version. Update this version can get the latest pronunciation bug fixing.

id

string

minLength: 3
maxLength: 64
pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]{1,62}[a-zA-Z0-9]$

Resource id

lastActionDateTime

string (date-time)

The timestamp when the current status was entered. The timestamp is encoded as ISO 8601 date and time format ("YYYY-MM-DDThh:mm:ssZ", see https://en.wikipedia.org/wiki/ISO_8601#Combined_date_and_time_representations).

locale

string

The locale of this model. Locale code follows BCP-47. You can find the text to speech locale list here https://learn.microsoft.com/azure/ai-services/speech-service/language-support?tabs=tts.

projectId

string

minLength: 3
maxLength: 64
pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]{1,62}[a-zA-Z0-9]$

Resource id

properties

ModelProperties

Model properties

recipe

Recipe

Recipe for model building. Different recipes have different capability.

status

Status

Status of a resource.

trainingSetId

string

minLength: 3
maxLength: 64
pattern: ^[a-zA-Z0-9][a-zA-Z0-9._-]{1,62}[a-zA-Z0-9]$

Resource id

voiceName

string

minLength: 1

Voice name

ModelFailureReason

Model training failure reason

Value Description
InaccessibleCustomerStorage

The customer uses Bring Your Own Storage in Speech Account. But the storage is not accessible now. Please check doc.

SpeakerVerificationFailed

The consent and training audio are not from the same speaker.

TerminateByUser

The customer canceled model training.

Internal

Custom Voice Service error.

DataNotReady

Training data is not ready for model training.

DataNotEnough

Training data is not enough for model training.

ModelProperties

Model properties

Name Type Description
failedTrainingsets

string[]

IDs of failed training sets.

failureReason

ModelFailureReason

Model training failure reason

presetStyles

string[]

Preset styles of this model.

secondaryLocales

string[]

Secondary locales that this model can speak. Locale code follows BCP-47.

styleTrainingSetIds

object

Customized styles and associated training sets.

voiceStyles

string[]

All styles supported by this model.

PresetStyleItem

Preset styles supported by the recipe. The voice model can support these styles without any style training set.

Name Type Description
female

string[]

Preset styles supported on female voice model.

male

string[]

Preset styles supported on male voice model.

Recipe

Recipe for model building. Different recipes have different capability.

Name Type Description
datasetLocales

string[]

The locale of the training dataset. Locale code follows BCP-47. You can find the text to speech locale list here https://learn.microsoft.com/azure/ai-services/speech-service/language-support?tabs=tts.

description

string

Recipe description

kind

string

Recipe kind

maxCustomStyleNum

integer (int32)

Maximum customized style number supported in one voice model.

minDurationInSeconds

number (double)

Minimum audio duration in seconds required to train a voice model with this recipe.

minStyleUtteranceCount

integer (int32)

Minimum utterance count required to train each customized style.

minUtteranceCount

integer (int32)

Minimum utterance count required to train a voice model with this recipe.

modelLocales

string[]

The locale that a voice model can speak with this recipe. Locale code follows BCP-47. You can find the text to speech locale list here https://learn.microsoft.com/azure/ai-services/speech-service/language-support?tabs=tts.

presetStyles

<string,  PresetStyleItem>

Preset styles supported by this recipe per locale. You can get these styles without any style training set.

version

string

Recipe version

Status

Status of a resource.

Value Description
NotStarted

NotStarted

Running

Running

Succeeded

Succeeded

Failed

Failed

Disabling

Disabling

Disabled

Disabled