Share via

mistral-document-ai-2512: intermittent 408 and 503 errors on PDF inputs (document_url) while image_url works

Noe G 0 Reputation points
2026-05-05T14:30:01.54+00:00

Summary

Calls to a mistral-document-ai-2512 serverless deployment on Microsoft Foundry intermittently return HTTP 408 "upstream request timeout" and HTTP 503 "service unavailable" when the payload uses document_url with a base64-encoded PDF. The exact same endpoint, deployment, model and credentials succeed reliably when the payload uses image_url with a base64-encoded image. PDF inputs were working consistently until a few days ago, without any change on our side.

Endpoint pattern

https://<resource>.cognitiveservices.azure.com/providers/mistral/azure/ocr

Model

mistral-document-ai-2512 (deployed as global-standard serverless)

Failing request body (PDF)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "document_url",

    "document_url": "data:application/pdf;base64,<...>"

  },

  "include_image_base64": false

}

Working request body (image, same endpoint)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "image_url",

    "image_url": "data:image/png;base64,<...>"

  },

  "include_image_base64": false

}

Error responses

Two distinct errors are returned, both server-side:

408 - upstream request timeout

503 - service unavailable

The errors appear non-deterministic — the same PDF payload can return 408 on one attempt and 503 on the next. Far below our 50 RPM quota in every case (no 429 ever observed).

Reproduction details

  • PDF size tested: small (single page, well under the documented 30 MB / 30 page limit)
  • Also tested with pages: [0] to restrict processing to the first page only — same 408 / 503 mix
  • Tried with include_image_base64 set to both true and false — same behavior
  • Client-side timeout set to 180000 ms — errors are clearly server-side, not client-side
  • Multiple retries with 30 to 45 second spacing all fail
  • Image inputs (PNG, same byte range) succeed within seconds on the same endpoint

What we have tried

  • Increased client timeout up to 180 seconds
  • Added retry logic (up to 5 attempts, 45 seconds between tries)
  • Sent only pages: [0] to minimize processing load
  • Toggled include_image_base64
  • Verified the payload matches the documented API contract

Questions

  1. Is there a known regression or capacity issue affecting the document_url (PDF) path of mistral-document-ai-2512 over the past few days?
  2. Is the PDF processing pipeline routed differently from the image_url pipeline? The clear asymmetry (PDFs fail, images succeed on the same endpoint) suggests separate backend handling.
  3. Are 408 and 503 expected responses when underlying capacity is constrained, even when the client is well below its assigned RPM quota?
  4. Are there recommended workarounds while this is investigated, other than client-side PDF rasterization to PNG images?

Thanks for any guidance from the Foundry or Mistral team.

Summary

Calls to a mistral-document-ai-2512 serverless deployment on Microsoft Foundry intermittently return HTTP 408 "upstream request timeout" and HTTP 503 "service unavailable" when the payload uses document_url with a base64-encoded PDF. The exact same endpoint, deployment, model and credentials succeed reliably when the payload uses image_url with a base64-encoded image. PDF inputs were working consistently until a few days ago, without any change on our side.

Endpoint pattern

https://<resource>.cognitiveservices.azure.com/providers/mistral/azure/ocr

Model

mistral-document-ai-2512 (deployed as global-standard serverless)

Failing request body (PDF)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "document_url",

    "document_url": "data:application/pdf;base64,<...>"

  },

  "include_image_base64": false

}

Working request body (image, same endpoint)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "image_url",

    "image_url": "data:image/png;base64,<...>"

  },

  "include_image_base64": false

}

Error responses

Two distinct errors are returned, both server-side:

408 - upstream request timeout

503 - service unavailable

The errors appear non-deterministic — the same PDF payload can return 408 on one attempt and 503 on the next. Far below our 50 RPM quota in every case (no 429 ever observed).

Reproduction details

  • PDF size tested: small (single page, well under the documented 30 MB / 30 page limit)
  • Also tested with pages: [0] to restrict processing to the first page only — same 408 / 503 mix
  • Tried with include_image_base64 set to both true and false — same behavior
  • Client-side timeout set to 180000 ms — errors are clearly server-side, not client-side
  • Multiple retries with 30 to 45 second spacing all fail
  • Image inputs (PNG, same byte range) succeed within seconds on the same endpoint

What we have tried

  • Increased client timeout up to 180 seconds
  • Added retry logic (up to 5 attempts, 45 seconds between tries)
  • Sent only pages: [0] to minimize processing load
  • Toggled include_image_base64
  • Verified the payload matches the documented API contract

Questions

  1. Is there a known regression or capacity issue affecting the document_url (PDF) path of mistral-document-ai-2512 over the past few days?
  2. Is the PDF processing pipeline routed differently from the image_url pipeline? The clear asymmetry (PDFs fail, images succeed on the same endpoint) suggests separate backend handling.
  3. Are 408 and 503 expected responses when underlying capacity is constrained, even when the client is well below its assigned RPM quota?
  4. Are there recommended workarounds while this is investigated, other than client-side PDF rasterization to PNG images?

Thanks for any guidance from the Foundry or Mistral team.

Summary

Calls to a mistral-document-ai-2512 serverless deployment on Microsoft Foundry intermittently return HTTP 408 "upstream request timeout" and HTTP 503 "service unavailable" when the payload uses document_url with a base64-encoded PDF. The exact same endpoint, deployment, model and credentials succeed reliably when the payload uses image_url with a base64-encoded image. PDF inputs were working consistently until a few days ago, without any change on our side.

Endpoint pattern

https://<resource>.cognitiveservices.azure.com/providers/mistral/azure/ocr

Model

mistral-document-ai-2512 (deployed as global-standard serverless)

Failing request body (PDF)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "document_url",

    "document_url": "data:application/pdf;base64,<...>"

  },

  "include_image_base64": false

}

Working request body (image, same endpoint)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "image_url",

    "image_url": "data:image/png;base64,<...>"

  },

  "include_image_base64": false

}

Error responses

Two distinct errors are returned, both server-side:

408 - upstream request timeout

503 - service unavailable

The errors appear non-deterministic — the same PDF payload can return 408 on one attempt and 503 on the next. Far below our 50 RPM quota in every case (no 429 ever observed).

Reproduction details

  • PDF size tested: small (single page, well under the documented 30 MB / 30 page limit)
  • Also tested with pages: [0] to restrict processing to the first page only — same 408 / 503 mix
  • Tried with include_image_base64 set to both true and false — same behavior
  • Client-side timeout set to 180000 ms — errors are clearly server-side, not client-side
  • Multiple retries with 30 to 45 second spacing all fail
  • Image inputs (PNG, same byte range) succeed within seconds on the same endpoint

What we have tried

  • Increased client timeout up to 180 seconds
  • Added retry logic (up to 5 attempts, 45 seconds between tries)
  • Sent only pages: [0] to minimize processing load
  • Toggled include_image_base64
  • Verified the payload matches the documented API contract

Questions

  1. Is there a known regression or capacity issue affecting the document_url (PDF) path of mistral-document-ai-2512 over the past few days?
  2. Is the PDF processing pipeline routed differently from the image_url pipeline? The clear asymmetry (PDFs fail, images succeed on the same endpoint) suggests separate backend handling.
  3. Are 408 and 503 expected responses when underlying capacity is constrained, even when the client is well below its assigned RPM quota?
  4. Are there recommended workarounds while this is investigated, other than client-side PDF rasterization to PNG images?

Thanks for any guidance from the Foundry or Mistral team.

Summary

Calls to a mistral-document-ai-2512 serverless deployment on Microsoft Foundry intermittently return HTTP 408 "upstream request timeout" and HTTP 503 "service unavailable" when the payload uses document_url with a base64-encoded PDF. The exact same endpoint, deployment, model and credentials succeed reliably when the payload uses image_url with a base64-encoded image. PDF inputs were working consistently until a few days ago, without any change on our side.

Endpoint pattern

https://<resource>.cognitiveservices.azure.com/providers/mistral/azure/ocr

Model

mistral-document-ai-2512 (deployed as global-standard serverless)

Failing request body (PDF)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "document_url",

    "document_url": "data:application/pdf;base64,<...>"

  },

  "include_image_base64": false

}

Working request body (image, same endpoint)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "image_url",

    "image_url": "data:image/png;base64,<...>"

  },

  "include_image_base64": false

}

Error responses

Two distinct errors are returned, both server-side:

408 - upstream request timeout

503 - service unavailable

The errors appear non-deterministic — the same PDF payload can return 408 on one attempt and 503 on the next. Far below our 50 RPM quota in every case (no 429 ever observed).

Reproduction details

  • PDF size tested: small (single page, well under the documented 30 MB / 30 page limit)
  • Also tested with pages: [0] to restrict processing to the first page only — same 408 / 503 mix
  • Tried with include_image_base64 set to both true and false — same behavior
  • Client-side timeout set to 180000 ms — errors are clearly server-side, not client-side
  • Multiple retries with 30 to 45 second spacing all fail
  • Image inputs (PNG, same byte range) succeed within seconds on the same endpoint

What we have tried

  • Increased client timeout up to 180 seconds
  • Added retry logic (up to 5 attempts, 45 seconds between tries)
  • Sent only pages: [0] to minimize processing load
  • Toggled include_image_base64
  • Verified the payload matches the documented API contract

Questions

  1. Is there a known regression or capacity issue affecting the document_url (PDF) path of mistral-document-ai-2512 over the past few days?
  2. Is the PDF processing pipeline routed differently from the image_url pipeline? The clear asymmetry (PDFs fail, images succeed on the same endpoint) suggests separate backend handling.
  3. Are 408 and 503 expected responses when underlying capacity is constrained, even when the client is well below its assigned RPM quota?
  4. Are there recommended workarounds while this is investigated, other than client-side PDF rasterization to PNG images?

Thanks for any guidance from the Foundry or Mistral team.

Foundry Models
Foundry Models

A catalog of AI models in Microsoft Foundry that you can discover, compare, and deploy using Azure’s built‑in tools for evaluation, fine‑tuning, and inference


2 answers

Sort by: Most helpful
  1. Noe G 0 Reputation points
    2026-05-06T07:19:09.5266667+00:00

    Central-Sweden but it has been fixed this morning, I think it was just a server overload.

    0 comments No comments

  2. Q&A Assist
    2026-05-05T14:31:17.38+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    The context only documents mistral-document-ai-2512 as an image-to-text model that accepts image or PDF pages as input and describes general behavior for capacity and throttling, but it doesn’t contain any product note or incident information about regressions, nor internal routing details for document_url vs image_url. It also doesn’t document 408/503 patterns specific to this model.

    From the available information, only the following can be stated:

    1. mistral-document-ai-2512 capabilities

    mistral-document-ai-2512 is listed as an Image-to-Text model with:

    • Input: image or PDF pages (30 pages, max 30 MB PDF file)
    • Output: text
    • Languages: en
    • Response formats: Text, JSON, Markdown
    • Deployment type: Global standard (all regions) and Data zone standard (US and EU)

    This confirms that PDF input is officially supported for this model, but the context doesn’t describe any current regression or known issue with the PDF (document_url) path.

    1. Capacity and error semantics

    The provisioned-throughput documentation explains how capacity issues are normally surfaced:

    • When capacity is exceeded, the API returns HTTP 429 and continues to do so until utilization drops below 100%.
    • 429 is explicitly described as the signal for capacity limits, with retry-after headers to guide client behavior.

    There is no mention in the context of 408 or 503 being the expected signal for capacity exhaustion on Foundry Models; instead, 429 is the documented behavior when capacity is constrained.

    1. Known error patterns for Foundry / Azure OpenAI

    The batch and “on your data” troubleshooting content documents other error codes and conditions (e.g., too_many_tasks, model_not_found, invalid_request, search 503s due to Azure AI Search throttling), but none of these are specific to mistral-document-ai-2512 or to PDF ingestion via document_url.

    The 503 guidance that does exist is for Azure AI Search queries in “on your data” scenarios, where 503 can be caused by insufficient replicas/partitions and mitigated by scaling or adding retry/sleep logic. That guidance is specific to search resources and doesn’t describe the Mistral OCR endpoint.

    1. Answers to the specific questions (limited to what is in context)
    1. Known regression or capacity issue on document_url for mistral-document-ai-2512

    The provided documentation and Q&A snippets contain no statement about any regression, outage, or capacity issue specific to mistral-document-ai-2512, nor to its document_url (PDF) path.

    1. Whether PDF processing is routed differently from image_url

    The context only states that the model accepts “image or PDF pages” as input. It does not describe internal routing, separate pipelines, or architectural differences between PDF and image handling for mistral-document-ai-2512.

    1. Whether 408 and 503 are expected for capacity constraints

    The capacity documentation for Foundry Models states that when capacity is exceeded, the API returns 429 and that this is by design as the capacity signal. There is no documentation here that 408 or 503 are used as the standard signal for capacity issues on Foundry Models, even when within RPM quota.

    1. Recommended workarounds beyond rasterizing PDFs

    The context does not provide any model-specific workaround guidance for mistral-document-ai-2512 PDF issues. General guidance for capacity or backend issues includes:

    • Retrying with backoff when receiving capacity-related responses (documented for 429).
    • Redirecting traffic to other deployments or models when a given deployment is fully utilized.

    However, there is no explicit recommendation in the context for handling intermittent 408/503 on this specific OCR endpoint, nor any alternative PDF ingestion pattern beyond what is already being used.

    Given the constraints of the provided documentation, any further guidance (such as confirming a regression, explaining backend routing, or proposing product-level mitigations) would require escalation through Azure support or service health channels, which are not described in detail in this context.


    References:

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.