New Requirements to Spark Pool via EV2

Question

New Requirements to Spark Pool via EV2

Yajushi Mattegunta 0 Microsoft Employee

I have been facing some issues with deploying new requirements to a Synapse Spark Pool.

We currently deploy our Synapse pool requirements via EV2. When I try to add a new module to the requirements.json file (in this case, azure.cosmos), it fails during runtime on a previous requirement that was fine (azure.kusto). Before adding another module, all the requirements worked fine.

I have tried deploying an "empty" list of requirements as a reset, which failed the EV2 release. Adding the whl file manually for my new module did not work. I tried restarting the spark pools, but that also did not work. It does work locally via the requirements.txt, so there is no conflict between the packages, but fails consistently via EV2.

How can I add new requirements for Synapse without previous requirements failing?

Manoj Kumar Boyini 13,930 Reputation points Microsoft External Staff Moderator

2026-04-23T21:44:08.7666667+00:00
Hi Yajushi Mattegunta

This issue is typically due to how Synapse handles dependencies during Spark pool deployments. When you add a new package (like azure.cosmos), the system re-resolves the entire dependency set, which can change shared libraries (e.g., azure-core) and cause previously working packages like azure.kusto to fail.

Also, Synapse uses prebuilt runtimes with cached environments, so changes can lead to inconsistent states—this is why it works locally but fails in Synapse.

Recommendations:

Keep a complete requirements.json (not incremental changes)

Pin package versions to avoid unexpected conflicts

If needed, test with a new Spark pool to avoid cache issues

Could you share your requirements.json and the exact runtime error? That will help us narrow this down further.
Yajushi Mattegunta 0 Reputation points Microsoft Employee

2026-04-23T21:52:45.7366667+00:00
Here is the line in the requirements.json that is the issue:

"synapse_workspace_spark_pools_requirements_content": "pandera==0.18.3\nazure.kusto.data==4.6.2\nazure.kusto.ingest==4.6.2\nazure.cosmos==4.7.0"

The error that comes out is 'No module named azure.kusto' during runtime. The only addition I made is the azure.cosmos in the end.

1 answer

Your answer

Manoj Kumar Boyini 13,930 Reputation points Microsoft External Staff Moderator

2026-04-23T21:44:08.7666667+00:00

Hi Yajushi Mattegunta

This issue is typically due to how Synapse handles dependencies during Spark pool deployments. When you add a new package (like azure.cosmos), the system re-resolves the entire dependency set, which can change shared libraries (e.g., azure-core) and cause previously working packages like azure.kusto to fail.

Also, Synapse uses prebuilt runtimes with cached environments, so changes can lead to inconsistent states—this is why it works locally but fails in Synapse.

Recommendations:

Keep a complete requirements.json (not incremental changes)

Pin package versions to avoid unexpected conflicts

If needed, test with a new Spark pool to avoid cache issues

Could you share your requirements.json and the exact runtime error? That will help us narrow this down further.
Yajushi Mattegunta 0 Reputation points Microsoft Employee

2026-04-23T21:52:45.7366667+00:00

Here is the line in the requirements.json that is the issue:

"synapse_workspace_spark_pools_requirements_content": "pandera==0.18.3\nazure.kusto.data==4.6.2\nazure.kusto.ingest==4.6.2\nazure.cosmos==4.7.0"

The error that comes out is 'No module named azure.kusto' during runtime. The only addition I made is the azure.cosmos in the end.

Answer 1

Adding new requirements to a Synapse Spark pool can surface dependency conflicts or library-installation errors that cause previously working packages to fail. In Synapse, Python libraries for a Spark pool are resolved as a single Conda environment from the requirements file; if any package (including the new one) fails to install or conflicts with existing ones, the pool silently falls back to the base runtime and your previous requirements appear to “break.”

To add new requirements without breaking existing ones, follow these steps:

Recreate the Synapse runtime environment locally
- Download the base Synapse runtime environment template YAML for the Spark version in use (for example, from the runtime documentation such as Apache Spark 3.4 GA).
- Create a local Conda environment from that YAML:
```
     conda myenv create -f environment.yml
     conda activate myenv
```
- This approximates the Synapse base runtime where the pool requirements are applied.
Test the combined requirements locally
- Take the exact requirements file used by EV2 (the one that includes both azure.kusto and the new azure.cosmos requirement) and install it into the local environment:
```
     pip install -r requirements.txt
```
- If installation fails locally, there is a dependency conflict between the base runtime and the requirements, or between azure.kusto and azure.cosmos (or their transitive dependencies). Adjust versions or remove conflicting packages until pip install -r succeeds.
- Only when the combined requirements install cleanly locally should they be deployed to the Synapse Spark pool.
Deploy the corrected requirements to the Spark pool
- Update the pool-level requirements file (via EV2/ARM) with the validated set of packages.
- After updating, force the Spark pool to pick up the new libraries by using Force new settings on the pool. This ends all current sessions and restarts the pool so the new environment is applied.
If using a DEP-enabled workspace
- Installing packages directly from public repos is not supported. Upload all required wheels (for example, for azure.cosmos and any additional dependencies) as workspace libraries and attach them to the Spark pool instead of pulling from PyPI.
- Use the same local-environment approach to discover all required wheel dependencies before uploading.
General guidance
- Any error in the requirements file (invalid package, incompatible version, or dependency conflict) causes the pool to revert to the base runtime, which can look like previously working requirements suddenly failing.
- Always validate the full, combined requirements set in a local environment that mirrors the Synapse runtime before updating the pool.

References:

Share via

New Requirements to Spark Pool via EV2

1 answer

Your answer