Hi Inquisitive,
It sounds like you're trying to manage the degree of parallelism in Azure Data Factory (ADF) for your copy and foreach activities, and you're curious about how these settings interact when fetching data from multiple tables.
Here's a bit of insight:
- Degree of Parallelism: This setting controls how many concurrent copies ADF will attempt to run. In your case, you mentioned setting it to 5. This means ADF can process up to 5 concurrent copy operations at any given time. However, be mindful that if this number is set too high, it may potentially lead to performance issues or even throttling, especially if you're working with sources that have a maximum limit (like 32 queries at once).
- ForEach Batch Count: The batch count decides how many items are processed in parallel within the foreach activity. You've set this to 15. What this means is that ADF can initiate up to 15 parallel 'queues', but remember that these queues run sequentially in terms of the items they process.
In your scenario:
- You can run multiple copy activities simultaneously, but you're capped by both your degree of parallelism (5) and how the foreach activity constructs its queues (up to 15). So, effectively, while you have the capability for many queues, the actual number of concurrent activities will be limited by the degree of parallelism you've set.
Recommendations:
Monitor Performance: Keep an eye on the performance metrics to ensure that your pipeline is running efficiently and adjust the degree of parallelism if you notice any bottlenecks.
Testing: Test different configurations to find the sweet spot that works best for your workload.
Feel free to reach out if you need more specific guidance or have further questions!
References:
- Copy and transform data in Microsoft Fabric Warehouse using Azure Data Factory or Azure Synapse Analytics
- Learn about Degree of Parallelism and ForEach
- ForEach Limitations and Workarounds
Hope this helps!