Copy Activity with maximum N files in 1 trigger?

Hrithik Purwar 25 Reputation points
2024-08-13T16:10:30.12+00:00

How can I configure my pipeline/copy activity to copy a maximum of 100 files at a time (one trigger) from one blob storage to another in binary mode (using the delete after copy option), given that the source contains thousands of files?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
{count} votes

1 answer

Sort by: Most helpful
  1. Pinaki Ghatak 5,690 Reputation points Microsoft Employee Volunteer Moderator
    2024-08-14T15:26:14.56+00:00

    Hello @Hrithik Purwar

    To copy a maximum of 100 files at a time from one blob storage to another in binary mode, you can use the "Get Metadata" activity to get the list of files in the source blob storage, and then use the "ForEach" activity to iterate over the list of files and copy them to the destination blob storage.

    Here are the following 5 steps to configure your pipeline:

    1. Add a "Get Metadata" activity to your pipeline and configure it to get the list of files from the source blob storage. In the settings of the "Get Metadata" activity, set the "Child Items" option to "None" and the "Recursive" option to "True". This will ensure that the activity only returns the list of files in the source blob storage.
    2. Add a "ForEach" activity to your pipeline and configure it to iterate over the list of files returned by the "Get Metadata" activity. In the settings of the "ForEach" activity, set the "Items" option to the output of the "Get Metadata" activity.
    3. Inside the "ForEach" activity, add a "Copy Data" activity to copy each file to the destination blob storage. In the settings of the "Copy Data" activity, set the "Source" option to the current file being iterated over, and set the "Sink" option to the destination blob storage.
    4. To limit the number of files copied in one trigger to 100, you can set the "Batch Count" option of the "ForEach" activity to 100. This will ensure that the activity only iterates over a maximum of 100 files at a time.
    5. To delete the source files after they have been copied to the destination blob storage, you can set the "Delete Source" option of the "Copy Data" activity to "True".

    I hope that this response has addressed your query and helped you overcome your challenges. If so, please mark this response as Answered. This will not only acknowledge our efforts, but also assist other community members who may be looking for similar solutions.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.