Dockerfile: Choosing Between COPY and ADD

Understanding COPY and ADD in Dockerfiles

Dockerfiles are the blueprints for building Docker images. Within a Dockerfile, the COPY and ADD instructions are used to bring files and directories from your host machine into the image. While they appear similar, understanding their differences is crucial for building efficient and maintainable images.

Core Functionality

Both COPY and ADD serve the primary purpose of copying files from a source location to a destination within the Docker image’s filesystem. The basic syntax is:

COPY <src> <dest>
ADD <src> <dest>

<src> specifies the file or directory to copy, and <dest> is the destination path inside the image. Multiple COPY or ADD instructions can be used in a single Dockerfile.

Key Differences

The core distinction lies in their capabilities. COPY is a straightforward instruction focused solely on copying files. ADD, on the other hand, possesses extra features, which can be both helpful and potentially problematic:

Tar Archive Extraction: If <src> is a local tar archive (gzip, bzip2, xz), ADD will automatically extract it into the destination directory. COPY simply copies the archive file itself.
Remote URL Support: ADD can accept a URL as the source, downloading the file from the web and copying it into the image. COPY only works with local files and directories.

When to Use COPY

In most scenarios, COPY is the preferred choice. Here’s why:

Transparency: COPY’s behavior is predictable – it copies files as-is. This makes your Dockerfile easier to understand and maintain.
Image Layering: Using COPY generally results in more efficient image layering. Each COPY or ADD instruction creates a new layer in the image. If you use ADD to download and extract a file, it’s all done in a single layer. With COPY, you can download, extract, and clean up in separate layers, allowing for better caching and potentially smaller image sizes.
Best Practice: The official Docker documentation recommends using COPY whenever possible, reserving ADD for specific use cases.

Example:

COPY application.jar /app/application.jar

This simply copies the application.jar file from your host machine into the /app directory within the Docker image.

When to Use ADD

ADD is best suited for the following specific scenarios:

Local Tar Archive Extraction: If you need to automatically extract a local tar archive during image creation, ADD provides a convenient shortcut.

Example:

ADD rootfs.tar.gz /

This will extract the contents of rootfs.tar.gz directly into the root directory of the image.

Avoiding Common Pitfalls

Remote URL Downloads: While ADD supports remote URLs, it’s strongly discouraged. Instead, use RUN with tools like curl or wget to download files, and then use COPY to bring them into the image. This allows you to delete the downloaded files after extraction, reducing image size.
Unexpected Archive Extraction: Be careful when using ADD with local paths. If a file unexpectedly matches a recognized archive format, it will be automatically extracted, potentially leading to unexpected results.

Best Practices

Favor COPY: Choose COPY unless you specifically need the archive extraction or remote URL features of ADD.
Explicit Extraction: If you need to extract an archive, use RUN with tools like tar after using COPY to bring the archive into the image.
Minimize Layers: Combine related instructions to reduce the number of image layers.
Clean Up: Delete any unnecessary files after they’ve been used to keep your image size down.

By understanding the nuances of COPY and ADD, you can create more efficient, maintainable, and predictable Docker images.