Code Monkey home page Code Monkey logo

dcp's Introduction

dcp: docker cp made easy

GitHub Actions Latest version MIT licensed

Summary

Containers are great tools that can encapsulate an application and its dependencies, allowing apps to run anywhere in a streamlined way. Some container images contain commands to start a long-lived binary, whereas others may simply contain data that needs to be available in the environment (for example, a Kubernetes cluster). For example, operator-framework bundles and crossplane packages both use container images to store Kubernetes manifests. These manifests are unpacked and applied to the cluster.

One of the downsides of using container images to store data is that they are opaque. There's no way to quickly tell what's inside the image, although the hash digest is useful in seeing whether the image has changed from a previous version. The options are to use docker cp or something similar using podman or containerd.

Using docker cp by itself can be cumbersome. Say you have a remote image somewhere in a registry. You have to pull the image, create a container from that image, and only then run docker cp <container-id> using an unintuitive syntax for selecting what should be copied to the local filesystem.

dcp is a simple binary that simplifies this workflow. A user can simply say dcp <image-name> and dcp can extract the contents of that image onto the local filesystem. From there, users are free to view and edit the files locally. Any OCI-based image is supported.

Demo

Installing

Installing from crates.io

If you're a Rust programmer and have Rust installed locally, you can install dcp by simply entering cargo install dcp, which will fetch the latest version from crates.io. dcp relies on the stable Rust toolchain.

Download compiled binary

The release section has a number of precompiled versions of dcp for different platforms. Linux, macOS, and Windows (experimental) binaries are pre-built. For MacOS, both arm and x86 targets are provided, and for Linux only x86 is provided. If your system is not supported, building dcp from the source is straightforward.

Build from source

To build from source, ensure that you have the rust toolchain installed locally. This project does not rely on nightly and uses the 1.62-stable toolchain. Clone the repository and run cargo build --release to build a release version of the binary. From there, you can move the binary to a folder on your $PATH to access it easily.

Implementation

Because there wasn't a suitable containerd client implementation in Rust at the time of writing, dcp relies on APIs provided by external docker and podman crates. This limits dcp to working on systems where docker or podman is the container runtime.

By default, dcp will look for an active docker socket to connect to at the standard path. If the docker socket is unavailable, dcp will fallback to the current user's podman socket based on the $XDG_RUNTIME_DIR environment variable.

If the docker socket is on a remote host, or in a custom location, use the -s flag with the path to the custom socket.

Flags and Examples

By default, dcp will copy content to the current working directory. For example, lets try issuing the following command:

$ dcp tyslaton/sample-catalog:v0.0.4 -c configs

This command will copy the configs directory (specified via the -c flag) from the image to the current directory.

For further configuration, lets try:

$ dcp tyslaton/sample-catalog:v0.0.4 -d output -c configs

This command pulls down the requested image, only extracting the configs directory and copying it to the output directory locally (specified via the -d flag). If output does not exist locally, it will be created as part of the process.

Another example, for copying only the manifests directory:

$ dcp quay.io/tflannag/bundles:resolveset-v0.0.2 -c manifests

Lastly, we can copy from a private image by providing a username and password (specified via the -u and -p flags).

$ dcp quay.io/tyslaton/sample-catalog-private:latest -u <username> -p <password>

⚠️ This serves as a convenient way to copy contents from a private image but is insecure as your registry credentials are saved in your shell history. If you would like to be completely secure then login via <container_runtime> login and pull the image first. dcp will then be able to find the image locally and process it.

FAQ

Q: I hit an unexpected error unpacking the root filesystem of an image: trying to unpack outside of destination path. How can I avoid this?

A: dcp relies on the underlying tar Rust library to unpack the image filesystem represented as a tar file. The unpack method is sensitive in that it will not write files outside of the path specified by the destination. So things like symlinks will cause errors when unpacking. Whenever possible, use the -c flag to specify a directory to unpack, instead of the container filesystem root, to avoid this error.


Q: I would like to use dcp to pull content from an image but I don't know where in the image the content is stored. Is there an ls command or similar functionality in dcp?

A: Checkout the excellent dive tool to easily explore a container filesystem by layer. After finding the path of the files to copy, you can then use dcp to extract just those specific files.


Q: Is dcp supported on Windows?

A: Yes, dcp is supported on Windows. Windows support is experimental, as there is no CI coverage, but it will likely work in your windows environment. The only non-default change you need to make is to expose the docker daemon so that dcp can connect to it. This can be done through one of two ways:

  1. Adding the following to your %userprofile%\.docker\daemon.json file.

    {
        "hosts": ["tcp://0.0.0.0:2375"]
    }
  2. Going through the Docker Desktop UI and enabling the setting for Expose daemon on tcp://localhost:2375 without TLS under General.


Q: I would like to inspect image labels to figure out where in the filesystem I should copy from. Does dcp have an inspect command to list image labels?

A: Listing an image's labels can be done easily using the underlying container runtime. For example, run docker image inspect <image-id> | grep Labels to see labels attached to an image. From there, dcp can be used to copy files from the container filesystem.

Testing

If you would like to run the test suite, you just need to run the standard cargo command. This will run all relevant unit, integration and documentation tests.

$ cargo test

dcp's People

Contributors

dependabot[bot] avatar exdx avatar timflannagan avatar tylerslaton avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dcp's Issues

logging: Replace "." with the pwd

When copying content to the local disk, by default "." is the destination used. This is intentional, but it would be more informative in the logs if the pwd was provided to the user instead of "."

Error when copying content from local image

$ dcp quay.io/operator-framework/rukpak:latest
 ERROR dcp > ❌ error 404 Not Found - manifest for quay.io/operator-framework/rukpak:latest not found: manifest unknown: manifest unknown
 DEBUG dcp > 📦 Created container with id: "b529d9210b1ec503ff8e90a0e8d83ef6cbc232161f0f59d68bceedebed919eb4"
 INFO  dcp > ✅ Copied content to . successfully
 DEBUG dcp > 📦 Cleaned up container "b529d9210b1ec503ff8e90a0e8d83ef6cbc232161f0f59d68bceedebed919eb4" successfully

dcp does successfully copy the container filesystem, but it gives a misleading error when presented with an image that is not on a remote registry (just one built locally). It should first check to see if the image is available locally before attempting to pull it.

Add support for listing container image labels

Use-case:
I want to be able to see the labels on the particular container image I am interested in, to inform what I would like to copy.

A/C:
Provide a dcp describe command that can show the labels associated with a particular image.

Support windows builds

The current method of connecting to the docker socket, via the unix:// path, does not work on windows systems. Connecting in an alternative way would enable support for windows.

manifests directory added after test run

Running cargo test results in a manifests directory being added to the root-level, producing a git diff. This directory should probably go into the /tmp test directory for now

Support a shell into a container to browse the filesystem

This feature is for a shell that opens so one can explore the files of a container without actually running the ls command inside it.

This functionality is similar to the dive tool. Potentially the feature should just be a dive integration but that project is in Go versus Rust.

cc @tylerslaton

Support custom socket locations

As a user, I would like to use dcp with a custom socket, either a remote socket or one on my host in a different location.

dcp should have a command-line flag to set a custom socket path.

Notes:

  • The user would also have to supply which runtime the socket is meant for, since both docker and podman sockets are options

Release dcp on crates.io

Since dcp has stabilized, it would make sense to release it to the wider community. As a first step, releasing it on crates.io lets Rust users install dcp via cargo install.

Preserve file permissions flag

When copying data from the container to the local filesystem, there should be a choice whether or not to preserve the original permissions of the files in the container. I think this may be possible, based on the underlying tar unpack command.

About how dcp unpack when it face a unpack issue

Hi, i come across a issus with Error: failed to unpack when i unpack a image to local folder. And i understand why this happened by FAQ.
But i have a question about how it works. Does it unpack all the image files expect the ones like symlinks, or just stop unpacking when it fail to unpack a file? Thanks!

Support extracting more than one directory from the container filesystem

Right now, dcp only extracts one directory from the image, provided via a mandatory command-line argument. As a user, I want to extract multiple directories (for example, /manifests and /metadata) from the image in one invocation of dcp.

A/C:

  • Support specifying multiple directories as command line arguments, either comma separated or space separated, and have dcp copy these multiple directories out

Cleanup the container

Right now the container is created, the files are copied, and then the program exits. The program should cleanup the container after it's done.

Support extracting files from k8s container

As a user, I would like to debug a live container running in my k8s cluster by copying files from that container's filesystem to my local filesystem.

For example,

dcp --cluster-container namespace/pod -d /tmp/data -c /data

Refactor is_image_present functions

The functions that currently check whether an image is on-disk are very imperative and a sequence of for statements. They do the job, but can be rewritten to be more functional and clear.

A/C:
Rewrite

dcp/src/lib.rs

Line 328 in a8c6920

for image in images {
as well as the podman version beneath it to use .iter() and .map() instead of for loops.

Add questions label and backfill existing FAQ's as issues

In a recent PR we were going to introduce an FAQ section to the README. However, we decided it would be a bit more scalable if we just added a question label for issues and backfill with the frequently asked questions. From there we could just link an issue filter in the README.md for easy access.

Restructure the logging

The logging output right now is not very clean. Using a logging library and providing structure logs should be sufficient. Also, provide a log indicating the content was copied succesfully (from where to where)

dcp-demo: ./dcp quay.io/tflannag/bundles:resolveset-v0.0.2
PullStatus { status: "Pulling from tflannag/bundles", id: Some("resolveset-v0.0.2"), progress: None, progress_detail: None }
PullStatus { status: "Pulling fs layer", id: Some("8d646b05728d"), progress: None, progress_detail: Some(ProgressDetail { current: None, total: None }) }
PullStatus { status: "Downloading", id: Some("8d646b05728d"), progress: Some("[==================================================>]     411B/411B"), progress_detail: Some(ProgressDetail { current: Some(411), total: Some(411) }) }
PullStatus { status: "Verifying Checksum", id: Some("8d646b05728d"), progress: None, progress_detail: Some(ProgressDetail { current: None, total: None }) }
PullStatus { status: "Download complete", id: Some("8d646b05728d"), progress: None, progress_detail: Some(ProgressDetail { current: None, total: None }) }
PullStatus { status: "Extracting", id: Some("8d646b05728d"), progress: Some("[==================================================>]     411B/411B"), progress_detail: Some(ProgressDetail { current: Some(411), total: Some(411) }) }
PullStatus { status: "Extracting", id: Some("8d646b05728d"), progress: Some("[==================================================>]     411B/411B"), progress_detail: Some(ProgressDetail { current: Some(411), total: Some(411) }) }
PullStatus { status: "Pull complete", id: Some("8d646b05728d"), progress: None, progress_detail: Some(ProgressDetail { current: None, total: None }) }
PullStatus { status: "Digest: sha256:145ccb5e7e73d4ae914160c066e49f35bc2be2bb86e4ab0002a802aa436599bf", id: None, progress: None, progress_detail: None }
PullStatus { status: "Status: Downloaded newer image for quay.io/tflannag/bundles:resolveset-v0.0.2", id: None, progress: None, progress_detail: None }
"1fbb062dd17ccfd444843b29a5f99a886ef11b4c8ef03c522f1c17fbd224b5d7"

Add support for listing files on the container filesystem

Use-case:
I want to extract a particular file from a container, but am not sure where it is located ahead of time.

A/C:
Provide an dcp ls command that can list the files on the container filesystem and write the list to stdout. The user then can use the -c flag to extract just the particular file they are interested in.

Add support for digest image references

The input image reference currently requires a tag and does not accept a digest.

Example error with version 0.3.1 on linux:

dcp icr.io/cpopen/ibm-iam-operator-bundle@sha256:c63fdfbbeb3314d3ee889bee86139aa0bf6dd4a69f7e63139b6608abae0b3e5 -c manifests
 DEBUG dcp::image > 📦 Searching for image icr.io/cpopen/ibm-iam-operator-bundle@sha256:c63fdfbbeb3314d3ee889bee86139aa0bf6dd4a69f7e63139b6608abae0b3e5 locally
Error: ❌ error pulling image: error 400 Bad Request - invalid reference format

feature: create directory as part of unpacking

As a user, I would like dcp to be able to create a directory (a subdirectory of the current working directory, for example) when unpacking contents, so I don't need to create the directory manually before hand.

One option is to introduce a new flag, -o, that when set creates a directory on the local filesystem.

Another option is to use the existing -d flag to create a directory if one does not exist already.

Error trying to unpack outside of destination path

Running dcp registry.ci.openshift.org/ci/determinize-ci-operator:latest I encountered the following error:

 DEBUG dcp > 📦 Created container with id: "263a233eb2dda832375e57d078bbce5d407ec42239a6b924f2d251b7b2758147"
Error: Custom { kind: Other, error: TarError { desc: "failed to unpack `/Users/dsover/code/src/ci-operator-config/usr/bin/perl5.26.3`", io: Custom { kind: Other, error: TarError { desc: "trying to unpack outside of destination path: /Users/dsover/code/src/ci-operator-config", io: Custom { kind: Other, error: "Invalid argument" } } } } }

I believe this error is coming from the tar library. It suggests that a file in the container filesystem is outside of the destination filepath, and tar doesn't allow this for security reasons. This may be a symlink file. I'm not sure how to recover from this error, since fundamentally unpacking outside the destination path is insecure.

Create some introductory E2E tests

Summary

We should go ahead and add some simple E2E tests that establish how we create them moving forward. As a first pass, it likely makes sense to make a few tests that guarantee the flags being passed are working as intended.

Support sha256 image tags

I tried running dcp on an OLM registry+v1 bundle image but that failed with the following error:

$ dcp --content-path output registry.redhat.io/openshift-update-service/cincinnati-operator-bundle@sha256:3feccb95e912f2d710958e701a1b52acf16a676fde9a8219491ad9381e6284ea
error 400 Bad Request - invalid reference format
"d8f1b22b1239673912ad2555e16e3e790c994d9d87f599f044c808160b246b56"
Error: Error(Fault { code: 404, message: "Could not find the file output in container d8f1b22b1239673912ad2555e16e3e790c994d9d87f599f044c808160b246b56" })

And that image is pullable via docker:

$ docker pull registry.redhat.io/openshift-update-service/cincinnati-operator-bundle@sha256:3feccb95e912f2d710958e701a1b52acf16a676fde9a8219491ad9381e6284ea
registry.redhat.io/openshift-update-service/cincinnati-operator-bundle@sha256:3feccb95e912f2d710958e701a1b52acf16a676fde9a8219491ad9381e6284ea: Pulling from openshift-update-service/cincinnati-operator-bundle
d83fa9b96fa1: Pull complete 
Digest: sha256:3feccb95e912f2d710958e701a1b52acf16a676fde9a8219491ad9381e6284ea
Status: Downloaded newer image for registry.redhat.io/openshift-update-service/cincinnati-operator-bundle@sha256:3feccb95e912f2d710958e701a1b52acf16a676fde9a8219491ad9381e6284ea

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.