opencontainers / umoci Goto Github PK

View Code? Open in Web Editor NEW

690.0 19.0 96.0 9.77 MB

umoci modifies Open Container images

Home Page: https://umo.ci

License: Apache License 2.0

Makefile 1.13% Go 76.63% Shell 21.57% Awk 0.20% HTML 0.01% Dockerfile 0.47%

oci oci-image containers go rootless-containers container-image docker-image

umoci's Introduction

umoci modifies Open Container images.

umoci (pronounced /uːmoˈʨi/ or approximately "oo-mo-tchee") is a reference implementation of the OCI image specification and provides users with the ability to create, manipulate, and otherwise interact with container images. It is designed to be as small and unopinonated as possible, so as to act as a foundation for larger systems to be built on top of. The primary method of using umoci is as a command-line tool:

  Extract image "leap" from image directory "opensuse" and place it
  inside an OCI runtime-spec bundle at the path "bundle".
% umoci unpack --image opensuse:leap bundle

  Make some changes to the root filesystem ("bundle/rootfs").
% runc run -b bundle ctr
ctr-sh$ zypper install -y foobarbaz
ctr-sh$ exit
% echo foo > bundle/rootfs/README

  Create a new image (called "new-leap") in the image directory "opensuse",
  based on "leap" which contains the changes made to "bundle/rootfs".
% umoci repack --image opensuse:new-leap bundle

  Modify the configuration of the "new-leap" image to specify a new author.
% umoci config --image opensuse:new-leap \
>              --author="Aleksa Sarai <[email protected]>" \
>              --config.workingdir="/var/www"

  Garbage-collect any unreferenced blobs in the image directory "opensuse".
% umoci gc --layout opensuse

See the quick start guide for more accessible documentation about how to use umoci. Notable users of umoci include:

KIWI, which uses umoci to support building both base and derived container images which are then converted to Docker images.
The Open Build Service, which uses umoci (through KIWI) to support building and publishing container images from its built-in container registry. The openSUSE project has been using this method of building container images in production since 2016.
Stacker, which uses umoci as its core building primitive, and is used by Cisco to build container images for some of their appliances since 2018.
LXC provides support for OCI container images through an OCI template, which is implemented as a shell script that wraps umoci. The fact that a container runtime with a vastly different model to OCI container runtimes can make use of umoci is further evidence of its unopinionated design.

If you wish to provide feedback or contribute, read the CONTRIBUTING.md for this project to refresh your knowledge about how to submit good bug reports and patches. Information about how to privately submit security disclosures is also provided.

Install

Pre-built binaries can be downloaded from umoci's releases page. As umoci's builds are reproducible, a cryptographic checksum file is included in the release assets. All of the assets are also signed with a release key, whose fingerprint is:

pub   rsa4096 2016-06-21 [SC] [expires: 2031-06-18]
      5F36C6C61B5460124A75F5A69E18AA267DDB8DB4
uid           [ultimate] Aleksa Sarai <[email protected]>
uid           [ultimate] Aleksa Sarai <[email protected]>
sub   rsa4096 2016-06-21 [E] [expires: 2031-06-18]

umoci is also available from several distributions' repositories:

To build umoci from the source code, a simple make should work on most machines, as should make install.

Usage

umoci has a subcommand-based command-line. For more detailed information, see the generated man pages (which you can build with make docs). You can also read through our quick start guide.

% umoci --help
NAME:
   umoci - umoci modifies Open Container images

USAGE:
   umoci [global options] command [command options] [arguments...]

VERSION:
   0.4.6

AUTHOR:
   Aleksa Sarai <[email protected]>

COMMANDS:
   raw      advanced internal image tooling
   help, h  Shows a list of commands or help for one command

   image:
     config      modifies the image configuration of an OCI image
     unpack      unpacks a reference into an OCI runtime bundle
     repack      repacks an OCI runtime bundle into a reference
     new         creates a blank tagged OCI image
     tag         creates a new tag in an OCI image
     remove, rm  removes a tag from an OCI image
     stat        displays status information of an image manifest
     insert      insert content into an OCI image

   layout:
     gc        garbage-collects an OCI image's blobs
     init      create a new OCI layout
     list, ls  lists the set of tags in an OCI layout

GLOBAL OPTIONS:
   --verbose      alias for --log=info
   --log value    set the log level (debug, info, [warn], error, fatal) (default: "warn")
   --help, -h     show help
   --version, -v  print the version

Releases and Stability

We regularly publish new releases, with each release being given a unique identifying version number (as governed by Semantic Versioning (SemVer)). Information about previous releases including the list of new features, bug fixes and resolved security issues is available in the change log.

Note that while umoci is currently usable as a Go library (and we do have several users of the Go APIs), the API is explicitly considered unstable until umoci 1.0 is released. However, the umoci CLI API is considered to be stable despite umoci not being a 1.0 project.

Governance

umoci is an Open Container Initative project, and is thus bound by the OCI Code of Conduct and the OCI Charter. In addition, the umoci project has its own specific governance rules which determine how changes are accepted into the project, how maintainers are added or removed, how releases are proposed and released, and how the governance rules are changed. In the case of any conflict which cannot be resolved by this project's governance rules, the OCI Technical Oversight Board may step in to help resolve the issue.

History

umoci was originally developed in 2016 by Aleksa Sarai as part of the openSUSE project, and was donated to the Open Container Initiative as a reference implementation of the OCI image specification in mid-2020.

License

umoci is licensed under the terms of the Apache 2.0 license.

umoci: Umoci Modifies Open Containers' Images
Copyright (C) 2016-2020 SUSE LLC
Copyright (C) 2018 Cisco Systems

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

   http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Citation

If you have used umoci in your research, please cite it like you would any other useful software. Here is a handy BibTex citation.

@misc{umoci,
	title = {umoci - Standalone Tool For Manipulating Container Images},
	author = {Aleksa Sarai et al.},
	year = {2016},
	url = {https://umo.ci/},
	doi = {http://dx.doi.org/10.5281/zenodo.1188474},
}

Thank you.

umoci's People

Contributors

Stargazers

Watchers

Forkers

mrunalp runcom erikh maximilianmeister atlaskerr vbatts cyphar zhouhao3 ryanolson hallyn tklauser wking tych0 anuvu flx42 vadorovsky akihirosuda ehotinger lizrice maxking vikas-lamba besnardjb chentex tao12345666333 mallozup suicidesin cpipilas ikaneshiro dtrudg sylabs bergwolf truthiswill whitemike889 maniacs-ops rchincha hassoon1986 acidburn0zzz 0x1a0b opensuse l1ves vrothberg klaven xiaodongloong alexlovelltroy ashokponkumar metaver5o nishakm simhaonline umeshh rasouli zhsj runhyve jdolitsky bloodorangeio isabella232 vinayak-pandey adamkorcz shyransystems akheon23 idleroamer akhramov pedroalvesbatista sfrias cameronnemo thesayyn bluepeople1 xchandan cclerget lotdeef mayhemheroes wangyumu zahranjamali project-stacker eternalerrors nealef blkperl nhat416 isgasho chandana0902 iq-scm elelayan zong-zhe iambibhas siretart testwill seanpm2001 mikemccracken shimish2 nmars stskeeps scantist-ossops-m2 kosikasaipriya developgo

umoci's Issues

cas: fix oci-layout handling

We need to output the correct version, and also verify it properly. Currently we're not verifying it at all simply because it breaks with skopeo (we're doing the wrong thing right now). This is bad.

We should also verify that we have a blobs and refs directories in a way that makes sense (no subdirectories or incorrect algorithms).

umoci: implement reference

We need to provide an interface for someone to re-tag a given tag with another tag. This touches on a core issue of the MediaType model of the image specification -- you can't really tag a random digest because you have to know what type it is. There's been a lot of (tiring) discussion about this upstream in opencontainers/image-spec#411, and currently upstream doesn't really like the idea of peek-inside detection. This makes an interface like:

% umoci reference tag --digest sha256:<digest> sometag

Not really possible to implement at the moment. Here are the sub-subcommands we'd like:

umoci reference create
umoci reference remove
umoci reference list

repack: diffid is not of the uncompressed layer

According to the spec, the DiffIDs in the configuration are meant to be the uncompressed layer's digest. So we probably need to also return some more DiffID information from GenerateLayer. Or we'll have to parse things twice which will be a pain.

umoci-tag: completely rework

umoci tag's interface is just silly. It needs to be switch to work properly (though #39 will change this as well).

unpack: insufficient pathname sanitisation

Currently we aren't properly cleaning paths inside unpackEntry. In particular, if you have an invalid tar archive that contains entries such as ../../ or if it contains a entry that resolves through a symlink, then umoci will start touching parts of the host.

To fix this we need to use this library, which I helped write specifically to solve this problem inside Docker https://github.com/docker/docker/tree/master/pkg/symlink.

handle removal of Memory, MemorySwap and CPUShares

These never made sense in the spec, and they've been recently removed so we should drop all mention of them from everywhere. opencontainers/image-spec#495

layer: Xattr unpack code is broken on Fedora/RHEL

just have a busybox image copied with skopeo to OCI format, then:

$ umoci unpack --image busybox bundle
INFO[0000] parsed mappings                               map.gid=[] map.uid=[]
FATA[0000] create runtime bundle: chown rootfs: lchown bundle/rootfs: operation not permitted

$ sudo $GOPATH/bin/umoci unpack --image busybox bundle
INFO[0000] parsed mappings                               map.gid=[] map.uid=[]
INFO[0000] unpack manifest: unpacking layer sha256:56bec22e355981d8ba0878c6c2f23b21f422f30ab0aba188b54f1ffeff59c190  diffid="sha256:e88b3f82283bc59d5e0df427c824e9f95557e661fcb0ea15fb0fb6f97760f9d9"
FATA[0000] create runtime bundle: unpack layer: unpack entry: bin: apply hdr metadata: clear xattr metadata: /home/amurdaca/go/src/github.com/docker/containerd/bundle/rootfs/bin: lclearxattrs: lremovexattr(/home/amurdaca/go/src/github.com/docker/containerd/bundle/rootfs/bin, security.selinux): permission denied

tests: add oci-*-tool validate test

We need to add some tests that ensure that our unpacked bundle is actually a valid OCI runtime bundle. However, currently this is blocked on opencontainers/runtime-tools#268 -- since we're generated v1.0.0-rc2 bundles that have different required fields to the v1.0.0-rc1-dev bundles which is what oci-runtime-tool currently uses.

There is also the fact that we need to have oci-image-validate run after every build.

In essence we should run this after every unpack in every test.

history is broken

Currently we have to manually modify the history with umoci config --history. While this sounds like a good idea, tools like skopeo have certain assumptions about the history. Essentially, we will have to have a separate history entry for every change (especially repacking).

We'll have to come up with a different UX.

cmd: combine --from and --image

Currently we have --from and --image tags as separate arguments. This is just silly (it doesn't add any useful information and in fact just makes everything more clunky). So we really should switch to path:tag-style. The only downside is that it will make umoci tag need to have a different interface.

man: add examples to documentation

Currently there's no example shell sessions in the man pages, we should really add some.

config: history setting not implemented

Because v1.History is a full structure (and v1.Config stores a slice of them), implementing the CLI interface for this might be more than a little painful. We could just require that someone pass the JSON for a v1.History but that's a really gross interface.

*pack: implement rootless unpacking/repacking

Currently umoci unpack requires root privileges to set the owners of the files, and thus umoci repack sometimes needs root privileges to even read the files to the archive. This makes rootless image manipulation not practical (and makes it effectively impossible to include umoci as part of a build system).

To fix this we should implement some sort of user mapping into the new version of unpack (which is saved in the unpacked bundle) and then on repack we use the same mapping to modify the *.mtree diffs we got.

unpack: cannot handle non-numeric user/groups

When generating the config.json, we can't handle cases where the user specification is something like "cyphar:users" because of the way the interfaces are designed (we do the extraction and generation separately from one another). This can be fixed by using libcontainer/users.

In addition, we'll also be able to fill the AdditionalGids array, which we couldn't before.

We could also add a $HOME env variable.

unpack: useless config.json

The config.json generated when unpacking is not useful because it is a config for running in the host, which isn't helpful. While this can be corrected by a user (using oci-runtime-tools generate) this quite a bit of a pain.

The best choice IMO would be to create a library that can take a v1.Image and then apply it to a runtime-tools/generate.Generator. We can then use the default config for runC as the base (which we know is actually sane).

The upstream bug for this is opencontainers/image-tools#76 but we should implement this ourselves.

cmd: add umoci-stat(1)

Currently in order to actually understand the history of an image you need to convert it to a Docker image so you can use docker history. We should really be including our own history-parsing code.

One issue with all of this is that we're creating special-purpose commands. Maybe we should just implement stat with options to narrow down what sections to output.

*: add tests

Currently all the testing I'm doing is manual, we need to have some actual testing. Namely integration testing (using bats) for the actual umoci tool as well as unit tests for the libraries. Another nice thing would be to have validation testing against the OCI validation tooling.

umoci: implement raw-unpack

This is like unpack except it just does a simple extraction of a given layer diff blob (no mtree or config.json). This probably will end up being in a set of raw-* subcommands that are more hardcore versions of the nice interfaces.

umoci: implement direct digest referencing

Currently we have to go through refs exclusively, which means that you can't tag a blob that doesn't already have a tag. This is a bit of a pain (though with umoci init it won't be a blocking issue).

However, a nice way of fixing this would be implementing an indexer for an OCI image (which does the same sort of indexing that umoci gc does but in addition storing the MediaType of the object referenced). Then we could always guarantee that we had blob references for most blobs, and we could then go with a backup option of just using MediaType as gospel.

The only downside to this approach is that layer diff blobs will not be practically usable (since we can't tell between a distributable and non-distributable blob -- though I'm not sure if the spec even says that a particular layer diff blob is in a strict binary between the two types). But we can just check for a gzip header (or just check that it's not JSON) and move on with our lives, and then not allow operations that directly reference those blobs.

We could also add an explicit --mediatype option that allows for overwriting of the detected mediatype (probably dangerous). Then we can just not allow operations on any untyped blob.

cmd: use positional arguments for mandatory arguments

For subcommand-specific mandatory arguments (like --bundle) really should be positional arguments...

*pack: implement --{u,g}id-map

As part of #26 we need to just implement --uid-map and --gid-map for unpack and repack.

Add support for "read-only" CAS opening

If the layout is on a read-only filesystem then cas.Open will fail because we create temporary directories even if we don't use them. We should either:

Create the temporary directory immediately as we need it. This means that cas.Open will be opportunistic.
Make two different "open" functions.

The former sounds like a better idea.

layer: generated atime is invalid

Because of how Go's archive/tar.Writer currently works, AccessTime and CreateTime are not correctly written to the output archive stream. This causes several issues which are really disappointing.

I've filed a bug upstream about it golang/go#17876.

umoci: implement repack

Sort of like oci-create-layer from opencontainers/image-tools#8, except it uses mtree(8) manifests. It's also automatic and generates all of the necessary blobs to represent the new tagged image.

unpack/repack: save the --from argument

When we do an umoci unpack we currently only keep track of the manifest hash that we extracted. We should also save (in umoci.state or something) the name of the reference that we dereferenced in order to get the extracted rootfs. This will allow us to remove the tedious --from argument in umoci repack.

The only real question to ask is whether we should make it possible for someone to forcefully create a rootfs that we cannot be sure is valid (namely specify --from and --mtree manually).

unpack: doesn't preserve mtime

This is because the OCI tooling doesn't appear to handle this properly (image.CreateRuntimeLayoutBundle is to blame). It will probably be a good idea to from-scratch implement this feature inside umoci so that there is more than one implementation of this unpacking functionality.

% umoci unpack --image opensuse --ref latest --bundle bundle1
% sleep 30s
% umoci unpack --image opensuse --ref latest --bundle bundle2
% gomtree -f bundle1/*.mtree -p bundle2/rootfs >/dev/null
% echo $?
1

Where all of the gomtree failed keywords are related to time.

gc: clean up tmpdirs with dirEngine

This is going to be a bit of a layer violation, so I'll have to think about it. But basically we need to have a way to remove all of the temporary directories that have been left behind inside an OCI image.

*: track upstream PRs

There are several components of umoci that are not very well defined by the image-spec specification. I'm working on improving this upstream, so that we can be sure that umoci is actually doing thing correctly (because we defined what is the correct way of doing them). Here's a currently list of upstream PRs:

opencontainers/image-spec#492 -- conversion.md
opencontainers/image-spec#504 -- Forbid Config.Volumes snapshotting.

This is quite important for us.

umoci: add man pages

We need to have proper docs so that people understand what the hell is going on. My only concern is that I don't really want to use go-md2man because of the fun issues we've had in the openSUSE community with packages like that. But we'll see how that goes.

repack: *.mtree should not contain ':'

Some lovely setups (such as inside VMs with shared filesystems) have issues if a path has : in its name. So we should just avoid this overall (the version in the design doc just replaces it with a _ which should be fine).

add annotation support

cmd: split create into init and create

It doesn't make sense to have a mixed command like umoci create that will create an empty image in one mode (no --tag) and create a manifest in another (with --tag).

reimplement unpacking ourselves

The current upstream unpacking utilities have some flaws:

They are completely separate to my CAS implementation, making using them quite ugly.
They don't expose a "single layer" unpacking method, meaning that we can't implement raw-unpack (#23) and we can't really test it properly.
They have bugs (like #1) and also don't support things like "user:group"-style specification of users, which is a problem.
Features like #26 are unlikely to be nicely solved upstream.

All of this can be solved if we just implement it ourselves.

repack: pipe errors don't propagate

In layer.GenerateLayer we use CloseWithError to try to propagate layer generation errors to the reader. However, it looks like the errors are dropped at some point (or the pipe is already closed when we error out). This causes us to generate new invalid layers when we should be erroring out.

This is a critical issue and must be solved before 0.0.0.

generate: duplicate environment variables

When we generate the rspec.Spec, we append to the environment variable set. This is a bit dangerous, and the more sane thing to do would be to overwrite variables if necessary. It would be nice if this was implemented upstream though...

switch to a proper error workflow

Let's not end up in the same boat as Docker. We should be using https://github.com/pkg/errors to safely manage wrapping of errors while also allowing for os.IsNotError to work. Let's not build up on the same technical debt that Docker has.

umoci: implement gc

Because of how umoci repack is implemented (we create a bunch of new blobs and leave the old ones) we need to include a garbage collector that can clean up the trash after we've regenerated everything. A standard mark-and-sweep garbage collector will do.

A nice way of doing this would be to use reflect and to parse every Descriptor, adding every referenced descriptor to "black" coloured objects (with the default being "white"). Then the "white" coloured objects are removed.

config: created time cannot be parsed

If you use something like time.RFC3339 as the format for time, it won't work when parsing the time. We need to either make (*igen.Generator).{Set,}Created just take a string or figure out why the time parsing isn't working.

umoci: implement config

This subcommand will be very similar to oci-runtime-tools generate, which allows you to modify a config.json. In contrast, this tool will allow you to modify the image configuration of a manifest.

umoci: implement info

We need to have some way of providing information about an image (or a tag, or a blob). We could also call this umoci stat (and have a -L option to not dereference a reference). This is not a necessary feature, but would be pretty useful for debugging.

umoci: implement unpack

Like oci-create-runtime-bundle but it also generates an mtree(8) manifest for the rootfs which allows us to create diff layers without needing to copy the rootfs twice.

refactor image modification to library

Currently umoci unpack and umoci repack implement the same functionality in a giant block of code. We should move it to a library so we can create unit tests for it, as well as reduce the duplication.

license: sort out licensing

I really would like to license this project under GPLv3+, but currently the image/ code needs to be Apache licensed if we want to have any hope of the code being merged into the OCI. We could do some sort of dual-licensing thing, but I feel like that wouldn't end well.

layerdiff: generates redundant whiteouts

As far as I'm aware, a whiteout of a directory implies that all of its children are also to be removed. However, currently we are creating a new whiteout for every mtree.InodeDelta element in the diff from gomtree -- which will can include files that are underneath directories.

There's also a valid question about whether we should generate directory entries if we only detected a deletion between two layers. But that's a question for another time (and according to upstream our current method is fine).

umoci: consolidate UX code

Currently --from and --image are both implemented by every subcommand separately, which is just unwieldy. It would be nice if we could implement them as global flags or something like that. There's also the question of whether the current flag-based interface even makes sense (should it be positional arguments instead?).

In addition, one of the really annoying things with umoci tag is how you have to understand the spec to use the damn command. umoci stat should be far more friendly and we should have a way of creating a copy of a reference (with umoci add only being used for hardcore users).

layer: don't fake hardlinks with symlinks

Currently, we have to fake hardlinks using symlinks because of the fact that we cannot be sure of the ordering of tar archive entries. In particular we can't be sure that the thing we're linking to will be hit before we hit the link.

The solution to this problem is creating an inode map to create hardlinks after the fact. In principle this should work without any major hitches (we don't need to save the metadata because the original inode determines everything) and we'd just have to delay it until the end.

umoci: implement init

We need to have a way of creating a new image. The only real question is how do we get a user to specify what they want the "default" rootfs to be. In my opinion, the best thing to do would be to have an empty set of layers that the user can then add the first layer to using unpack (as expected). However, I have a feeling that we'll have to mess around with go-mtree's generated .mtree spec file -- because it will analyse the root even though the image doesn't have a proper root.

One obvious option is to go with the Docker route, where they just added a dummy scratch rootfs that they then added on top of. The reason this is not a great idea (and Docker has since moved away from that) should be obvious.

We could also implement something like the ADD a.tar.xz / functionality that Dockerfiles have, but the problem is that it will make the interface for creating the initial image a bit janky. I'll have to ask the KIWI developers what they think about this.

*: support manifest lists

Currently all of the tooling only really supports tags that reference manifests (not manifest lists). The main reason for this is that I'm not really sure how to implement manifest list handling. Should we just take the current OS and architecture (and should that be taken from runtime.GOOS which is not entirely accurate, or from some other source?). Or we could just mandate that --os and --arch be specified with every call (this is annoying).

Not to mention how should we handle repacking? Should we not allow repacking over a different manifest than the one that was used to extract the damn thing (which is how the current implementation works)?

*: remove old comments

There's a lot of // TODO comments lying around that we've already fixed.

config: implement raw-config

It would be nice if we could have a --dry-run flag that allows a user to take an input template (from anywhere) and outputs the modified config. If we correctly break out mutateConfig then this should actually be entirely doable by adding a new subcommand (raw-config or something).