Skip to main content

What are Containers

·2935 words·14 mins

About
#

In this article we explore concepts related to container technology. We start from the basic by understanding what are containers and virtual machines. Then we explore the open container initiative (image-spec and runtime-spec) and later explore Docker and Podman specific concepts.

Introduction
#

What are Virtual Machines?
#

  • Applications are generally deployed in a virtual machine.
  • A Virtual Machine (VM) is a compute resource that uses software instead of a physical computer to run programs and deploy apps.
  • Each virtual machine runs its own operating system and functions separately from the other VMs, even when they are all running on the same host.

What are Containers?
#

Containers are a technology that allows applications to be packaged and isolated with their entire runtime environment.

Why use Containers?
#

  • The computational overhead spent virtualizing hardware for a guest OS to use is substantial.
  • They make it easier to maintain consistent behavior and functionality while moving the contained application between environments.
  • Containers share the machine’s OS system kernel and therefore do not require an OS per application, driving higher server efficiencies.

What is the Open Container Initiative (OCI)?
#

The Open Container Initiative (OCI) is a lightweight, open governance structure (project) for the express purpose of creating open industry standards around container formats and runtimes.

The OCI currently contains three specifications:

  • The Runtime Specification (runtime-spec).
  • The Image Specification (image-spec).
  • The Distribution Specification (distribution-spec).

Runtime Specification
#

The Open Container Initiative Runtime Specification aims to specify the configuration, execution environment, and lifecycle of a container. The Runtime Specification outlines how to run a “filesystem bundle” that is unpacked on disk.

A container’s configuration is specified in the config.json for the supported platforms and details the fields that enable the creation of a container. The execution environment is specified to ensure that applications running inside a container have a consistent environment between runtimes, along with common actions defined for the container’s lifecycle.

Application bundle builders can create a bundle directory that includes all the files required to launch an application as a container. The bundle contains an OCI configuration file (config.json) where the builder can specify host-independent details such as which executable to launch (process object defined in the config.json file) and host-specific settings such as mount locations, hook paths, Linux namespaces and cgroups.

What is a file system bundle?
#

A set of files organized in a certain way and containing all the necessary data and metadata for any compliant runtime to perform all standard operations against it.

A container is encoding as a filesystem bundle on disk. The definition of a bundle is concerned only with how a container and its configuration data are stored on a local filesystem so that they can be consumed by a compliant runtime.

A Standard Container bundle contains all the information needed to load and run a container. This includes the following artifacts:

  1. config.json containing all configuration data. (File is mandatory)
  2. The container’s root filesystem, referred to by root.path in the config.json file. (Optional but mandatory in Windows)

Scope of a Container
#

The entity using a runtime to create a container MUST be able to use the operations defined in this specification against that same container. Whether other entities using the same, or other, instance of the runtime can see that container is out of scope of this specification.

State of a Container
#

The state of a container includes the following properties:

  • ociVersion
  • id
  • status (Additional values MAY be defined by the runtime, however, they MUST be used to represent new runtime states not defined below.)
    • creating: The container is being created.
    • created: The runtime has finished the create operation, and the container process has neither exited nor executed the user-specified program.
    • running: The container process has executed the user-specified program but has not exited
    • stopped: The container process has exited
  • pid
  • bundle
  • annotations
//Example of state
{
    "ociVersion": "0.2.0",
    "id": "oci-container1",
    "status": "running",
    "pid": 4422,
    "bundle": "/containers/redis",
    "annotations": {
        "myKey": "myValue"
    }
}

Runtime Lifecycle
#

The lifecycle describes the timeline of events that happen from when a container is created to when it ceases to exist.

  1. OCI create operation command is invoked.
  2. The container’s runtime environment MUST be created according to the configuration in config.json. While the resources requested in the config.json MUST be created, the user-specified program MUST NOT be run at this time. Any updates to config.json after this step MUST NOT affect the container.
  3. prestart hook
  4. createRuntime hook
  5. createContainer hook
  6. Runtime’s start command is invoked with the unique identifier of the container.
  7. startContainer hook
  8. The runtime MUST run the user-specified program, as specified by process . (process object is defined in the config.json)
  9. postStart hook
  10. The container process exits. This MAY happen due to erroring out, exiting, crashing or the runtime’s kill operation being invoked.
  11. Runtime’s delete command is invoked with the unique identifier of the container.
  12. The container MUST be destroyed by undoing the steps performed during create phase (step 2).
  13. postStop hook

Operations
#

Unless otherwise stated, runtimes MUST support the following operations. (These operations are not specifying any command-line APIs, and the parameters are inputs for general operations.)

Container tools provide CLI tools which may have a different name but the underlying operation should support these, that are consistent with OCI runtime-spec.

  • query state: This operation MUST return the state of a container as specified in state
  • create: This operation MUST create a new container. Any changes made to the config.json file after this operation will not have an effect on the container.
  • start: This operation MUST run the user-specified program as specified by process .
  • kill: This operation MUST send the specified signal to the container process.
  • delete: Attempting to delete a container that is not stopped MUST have no effect on the container and MUST generate an error. Deleting a container MUST delete the resources that were created during the create step. Note that resources associated with the container, but not created by this container, MUST NOT be deleted.
    • Volumes or mounts etc, are not deleted.

Configuration
#

This configuration file contains metadata necessary to implement standard operations against the container. This includes the process to run, environment variables to inject, sandboxing features to use, etc.

Image Specification
#

This specification defines an OCI Image, consisting of an image manifest, an image index (optional), a set of filesystem layers, and a configuration.

Image Manifest
#

At a high level, the image manifest contains metadata about the contents and dependencies of the image, including the content-addressable identity of one or more filesystem layer changeset archives that will be unpacked to make up the final runnable filesystem.

Image Configuration
#

The image configuration includes information such as application arguments, environments, etc.

Image Index
#

The image index is a higher-level manifest that points to a list of manifests and descriptors. Typically, these manifests may provide different implementations of the image, possibly varying by platform or other attributes.

Content Descriptors
#

  • An OCI image consists of several different components arranged in a Merkle Directed Acyclic Graph (DAG).
  • References between components in the graph are expressed through Content Descriptors.
  • A Content Descriptor, or simply Descriptor, describes the disposition (the way in which something is placed or arranged, especially in relation to other things) of the targeted content.
  • The content identifier is the digest.
  • The media type defining the descriptor is: application/vnd.oci.descriptor.v1+json

A canonical form is a representation such that every object has a unique representation. Thus, the equality of two objects can easily be tested by testing the equality of their canonical forms. Canonicalization being the process through which a representation is put into its canonical form. For example, the content {’a’:1, ‘b’:2} and {’b’:2,’a’:1} although being same can show different digests. Therefore, canonicalization is used when saving content in OCI.

echo -n {‘a’:1,‘b’:2} | sha256sum d8766531781e268ee6fe73b2333041ca231ac61f059874afe0d10c395421b388

echo -n {‘b’:2,‘a’:1} | sha256sum d644ddd8c7d5668b270da1e1d8a51a3c8b0a4c7458513a85cbf056b4414f4b65

Image Layout Specification
#

  • The OCI Image Layout is the directory structure for OCI content-addressable blobs and location-addressable references (refs).

Given an image layout and a ref, a tool can create an OCI Runtime Specification bundle by:

  • Following the ref to find a manifest, possibly via an image index
  • Applying the filesystem layers in the specified order
  • Converting the image configuration into an OCI Runtime Specification config.json

Structure
#

The image layout is as follows:

  • blobs directory:
    • Contains content-addressable blobs
    • A blob has no schema and SHOULD be considered opaque
    • Directory MUST exist and MAY be empty
  • oci-layout file:
    • It MUST exist and be a JSON object.
    • It MUST contain an imageLayoutVersion field
  • index.json file
    • It MUST exist and be an image index JSON object.

Blobs
#

  • Object names in the blobs subdirectories are composed of a directory for each hash algorithm, the children of which will contain the actual content.
  • The content of blobs/<alg>/<encoded> MUST match the digest <alg>:<encoded> (referenced per descriptor). For example, the content of blobs/sha256/da39a3ee5e6b4b0d3255bfef95601890afd80709 MUST match the digest sha256:da39a3ee5e6b4b0d3255bfef95601890afd80709.

oci-layout file
#

  • This JSON object serves as a marker for the base of an Open Container Image Layout and to provide the version of the image-layout in use.
  • The media type defining the image layout specification is: application/vnd.oci.layout.header.v1+json

index.json file
#

  • It is the entry point for references and descriptors of the image layout.
  • The image index is a multi-descriptor entry point.
  • This index provides an established path (/index.json) to have an entry point for an image-layout and to discover auxiliary descriptors.
  • In general the mediaType of each descriptor object in the manifests field will be either application/vnd.oci.image.index.v1+json or application/vnd.oci.image.manifest.v1+json.
  • An encountered mediaType that is unknown MUST NOT generate an error.

Image Index Specification
#

  • The image index is a higher-level manifest that points to specific image manifests, ideal for one or more platforms. While the use of an image index is OPTIONAL for image providers, image consumers SHOULD be prepared to process them.
  • This section defines the application/vnd.oci.image.index.v1+json media type.

Image Manifest Specification
#

There are three main goals of the Image Manifest Specification. The media type defined by this section is application/vnd.oci.image.manifest.v1+json

  • content-addressable images: by supporting an image model where the image’s configuration can be hashed to generate a unique ID for the image and its components.
  • To allow multi-architecture images, through a “fat manifest” which references image manifests for platform-specific versions of an image. In OCI, this is codified in an image index.
  • To be translatable to the OCI Runtime Specification.

An image manifest provides a configuration and set of layers for a single container image for a specific architecture and operating system.

Image Configuration
#

  • An OCI Image is an ordered collection of root filesystem changes and the corresponding execution parameters for use within a container runtime.
  • This specification outlines the JSON format describing images for use with a container runtime and execution tool and its relationship to filesystem changesets.
  • The media type application/vnd.oci.image.config.v1+json defines the image configuration.

Terminology
#

Layer
#

  • Image filesystems are composed of layers.
  • Each layer represents a set of filesystem changes in a tar-based layer format, recording files to be added, changed, or deleted relative to its parent layer.
  • Layers do not have configuration metadata such as environment variables or default arguments, these are properties of the image as a whole rather than any particular layer.
  • Using a layer-based or union filesystem such as AUFS, or by computing the diff from filesystem snapshots, the filesystem changeset can be used to present a series of image layers as if they were one cohesive filesystem.
  • One or more layers are applied on top of each other to create a complete filesystem.
    • The media type application/vnd.oci.image.layer.v1.tar+gzip represents an application/vnd.oci.image.layer.v1.tar payload which has been compressed with gzip.
    • The media type application/vnd.oci.image.layer.v1.tar+zstd represents an application/vnd.oci.image.layer.v1.tar payload which has been compressed with zstd.
    • Layer Changesets for the media type application/vnd.oci.image.layer.v1.tar MUST be packaged in tar archive.

Change Types
#

Types of changes that can occur in a changeset are:

  • Additions
  • Modifications
  • Removals

JSON
#

  • Each image has an associated JSON structure that describes some basic information about the image, such as date created, author, as well as execution/runtime configuration like its entrypoint, default arguments, networking, and volumes..
  • The JSON structure also references a cryptographic hash of each layer used by the image, and provides history information for those layers.
  • This JSON is considered to be immutable because changing it would change the computed ImageID.
  • Changing it means creating a new derived image, instead of changing the existing image.

Layer DiffID
#

  • A layer DiffID is the digest over the layer’s uncompressed tar archive and serialized in the descriptor digest format.

Chain ID
#

  • It is sometimes useful to refer to a stack of layers with a single identifier. While a layer’s DiffID identifies a single changeset, the ChainID identifies the subsequent application of those changesets.

Image ID
#

Properties
#

  • created A combined date and time at which the image was created
  • author Gives the name and/or email address of the person or entity that created and is responsible for maintaining the image.
  • architecture The CPU architecture on which the binaries in this image are built to run.
  • os The name of the operating system on which the image is built to run.
  • os.version This property specifies the version of the operating system targeted by the referenced blob.
  • os.features This property specifies an array of strings, each specifying a mandatory OS feature.
  • variant The variant of the specified CPU architecture.
  • config The execution parameters that SHOULD be used as a base when running a container using the image.
    • User The username or UID which is a platform-specific structure that allows specific control over which user the process runs as.
    • ExposedPorts A set of ports to expose from a container running this image. Its keys can be in the format of:port/tcp, port/udp, port With the default protocol being tcp if not specified.
    • Env Entries are in the format of VARNAME=VARVALUE. These values act as defaults and are merged with any specified when creating a container.
    • Entrypoint A list of arguments to use as the command to execute when the container starts. These values act as defaults and may be replaced by an entrypoint specified when creating a container.
    • Cmd Default arguments to the entrypoint of the container. If an Entrypoint value is not specified, then the first entry of the Cmd array SHOULD be interpreted as the executable to run.
    • Volumes A set of directories describing where the process is likely to write data specific to a container instance.
    • WorkingDir Sets the current working directory of the entrypoint process in the container. This value acts as a default and may be replaced by a working directory specified when creating a container.
    • Labels This field contains arbitrary metadata for the container.
    • StopSignal This field contains the system call signal that will be sent to the container to exit.
  • rootfs The rootfs key references the layer content addresses used by the image. This makes the image config hash depend on the filesystem hash.
    • type MUST be set to layers.
    • diff_ids An array of layer content hashes (DiffIDs), in order from first to last.
  • history Describes the history of each layer. The array is ordered from first to last.
    • created A combined date and time at which the layer was created.
    • author The author of the build point.
    • created_by The command that created the layer.
    • comment A custom message set when creating the layer.
    • empty_layer This field is used to mark if the history item created a filesystem diff. It is set to true if this history item doesn’t correspond to an actual layer in the rootfs section

Conversion to OCI Runtime Configuration
#

When extracting an OCI Image into an OCI Runtime bundle, two orthogonal components of the extraction are relevant:

  1. Extraction of the root filesystem from the set of filesystem layers.
  2. Conversion of the image configuration blob to an OCI Runtime configuration blob.
  • All the necessary system libraries and dependencies of the application are referenced as layers.

image manifest, specifies the CPU architecture for which the previous two elements are suitable. image index, which contains information about a set of images that can span a variety of architectures and operating systems

A file system is a structure used by an operating system to organise and manage files on a storage device such as a hard drive, solid state drive (SSD), or USB flash drive. It defines how data is stored, accessed, and organised on the storage device. Common File Systems:

  • FAT (File Allocation Table), FAT16, FAT32
  • exFAT (Extended File Allocation Table)
  • NTFS (New Technology File System)
  • APFS (Apple File System)
  • HFS, HFS+ (Hierarchical File System)
  • Ext4 (Fourth Extended File System)

BLOB stands for a “Binary Large Object,” a data type that stores binary data. Binary Large Objects (BLOBs) can be complex files like images or videos, unlike other data strings that only store letters and numbers. A BLOB will hold multimedia objects to add to a database.

An archive file stores the content of one or more computer files, possibly compressed and/or encrypted, with associated metadata such as file name, directory structure, error detection and correction information, and commentary. In computing, tar is a shell command for combining multiple computer files into a single archive file. A tarball contains metadata for the contained files including the name, ownership, timestamps, permissions and directory organization.

A changeset describes the exact differences between two successive versions in the version control system’s repository of changes.

References
#

Vaibhav
Author
Vaibhav
Full Stack Developer