Skip to main content

Container Image Management

Container Image Management provides a programmatic and declarative way to define and manage container images. This approach allows developers to construct images by layering various components, ensuring reproducibility and version control. Image objects are immutable once defined; any modification operation returns a new Image instance.

Defining Base Images

Images are constructed by starting with a base image using one of the from_* class methods.

  • Default Debian Base: The Image.from_debian_base() method creates a multi-architecture (amd64/arm64) Debian-based image with a specified Python version. It can optionally install the core library. This is the recommended starting point for most Python projects.

    from flytekit.image import Image

    # Creates a default Debian-based image with the current Python version and installs the core library
    my_default_image = Image.from_debian_base()

    # Specify Python 3.11 and prevent core library installation
    custom_python_image = Image.from_debian_base(python_version=(3, 11), install_flyte=False)

    # Specify a custom registry and name for the image
    named_image = Image.from_debian_base(registry="my-registry.com", name="my-app")
  • Existing Image URI: The Image.from_base(image_uri: str) method uses a pre-existing image from a container registry. The image_uri must be a complete URI, including the registry, name, and tag (e.g., my-registry.com/my-app:latest).

    existing_image = Image.from_base("ubuntu:22.04")
  • Custom Dockerfile: The Image.from_dockerfile(file: Path, registry: str, name: str, platform: Optional[Tuple[Architecture, ...]]) method builds an image from a local Dockerfile. When using this method, additional layers cannot be programmatically added using with_* methods, as the system does not parse or understand the Dockerfile's internal structure. All image logic must reside within the Dockerfile itself. The build context for the Dockerfile is its parent directory.

    from pathlib import Path
    dockerfile_path = Path("./Dockerfile")
    dockerfile_image = Image.from_dockerfile(
    file=dockerfile_path,
    registry="my-registry.com",
    name="my-custom-dockerfile-app"
    )
  • UV Script: The Image.from_uv_script(script: Path | str, name: str, registry: Optional[str], ...) method creates an image based on a uv script. This method automatically parses the script's header to determine the required Python version and dependencies, streamlining dependency management for uv-based projects.

    from pathlib import Path
    uv_script_path = Path("./my_script.py")
    uv_image = Image.from_uv_script(
    script=uv_script_path,
    name="my-uv-app",
    registry="my-registry.com"
    )

Layering Customizations

Once a base image is defined, use the with_* methods to add layers of customization. Each with_* method returns a new Image object, reflecting the layered approach.

Python Dependencies

  • Pip Packages: The with_pip_packages(*packages: str, ...) method installs specified Python packages using pip. It supports custom index URLs, pre-releases, and extra arguments for pip install.

    my_image = Image.from_debian_base().with_pip_packages("requests", "pandas==2.0.0")
  • Requirements File: The with_requirements(file: Path | str, ...) method installs Python packages listed in a requirements.txt file. The file must have a .txt extension.

    from pathlib import Path
    my_image = Image.from_debian_base().with_requirements(Path("./requirements.txt"))
  • UV Project: The with_uv_project(pyproject_file: Path | str, uvlock: Optional[Path], ...) method installs dependencies defined in a pyproject.toml and uv.lock file. This method is suitable for projects managed with uv. If uvlock is not specified, it defaults to pyproject_file.parent / "uv.lock".

    from pathlib import Path
    my_image = Image.from_debian_base().with_uv_project(Path("./pyproject.toml"))

System Dependencies

  • APT Packages: The with_apt_packages(*packages: str, ...) method installs system-level packages using apt.

    my_image = Image.from_debian_base().with_apt_packages("git", "build-essential")

File Operations

  • Copy Source Folder: The with_source_folder(src: Path, dst: str = ".") method copies a local directory (src) into the image at the specified destination (dst). If dst is not provided, it defaults to the image's working directory.

    from pathlib import Path
    my_image = Image.from_debian_base().with_source_folder(Path("./my_code"), "/app")
  • Copy Source File: The with_source_file(src: Path, dst: str = ".") method copies a local file (src) into the image at the specified destination (dst). If dst is not provided, it defaults to the image's working directory.

    from pathlib import Path
    my_image = Image.from_debian_base().with_source_file(Path("./config.yaml"), "/etc/config.yaml")
  • Dockerignore: The with_dockerignore(path: Path) method specifies a .dockerignore file to exclude files and directories from the build context when copying source code.

    from pathlib import Path
    my_image = Image.from_debian_base().with_dockerignore(Path("./.dockerignore"))

Environment and Commands

  • Environment Variables: The with_env_vars(env_vars: Dict[str, str]) method sets environment variables within the image.

    my_image = Image.from_debian_base().with_env_vars({"MY_VAR": "value", "DEBUG": "true"})
  • Working Directory: The with_workdir(workdir: str) method sets the working directory for subsequent commands in the image. This overrides any previously set working directory.

    my_image = Image.from_debian_base().with_workdir("/app")
  • Custom Commands: The with_commands(commands: List[str], ...) method executes arbitrary shell commands during the image build process. Do not include RUN in the commands, as it is implicitly added.

    my_image = Image.from_debian_base().with_commands(["mkdir -p /data", "chmod 777 /data"])

Image Immutability and Cloning

Image objects are immutable. Each with_* method, and the clone() method, returns a new Image instance with the applied changes. This design ensures that image definitions are treated as immutable specifications, promoting reproducibility.

The clone(registry: Optional[str], name: Optional[str], python_version: Optional[Tuple[int, int]], addl_layer: Optional[Layer]) method explicitly creates a new image based on the current one. It allows changing its registry, name, Python version, or adding a single new layer.

base_image = Image.from_debian_base(name="my-app", registry="my-registry.com")
dev_image = base_image.clone(name="my-app-dev") # Creates a new image with a different name

Image Identification and URI

  • Unique Identifier: The identifier property provides a stable, hash-based string representing the image's complete definition (base image, Dockerfile, and all layered customizations). This identifier is crucial for caching previously built images, avoiding redundant builds.

  • Image URI: The uri property constructs the full image URI in the format <registry>/<name>:<tag>. The tag is either explicitly set during image creation or automatically derived from the image's hash digest.

    my_image = Image.from_debian_base(registry="my-registry.com", name="my-app")
    print(my_image.identifier) # Example: "some_hash_string"
    print(my_image.uri) # Example: "my-registry.com/my-app:some_hash_string"

Advanced Considerations

  • Multi-Architecture Support: Images can be built for multiple architectures (e.g., linux/amd64, linux/arm64) by specifying the platform parameter in base image constructors like from_debian_base or from_dockerfile.

  • Secret Mounts for Build Process: Several with_* methods (e.g., with_pip_packages, with_apt_packages, with_commands) support secret_mounts. This allows securely providing credentials or sensitive files during the image build process, for example, to access private package repositories. Secrets can be mounted as environment variables or at specific file paths within the build environment.

    from flytekit.image import Image, Secret

    # Mount GITHUB_PAT as an environment variable during the build
    image_with_secret = Image.from_debian_base().with_pip_packages(
    "private-package",
    secret_mounts=[Secret(key="GITHUB_PAT")]
    )

    # Mount a secret to a specific file path within the build environment
    image_with_apt_secret = Image.from_debian_base().with_apt_packages(
    "my-private-apt-package",
    secret_mounts=[Secret(key="apt-secret", mount="/etc/apt/apt-secret")]
    )
  • Validation: The validate() method on an Image object performs checks on its layers, such as verifying the existence of source files or scripts. These validations are typically invoked at build time to catch potential issues before a build attempt.

Best Practices

  • Start with from_debian_base: For most Python-based applications, Image.from_debian_base() provides a robust and well-maintained starting point with sensible defaults.
  • Layer Order: Add system dependencies (with_apt_packages) before Python dependencies (with_pip_packages, with_requirements, with_uv_project) to ensure all necessary build tools are available. Add application code (with_source_folder, with_source_file) last to leverage Docker's layer caching effectively, minimizing rebuilds when only code changes.
  • Explicit Naming: Always provide a registry and name when defining images to ensure they can be pushed and pulled correctly from your container registry.
  • Leverage uv for Dependency Management: For modern Python projects, from_uv_script or with_uv_project offer efficient and reproducible dependency management.
  • Absolute Paths for Source: When using with_source_file or with_source_folder, use Path objects with absolute paths to avoid ambiguity during the build process, ensuring files are copied from the expected locations.