Skip to main content

Environments & Image Building

This section describes how to define and manage execution environments and build custom container images for your tasks. These capabilities ensure that your code runs consistently with the required dependencies and resources.

Defining Execution Environments

The Environment class defines the runtime context for your tasks. It encapsulates configuration such as the container image, resource allocations, environment variables, secrets, and dependencies on other environments.

Key Attributes of Environment:

  • name: A unique identifier for the environment (snake_case or kebab-case).
  • image: Specifies the container image to use. This can be a string URI or an Image object for programmatic image building. Using "auto" defaults to a standard Python image.
  • resources: Defines CPU, memory, and GPU allocations.
  • env_vars: A dictionary of environment variables to set within the container.
  • secrets: Specifies secrets to inject into the environment at runtime.
  • depends_on: A list of other Environment instances that must be deployed alongside this environment. This is useful for orchestrating related services.
  • pod_template: An optional PodTemplate object to customize the underlying Kubernetes Pod specification.

Example: Creating an Environment

from flytekit.core.base_task import task
from flytekit.core.environment import Environment
from flytekit.core.resources import Resources
from flytekit.core.secrets import Secret
from flytekit.core.pod_template import PodTemplate
from kubernetes.client import V1PodSpec, V1Container

# Define a custom PodTemplate
custom_pod_template = PodTemplate(
pod_spec=V1PodSpec(
containers=[
V1Container(
name="primary",
image="ubuntu:latest",
command=["bash", "-c", "sleep infinity"]
)
]
),
primary_container_name="primary"
)

# Define an environment with specific resources, env vars, and secrets
my_environment = Environment(
name="my-custom-env",
image="my-registry/my-app:v1.0",
resources=Resources(cpu="1", mem="2Gi"),
env_vars={"DEBUG_MODE": "true"},
secrets=[Secret(group="my-secrets", key="API_KEY")],
pod_template=custom_pod_template
)

# An environment can depend on other environments
db_environment = Environment(name="database-env", image="postgres:13")
app_environment = Environment(name="application-env", image="my-app:latest", depends_on=[db_environment])

# You can also add dependencies programmatically
another_env = Environment(name="another-env", image="alpine")
app_environment.add_dependency(another_env)

@task(environment=app_environment)
def my_task():
print("Running in my custom environment!")

Important Considerations:

  • Environment names must be in snake_case or kebab-case.
  • The depends_on attribute hints at deployment order but does not automatically manage the lifecycle of dependent services. You are responsible for ensuring these services are available.
  • The clone_with method allows creating new environments based on existing ones with overridden properties.

Programmatic Image Building

The Image class provides a powerful and flexible way to define and build container images programmatically. This approach allows you to specify base images, add dependencies, copy files, and execute commands in a layered fashion, ensuring reproducibility and simplifying image management.

Core Principles:

  • Layered Construction: Images are built by starting with a base and adding successive layers. Each with_* method returns a new Image instance, preserving the original.
  • Identifier Hashing: Each Image instance has a unique identifier derived from its base image and all applied layers. This identifier is used for caching and ensures that identical image definitions result in the same image.
  • URI Generation: The uri property generates the full image URI (<registry>/<name>:<tag>) based on the defined properties and the computed identifier.

Creating a Base Image:

You can start building an image using one of the following class methods:

  • Image.from_debian_base(): Creates an image based on a Debian slim Python image, suitable for most Python applications. You can specify the Python version, Flyte version, registry, name, and target platforms (e.g., ("linux/amd64", "linux/arm64")).
    from flytekit.core.image import Image

    # Default Python 3.10 image with Flyte installed
    default_image = Image.from_debian_base()

    # Python 3.11 image with a custom name and registry
    custom_python_image = Image.from_debian_base(
    python_version=(3, 11),
    registry="my-org-registry",
    name="my-python-app"
    )
  • Image.from_base(image_uri: str): Starts with an arbitrary pre-built image from a registry. The image_uri must be a complete URI (e.g., ubuntu:latest, my-registry/my-base:v1).
    from flytekit.core.image import Image

    ubuntu_base = Image.from_base("ubuntu:22.04")
  • Image.from_uv_script(script: Path | str, name: str, ...): Creates an image by executing a uv script. This method parses the script's header to determine Python version and dependencies, offering a concise way to define environments for uv-managed projects.
    from flytekit.core.image import Image
    from pathlib import Path

    # Assuming 'my_script.uv' exists with a uv header
    # Example content for my_script.uv:
    # #!/usr/bin/env -S uv run --script
    # # /// script
    # # requires-python = ">=3.12"
    # # dependencies = ["httpx"]
    # # ///
    Path("my_script.uv").write_text("#!/usr/bin/env -S uv run --script\n# /// script\n# requires-python = \">=3.12\"\n# dependencies = [\"httpx\"]\n# ///\nprint('Hello from uv script!')")

    uv_image = Image.from_uv_script(
    script="my_script.uv",
    name="my-uv-app",
    registry="my-org-registry"
    )
    Path("my_script.uv").unlink() # Clean up
  • Image.from_dockerfile(file: Path, registry: str, name: str, ...): Builds an image from a local Dockerfile.
    from flytekit.core.image import Image
    from pathlib import Path

    # Assuming 'Dockerfile' exists in the current directory
    # Example content for Dockerfile:
    # FROM python:3.10-slim-bookworm
    # WORKDIR /app
    # COPY . .
    # RUN pip install requests
    Path("Dockerfile").write_text("FROM python:3.10-slim-bookworm\nWORKDIR /app\nCOPY . .\nRUN pip install requests")

    dockerfile_image = Image.from_dockerfile(
    file=Path("./Dockerfile"),
    registry="my-org-registry",
    name="my-dockerfile-app"
    )
    Path("Dockerfile").unlink() # Clean up
    Limitation: Images created with from_dockerfile() cannot have additional layers added using with_* methods, as the system does not parse or understand the Dockerfile's contents. All image logic must reside within the Dockerfile itself.

Adding Layers to an Image:

Once you have a base image, you can add various layers using the with_* methods. Each method returns a new Image instance with the added layer.

  • with_pip_packages(*packages: str, ...): Installs Python packages using pip (or uv internally). Supports specifying index_url, extra_index_urls, pre-release flags, and extra_args.
    from flytekit.core.image import Image
    from flytekit.core.secrets import Secret

    my_image = Image.from_debian_base().with_pip_packages("pandas", "scikit-learn==1.0.0")

    # Private Packages: Use secret_mounts to provide credentials for private package repositories during the build process.
    private_repo_image = Image.from_debian_base().with_pip_packages(
    "my-private-package",
    index_url="https://private.pypi.org/simple",
    secret_mounts=[Secret(group="build-secrets", key="PYPI_TOKEN")]
    )
  • with_requirements(file: Path | str, ...): Installs Python packages from a requirements.txt file.
    from flytekit.core.image import Image
    from pathlib import Path

    Path("requirements.txt").write_text("requests==2.28.1\n")
    my_image = Image.from_debian_base().with_requirements(Path("./requirements.txt"))
    Path("requirements.txt").unlink() # Clean up
  • with_uv_project(pyproject_file: Path | str, uvlock: Path | None = None, ...): Installs dependencies defined in a pyproject.toml and uv.lock file. This is ideal for uv-managed Python projects.
    from flytekit.core.image import Image
    from pathlib import Path

    # Create dummy pyproject.toml and uv.lock for example
    Path("pyproject.toml").write_text("[project]\nname = \"my-project\"\nversion = \"0.1.0\"\ndependencies = [\"requests\"]\n")
    Path("uv.lock").write_text("# This would be a generated uv.lock file\n")

    my_image = Image.from_debian_base().with_uv_project(Path("./pyproject.toml"))

    Path("pyproject.toml").unlink() # Clean up
    Path("uv.lock").unlink() # Clean up
  • with_apt_packages(*packages: str, ...): Installs system-level packages using apt.
    from flytekit.core.image import Image
    from flytekit.core.secrets import Secret

    my_image = Image.from_debian_base().with_apt_packages("git", "curl")

    # Private APT Repositories: Similar to pip packages, secret_mounts can be used for private APT repositories.
    apt_secret_image = Image.from_debian_base().with_apt_packages(
    "my-private-apt-package",
    secret_mounts=[Secret(group="build-secrets", key="APT_KEY", mount="/etc/apt/apt-secret")]
    )
  • with_env_vars(env_vars: Dict[str, str]): Sets environment variables within the image.
    from flytekit.core.image import Image

    my_image = Image.from_debian_base().with_env_vars({"APP_ENV": "production"})
  • with_source_folder(src: Path, dst: str = "."): Copies a local directory into the image.
    from flytekit.core.image import Image
    from pathlib import Path

    Path("my_code").mkdir(exist_ok=True)
    Path("my_code/file.txt").write_text("hello")
    my_image = Image.from_debian_base().with_source_folder(Path("./my_code"), "/app")
    Path("my_code/file.txt").unlink()
    Path("my_code").rmdir() # Clean up
  • with_source_file(src: Path, dst: str = "."): Copies a local file into the image.
    from flytekit.core.image import Image
    from pathlib import Path

    Path("config.yaml").write_text("key: value")
    my_image = Image.from_debian_base().with_source_file(Path("./config.yaml"), "/etc/app/config.yaml")
    Path("config.yaml").unlink() # Clean up
  • with_commands(commands: List[str], ...): Executes arbitrary shell commands during the image build. Do not include RUN in your commands.
    from flytekit.core.image import Image

    my_image = Image.from_debian_base().with_commands(["mkdir -p /data", "chmod 777 /data"])
  • with_workdir(workdir: str): Sets the working directory for subsequent commands in the image.
    from flytekit.core.image import Image

    my_image = Image.from_debian_base().with_workdir("/app")
  • with_dockerignore(path: Path): Specifies a .dockerignore file to exclude files during context copying.

Image Identification and URI:

  • identifier: A cached property that returns a unique hash representing the image's definition (base image + all layers). This is crucial for caching and ensuring consistent image builds.
  • uri: A cached property that constructs the full image URI (<registry>/<name>:<tag>). The tag is either explicitly set or derived from the image's hash digest.

Example: Building a complete image

from flytekit.core.base_task import task
from flytekit.core.environment import Environment
from flytekit.core.image import Image
from flytekit.core.resources import Resources
from pathlib import Path

# Create a dummy requirements.txt and source file for the example
Path("requirements.txt").write_text("requests==2.28.1\n")
Path("my_script.py").write_text("import requests; print(requests.__version__)")

# Define a custom image
my_custom_image = (
Image.from_debian_base(
python_version=(3, 10),
registry="my-company-registry",
name="my-data-app",
platform=("linux/amd64", "linux/arm64") # Build for multiple architectures
)
.with_apt_packages("git")
.with_requirements(Path("requirements.txt"))
.with_source_file(Path("my_script.py"), "/app/my_script.py")
.with_env_vars({"APP_VERSION": "1.0"})
.with_commands(["echo 'Image build complete!'"])
)

# Use the custom image in an Environment
my_env_with_custom_image = Environment(
name="data-processing-env",
image=my_custom_image,
resources=Resources(cpu="2", mem="4Gi")
)

@task(environment=my_env_with_custom_image)
def process_data(input_path: str) -> str:
# This task will run in the environment defined by my_env_with_custom_image
# It will have git, requests, my_script.py, and APP_VERSION env var
import os
import subprocess
print(f"Running in environment: {os.getenv('APP_VERSION')}")
subprocess.run(["python", "/app/my_script.py"])
return f"Processed {input_path}"

# Clean up dummy files
Path("requirements.txt").unlink()
Path("my_script.py").unlink()

Customizing Pod Specifications

The PodTemplate class allows fine-grained control over the Kubernetes Pod specification that hosts your task's container. This is useful for advanced scenarios requiring specific node selectors, tolerations, volumes, or init containers.

Key Attributes of PodTemplate:

  • pod_spec: A V1PodSpec object from the Kubernetes client library. This is the core of the customization, allowing you to define almost any aspect of the Pod.
  • primary_container_name: Specifies the name of the main container within the pod_spec that will execute your task. Defaults to primary.
  • labels: A dictionary of labels to apply to the Pod.
  • annotations: A dictionary of annotations to apply to the Pod.

Integration with Environment:

You associate a PodTemplate with an Environment using the pod_template attribute.

Example: Customizing a Pod with PodTemplate

from flytekit.core.base_task import task
from flytekit.core.environment import Environment
from flytekit.core.pod_template import PodTemplate
from kubernetes.client import V1PodSpec, V1Container, V1ResourceRequirements

# Define a PodTemplate with custom resource requests/limits and a node selector
custom_pod_spec = V1PodSpec(
containers=[
V1Container(
name="primary", # Must match primary_container_name
image="my-registry/my-app:v1.0", # This image will be overridden by Environment.image if specified
resources=V1ResourceRequirements(
requests={"cpu": "500m", "memory": "1Gi"},
limits={"cpu": "1", "memory": "2Gi"}
)
)
],
node_selector={"disktype": "ssd"}
)

my_pod_template = PodTemplate(
pod_spec=custom_pod_spec,
primary_container_name="primary",
labels={"app": "my-special-app"},
annotations={"owner": "data-team"}
)

# Use the PodTemplate in an Environment
env_with_custom_pod = Environment(
name="high-perf-env",
image="my-registry/my-app:v1.0", # This image will be used for the primary container
pod_template=my_pod_template
)

@task(environment=env_with_custom_pod)
def high_performance_task(data: int) -> int:
# This task will run on a node with disktype: ssd
# and have the specified resource requests/limits, labels, and annotations.
return data * 2

Important Considerations:

  • When using PodTemplate within an Environment, the Environment.image will override the image specified in the V1Container within pod_spec for the primary_container_name. Other containers defined in pod_spec will retain their specified images.
  • Ensure the primary_container_name in PodTemplate matches the name of one of the containers in V1PodSpec.containers.
  • Direct manipulation of V1PodSpec requires familiarity with Kubernetes API objects.