Skip to main content

File

A generic file class representing a file with a specified format. Provides both async and sync interfaces for file operations. Users must handle all I/O operations themselves by instantiating this class with the appropriate class methods.

Attributes

  • path: string

    • The path to the file (can be local or remote)
  • name: Optional[str] = None

    • Optional name for the file (defaults to basename of path)
  • format: string = ""

    • The format of the file.
  • hash: Optional[str] = None

    • Optional hash value of the file content.
  • hash_method: Optional[HashMethod] = None

    • The hashing method used for the file.

Constructors

  • Initializes a new instance of the File class.

     Args:
    path: The path to the file (can be local or remote).
    name: Optional name for the file. Defaults to the basename of the path.
    format: The format of the file (default is empty string).
    hash: Optional hash value for the file.
    hash_method: Optional hash method to use for the file.

  • Parameters

    • path: string
      • The path to the file (can be local or remote).
    • name: string
      • Optional name for the file (defaults to basename of path)
    • format: string
      • The format of the file
    • hash: string
      • Optional hash value for the file
    • hash_method: string
      • Optional hash method to use for the file

Methods

def pre_init(data: dict) - > any
  • Validator to set the file name if it's not provided.

  • Parameters

    • data: dict
      • The input data dictionary.
  • Return Value: any

    • The data with the name field set.
def schema_match(incoming: dict) - > bool
  • Checks if the schema of an incoming dictionary matches the current File schema.

  • Parameters

    • incoming: dict
      • The incoming dictionary to compare schemas with.
  • Return Value: bool

    • True if the schemas match, False otherwise.
def new_remote(hash_method: Optional[HashMethod | str] = None) - > [File](src_flyte_io__file_file)[T]
  • Creates a new File reference for a remote file that will be written to.

  • Parameters

    • hash_method: Optional[HashMethod | str]
      • Optional hash method or string to use for cache key determination.
  • Return Value: File[T]

    • A new File instance representing a remote file.
def from_existing_remote(remote_path: str, file_cache_key: Optional[str] = None) - > [File](src_flyte_io__file_file)[T]
  • Creates a File reference from an existing remote file.

  • Parameters

    • remote_path: str
      • The remote path to the existing file.
    • file_cache_key: Optional[str]
      • Optional hash value to use for discovery purposes.
  • Return Value: File[T]

    • A File instance representing an existing remote file.
def open(mode: str = 'rb', block_size: Optional[int] = None, cache_type: str = 'readahead', cache_options: Optional[dict] = None, compression: Optional[str] = None) - > AsyncGenerator[Union[IO[Any], HashingWriter], None]
  • Asynchronously opens the file and returns an async file-like object.

  • Parameters

    • mode: str
      • The mode to open the file in.
    • block_size: Optional[int]
      • Size of blocks for reading (bytes).
    • cache_type: str
      • Caching mechanism to use.
    • cache_options: Optional[dict]
      • Dictionary of options for the cache.
    • compression: Optional[str]
      • Compression format or None for auto-detection.
  • Return Value: AsyncGenerator[Union[IO[Any], HashingWriter], None]

    • An async file-like object.
def exists_sync()
  • Synchronously checks if the file exists.

  • Return Value: bool

    • True if the file exists, False otherwise.
def open_sync(mode: str = 'rb', block_size: Optional[int] = None, cache_type: str = 'readahead', cache_options: Optional[dict] = None, compression: Optional[str] = None) - > Generator[IO[Any]]
  • Synchronously opens the file and returns a file-like object.

  • Parameters

    • mode: str
      • The mode to open the file in.
    • block_size: Optional[int]
      • Size of blocks for reading (bytes).
    • cache_type: str
      • Caching mechanism to use.
    • cache_options: Optional[dict]
      • Dictionary of options for the cache.
    • compression: Optional[str]
      • Compression format or None for auto-detection.
  • Return Value: Generator[IO[Any]]

    • A file-like object.
def download(local_path: Optional[Union[str, Path]] = None) - > str
  • Asynchronously downloads the file to a local path.

  • Parameters

    • local_path: Optional[Union[str, Path]]
      • The local path to download the file to. If None, a temporary directory will be used.
  • Return Value: str

    • The path to the downloaded file.
def from_local(local_path: Union[str, Path], remote_destination: Optional[str] = None, hash_method: Optional[HashMethod | str] = None) - > [File](src_flyte_io__file_file)[T]
  • Creates a new File object from a local file that will be uploaded to the configured remote store.

  • Parameters

    • local_path: Union[str, Path]
      • Path to the local file.
    • remote_destination: Optional[str]
      • Optional path to store the file remotely. If None, a path will be generated.
    • hash_method: Optional[HashMethod | str]
      • Hash method or string for cache key determination.
  • Return Value: File[T]

    • A new File instance pointing to the uploaded file.