Data Visualization and Rendering
Data Visualization and Rendering provides a set of utilities for converting various data types and content into HTML for display. This enables rich, interactive outputs within environments that support HTML rendering.
The core mechanism is the Renderable protocol, which defines a to_html method. Any object implementing this protocol can be converted into an HTML string, ensuring a consistent interface for rendering diverse content.
Markdown Content Rendering
The MarkdownRenderer converts Markdown formatted strings into HTML. This is useful for displaying rich text content, documentation, or narrative alongside data.
from src.flyte.types._renderer import MarkdownRenderer
renderer = MarkdownRenderer()
markdown_text = "# Project Report\n\nThis report details the **key findings** from the latest analysis."
html_output = renderer.to_html(markdown_text)
print(html_output)
Python Source Code Highlighting
The SourceCodeRenderer transforms Python source code into syntax-highlighted HTML. This is ideal for presenting code snippets, function definitions, or entire scripts in a readable format. It applies a "colorful" style and ensures a white background for readability.
from src.flyte.types._renderer import SourceCodeRenderer
renderer = SourceCodeRenderer(title="Example Function")
code = """
def calculate_sum(a: int, b: int) -> int:
\"\"\"Adds two integers and returns the result.\"\"\"
return a + b
"""
html_output = renderer.to_html(code)
print(html_output)
Tabular Data Rendering
Pandas DataFrames
The TopFrameRenderer converts pandas.DataFrame objects into HTML tables. It supports limiting the number of rows and columns displayed, which is useful for managing the output size of large datasets.
import pandas as pd
from src.flyte.types._renderer import TopFrameRenderer
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [24, 27, 22], 'City': ['NY', 'LA', 'SF']}
df = pd.DataFrame(data)
renderer = TopFrameRenderer(max_rows=2, max_cols=2)
html_output = renderer.to_html(df)
print(html_output)
Arrow Tables
The ArrowRenderer converts pyarrow.Table objects into a string representation, which is then embedded as HTML.
Currently, this renderer uses pyarrow.Table.to_string() which produces a plain text representation of the table. This text output is then placed directly into the HTML context. For rich HTML table rendering of Arrow tables, a custom renderer implementing the Renderable protocol would be required.
import pyarrow as pa
from src.flyte.types._renderer import ArrowRenderer
table = pa.table({'id': [1, 2, 3], 'value': ['A', 'B', 'C']})
renderer = ArrowRenderer()
html_output = renderer.to_html(table)
print(html_output)
Python Environment Dependencies
The PythonDependencyRenderer generates an HTML page detailing the Python packages installed in the current environment. It includes package names, versions, and provides a convenient button to copy the requirements.txt content. This is invaluable for debugging environment issues or documenting dependencies for reproducibility.
from src.flyte.types._renderer import PythonDependencyRenderer
renderer = PythonDependencyRenderer(title="Project Dependencies")
html_output = renderer.to_html()
print(html_output)
This renderer executes pip list and pip freeze commands, which might incur a slight performance overhead and requires pip to be available in the execution environment.
Extending Rendering Capabilities
To support rendering new data types or custom objects, implement the Renderable protocol. This involves defining a to_html method that accepts the object and returns its HTML representation as a string. This approach ensures consistency and allows for seamless integration with existing rendering mechanisms.
from typing import Any, Protocol
from src.flyte.types._renderer import Renderable
class CustomData:
def __init__(self, label: str, value: Any):
self.label = label
self.value = value
class CustomDataRenderer(Renderable):
def to_html(self, python_value: CustomData) -> str:
if not isinstance(python_value, CustomData):
raise TypeError("Expected CustomData object")
return f"<div><strong>{python_value.label}:</strong> {python_value.value}</div>"
# Usage of a custom renderer
custom_obj = CustomData("Status", "Completed Successfully")
renderer = CustomDataRenderer()
html_output = renderer.to_html(custom_obj)
print(html_output)
Key Considerations
- Security: When rendering arbitrary user-provided content (especially Markdown), be mindful of potential Cross-Site Scripting (XSS) vulnerabilities if the HTML is directly embedded into a web page without proper sanitization. While
MarkdownItgenerally handles common sanitization, always consider the context of deployment. - Performance: Generating HTML for very large datasets (e.g., millions of rows in a DataFrame) can be memory-intensive and slow. Use
max_rowsandmax_colswithTopFrameRendererto manage output size and prevent browser performance issues. - Dependencies: Renderers like
SourceCodeRenderer(Pygments),MarkdownRenderer(MarkdownIt), andPythonDependencyRenderer(subprocess, pip) rely on external libraries or system commands. Ensure these are available in the execution environment where rendering occurs.