fugue.extensions

fugue.extensions.context

class fugue.extensions.context.ExtensionContext[source]

Bases: object

Context variables that extensions can access. It’s also the base class of all extensions.

property callback: RPCClient

RPC client to talk to driver, this is for transformers only, and available on both driver and workers

property cursor: PartitionCursor

Cursor of the current logical partition, this is for transformers only, and only available on worker side

property execution_engine: ExecutionEngine

Execution engine for the current execution, this is only available on driver side

property has_callback: bool

Whether this transformer has callback

property key_schema: Schema

Partition keys schema, this is for transformers only, and available on both driver and workers

property output_schema: Schema

Output schema of the operation. This is accessible for all extensions ( if defined), and on both driver and workers

property params: ParamDict

Parameters set for using this extension.

Examples

>>> FugueWorkflow().df(...).transform(using=dummy, params={"a": 1})

You will get {"a": 1} as params in the dummy transformer

property partition_spec: PartitionSpec

Partition specification, this is for all extensions except for creators, and available on both driver and workers

property rpc_server: RPCServer

RPC client to talk to driver, this is for transformers only, and available on both driver and workers

validate_on_compile()[source]
Return type:

None

validate_on_runtime(data)[source]
Parameters:

data (DataFrame | DataFrames)

Return type:

None

property validation_rules: Dict[str, Any]

Extension input validation rules defined by user

property workflow_conf: ParamDict

Workflow level configs, this is accessible even in Transformer and CoTransformer

Examples

>>> dag = FugueWorkflow().df(...).transform(using=dummy)
>>> dag.run(NativeExecutionEngine(conf={"b": 10}))

You will get {"b": 10} as workflow_conf in the dummy transformer on both driver and workers.