fugue.extensions#

fugue.extensions.context#

class fugue.extensions.context.ExtensionContext[source]#

Bases: object

Context variables that extensions can access. It’s also the base class of all extensions.

property callback: RPCClient#

RPC client to talk to driver, this is for transformers only, and available on both driver and workers

property cursor: PartitionCursor#

Cursor of the current logical partition, this is for transformers only, and only available on worker side

property execution_engine: ExecutionEngine#

Execution engine for the current execution, this is only available on driver side

property has_callback: bool#

Whether this transformer has callback

property key_schema: Schema#

Partition keys schema, this is for transformers only, and available on both driver and workers

property output_schema: Schema#

Output schema of the operation. This is accessible for all extensions ( if defined), and on both driver and workers

property params: ParamDict#

Parameters set for using this extension.

Examples

>>> FugueWorkflow().df(...).transform(using=dummy, params={"a": 1})

You will get {"a": 1} as params in the dummy transformer

property partition_spec: PartitionSpec#

Partition specification, this is for all extensions except for creators, and available on both driver and workers

property rpc_server: RPCServer#

RPC client to talk to driver, this is for transformers only, and available on both driver and workers

validate_on_compile()[source]#
Return type

None

validate_on_runtime(data)[source]#
Parameters

data (Union[DataFrame, DataFrames]) –

Return type

None

property validation_rules: Dict[str, Any]#

Extension input validation rules defined by user

property workflow_conf: ParamDict#

Workflow level configs, this is accessible even in Transformer and CoTransformer

Examples

>>> dag = FugueWorkflow().df(...).transform(using=dummy)
>>> dag.run(NativeExecutionEngine(conf={"b": 10}))

You will get {"b": 10} as workflow_conf in the dummy transformer on both driver and workers.