Decorator for creators

Please read Creator Tutorial


schema (Any | None) –

Return type:

Callable[[Any], _FuncAsCreator]

fugue.extensions.creator.convert.register_creator(alias, obj, on_dup=0)[source]

Register creator with an alias. This is a simplified version of parse_creator()

Return type:



Registering an extension with an alias is particularly useful for projects such as libraries. This is because by using alias, users don’t have to import the specific extension, or provide the full path of the extension. It can make user’s code less dependent and easy to understand.

New Since


See also

Please read Creator Tutorial


Here is an example how you setup your project so your users can benefit from this feature. Assume your project name is pn

The creator implementation in file pn/pn/creators.py

import pandas import pd

def my_creator() -> pd.DataFrame:
    return pd.DataFrame()

Then in pn/pn/__init__.py

from .creators import my_creator
from fugue import register_creator

def register_extensions():
    register_creator("mc", my_creator)
    # ... register more extensions


In users code:

import pn  # register_extensions will be called
from fugue import FugueWorkflow

dag = FugueWorkflow()
dag.create("mc").show()  # use my_creator by alias


class fugue.extensions.creator.creator.Creator[source]

Bases: ExtensionContext, ABC

The interface is to generate single DataFrame from params. For example reading data from file should be a type of Creator. Creator is task level extension, running on driver, and execution engine aware.

To implement this class, you should not have __init__, please directly implement the interface functions.


Before implementing this class, do you really need to implement this interface? Do you know the interfaceless feature of Fugue? Implementing Creator is commonly unnecessary. You can choose the interfaceless approach which may decouple your code from Fugue.

See also

Please read Creator Tutorial

abstract create()[source]

Create DataFrame on driver side


  • It runs on driver side

  • The output dataframe is not necessarily local, for example a SparkDataFrame

  • It is engine aware, you can put platform dependent code in it (for example native pyspark code) but by doing so your code may not be portable. If you only use the functions of the general ExecutionEngine interface, it’s still portable.


result dataframe

Return type: