fugue_ibis#

fugue_ibis.extensions#

fugue_ibis.extensions.as_fugue(expr, ibis_engine=None)[source]#

Convert a lazy ibis object to Fugue workflow dataframe

Parameters
  • expr (ibis.expr.types.TableExpr) – the actual instance should be LazyIbisObject

  • ibis_engine (Optional[Any]) –

Returns

the Fugue workflow dataframe

Return type

fugue.workflow.workflow.WorkflowDataFrame

Examples

# non-magical approach
import fugue as FugueWorkflow
from fugue_ibis import as_ibis, as_fugue

dag = FugueWorkflow()
df1 = dag.df([[0]], "a:int")
df2 = dag.df([[1]], "a:int")
idf1 = as_ibis(df1)
idf2 = as_ibis(df2)
idf3 = idf1.union(idf2)
result = idf3.mutate(b=idf3.a+1)
as_fugue(result).show()
# magical approach
import fugue as FugueWorkflow
import fugue_ibis  # must import

dag = FugueWorkflow()
idf1 = dag.df([[0]], "a:int").as_ibis()
idf2 = dag.df([[1]], "a:int").as_ibis()
idf3 = idf1.union(idf2)
result = idf3.mutate(b=idf3.a+1).as_fugue()
result.show()

Note

The magic is that when importing fugue_ibis, the functions as_ibis and as_fugue are added to the correspondent classes so you can use them as if they are parts of the original classes.

This is an idea similar to patching. Ibis uses this programming model a lot. Fugue provides this as an option.

Note

The returned object is not really a TableExpr, it’s a ‘super lazy’ object that will be translated into TableExpr at run time. This is because to compile an ibis execution graph, the input schemas must be known. However, in Fugue, this is not always true. For example if the previous step is to pivot a table, then the output schema can be known at runtime. So in order to be a part of Fugue, we need to be able to construct ibis expressions before knowing the input schemas.

fugue_ibis.extensions.as_ibis(df)[source]#

Convert the Fugue workflow dataframe to an ibis table for ibis operations.

Parameters

df (fugue.workflow.workflow.WorkflowDataFrame) – the Fugue workflow dataframe

Returns

the object representing the ibis table

Return type

ibis.expr.types.TableExpr

Examples

# non-magical approach
import fugue as FugueWorkflow
from fugue_ibis import as_ibis, as_fugue

dag = FugueWorkflow()
df1 = dag.df([[0]], "a:int")
df2 = dag.df([[1]], "a:int")
idf1 = as_ibis(df1)
idf2 = as_ibis(df2)
idf3 = idf1.union(idf2)
result = idf3.mutate(b=idf3.a+1)
as_fugue(result).show()
# magical approach
import fugue as FugueWorkflow
import fugue_ibis  # must import

dag = FugueWorkflow()
idf1 = dag.df([[0]], "a:int").as_ibis()
idf2 = dag.df([[1]], "a:int").as_ibis()
idf3 = idf1.union(idf2)
result = idf3.mutate(b=idf3.a+1).as_fugue()
result.show()

Note

The magic is that when importing fugue_ibis, the functions as_ibis and as_fugue are added to the correspondent classes so you can use them as if they are parts of the original classes.

This is an idea similar to patching. Ibis uses this programming model a lot. Fugue provides this as an option.

Note

The returned object is not really a TableExpr, it’s a ‘super lazy’ object that will be translated into TableExpr at run time. This is because to compile an ibis execution graph, the input schemas must be known. However, in Fugue, this is not always true. For example if the previous step is to pivot a table, then the output schema can be known at runtime. So in order to be a part of Fugue, we need to be able to construct ibis expressions before knowing the input schemas.

fugue_ibis.extensions.run_ibis(ibis_func, ibis_engine=None, **dfs)[source]#

Run an ibis workflow wrapped in ibis_func

Parameters
Returns

the output workflow dataframe

Return type

fugue.workflow.workflow.WorkflowDataFrame

Examples

import fugue as FugueWorkflow
from fugue_ibis import run_ibis

def func(backend):
    t = backend.table("tb")
    return t.mutate(b=t.a+1)

dag = FugueWorkflow()
df = dag.df([[0]], "a:int")
result = run_ibis(func, tb=df)
result.show()