pyiron_workflow.nodes.for_loop module

class pyiron_workflow.nodes.for_loop.For(*args, label: str | None = None, parent: Composite | None = None, autoload: Literal['h5bag', 'pickle'] | StorageInterface | None = None, delete_existing_savefiles: bool = False, autorun: bool = False, checkpoint: Literal['h5bag', 'pickle'] | StorageInterface | None = None, strict_naming: bool = True, body_node_executor: Executor | tuple[Callable[[...], Executor], tuple, dict] | None = None, **kwargs)[source]

Bases: Composite, StaticNode, ABC

Specifies fixed fields of some other node class to iterate over, but allows the length of looped input to vary by dynamically destroying and recreating (most of) its subgraph at run-time.

Collects looped output and collates them with looped input values in a dataframe.

The body_node_executor gets applied to each body node instance on each run.

property ncols: int | None

property nrows: int | None

classmethod output_column_map() → dict[str, str][source]: How to transform body node output labels to dataframe column names.

exception pyiron_workflow.nodes.for_loop.MapsToNonexistentOutputError[source]

Bases: ValueError

When a for-node tries to map body node output channels that don’t exist.

exception pyiron_workflow.nodes.for_loop.UnmappedConflictError[source]

Bases: ValueError

When a for-node gets a body whose output label conflicts with looped a input label and no map was provided to avoid this.

pyiron_workflow.nodes.for_loop.dictionary_to_index_maps(data: dict, nested_keys: list[str] | tuple[str, ...] | None = None, zipped_keys: list[str] | tuple[str, ...] | None = None)[source]

Given a dictionary where some data is iterable, and list(s) of keys over which to make a nested and/or zipped loop, return dictionaries mapping these keys to all the indices of the data they hold. Zipped loops are nested outside the nesting loops.

Parameters:

data (dict) – The dictionary of data, some of which must me iterable.
nested_keys (tuple[str, ...] | None) – The keys whose data to make a nested for-loop over.
zipped_keys (tuple[str, ...] | None) – The keys whose data to make a zipped for-loop over.

Returns:

A tuple of dictionaries where each item: maps the dictionary key to an index for that key’s value.

Return type:

(tuple[dict[…, int], …])

Raises:

(KeyError) – If any of the provided keys are not keys of the provided dictionary.
(TypeError) – If any of the data held in a provided key does cannot be operated on with len.
(ValueError) – If neither set of keys to iterate on is provided, or if all values being iterated over have a length of zero.

pyiron_workflow.nodes.for_loop.for_node(body_node_class: type[StaticNode], *node_args, iter_on: tuple[str, ...] | str = (), zip_on: tuple[str, ...] | str = (), output_as_dataframe: bool = True, output_column_map: dict[str, str] | None = None, use_cache: bool = True, **node_kwargs)[source]

Makes a new For node which internally creates instances of the :param:`body_node_class` and loops input onto them in nested and/or zipped loop(s).

Output is a single channel, “df”, which holds a pandas.DataFrame whose rows couple (looped) input to their respective body node outputs.

The internal node structure gets re-created each run, so the same inputs must consistently be iterated over, but their lengths can change freely.

An executor can be applied to all body node instances at run-time by assigning it to the body_node_executor attribute of the for-node.

Parameters:

type[StaticNode] (body_node_class) – The class of node to loop on.
*node_args – Regular positional node arguments.
iter_on (tuple[str, ...] | str) – Input label(s) in the :param:`body_node_class` to nested-loop on.
zip_on (tuple[str, ...] | str) – Input label(s) in the :param:`body_node_class` to zip-loop on.
output_as_dataframe (bool) – Whether to package the output (and iterated input) as a dataframe, or leave them as individual lists. (Default is True, package as dataframe.)
output_column_map (dict[str, str] | None) – A map for generating dataframe column names (values) from body node output channel labels (keys). Necessary iff the body node has the same label for an output channel and an input channel being looped over. (Default is None, just use the output channel labels as columb names.)
use_cache (bool) – Whether this node should default to caching its values. (Default is True.)
**node_kwargs – Regular keyword node arguments.

Returns:

An instance of a dynamically-subclassed For node.

Return type:

(For)

Examples

>>> from pyiron_workflow import Workflow
>>>
>>> @Workflow.wrap.as_function_node("together")
... def FiveTogether(a: int, b: int, c: int, d: int, e: str = "foobar"):
...     return (a, b, c, d, e),
>>>
>>> for_instance = Workflow.create.for_node(
...     FiveTogether,
...     iter_on=("a", "b"),
...     zip_on=("c", "d"),
...     a=[1, 2],
...     b=[3, 4, 5, 6],
...     c=[7, 8],
...     d=[9, 10, 11],
...     e="e"
... )
>>>
>>> out = for_instance()
>>> type(out.df)
<class 'pandas...DataFrame'>

Internally, the loop node has made a bunch of body nodes, as well as nodes to index and collect data >>> len(for_instance) 48

We get one dataframe row for each possible combination of looped input >>> len(out.df) 16

We are stuck iterating on the fields we defined, but we can change the length of the input and the loop node’s body will get reconstructed at run-time to accommodate this >>> out = for_instance(a=[1], b=[3], d=[7]) >>> len(for_instance), len(out) (12, 1)

Note that if we had simply returned each input individually, without any output labels on the node, we’d need to specify a map on the for-node so that the (looped) input and output columns on the resulting dataframe are all unique: >>> @Workflow.wrap.as_function_node … def FiveApart(a: int, b: int, c: int, d: int, e: str = “foobar”): … return a, b, c, d, e, >>> >>> for_instance = Workflow.create.for_node( … FiveApart, … iter_on=(“a”, “b”), … zip_on=(“c”, “d”), … a=[1, 2], … b=[3, 4, 5, 6], … c=[7, 8], … d=[9, 10, 11], … e=”e”, … output_column_map={ … “a”: “out_a”, … “b”: “out_b”, … “c”: “out_c”, … “d”: “out_d” … } … ) >>> >>> out = for_instance() >>> out.df.columns # doctest: +ELLIPSIS Index([‘a’, ‘b’, ‘c’, ‘d’, ‘out_a’, ‘out_b’, ‘out_c’, ‘out_d’, ‘e’], dtype=’…’)