pyiron_workflow.node module
A base class for objects that can form nodes in the graph representation of a computational workflow.
The workhorse class for the entire concept.
- exception pyiron_workflow.node.AmbiguousOutputError[source]
Bases:
ValueErrorRaised when searching for exactly one output, but multiple are found.
- exception pyiron_workflow.node.ConnectionCopyError[source]
Bases:
ValueErrorRaised when trying to copy IO, but connections cannot be copied
- class pyiron_workflow.node.Node(*args, label: str | None = None, parent: Composite | None = None, delete_existing_savefiles: bool = False, autoload: BackendIdentifier | StorageInterface | None = None, autorun: bool = False, checkpoint: BackendIdentifier | StorageInterface | None = None, **kwargs)[source]
Bases:
HasStateDisplay,Lexical[Composite],Runnable,InjectsOnChannel,ABCNodes are elements of a computational graph. They have inputs and outputs to interface with the wider world, and perform some operation. By connecting multiple nodes’ inputs and outputs together, computational graphs can be formed. These can be collected under a parent, such that new graphs can be composed of one or more sub-graphs.
This is an abstract class. Children must define how
inputsandoutputsare constructed, what will happen_on_run(), therun_argsthat will get passed to_on_run(), and how toprocess_run_result()once_on_run()finishes. They may optionally add additional signal channels to the signals IO.- future
A futures object, if the node is currently running or has already run using an executor.
- Type:
concurrent.futures.Future | None
- label
A name for the node.
- Type:
str
- parent
The parent object owning this, if any.
- Type:
pyiron_workflow.composite.Composite | None
- recovery
(BackendIdentifier | StorageInterface | None): The storage backend to use for saving a “recovery” file if the node execution crashes and this is the parent-most node. Default is “pickle”, setting None will prevent any file from being saved.
- running
Whether the node has called
run()and has not yet received output from this call. (Default is False.)- Type:
bool
- checkpoint
Whether to trigger a save of the entire graph after each run of the node, and if so what storage back end to use. (Default is None, don’t do any checkpoint saving.)
- Type:
BackendIdentifier | StorageInterface | None
- use_cache
Whether or not to cache the inputs and, when the current inputs match the cached input (by == comparison), to bypass running the node and simply continue using the existing outputs. Note that you may be able to trigger a false cache hit in some special case of non-idempotent nodes working on mutable data.
- Type:
bool
- property cache_hit: bool
- property channel: OutputDataWithInjection
The single output channel. Fulfills the interface expectations for the
HasChannelmixin and allows this object to be used directly for forming connections, etc.- Returns:
The single output channel.
- Return type:
- Raises:
AmbiguousOutputError – If there is not exactly one output channel.
- property color: str
A hex code color for use in drawing.
- property connected: bool
Whether _any_ of the IO (including signals) are connected.
- delete_storage(backend: BackendIdentifier | StorageInterface | None = None, only_requested: bool = False, filename: str | Path | None = None, *, delete_even_if_not_empty: bool = False, **kwargs)[source]
Remove save file(s).
- Parameters:
backend (str | StorageInterface) – The interface to use for serializing the node. (Default is “pickle”, which loads the standard pickling back end.)
only_requested (bool) – Whether to _only_ search for files using the specifiedmbackend, or to loop through all available backends. (Default is False, try to remove whatever you can find.)
filename (str | Path | None) – The name of the file (without extensions) to remove. (Default is None, which uses the node’s lexical path.)
delete_even_if_not_empty (bool) – Whether to delete the file even if it is not empty. (Default is False, which will only delete the file if it is empty, i.e. has no content in it.)
**kwargs – back end-specific arguments (only likely to work in combination with :param:`only_requested`, otherwise there’s nothing to be specific _to_.)
- disconnect() list[tuple[Channel, Channel]][source]
Disconnect all connections belonging to inputs, outputs, and signals channels.
- display_state(state=None, ignore_private=True)[source]
A dictionary of JSON-compatible objects based on the object state (plus whatever modifications to the state the class designer has chosen to make).
Anything that fails to dump to JSON gets cast as a string and then dumped.
- Parameters:
state (dict|None) – The starting state. Default is None which uses __getstate__, but available in case child classes want to first sanitize the state values.
ignore_private (bool) – Whether to ignore or include any state element whose key starts with ‘_’. Default is True, only show public data.
- Return type:
dict
- draw(depth: int = 1, rankdir: Literal['LR', 'TB'] = 'LR', size: tuple | None = None, save: bool = False, view: bool = False, directory: Path | str | None = None, filename: Path | str | None = None, format: str | None = None, cleanup: bool = True) graphviz.graphs.Digraph[source]
Draw the node structure and return it as a graphviz object.
A selection of the
graphviz.Graph.render()method options are exposed, and if :param:`view` or :param:`filename` is provided, this will be called before returning the graph. The graph file and rendered image will be stored in a directory based of the node’s lexical path, unless a :param:`directory` is explicitly set. This is purely for convenience – since we directly return a graphviz object you can instead use this to leverage the full power of graphviz.- Parameters:
depth (int) – How deeply to decompose the representation of composite nodes to reveal their inner structure. (Default is 1, which will show owned nodes if _this_ is a composite node, but all children will be drawn at the level of showing their IO only.) A depth value greater than the max depth of the node will have no adverse side effects.
rankdir ("LR" | "TB") – Use left-right or top-bottom graphviz rankdir to orient the flow of the graph.
size (tuple[int | float, int | float] | None) – The size of the diagram, in inches(?); respects ratio by scaling until at least one dimension matches the requested size. (Default is None, automatically size.)
save (bool) – Render the graph image. (Default is False. When True, all other defaults will yield a PDF in the node’s working directory.)
view (bool) – graphviz.Graph.render argument, open the rendered result with the default application. (Default is False. When True, default values for the directory and filename are supplied by the node working directory and label.)
directory (Path|str|None) – graphviz.Graph.render argument, (sub)directory for source saving and rendering. (Default is None, which uses the node’s working directory.)
filename (Path|str) – graphviz.Graph.render argument, filename for saving the source. (Default is None, which uses the node label + “_graph”.
format (str|None) – graphviz.Graph.render argument, the output format used for rendering (‘pdf’, ‘png’, etc.).
cleanup (bool) – graphviz.Graph.render argument, delete the source file after successful rendering. (Default is True – unlike graphviz.)
- Returns:
The resulting graph object.
- Return type:
(graphviz.graphs.Digraph)
- property emitting_channels: tuple[OutputSignal, ...]
- execute(*args, **kwargs)[source]
A shortcut for
run()with particular flags.Run the node with whatever input it currently has (or is given as kwargs here), run it on this python process, and don’t emit the ran signal afterwards.
Intended to be useful for debugging by just forcing the node to do its thing right here, right now, and as-is.
- property fully_connected: bool
Whether _all_ of the IO (including signals) are connected.
- property graph_path: str
The path of node labels from the graph root (parent-most node in this lexical path) down to this node.
- has_saved_content(backend: BackendIdentifier | StorageInterface | None = None, only_requested: bool = False, filename: str | Path | None = None, **kwargs)[source]
Whether any save files can be found at the canonical location for this node.
- Parameters:
backend (str | StorageInterface) – The interface to use for serializing the node. (Default is “pickle”, which loads the standard pickling back end.)
only_requested (bool) – Whether to _only_ search for files using the specified backend, or to loop through all available backends. (Default is False, try to finding whatever you can find.)
filename (str | Path | None) – The name of the file (without extensions) to look for. (Default is None, which uses the node’s lexical path.)
**kwargs – back end-specific arguments (only likely to work in combination with :param:`only_requested`, otherwise there’s nothing to be specific _to_.)
- Returns:
Whether any save files were found
- Return type:
bool
- property import_readiness_report
- property import_ready: bool
Checks whether importlib can find this node’s class, and if so whether the imported object matches the node’s type.
- Returns:
- Whether the imported module and name of this node’s class match
its type.
- Return type:
(bool)
- load(backend: BackendIdentifier | StorageInterface = 'pickle', only_requested=False, filename: str | Path | None = None, _node: Node | None = None, **kwargs)[source]
Loads a node from file returns its instance.
- Parameters:
backend (str | StorageInterface) – The interface to use for serializing the node. (Default is “pickle”, which loads the standard pickling back end.)
only_requested (bool) – Whether to _only_ try loading from the specified backend, or to loop through all available backends. (Default is False, try to load whatever you can find.)
filename (str | Path | None) – The name of the file (without extensions) from which to load the node. (Default is None, which uses the node’s lexical path.)
**kwargs – back end-specific arguments (only likely to work in combination with :param:`only_requested`, otherwise there’s nothing to be specific _to_.)
- Raises:
FileNotFoundError – when nothing got loaded.
- abstract property outputs: OutputsWithInjection
- pull(*args, run_parent_trees_too=False, **kwargs)[source]
A shortcut for
run()with particular flags.Runs nodes upstream in the data graph, then runs this node without triggering any downstream runs. By default only runs sibling nodes, but can optionally require the parent node to pull in its own upstream runs (this is recursive up to the parent-most object).
- Parameters:
run_parent_trees_too (bool) – Whether to (recursively) require the parent to first pull.
- push(*args, **kwargs)[source]
Exactly like
run()with all the same flags, _except_ it handles an edge case where you are trying to directly run the child node of apyiron_workflow.workflow.Workflowbefore it has had any chance to configure its execution signals. _If_ the parent is a workflow set up to automate execution flow, does that _first_ then runs as usual.
- property ready: bool
Whether the inputs are all ready and the node is neither already running nor already failed.
- run(*args, run_data_tree: bool = False, run_parent_trees_too: bool = False, fetch_input: bool = True, check_readiness: bool = True, raise_run_exceptions: bool = True, rerun: bool = False, emit_ran_signal: bool = True, **kwargs)[source]
The master method for running in a variety of ways. By default, whatever data is currently available in upstream nodes will be fetched, if the input all conforms to type hints then this node will be run (perhaps using an executor), and finally the ran signal will be emitted to trigger downstream runs.
If executor information is specified, execution happens on that process, a callback is registered, and futures object is returned.
Input values can be updated at call time with kwargs, but this happens _first_ so any input updates that happen as a result of the computation graph will override these by default. If you really want to execute the node with a particular set of input, set it all manually and use execute (or run with carefully chosen flags).
- Parameters:
run_data_tree (bool) – Whether to first run all upstream nodes in the data graph. (Default is False.)
run_parent_trees_too (bool) – Whether to recursively run the data tree in parent nodes (if any). (Default is False.)
fetch_input (bool) – Whether to first update inputs with the highest-priority connections holding data (i.e. the first valid connection; and the most recently formed connections appear first unless the connections list has been manually tampered with). (Default is True.)
check_readiness (bool) – Whether to raise an exception if the node is not
readyto run after fetching new input. (Default is True.)raise_run_exceptions (bool) – Whether to raise exceptions encountered during the run, or just ignore them. (Default is True, raise them!)
rerun (bool) – Whether to force-set
runningandfailedto False before running. (Default is False.)emit_ran_signal (bool) – Whether to fire off all the output ran signal afterwards. (Default is True.)
**kwargs – Keyword arguments matching input channel labels; used to update the input channel values before running anything.
- Returns:
- The result of running the node, or a futures object (if
running on an executor).
- Return type:
(Any | Future)
Note
Running data trees is a pull-based paradigm and only compatible with graphs whose data forms a directed acyclic graph (DAG).
Note
Kwargs updating input channel values happens _first_ and will get overwritten by any subsequent graph-based data manipulation.
- run_data_tree(run_parent_trees_too=False) None[source]
Use topological analysis to build a tree of all upstream dependencies and run them.
- Parameters:
run_parent_trees_too (bool) – First, call the same method on this node’s parent (if one exists), and recursively up the parentage tree. (Default is False, only run nodes in this scope, i.e. sharing the same parent.)
- save(backend: BackendIdentifier | StorageInterface = 'pickle', filename: str | Path | None = None, **kwargs)[source]
Writes the node to file using the requested interface as a back end.
- Parameters:
backend (str | StorageInterface) – The interface to use for serializing the node. (Default is “pickle”, which loads the standard pickling back end.)
filename (str | Path | None) – The name of the file (without extensions) at which to save the node. (Default is None, which uses the node’s lexical path.)
**kwargs –
Back end-specific keyword arguments.
HERE BE DRAGONS!!!
Warning
This almost certainly only fails for subclasses of
Nodethat don’t override node_function or macro_creator directly, as these are expected to be part of the class itself (and thus already present on our instantiated object) and are never stored. Nodes created using the provided decorators should all work.Warning
If you modify a Macro class in any way (changing its IO maps, rewiring internal connections, or replacing internal nodes), don’t expect saving/loading to work.
Warning
If the underlying source code has changed since saving (i.e. the node doing the loading does not use the same code as the node doing the saving, or the nodes in some node package have been modified), then all bets are off.
- save_checkpoint(backend: Literal['h5bag', 'pickle'] | StorageInterface = 'pickle')[source]
Triggers a save on the parent-most node.
- Parameters:
backend (str | StorageInterface) – The interface to use for serializing the node. (Default is “pickle”, which loads the standard pickling back end.)
- set_input_values(*args, **kwargs) None[source]
Match keywords to input channels and update their values.
Throws a warning if a keyword is provided that cannot be found among the input keys.
- Parameters:
*args – values assigned to inputs in order of appearance.
**kwargs – input key - input value (including channels for connection) pairs.
- Raises:
(ValueError) – If more args are received than there are inputs available.
(ValueError) – If there is any overlap between channels receiving values from args and those from kwargs.
(ValueError) – If any of the kwargs keys do not match available input labels.
- property signals: Signals
A container for input and output signals, which are channels for controlling execution flow. By default, has a
signals.inputs.runchannel which has a callback to therun()method that fires whenever _any_ of its connections sends a signal to it, asignals.inputs.accumulate_and_runchannel which has a callback to therun()method but only fires after _all_ its connections send at least one signal to it, and signals.outputs.ran which gets called when the run method is finished.Additional signal channels in derived classes can be added to
signals.inputsandsignals.outputsafter this mixin class is initialized.
- use_cache = True