The alkymi API¶
Decorators¶
- alkymi.decorators.foreach(mapped_inputs: alkymi.recipe.Recipe, ingredients=(), name: Optional[str] = None, transient: bool = False, cache: alkymi.config.CacheType = CacheType.Auto) Callable[[Callable[[...], alkymi.decorators.R]], alkymi.foreach_recipe.ForeachRecipe[alkymi.decorators.R]] [source]¶
Convert a function into an alkymi Recipe to enable caching and conditional evaluation
- Parameters
mapped_inputs – A single Recipe to whose output (a list or dictionary) the bound function will be applied to generate the new outputs (similar to Python’s built-in map() function)
ingredients – The dependencies of this Recipe - the outputs of these Recipes will be provided as arguments to the bound function when called in the order that they were provided. If not all arguments are provided directly, alkymi will look up recipes that match the name of arguments automatically
name – The name to assign to the created recipe - if not provided, the bound function’s name will be used
transient – Whether to always (re)evaluate the created Recipe
cache – The type of caching to use for this Recipe
- Returns
A callable that will yield the Recipe created from the bound function
- alkymi.decorators.recipe(ingredients=(), name: Optional[str] = None, transient: bool = False, cache: alkymi.config.CacheType = CacheType.Auto) Callable[[Callable[[...], alkymi.decorators.R]], alkymi.recipe.Recipe[alkymi.decorators.R]] [source]¶
Convert a function into an alkymi Recipe to enable caching and conditional evaluation
- Parameters
ingredients – The dependencies of this Recipe - the outputs of these Recipes will be provided as arguments to the bound function when called in the order that they were provided. If not all arguments are provided directly, alkymi will look up recipes that match the name of arguments automatically
name – The name to assign to the created recipe - if not provided, the bound function’s name will be used
transient – Whether to always (re)evaluate the created Recipe
cache – The type of caching to use for this Recipe
- Returns
A callable that will yield the Recipe created from the bound function
Recipe¶
- class alkymi.recipe.Recipe(func: Callable[[...], alkymi.recipe.R], ingredients: Iterable[alkymi.recipe.Recipe], name: str, transient: bool, cache: alkymi.config.CacheType, cleanliness_func: Optional[Callable[[alkymi.recipe.R], bool]] = None)[source]¶
Recipe is the basic building block of alkymi’s evaluation approach. It binds a function (provided by the user) that it then calls when asked to by alkymi’s execution engine. The result of the bound function evaluation can be automatically cached to disk to allow for checking of cleanliness (whether a Recipe is up-to-date), and to avoid invoking the bound function if necessary on subsequent evaluations
- brew(*, jobs: int = 1, progress_type: Optional[alkymi.types.ProgressType] = None) alkymi.recipe.R [source]¶
Evaluate this Recipe and all dependent inputs - this will build the computational graph and execute any needed dependencies to produce the outputs of this Recipe
- Parameters
jobs – The number of jobs to use for evaluating this recipe in parallel, defaults to 1 (no parallelism), zero or negative values will cause alkymi to use the system’s default number of jobs
progress_type – The method to use for showing progress, if None will default to setting in alkymi’s config
- Returns
The outputs of this Recipe (which correspond to the outputs of the bound function)
- property cache: alkymi.config.CacheType¶
- Returns
The type of caching to use for this Recipe
- property function_hash: str¶
- Returns
The hash of the bound function as a string
- property ingredients: List[alkymi.recipe.Recipe]¶
- Returns
The dependencies of this Recipe - the outputs of these Recipes will be provided as arguments to the bound function when called (following the item from the mapped_inputs sequence)
- property input_checksums: Optional[Tuple[Optional[str], ...]]¶
- Returns
The checksum(s) for the inputs
- property name: str¶
- Returns
The name of this Recipe
- property output_checksum: Optional[str]¶
- Returns
The computed checksums for the outputs (this is set when outputs are set)
- property outputs: Optional[alkymi.recipe.R]¶
- Returns
The outputs of this Recipe
- property outputs_valid: bool¶
Check whether an output is still valid - this is currently only used to check files that may have been deleted or altered outside alkymi’s cache. If no outputs have been produced yet, True will be returned.
- Returns
Whether all outputs are still valid
- restore_from_dict(old_state) None [source]¶
Restores the state of this Recipe from a previously cached state
- Parameters
old_state – The old cached state to restore
- set_result(outputs: alkymi.recipe.R, input_checksums: Tuple[Optional[str], ...]) None [source]¶
Stores the provided result in the recipe and caches it to disk if applicable
- Parameters
outputs – The outputs to store in the recipe
input_checksums – The checksums of the inputs that were used to calculate the outputs
- status() alkymi.types.Status [source]¶
- Returns
The status of this recipe (will evaluate all upstream dependencies)
- property transient: bool¶
- Returns
Whether to always (re)evaluate the created Recipe
ForeachRecipe¶
- class alkymi.foreach_recipe.ForeachRecipe(mapped_recipe: alkymi.recipe.Recipe, ingredients: Iterable[alkymi.recipe.Recipe], func: Callable[[...], alkymi.foreach_recipe.R], name: str, transient: bool, cache: alkymi.config.CacheType, cleanliness_func: Optional[Callable[[alkymi.recipe.R], bool]] = None)[source]¶
Special type of Recipe that applies its bound function to each input from a list or dictionary (similar to Python’s built-in map() function). Evaluations of the bound function are cached and used to avoid reevaluation previously seen inputs, this means that changing the inputs to a ForeachRecipe may only trigger reevaluation of the bound function for some inputs, avoiding the overhead of recomputing things
- property mapped_inputs: Optional[Union[List[Any], Dict[Any, Any]]]¶
- Returns
The sequence of inputs to apply the bound function to
- property mapped_inputs_checksum: Optional[str]¶
- Returns
The summary of the mapped inputs checksum
- property mapped_inputs_checksums: Optional[Union[List[Optional[str]], Dict[Any, Optional[str]]]]¶
- Returns
The computed checksums for the sequence of mapped inputs (this is set when mapped_inputs is set)
- property mapped_outputs_checksums: Optional[Union[List[Optional[str]], Dict[Any, Optional[str]]]]¶
- Returns
The computed checksums for the sequence of mapped outputs
- property output_checksum: Optional[str]¶
- Returns
The computed checksums for the outputs (this is set when outputs are set)
- property outputs: Optional[Union[Dict, List]]¶
- Returns
The outputs of this ForeachRecipe in canonical form (None or a tuple with zero or more entries)
- property outputs_valid: bool¶
Check whether an output is still valid - this is currently only used to check files that may have been deleted or altered outside alkymi’s cache. If no outputs have been produced yet, True will be returned.
- Returns
Whether all outputs are still valid
- restore_from_dict(old_state: Dict) None [source]¶
Restores the state of this ForeachRecipe from a previously cached state
- Parameters
old_state – The old cached state to restore
- set_current_result(evaluated: Union[List[Any], Dict[Any, Any]], outputs: Union[List[alkymi.serialization.Output], Dict[Any, alkymi.serialization.Output]], mapped_inputs_checksum: Optional[str], other_input_checksums: Tuple[Optional[str], ...], completed: bool) None [source]¶
Stores the provided results in the recipe and caches them to disk if applicable
- Parameters
evaluated – The inputs that were used to generate the provided outputs
outputs – The outputs to store in this recipe
mapped_inputs_checksum – The checksum of all mapped inputs
other_input_checksums – The checksums of other (non-mapped) inputs
completed – Bool indicating whether all mapped inputs have been processed
Checksums¶
- class alkymi.checksums.Checksummer[source]¶
Class used to compute a stable hash/checksum of an object recursively. Currently uses MD5.
Serialization¶
- class alkymi.serialization.CachedOutput(value: Optional[alkymi.serialization.T], checksum: str, serializable_representation: Union[str, int, float, None, Iterable[Optional[Union[str, int, float]]], Dict[str, Optional[Union[str, int, float]]]])[source]¶
An Output that has been cached - may or may not have it’s associated value in-memory
- property serialized: Union[str, int, float, None, Iterable[Optional[Union[str, int, float]]], Dict[str, Optional[Union[str, int, float]]]]¶
- Returns
A serializable representation of the value of this output
- property valid: bool¶
- Returns
Whether this Output is still valid (e.g. an external file pointed to by a Path can have been altered)
- class alkymi.serialization.Output(checksum: str)[source]¶
Abstract base class for keeping track of outputs of Recipes
- property checksum: str¶
- Returns
The checksum of this Output
- property valid: bool¶
- Returns
Whether this Output is still valid (e.g. an external file pointed to by a Path can have been altered)
- class alkymi.serialization.OutputWithValue(value: alkymi.serialization.T, checksum: str)[source]¶
An Output that is guaranteed to have an in-memory value - all outputs start out as this before being cached
- property valid: bool¶
- Returns
Whether this Output is still valid (e.g. an external file pointed to by a Path can have been altered)
- class alkymi.serialization.Serializer(*args, **kwds)[source]¶
Abstract base class for classes that enable serialization/deserialization of classes not in the standard library
- alkymi.serialization.cache(output: alkymi.serialization.OutputWithValue, base_path: pathlib.Path) alkymi.serialization.CachedOutput [source]¶
Cache an in-memory OutputWithValue, thus converting it to a CachedOutput. The resulting output will retain the value in-memory
- Parameters
output – The Output to cache
base_path – The directory to use for this serialization. A subdirectory will be created to store complex serialized objects
- Returns
The cached output
- alkymi.serialization.create_token(name) str [source]¶
Creates a token using the ‘TOKEN_TEMPLATE’ to signify to the deserializer func how to deserializer a given value
- Parameters
name – The name of the token
- Returns
A new token for the given token name
- alkymi.serialization.deserialize_item(item: Union[str, int, float, None, Iterable[Optional[Union[str, int, float]]], Dict[str, Optional[Union[str, int, float]]]]) Any [source]¶
Deserializes an item (potentially recursively)
- Parameters
item – The item to deserialize (may be nested)
- Returns
The deserialized item
- alkymi.serialization.from_cache(serializable_representation: Union[str, int, float, None, Iterable[Optional[Union[str, int, float]]], Dict[str, Optional[Union[str, int, float]]]]) Any [source]¶
Deserialize an output from the cache using its serialized representation
- Parameters
serializable_representation – The serialized representation to deserialize
- Returns
The deserialized object
- alkymi.serialization.is_valid_serialized(item: Union[str, int, float, None, Iterable[Optional[Union[str, int, float]]], Dict[str, Optional[Union[str, int, float]]]]) bool [source]¶
Recursively check validity of a serialized representation. Currently just looks for external files represented by Path objects, and then compares the stored checksum of each such item with the current checksum (computed from the current file contents)
- Parameters
item – The serialized representation to check validity for
- Returns
True if the input is still valid
- alkymi.serialization.serialize_item(item: Any, cache_path_generator: Generator[pathlib.Path, None, None]) Union[str, int, float, None, Iterable[Optional[Union[str, int, float]]], Dict[str, Optional[Union[str, int, float]]]] [source]¶
Serializes an item (potentially recursively)
- Parameters
item – The item to serialize (may be nested)
cache_path_generator – The generator to use for generating cache paths
- Returns
The serialized item
Lab¶
- class alkymi.lab.Lab(name: str)[source]¶
Class used to define a collection of alkymi recipes and expose them as a command line interface (CLI)
This can be used to create files that bear resemblance to Makefiles (see alkymi/labfile.py as an example)
- add_recipe(recipe: alkymi.recipe.Recipe) alkymi.recipe.Recipe [source]¶
Add a new recipe to the Lab (this will make the recipe available through the CLI)
- Parameters
recipe – The recipe to add
- Returns
The input recipe (to allow chaining calls)
- add_recipes(*recipes: alkymi.recipe.Recipe) None [source]¶
Add a set of recipes to the Lab (this will make the recipes available through the CLI)
- Parameters
recipes – The recipes to add
- property args: Dict[str, alkymi.recipes.Arg]¶
- Returns
The list of args registered with this Lab
- brew(target_recipe: Union[alkymi.recipe.Recipe, str], *, jobs=1, progress_type: Optional[alkymi.types.ProgressType] = ProgressType.Fancy) Any [source]¶
Brew (evaluate) a target recipe defined by its reference or name, and return the results
- Parameters
target_recipe – The recipe to evaluate, as a reference ot by name
jobs – The number of jobs to use for evaluating this recipe in parallel, defaults to 1 (no parallelism), zero or negative values will cause alkymi to use the system’s default number of jobs
progress_type – The method to use for showing progress, if None will default to setting in alkymi’s config
- Returns
The output of the evaluated recipe
- property name: str¶
- Returns
The name of this Lab
- open(args: Optional[List[str]] = None, stream: TextIO = <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>) None [source]¶
Runs the command line interface for this Lab by parsing command line arguments and carrying out the designated command
- Parameters
args – The input arguments to use - will default to system args
stream – The stream to print output to
- property recipes: List[alkymi.recipe.Recipe]¶
- Returns
The list of recipes contained in this Lab
- register_arg(arg: alkymi.recipes.Arg) None [source]¶
Register an argument with the Lab (this will make the argument settable through the CLI)
- Parameters
arg – The argument to register
Built-in Recipes¶
- class alkymi.recipes.Arg(arg: alkymi.recipes.T, name: str, cache=CacheType.Auto)[source]¶
Class providing a stateful argument
To use, create an Arg instance with the initial value for your argument, e.g.
Arg(0)
, then provide as a recipe to downstream recipes. To change the input arguments, callset()
again - this will mark the recipe as dirty and cause reevaluation of downstream recipe(s)- set(_arg: alkymi.recipes.T) None [source]¶
Change the argument, causing the recipe to need reevaluation
- Parameters
_arg – The new argument
- property subtype: Optional[Any]¶
- Returns
The subtype of the argument (e.g. the type of items contained in a list). Will be None for non-iterable types
- property type: Type[alkymi.recipes.T]¶
- Returns
The type of the argument
- alkymi.recipes.arg(_arg: alkymi.recipes.T, name: str, cache=CacheType.Auto) alkymi.recipes.Arg[alkymi.recipes.T] [source]¶
Shorthand for creating an
Arg
instance- Parameters
_arg – The initial argument to use
name – The name to give the created Recipe
cache – The type of caching to use for this Recipe
- Returns
The created
Arg
instance
- alkymi.recipes.file(name: str, path: pathlib.Path, cache=CacheType.Auto) alkymi.recipe.Recipe[pathlib.Path] [source]¶
Create a Recipe that outputs a single file
- Parameters
name – The name to give the created Recipe
path – The path to the file to output
cache – The type of caching to use for this Recipe
- Returns
The created Recipe
- alkymi.recipes.glob_files(name: str, directory: pathlib.Path, pattern: str, recursive: bool, cache=CacheType.Auto) alkymi.recipe.Recipe[List[pathlib.Path]] [source]¶
Create a Recipe that will glob files in a directory and return them as a list. The created recipe will only be considered dirty if the file paths contained in the glob changes (not if the contents of any one file changes)
- Parameters
name – The name to give the created Recipe
directory – The directory in which to perform the glob
pattern – The pattern to use for the glob, e.g. *.py
recursive – Whether to glob recursively into subdirectories
cache – The type of caching to use for this Recipe
- Returns
The created Recipe
- alkymi.recipes.zip_results(name: str, recipes: Iterable[alkymi.recipe.Recipe], cache=CacheType.Auto) alkymi.recipe.Recipe[Union[List[Tuple[Any, ...]], Dict[Any, Tuple[Any, ...]]]] [source]¶
Create a Recipe that zips the outputs from a number of recipes into elements, similar to Python’s built-in zip(). Notably, dictionaries are handled a bit differently, in that a dictionary is returned with keys mapping to tuples from the different inputs, i.e.:
{"1": 1} zip {"1", "one"} -> {"1", (1, "one")}
- Parameters
name – The name to give the created Recipe
recipes – The recipes to zip. These must return lists or dictionaries
cache – The type of caching to use for this Recipe
- Returns
The created Recipe
Core¶
- alkymi.core.brew(recipe: alkymi.recipe.Recipe[alkymi.recipe.R], *, jobs: int, progress_type: Optional[alkymi.types.ProgressType]) alkymi.recipe.R [source]¶
Evaluate a Recipe and all dependent inputs - this will build the computational graph and execute any needed dependencies to produce the outputs of the input Recipe
- Parameters
recipe – The Recipe to evaluate
jobs – The number of jobs to use for evaluating the recipe in parallel, 1 job corresponds to no parallelism, zero or negative values will cause alkymi to use the system’s default number of jobs
progress_type – The method to use for showing progress, if None will default to setting in alkymi’s config
- Returns
The outputs of the Recipe (which correspond to the outputs of the bound function)
- alkymi.core.compute_recipe_status(recipe: alkymi.recipe.Recipe[alkymi.recipe.R], graph: networkx.classes.digraph.DiGraph) Dict[alkymi.recipe.Recipe, alkymi.types.Status] [source]¶
Compute the Status for the provided recipe and all dependencies (ingredients or mapped inputs)
- Parameters
recipe – The recipe for which status should be computed
graph – The graph representing the recipe and all its dependencies
- Returns
The status of the provided recipe and all dependencies as a dictionary
- alkymi.core.create_graph(recipe: alkymi.recipe.Recipe[alkymi.recipe.R]) networkx.classes.digraph.DiGraph [source]¶
Create a Directed Acyclic Graph (DAG) based on the provided recipe Each node in the graph represents a recipe, and has an associated “status” attribute
- Parameters
recipe – The recipe to construct a graph for
- Returns
The constructed graph
- alkymi.core.evaluate_recipe(recipe: alkymi.recipe.Recipe[alkymi.recipe.R], graph: networkx.classes.digraph.DiGraph, statuses: Dict[alkymi.recipe.Recipe, alkymi.types.Status], jobs: int, progress_type: Optional[alkymi.types.ProgressType] = None) Tuple[alkymi.recipe.R, Optional[str]] [source]¶
Evaluate a Recipe, including any dependencies that are not up-to-date
- Parameters
recipe – The recipe to evaluate
graph – The graph to use for evaluation
statuses – The statuses of the recipes contained in the graph - used to skip evaluation if unnecessary
jobs – The number of jobs to use for evaluating the recipe in parallel, 1 job corresponds to no parallelism, zero or negative values will cause alkymi to use the system’s default number of jobs
progress_type – The method to use for showing progress, if None will default to setting in alkymi’s config
- Returns
The output(s) and checksum(s) of the evaluated recipe
- async alkymi.core.invoke(recipe: alkymi.recipe.Recipe, inputs: Tuple[Any, ...], input_checksums: Tuple[Optional[str], ...], loop: asyncio.events.AbstractEventLoop, executor: Optional[concurrent.futures._base.Executor], progress_callback: Optional[Callable[[alkymi.types.EvaluateProgress, alkymi.recipe.Recipe, int, int], None]] = None) Tuple[alkymi.recipe.R, Optional[str]] [source]¶
Evaluate the Recipe using the provided inputs. This will call the bound function on the inputs.
- Parameters
recipe – The recipe to evaluate given the provided inputs
inputs – The inputs provided by the ingredients (dependencies) of the Recipe
input_checksums – The (possibly new) input checksum
loop – The asyncio event loop to use for scheduling the recipe evaluation
executor – An optional executor to use for evaluating bound functions in parallel
progress_callback – An optional callback to invoke when evaluation progress occurs
- Returns
The output(s) and checksum(s) of the evaluated recipe
- async alkymi.core.invoke_foreach(recipe: alkymi.foreach_recipe.ForeachRecipe, inputs: Tuple[Any, ...], input_checksums: Tuple[Optional[str], ...], loop: asyncio.events.AbstractEventLoop, executor: Optional[concurrent.futures._base.Executor], progress_callback: Optional[Callable[[alkymi.types.EvaluateProgress, alkymi.recipe.Recipe, int, int], None]] = None) Tuple[alkymi.recipe.R, Optional[str]] [source]¶
Evaluate the ForeachRecipe using the provided inputs. This will apply the bound function to each item in the “mapped_inputs”. If the result for any item is already cached, that result will be used instead (the checksum is used to check this). Only items from the immediately previous invoke call will be cached
- Parameters
recipe – The ForeachRecipe to evaluate given the provided inputs
inputs – The inputs provided by the ingredients (dependencies) of the ForeachRecipe
input_checksums – The (possibly new) input checksum to use for checking cleanliness
loop – The asyncio event loop to use for scheduling the recipe evaluation
executor – An optional executor to use for evaluating bound functions in parallel
progress_callback – An optional callback to invoke when evaluation progress occurs
- Returns
The output(s) and checksum(s) of the evaluated recipe
- alkymi.core.is_clean(recipe: alkymi.recipe.Recipe[alkymi.recipe.R], new_input_checksums: Tuple[Optional[str], ...]) alkymi.types.Status [source]¶
Check whether a Recipe is clean (result is cached) based on a set of (potentially new) input checksums
- Parameters
recipe – The Recipe to check for cleanliness
new_input_checksums – The (potentially new) input checksums to use for checking cleanliness
- Returns
Whether the recipe is clean represented by the Status enum
- async alkymi.core.schedule(loop: asyncio.events.AbstractEventLoop, executor: Optional[concurrent.futures._base.Executor], recipe: alkymi.recipe.Recipe, status: alkymi.types.Status, coros_or_tasks: Dict[alkymi.recipe.Recipe, Union[Coroutine, _asyncio.Task]], progress_callback: Optional[Callable[[alkymi.types.EvaluateProgress, alkymi.recipe.Recipe, int, int], None]] = None) Tuple[alkymi.recipe.R, Optional[str]] [source]¶
Helper function used to asynchronously await inputs from dependant recipes, and then retrieve the output of the provided recipe (evaluating it if necessary). Note that inputs will only be awaited if needed (not if cached).
- Parameters
loop – The asyncio event loop to use for scheduling the recipe evaluation
executor – An optional executor to use for evaluating bound functions in parallel
recipe – The recipe to evaluate using the executor
status – The status of the recipe being scheduled - used to skip evaluation if unnecessary
coros_or_tasks – Dictionary containing coroutines for recipes - used to await ingredient inputs
progress_callback – An optional callback to invoke when evaluation progress occurs
- Returns
A future that will eventually return the output(s) and checksum(s) of the recipe
AlkymiConfig¶
- class alkymi.config.AlkymiConfig[source]¶
Global singleton config for alkymi
- property allow_pickling: bool¶
- Returns
Whether to allow pickling for serialization, deserialization and checksumming
- property cache: bool¶
- Returns
Whether to enable alkymi caching globally (see CacheType.Auto)
- property cache_path: Optional[pathlib.Path]¶
- Returns
A user-provided location to place the cache
- property file_checksum_method: alkymi.config.FileChecksumMethod¶
- Returns
The currently used method for calculating file checksums (for Path objects)
- static get() alkymi.config.AlkymiConfig [source]¶
- Returns
The singleton instance of AlkymiConfig
- property progress_type: alkymi.types.ProgressType¶
- Returns
The currently used type of progress indication