maestro tags are roxygen2 comments for configuring the scheduling and execution of pipelines.
Details
maestro tags follow the format: #' @maestroTagName value
Some tags may not take a value.
maestro tags must be written above the function that is to be included as a pipeline. A typical pipeline with tags could look like this:
#' @maestroFrequency 1 hour
#' @maestroStartTime 12:30:00
#' @maestroLogLevel WARN
my_pipeline <- function() {
# Pipeline code
}Below are descriptions of all the tags currently available in maestro along with examples.
maestroFrequency
How often to run the pipeline. This tag takes a time unit indicating how long to wait between subsequent runs of the pipeline. Acceptable values include an integer value followed by one of minute(s), hour(s), day(s), week(s), month(s), and year(s). Note that some combinations of integer + unit may be invalid. Adverbs like 'hourly', 'daily', 'weekly', etc. are also valid.
Default: 1 day
Examples:
#' @maestroFrequency 1 hour#' @maestroFrequency daily#' @maestroFrequency 2 weeks#' @maestroFrequency 3 months
maestroStartTime
Timestamp, date, or time corresponding to the start time of the pipeline. This
also sets the cadence of the pipeline in some cases. For instance, if the start time
is 2025-03-18 03:00:00 and the frequency is daily, the pipeline will run at 03:00
every day. A value in the future prevents the pipeline from running until that time
has been reached.
For pipelines with a frequency lower than daily, partial anchor formats are supported to make it easier to express natural cycle points without choosing a specific date:
HH:MM:SS— time-of-day anchor forminute,hour, ordayfrequencies. e.g.,08:00:00runs every day at 8am.Mon HH:MM:SS(weekday abbreviation + time) — anchor forweekor multi-week frequencies. Resolved to that weekday of the current ISO week. e.g.,Mon 04:00:00with@maestroFrequency 1 weekruns every Monday at 4am. Valid abbreviations:Mon,Tue,Wed,Thu,Fri,Sat,Sun.DD HH:MM:SSorDD(month-day + optional time) — anchor formonthfrequencies. Resolved to that day of the current month. e.g.,15 04:00:00with@maestroFrequency 1 monthruns on the 15th of every month at 4am.
Default: 2024-01-01 00:00:00
Examples:
#' @maestroStartTime 2025-02-05 12:00:00#' @maestroStartTime 2025-01-01#' @maestroStartTime 08:00:00#' @maestroStartTime Mon 04:00:00#' @maestroStartTime Wed 09:30:00#' @maestroStartTime 15 04:00:00#' @maestroStartTime 1
maestroTz
Timezone in which the maestroStartTime is to be considered. Takes any valid timezone
string that can be found in OlsonNames().
Default: UTC
Examples:
#' @maestroTz Europe/Paris#' @maestroTz America/Halifax#' @maestroTz US/Pacific
maestroLogLevel
Minimum logging threshold for messages, warnings, and errors that come from the pipeline.
These levels correspond to those in logger:::log_levels_supported but most commonly
used are ERROR, WARN, INFO. For example, if you use WARN then any messages of lower
urgency (i.e., INFO) will be suppressed, but errors will be logged.
Default: INFO
Examples:
#' @maestroLogLevel ERROR#' @maestroLogLevel WARN
maestroHours
Specific hours of the day in which to run the pipeline. This only applies for pipelines that run
at an hourly or minutely frequency. Acceptable values are integers from 0-23 separated
by spaces. If empty, pipeline runs at all hours. This tag uses the timezone specified by
maestroTz (will be UTC if empty).
Default:
Examples:
#' @maestroHours 1 4 7 10#' @maestroHours 0 5 20
maestroDays
Specific days of week or days of month on which to run the pipeline. This only applies for pipelines that run at a minutely, hourly, or daily frequency. Acceptable values are either integers from 1-31 or day of week strings like Mon, Tue, Wed, etc. These two options cannot be used in combination.
Default:
Examples
#' @maestroDays 1 11 21 31#' @maestroDays Mon Tue Wed Thu Fri
maestroMonths
Specific months of the year on which to run the pipeline. This only applies for pipelines that do run at least monthly. Acceptable values are integers (1-12) corresponding to the month of the year (e.g., 1 = January, 2 = February, etc.).
Default:
Examples:
#' @maestroMonths 3 8 12#' @maestroMonths 10
maestroInputs
For a DAG style pipeline, the names of pipelines that input into the pipeline.
These names must match the function name defining the inputting pipeline. Multiple
pipelines can be used as inputs and the input value is used in the target pipeline
via the required .input parameter. Note that this tag could be redundant if the
inputting pipeline uses maestroOutputs.
To enable fan-in (collect), wrap one or more input names with collect().
The downstream pipeline receives a named list as .input, where each name
corresponds to an upstream pipeline and each value is that pipeline's return
value. All listed upstream pipelines must have succeeded before the collect
pipeline fires. Works with both static multi-source inputs and dynamic fan-out
followed by fan-in (@maestroMap → collect()): in the latter case .input
is a list of all successful iteration results.
Default:
Examples:
#' @maestroInputs extract verify#' @maestroInputs collect(letter_a, letter_b)#' @maestroInputs collect(multiply)(collect all iterations of a@maestroMapupstream)
maestroOutputs
For a DAG style pipeline, the names of pipelines that receive the return value of
this pipeline as input. These names must match the function name defining the
outputting pipeline. Multiple pipelines can be outputted into. The return value
of the pipeline will be passed into the receiving pipeline. Note that this tag
could be redundant if pipeline to be inputted into uses maestroInputs.
Default:
Examples:
#' @maestroOutputs transform
maestroSkip
Flags a pipeline to never be executed even if it is scheduled to run. This can
be useful when developing or testing a pipeline. This tag takes no value, instead
the presence of the tag indicates whether it is skipped. This tag is ignored when
run_schedule(..., run_all = TRUE) or when using invoke().
Default:
Examples:
#' @maestroSkip
maestroRunIf
An R expression that is evaluated at run time to determine whether the pipeline
should execute. The expression must return a single TRUE or FALSE. If the
expression returns FALSE, the pipeline is silently skipped for that run. Resources
passed via run_schedule(..., resources = list(...)) are available inside the
expression, as is .input for DAG pipelines.
Default:
Examples:
#' @maestroRunIf Sys.getenv("RUN_PIPELINE") == "true"#' @maestroRunIf lubridate::wday(lubridate::now()) == 2#' @maestroRunIf !is.null(.input)
maestroMap
Enables dynamic fan-out (scatter): the downstream pipeline executes once
per element of the upstream return value. When the tag value is empty, each
element of the upstream return value is passed directly as .input. When one
or more R expressions referencing .input fields are provided (space-separated),
those fields are scattered over and zipped together pmap-style: each iteration
receives .input with all specified fields replaced by their i-th element.
The full list remains accessible as .input in every branch.
Vectors must all have the same length, unless a vector has length 1, in which case it is recycled across all iterations. Mismatched lengths produce a pipeline error.
Requires @maestroInputs to reference an upstream pipeline.
Default:
Examples:
#' @maestroMap(iterate over each element of the upstream return value)#' @maestroMap .input$items#' @maestroMap .input$ids .input$labels#' @maestroMap .input$model .input$label .input$threshold
maestroPriority
Determines the order in which pipelines that run at the same scheduled instance
are executed. Values are positive integers from 1 to N. Order is determined in
descending order such that 1 indicates the highest priority. Pipelines with the
same priority run in the order in which build_schedule() parses the pipeline
(usually alphabetical according to file name and then line number within file).
By default, all pipelines are given equal priority.
Default:
Examples:
#' @maestroPriority 1#' @maestroPriority 3
maestroFlags
Arbitrary strings which are then made accessible via get_flags(). A pipeline
can have multiple tags separated by spaces.
Default:
Examples:
#' @maestroFlags critical etl cloud#' @maestroFlags aviation
maestroLabel
Key-value pairs used for attributing metadata for a pipeline. Multiple labels
can be assigned to a pipeline using multiple instances of @maestroLabel. These
labels can be extracted as a data.frame. Each label must follow the pattern of
key value.
Default:
Examples:
#' @maestroLabel domain transportation#' @maestroLabel author will.hipson
