Workflow expressions are snippets of code that allow you to express conditions, extract data, and manipulate values within your workflows.
Their purpose is to provide dynamic capabilities to your workflows, allowing evaluation of an expressions using the context of the job.
In this article, we will explore how expressions are evaluated, the context in which they are executed, the syntax used, and practical use cases.
Context represents the state against which an expression is evaluated. It includes important information based on which you can build your expressions.
You can see how the context is being used inside the runner by visiting the jobengine crate inside the runner repository.
idThe id field represents the unique identifier of the job. This field is mostly
used by the expression engine helper functions. However, if you are building
your own notification server, you can use the ${{ id }} expression to get the
job ID.
nameThe name field represents the name of the scan. Since job is an instance of
the scan it doesn't have its own name. You can use the ${{ name }} expression
to get the scan name.
Mostly, this field is used by the expression engine helper functions.
projectProject is an object that contains information about the project this scan belongs to.
It only has one field: id. You can use the ${{ project.id }} expression to
get the project ID.
workflowWorkflow is an object that contains information about the workflow this job belongs to.
It only has one field: id. You can use the ${{ workflow.id }} expression to
get the workflow ID.
revisionRevision is an object that contains information about the workflow revision this job belongs to.
It only has one field: id. You can use the ${{ revision.id }} expression to
get the revision ID.
scansScans is an object containing scan names as keys and their and the list of jobs belonging to those scans as values.
This is the primary field used by the expression engine helper functions to extract data from the executions of the scans, and making decisions based on that data.
Since this is one of the most important fields added to the context, let's explore it in more detail.
scans.[ID]An ID inside the scan is a string name representing the name of the scan.
Since each scan name is unique, the name of the scan is an identifier of the scan.
Let's say we have a scan named subfinder, accessing a list of successful jobs
belonging to the subfinder scan is done using ${{ scans.subfinder }}
Each job object inside the scan list contains information about the job.
Each job in the list contains metadata related to the successful execution of the job that belongs to the scan.
The size of this list is currently set to 10, meaning only the last 10
successful jobs are stored in the context. Therefore, size(scans.subfinder)
for example will always be <= 10.
During scheduling, last 10 finished jobs will be used to determine if the scan should be scheduled. The distinction is to avoid re-scheduling failed scans.
When the jobs are used in the scans.[ID].steps[],
you only care about the last 10 successful jobs. However, during scheduling,
fields such as scans.[ID].on.expr
or scans.[ID].if, all last 10 finished jobs
are used to determine whether the scan should be scheduled or not.
Job are sorted from the latest to the oldest, meaning the job at index 0 is
the latest successful job, and the job at index 9 is the oldest successful
job.
In other words, ${{ scans.subfinder[0] }} will give you the latest successful
job metadata.
If you need context to contain more jobs, please reach out to us through our Discord server or use any of the links provided on the support page.
scans.[ID][index].idThe id field inside the job object represents the unique identifier of the
job. Each job has its own unique ID, and it is the primary way to reference the
job.
You can use the ${{ scans.subfinder[0].id }} expression to get the ID of the
latest successful job belonging to the subfinder scan.
When fetching the output of the job, bh uses the -j flag to specify the job
ID.
This command fetches the artifact named subfinder.zip from the latest
successful job belonging to the subfinder scan.
scans.[ID].[index].artifacts[]The artifacts field inside the job object metadata, containing a list of
artifacts produced by the job.
You can use the ${{ scans.subfinder[0].artifacts }} expression to get the list
of artifacts produced by the latest successful job belonging to the subfinder
scan.
scans.[ID].[index].artifacts[index].nameThe name field inside the artifact object represents the name of the artifact
produced by the job.
scans.[ID].[index].artifacts[index].checksumThe checksum field inside the artifact object represents the checksum of the
artifact produced by the job. This field is of type string, and is used to
determine whether the artifact has changed or not.
varsVars is an object containing user-defined variables that can be used to store custom data. Vars represents the Project Variables defined in the project settings.
There is no validation during the workflow validation on this field. You can add or remove variables as you wish. However, make sure to use valid variable names.
If the variable doesn't exist, and it is used inside an expression, the job will not be scheduled.
vars.[ID]The ID is the name of the Project Variable. You can use the
${{ vars.MY_VAR }} expression to get the value of the MY_VAR Project
Variable.
secretsSecrets is an object containing user-defined secrets that can be used to store sensitive data. Secrets represents the Project Secrets defined in the project settings.
There is no validation during the workflow validation on this field. You can add or remove secrets as you wish. However, make sure to use valid secret names.
If the secret doesn't exist, and it is used inside an expression, the job will not be scheduled.
secrets.[ID]The ID is the name of the Project Secret. You can use the
${{ secrets.MY_SECRET }} expression to get the value of the MY_SECRET
Project Secret.
inputsInputs is an object containing user-defined inputs that can be used to store custom data. Inputs represents the Workflow Inputs defined in the workflow definition.
Inputs can only be specified by a dispatch scan.
Otherwise, the size of this object will be zero.
inputs.[ID]The ID is the name of the Workflow Input. You can use the
${{ inputs.MY_INPUT }} expression to get the value of the MY_INPUT Workflow
Input.
The type evaluated by the expression engine is the type you specified in the dispatch input types.
okThe ok field is a boolean value representing whether the job is going to fail
or not.
The ok is set to false as soon as the step outcome is set to failure.
It is mostly used internally to skip steps using the ok && (<if expression>)
pattern.
This field is updated after each step.
alwaysThe always field is a boolean value that is always set to true.
This field is useful when you always want to run step, regardless of the job outcome.
Keep in mind, since the ok && <if expression> pattern is used internally,
steps that should always run, need to use the always field to override the
internal check.
Expressions are snippets of code that are evaluated using the context of the job.
The language used for expressions is called CEL.
You can find the CEL expression parser used by the platform on the GitHub.
Since I have created the language parser, I can quickly fix any issues you might find, or add new features you might need.
Since going through CEL language is out of scope for this article, I will cover helper functions provided by the expression engine that are useful when building workflows.
The expressions are rather streightforward to use. However, if you need help building expressions, feel free to reach out to us through our Discord server or use any of the links provided on the support page.
Keep in mind the following:
list.size()).size(list)).Expressions are separated into built-in functions defined by CEL itself, and helper functions provided by the expression engine.
size()Size returns an integer representing the size of the list or string passed as an argument.
Inputs:
Workflow example:
type()Type returns a string representing the type of the value passed as an argument.
This is mostly useless in your workflows, but can be useful when debugging complex expressions.
Type accepts all types as an input, and returns the string name of the type.
has()Has returns a boolean value representing whether the map contains the specified key.
The only input type used by this function is map.
Workflow example:
all()Returns boolean value that is true if all elements in the list are true.
All accepts 3 parameters:
exists()Exists returns a boolean value representing whether there is at least one element in the list that evaluates to true.
Exists accepts 3 parameters:
Example where subfinder can have conditional artifacts. This step will run if subfinder produced at least one artifact.
exists_one()Exists_one returns a boolean value representing whether there is exactly one element in the list that evaluates to true.
Exists_one accepts 3 parameters:
map()Map returns a list containing the results of applying the specified expression to each element in the input list or map.
It can either accept 3 parameters or 4 parameters.
When accepting 3 parameters:
When accepting 4 parameters:
filter()Filter returns a list containing the elements from the input list or map that satisfy the specified condition.
Filter accepts 3 parameters:
contains()Contains returns a boolean value representing whether the specified substring is present within the given string.
Inputs:
Workflow example when we want to run step only if the SCOPE variable contains
bountyhub.org.
Maybe, bountyhub.org has special rules allowing us to run additional scans.
Maybe the target goes out-of-scope periodically, and we want to avoid running certain scans when that happens. We can use this function to check whether the target is still in scope.
startsWith()The startsWith returns a boolean value representing whether the given string
starts with the specified prefix.
Inputs:
endsWith()The endsWith returns a boolean value representing whether the given string
ends with the specified suffix.
Inputs:
matches()Test whether a string matches a regular expression. For this implementation, you
can find regular expressions
syntax documented in the regex
crate documentation.
Inputs:
Workflow example when we want to run step only if the SCOPE variable matches a
specific regular expression pattern.
uint()The uint function converts the input value to an unsigned integer.
It only accepts one argument of type:
int()The int function converts the input value to an integer.
It only accepts one argument of type:
string()The string function converts the input value to a string.
It only accepts one argument of type:
timestamp()The timestamp function converts the input value to a timestamp.
It only accepts one argument of type:
The string should be in RFC3339 format.
duration()The duration function converts the input value to a duration.
It only accepts one argument of type:
The duration is represented using numbers and suffixes like "s" for seconds, "m" for minutes, "h" for hours, etc.
is_available()This function relies on the specific context of the scan to determine whether the scan produced any results.
The scan is available if there is at least one successful job belonging to the, that executed successfully after the latest execution of the latest finished job belonging to the current scan.
Let's talk about the example to illustrate how this works.
Let's say that we have job A which executes on the cron schedule, and we have
job B which should be scheduled when job A is available.
Let's say something triggers the job B evaluation. Now, job A is not
available, so it is not scheduled.
Let's say that job A executes successfully, producing some results. Now, all
expression scans are evaluated, and since job A contains a run, and job B
does not, job B is scheduled.
Let's say job B executes successfully. We don't evaluate the job B during
this execution, to avoid indefinite recursive calls.
Let's say something triggered the expression scans again. Now, job B has
execution after the latest successful execution of job A, so job B is not
scheduled.
has_diff()This function relies on the specific context of the scan to determine whether
the scan produced different results compared to the previous successful
execution, and that scan is_available().
The has_diff function accepts the list of jobs as the first argument, and the
artifact name as the second argument.
If there exists one scan, every nonce value evaluates to true If there
multiple scans, the latest scan compared to the scan prior to to it must have a
different checksum for the specified artifact to evaluate to true.
Let's talk about the example to illustrate how this works. In this example,
has_diff always means has diff on the subfinder.zip artifact.
Let's say that we have job A which executes on the cron schedule, and we have
job B which should be scheduled when job A has different results compared to
the previous execution.
Let's say something triggers the job B evaluation. Now, job A is not
available, therefore, it doesn't have a diff. Therefore, job B is not
scheduled.
Let's say that job A executes successfully, producing some results. Now, all
expression scans are evaluated, and since job A contains a run, and job B
does not. Job A contains a diff, since first run is by default different than
nothing. Therefore, job B is scheduled.
Let's say job B executes successfully. It is ignored during evaluation.
Something else triggers the job B evaluation again. Now, job A does not have
an execution after the latest execution of job B, so even though job A
produced some results, job B is not scheduled.
Let's also say that job 'A' executes again, producing the same results as
before. Now, job B is not scheduled, since job A is available, but it
doesn't have a diff compared to the previous execution.
Let's finally say that job A executes again, producing different results than
before. Now, job B is scheduled, since job A is available, and it has a diff
compared to the previous execution.
Certain fields in the workflow syntax support expressions. Some support
expressions using the ${{ }} construct, while others directly evaluate
expressions without the ${{ }} syntax.
To make sure it is mentioned, the ${{ <epxr> }} syntax is only used to denote
that the part of the string should be replaced with the result of the
expression.
In other words, echo '${{ name }}' will be replaced with echo 'subfinder' if
the scan name is subfinder.
Other fields don't use the template syntax at all, since they are by default expressions.
Here is the list of fields supporting expressions, along with their syntax:
scans.[ID].on.expr: Does not use the ${{ }} syntax, since on.expr
field is always evaluated as an expression.scans.[ID].if: Does not use the ${{ }} syntax, since if field is
always evaluated as an expression.scans.[ID].env: Using the ${{ }} syntax, you can dynamically generate
environment variable values.scans.[ID].steps[].run: Using the ${{ }} syntax, you can dynamically
generate the command to be executed by the step.scans.[ID].steps[].if: Does not use the ${{ }} syntax, since if
field is always evaluated as an expression.Currently Reading
Workflow Expressions