# What are Workflow Expressions?

Workflow expressions are snippets of code that allow you to express conditions, extract data, and manipulate values within your workflows.

Their purpose is to provide dynamic capabilities to your workflows, allowing evaluation of an expressions using the context of the job.

In this article, we will explore how expressions are evaluated, the context in which they are executed, the syntax used, and practical use cases.

# Context

Context represents the state against which an expression is evaluated. It includes important information based on which you can build your expressions.

You can see how the context is being used inside the runner by visiting the jobengine crate inside the runner repository.

# id

The id field represents the unique identifier of the job. This field is mostly used by the expression engine helper functions. However, if you are building your own notification server, you can use the ${{ id }} expression to get the job ID.

# name

The name field represents the name of the scan. Since job is an instance of the scan it doesn't have its own name. You can use the ${{ name }} expression to get the scan name.

Mostly, this field is used by the expression engine helper functions.

# project

Project is an object that contains information about the project this scan belongs to.

It only has one field: id. You can use the ${{ project.id }} expression to get the project ID.

# workflow

Workflow is an object that contains information about the workflow this job belongs to.

It only has one field: id. You can use the ${{ workflow.id }} expression to get the workflow ID.

# revision

Revision is an object that contains information about the workflow revision this job belongs to.

It only has one field: id. You can use the ${{ revision.id }} expression to get the revision ID.

# scans

Scans is an object containing scan names as keys and their and the list of jobs belonging to those scans as values.

This is the primary field used by the expression engine helper functions to extract data from the executions of the scans, and making decisions based on that data.

Since this is one of the most important fields added to the context, let's explore it in more detail.

# scans.[ID]

An ID inside the scan is a string name representing the name of the scan.

Since each scan name is unique, the name of the scan is an identifier of the scan.

Let's say we have a scan named subfinder, accessing a list of successful jobs belonging to the subfinder scan is done using ${{ scans.subfinder }}

Each job object inside the scan list contains information about the job.

Each job in the list contains metadata related to the successful execution of the job that belongs to the scan.

The size of this list is currently set to 10, meaning only the last 10 successful jobs are stored in the context. Therefore, size(scans.subfinder) for example will always be <= 10.

During scheduling, last 10 finished jobs will be used to determine if the scan should be scheduled. The distinction is to avoid re-scheduling failed scans.

When the jobs are used in the scans.[ID].steps[], you only care about the last 10 successful jobs. However, during scheduling, fields such as scans.[ID].on.expr or scans.[ID].if, all last 10 finished jobs are used to determine whether the scan should be scheduled or not.

Job are sorted from the latest to the oldest, meaning the job at index 0 is the latest successful job, and the job at index 9 is the oldest successful job.

In other words, ${{ scans.subfinder[0] }} will give you the latest successful job metadata.

If you need context to contain more jobs, please reach out to us through our Discord server or use any of the links provided on the support page.

# scans.[ID][index].id

The id field inside the job object represents the unique identifier of the job. Each job has its own unique ID, and it is the primary way to reference the job.

You can use the ${{ scans.subfinder[0].id }} expression to get the ID of the latest successful job belonging to the subfinder scan.

When fetching the output of the job, bh uses the -j flag to specify the job ID.

yaml

This command fetches the artifact named subfinder.zip from the latest successful job belonging to the subfinder scan.

# scans.[ID].[index].artifacts[]

The artifacts field inside the job object metadata, containing a list of artifacts produced by the job.

You can use the ${{ scans.subfinder[0].artifacts }} expression to get the list of artifacts produced by the latest successful job belonging to the subfinder scan.

# scans.[ID].[index].artifacts[index].name

The name field inside the artifact object represents the name of the artifact produced by the job.

# scans.[ID].[index].artifacts[index].checksum

The checksum field inside the artifact object represents the checksum of the artifact produced by the job. This field is of type string, and is used to determine whether the artifact has changed or not.

# vars

Vars is an object containing user-defined variables that can be used to store custom data. Vars represents the Project Variables defined in the project settings.

There is no validation during the workflow validation on this field. You can add or remove variables as you wish. However, make sure to use valid variable names.

If the variable doesn't exist, and it is used inside an expression, the job will not be scheduled.

# vars.[ID]

The ID is the name of the Project Variable. You can use the ${{ vars.MY_VAR }} expression to get the value of the MY_VAR Project Variable.

# secrets

Secrets is an object containing user-defined secrets that can be used to store sensitive data. Secrets represents the Project Secrets defined in the project settings.

There is no validation during the workflow validation on this field. You can add or remove secrets as you wish. However, make sure to use valid secret names.

If the secret doesn't exist, and it is used inside an expression, the job will not be scheduled.

# secrets.[ID]

The ID is the name of the Project Secret. You can use the ${{ secrets.MY_SECRET }} expression to get the value of the MY_SECRET Project Secret.

# inputs

Inputs is an object containing user-defined inputs that can be used to store custom data. Inputs represents the Workflow Inputs defined in the workflow definition.

Inputs can only be specified by a dispatch scan.

Otherwise, the size of this object will be zero.

# inputs.[ID]

The ID is the name of the Workflow Input. You can use the ${{ inputs.MY_INPUT }} expression to get the value of the MY_INPUT Workflow Input.

The type evaluated by the expression engine is the type you specified in the dispatch input types.

# ok

The ok field is a boolean value representing whether the job is going to fail or not.

The ok is set to false as soon as the step outcome is set to failure.

It is mostly used internally to skip steps using the ok && (<if expression>) pattern.

This field is updated after each step.

# always

The always field is a boolean value that is always set to true.

This field is useful when you always want to run step, regardless of the job outcome.

Keep in mind, since the ok && <if expression> pattern is used internally, steps that should always run, need to use the always field to override the internal check.

yaml

# Expressions

Expressions are snippets of code that are evaluated using the context of the job.

The language used for expressions is called CEL.

You can find the CEL expression parser used by the platform on the GitHub.

Since I have created the language parser, I can quickly fix any issues you might find, or add new features you might need.

Since going through CEL language is out of scope for this article, I will cover helper functions provided by the expression engine that are useful when building workflows.

The expressions are rather streightforward to use. However, if you need help building expressions, feel free to reach out to us through our Discord server or use any of the links provided on the support page.

Keep in mind the following:

  • map is effectively an object with key-value pairs.
  • list is effectively an array of values.
  • string is a sequence of characters.
  • bytes is a sequence of bytes.
  • methods or functions can be used interchangeably.
  • you can call every function as a method on the object itself (e.g., list.size()).
  • you can call functions using the function syntax (e.g., size(list)).

Expressions are separated into built-in functions defined by CEL itself, and helper functions provided by the expression engine.

# Built-in Functions

size()

Size returns an integer representing the size of the list or string passed as an argument.

Inputs:

  • list: Returns the number of elements in the list.
  • map: Returns the number of key-value pairs in the map.
  • string: Returns the number of characters in the string.
  • bytes: Returns the number of bytes in the byte array.

Workflow example:

yaml
type()

Type returns a string representing the type of the value passed as an argument.

This is mostly useless in your workflows, but can be useful when debugging complex expressions.

Type accepts all types as an input, and returns the string name of the type.

has()

Has returns a boolean value representing whether the map contains the specified key.

The only input type used by this function is map.

Workflow example:

yaml
all()

Returns boolean value that is true if all elements in the list are true.

All accepts 3 parameters:

  1. A list, or a map.
  2. A variable name representing the current element in the iteration.
  3. An expression that evaluates to a boolean value.
exists()

Exists returns a boolean value representing whether there is at least one element in the list that evaluates to true.

Exists accepts 3 parameters:

  1. A list, or a map.
  2. A variable name representing the current element in the iteration.
  3. An expression that evaluates to a boolean value.

Example where subfinder can have conditional artifacts. This step will run if subfinder produced at least one artifact.

yaml
exists_one()

Exists_one returns a boolean value representing whether there is exactly one element in the list that evaluates to true.

Exists_one accepts 3 parameters:

  1. A list, or a map.
  2. A variable name representing the current element in the iteration.
  3. An expression that evaluates to a boolean value.
map()

Map returns a list containing the results of applying the specified expression to each element in the input list or map.

It can either accept 3 parameters or 4 parameters.

When accepting 3 parameters:

  1. A list, or a map.
  2. A variable name representing the current element in the iteration.
  3. An expression that produces the value to be included in the resulting list.

When accepting 4 parameters:

  1. A list, or a map.
  2. A variable name representing the current element in the iteration.
  3. Expression that filters the elements to be included in the resulting list.
  4. An expression that produces the value to be included in the resulting list.
filter()

Filter returns a list containing the elements from the input list or map that satisfy the specified condition.

Filter accepts 3 parameters:

  1. A list, or a map.
  2. A variable name representing the current element in the iteration.
  3. An expression that evaluates to a boolean value.
contains()

Contains returns a boolean value representing whether the specified substring is present within the given string.

Inputs:

  • string: The string to search within.
  • substring: The substring to search for.

Workflow example when we want to run step only if the SCOPE variable contains bountyhub.org.

Maybe, bountyhub.org has special rules allowing us to run additional scans.

Maybe the target goes out-of-scope periodically, and we want to avoid running certain scans when that happens. We can use this function to check whether the target is still in scope.

yaml
startsWith()

The startsWith returns a boolean value representing whether the given string starts with the specified prefix.

Inputs:

  • string: The string to check.
  • prefix: The prefix to look for.
endsWith()

The endsWith returns a boolean value representing whether the given string ends with the specified suffix.

Inputs:

  • string: The string to check.
  • suffix: The suffix to look for.
matches()

Test whether a string matches a regular expression. For this implementation, you can find regular expressions syntax documented in the regex crate documentation.

Inputs:

  • string: The string to check.
  • regex: String representing the regular expression pattern.

Workflow example when we want to run step only if the SCOPE variable matches a specific regular expression pattern.

yaml
uint()

The uint function converts the input value to an unsigned integer.

It only accepts one argument of type:

  • int: returns an uint
  • uint: returns an uint
  • double: returns the integer part of the double as an uint
  • string: parses the string and returns the uint value
  • timestamp: Returns the number of seconds since epoch as an uintnt: returns an int
int()

The int function converts the input value to an integer.

It only accepts one argument of type:

  • int: returns an int
  • uint: returns an int
  • double: returns the integer part of the double as an int
  • string: parses the string and returns the int value
  • timestamp: Returns the number of seconds since epoch as an int
string()

The string function converts the input value to a string.

It only accepts one argument of type:

  • int: returns the string representation of the int
  • uint: returns the string representation of the uint
  • double: returns the string representation of the double
  • string: returns the string itself
  • bool: returns "true" or "false" based on the boolean value
  • bytes: returns the string representation of the byte array
  • timestamp: returns the string representation of the timestamp
  • null: returns "null"
  • duration: returns the string representation of the duration in seconds.
timestamp()

The timestamp function converts the input value to a timestamp.

It only accepts one argument of type:

  • string: parses the string and returns the timestamp value
  • timestamp: returns the timestamp itself

The string should be in RFC3339 format.

duration()

The duration function converts the input value to a duration.

It only accepts one argument of type:

  • string: parses the string and returns the duration value
  • duration: returns the duration itself

The duration is represented using numbers and suffixes like "s" for seconds, "m" for minutes, "h" for hours, etc.

# Specialized functions

is_available()

This function relies on the specific context of the scan to determine whether the scan produced any results.

The scan is available if there is at least one successful job belonging to the, that executed successfully after the latest execution of the latest finished job belonging to the current scan.

yaml

Let's talk about the example to illustrate how this works.

Let's say that we have job A which executes on the cron schedule, and we have job B which should be scheduled when job A is available.

Let's say something triggers the job B evaluation. Now, job A is not available, so it is not scheduled.

Let's say that job A executes successfully, producing some results. Now, all expression scans are evaluated, and since job A contains a run, and job B does not, job B is scheduled.

Let's say job B executes successfully. We don't evaluate the job B during this execution, to avoid indefinite recursive calls.

Let's say something triggered the expression scans again. Now, job B has execution after the latest successful execution of job A, so job B is not scheduled.

has_diff()

This function relies on the specific context of the scan to determine whether the scan produced different results compared to the previous successful execution, and that scan is_available().

The has_diff function accepts the list of jobs as the first argument, and the artifact name as the second argument.

If there exists one scan, every nonce value evaluates to true If there multiple scans, the latest scan compared to the scan prior to to it must have a different checksum for the specified artifact to evaluate to true.

yaml

Let's talk about the example to illustrate how this works. In this example, has_diff always means has diff on the subfinder.zip artifact.

Let's say that we have job A which executes on the cron schedule, and we have job B which should be scheduled when job A has different results compared to the previous execution.

Let's say something triggers the job B evaluation. Now, job A is not available, therefore, it doesn't have a diff. Therefore, job B is not scheduled.

Let's say that job A executes successfully, producing some results. Now, all expression scans are evaluated, and since job A contains a run, and job B does not. Job A contains a diff, since first run is by default different than nothing. Therefore, job B is scheduled.

Let's say job B executes successfully. It is ignored during evaluation.

Something else triggers the job B evaluation again. Now, job A does not have an execution after the latest execution of job B, so even though job A produced some results, job B is not scheduled.

Let's also say that job 'A' executes again, producing the same results as before. Now, job B is not scheduled, since job A is available, but it doesn't have a diff compared to the previous execution.

Let's finally say that job A executes again, producing different results than before. Now, job B is scheduled, since job A is available, and it has a diff compared to the previous execution.

# Fields supporting expressions

Certain fields in the workflow syntax support expressions. Some support expressions using the ${{ }} construct, while others directly evaluate expressions without the ${{ }} syntax.

To make sure it is mentioned, the ${{ <epxr> }} syntax is only used to denote that the part of the string should be replaced with the result of the expression.

In other words, echo '${{ name }}' will be replaced with echo 'subfinder' if the scan name is subfinder.

Other fields don't use the template syntax at all, since they are by default expressions.

Here is the list of fields supporting expressions, along with their syntax:

  • scans.[ID].on.expr: Does not use the ${{ }} syntax, since on.expr field is always evaluated as an expression.
  • scans.[ID].if: Does not use the ${{ }} syntax, since if field is always evaluated as an expression.
  • scans.[ID].env: Using the ${{ }} syntax, you can dynamically generate environment variable values.
  • scans.[ID].steps[].run: Using the ${{ }} syntax, you can dynamically generate the command to be executed by the step.
  • scans.[ID].steps[].if: Does not use the ${{ }} syntax, since if field is always evaluated as an expression.

# Next steps