Running a job

This section is dedicated to documenting how the runner is configured, what environment variables are available, and to provide more information from the runners' perspective how the workflow will actually be executed.

How runner works behind the scenes?

When the runner is started, it tries to read its configuration first. Once read, it will try to build the environment based on the current environment variables available to it, along with the .env file.

So if you want to expose API keys to tools from the runner, and you don't want to have them globally set, you can either:

  1. Expose environment variables by running export YOUR_ENV=some_value, and then start the runner.
  2. Create the .env file next to the runner binary. The .env file contains list of keys and values that will be sourced by the runner. Here is an example.
# Content of the .env file
YOUR_SECRET_API_KEY=some_api_key
BOUNTYHUB_TOKEN=your_bountyhub_token

Once everything is ready, the runner will start to periodically ask the server for a new job. When the job becomes available, the runner will try to lock the job for itself. Then, it will start executing the job one step at the time.

The important thing to note is that runner can only run a single job at the time. This is intentional, to guard against triggering safety mechanisms (your server can be blacklisted, rate-limited, etc.), and to provide a production from accidental DoS attack on your target.

Runner received a job, what is next?

The runner will try to keep the job alive by periodically pinging the server. On a separate thread, the runner will:

  1. Prepare the working directory by creating a unique subdirectory within your workfolder
  2. Execute steps
  3. Upload the results if there are any
  4. Clean-up the working directory.

Let's dive into each step.

1. Preparing the working directory.

The runner will check if the working directory you configured (_work by default) exists. If it doesn't, it will create one.

Within that directory, it will create a unique folder for each job. This ensures 2 things:

  1. You always start with an empty, fresh directory
  2. Previous job cannot interfere with the new one that landed on the runner. If previous cleanup fails, the runner should still try its best to run the new job.

If this step fails, the runner will not execute the job, and the job will be marked as failed.

On the other hand, if this step is successful, then the runner starts executing your workflow.

2. Executing steps

The most important part of each workflow is the sequence of steps that should be taken to execute your desired action.

There are 2 most important fields for each step:

  • run
  • shell

The run field contains the contents of the script or the desired action that should be taken. Basically, contents of the run field will be written to the temporary file under the unique directory inside your workdir.

The shell field is the program that will execute that file. Let's put an example to better explain this behavior.

run: |
  echo "Running the assetfinder"
  assetfinder bountyhub.org > output.txt
  line_count=$(wc -l output.txt | cut -d ' ' -f 1)
  echo "Found ${line_count} subdomains"
shell: bash

What the runner will do is take the contents of the run, and write it to a file uniquely named. Once the file is written, the runner will execute which on the shell field. Let's say that which finds the bash to be located at /usr/bin/bash. And let's say that the file name is ddcd57b0-143d-013d-a414-04421ad31b68. The command that will be executed is: /usr/bin/bash ddcd57b0-143d-013d-a414-04421ad31b68. That is all there is to it.

The runner listens to the stdout and the stderr. Once the command is done, the runner inspects the exit code. If the exit code is not 0, it will report the step as failed. Both the stdout and the stderr will be propagated to the back-end for further inspection.

3. Uploading results

The results are uploaded if there exists a field within a scan that specifies uploads. Otherwise, the job will have no result. The downside of not having the result is that you cannot store it on the server, and retrieve it from any machine.

If you rely on the stdout/stderr solely, keep in mind that these outputs are going to be expired in a month, so you cannot copy them after that period of time. Another restriction is that combined request can have up to 2MiB, meaning that contents of the stdout and stderr combined cannot exceed 2MiB.

After all steps are done, the runner will try to build a result zip file. You should only upload essential results for the job. If the files/directories you specified are not present, the upload step would fail, and it will fail the overall job.

If the uploads field is not specified, this step will not be taken.

4. Cleanup

After each job, the runner will try to clean up the working directory. The reason for cleanup is to try to minimize storage used on your servers. Currently, there are no caching mechanisms used, so you need to download dependencies on each run.

If cleanup fails, the overall job does not fail. The reason is that maybe there are permission issues causing cleanup failures. That is okay, your job still found the result, and you should be able to inspect it. If you find that cleanup steps are failing, make sure you manually clean up those directories or fix the workflow, so that cleanup can succeed. You can of course choose to ignore it, but after a while, you can end up with 100% disk usage which is not a good situation to be in.