[V2] Workflow configuration file

This is the description of the JSON configuration file for a V2 Workflow data operation.

A Workflow data operation is used to specify one or several data operation IDs that need to be successfully executed for one target data operation to be triggered.

👁️‍🗨️ Example

{
    "$schema": "http://jsonschema.tailer.ai/schema/workflow-veditor",
    "configuration_type": "workflow",
    "configuration_id": "workflow-Load_Configuration",
    "version":"2",
    "environment": "DEV",
    "short_description": "This is a short description",
    "doc_md": "readme.md",
    "account": "000099",
    "activated": true,
    "archived": false,
    "gcp_project_id": "fd-io-tailer-demo-dlk",
    "schedule_interval_reset": "59 23 * * *",
    "authorized_job_ids": [
        "gbq-to-gbq|000099_assert_of_configuration_DEV"
    ],
    "target_dag": {
        "configuration_type":"table-to-table",
        "configuration_id":"Load_Configuration_DEV"
    }
}

⚙️ Parameters

ParameterDescription

$schema

type: string

optional

The url of the json-schema that contains the properties that your configuration must verify. Most Code Editor can use that to validate your configuration, display help boxes and enlighten issues.

configuration_type

type: string

mandatory

Type of configuration file. In this case, the value has to be "workflow".

configuration_id

type: string

mandatory

ID of the workflow.

You can pick any name you want, but is has to be unique for this type of configuration file.

Note that in case of conflict, the newly deployed data operation will overwrite the previous one.

version

type: string

mandatory

Version of the configuration. Must be "2" in order to use the latest features.

Default : "1" for backward compatibility purposes but only version "2" supports the latest features. Version 1 is deprecated.

environment

type: string

mandatory

Deployment context.

Values: PROD, PREPROD, STAGING, DEV.

short_description

type: string

optional

Short description of the context of the configuration file.

doc_md

type: string

optional

Path to a file containing a detailed description of the data operation. The file must be in Markdown format.

account

type: string

mandatory

Your account ID is a 6-digit number assigned to you by your Tailer Platform administrator.

activated

type: boolean

optional

Flag used to enable/disable the execution of the workflow.

If not specified, the default value will be "true".

archived

type: boolean

optional

Flag used to enable/disable the visibility of the workflow's configuration and runs in Tailer Studio.

If not specified, the default value will be "false".

gcp_project_id

type: string

mandatory

ID of the Google Cloud project containing the BigQuery instance to be triggered.

schedule_interval_reset type: string

optional

You can choose to reset your workflow regularly by specifying here a Cron expression (see for ex. crontab.guru). When you reset a workflow, all the triggering conditions are set to false, so all the previous runs are forgotten.

Example:

For a daily job, you may want to reset the workflow everyday at 23:59, so the runs on previous days won't be taken into account for the current day. You need to set it as follows:

"schedule_interval_reset": "59 23 * * *"

Default: None

authorized_job_ids

type: string array

mandatory

Data operations that need to be executed and successful for the current workflow to be triggered. It can be retrieved from the Runs on Tailer Studio, in the Run Details > Job Id. It is possible to wait for several configurations which will be separated by a comma. Example : "gbq-to-gbq

target_dag

type: dict

mandatory

Data operation to trigger. Target_dag is split in two key-value.

configuration_type (string) : you specify the type of data operation with one of the following values : - "storage-to-storage" - "storage-to-table" - "table-to-table" - "table-to-storage" - "vm-launcher" configuration_id (string) : you need to specify the data operation configuration_id, as it is specified in the last part of your job_id. If you deployed your target job without any context, then it's the configuration_id concatenated with the environment (for ex: yourConfId_PROD). If you deployed your target job with a context, then it's the context id, starting with the account id, concatenated with the configuration id (for ex: 000099_yourContextId_yourConfId).

Last updated