VM Launcher configuration file for code processing
This is the description of the JSON configuration file for a VM Launcher code processing data operation.
Last updated
This is the description of the JSON configuration file for a VM Launcher code processing data operation.
Last updated
The configuration file is in JSON format. It contains the following sections:
Global parameters: General information about the data operation.
Script parameters: Information about the script location and instructions to execute on the VM.
VM parameters: Information related to the VM where to execute the script.
Here is an example of VM Launcher configuration file for code processing:
$schema
type: string
optional
The url of the json-schema that contains the properties that your configuration must verify. Most Code Editor can use that to validate your configuration, display help boxes and enlighten issues.
configuration_type
type: string
mandatory
Type of data operation.
For an STS data operation, the value is always "storage-to-storage".
configuration_id
type: string
mandatory
ID of the data operation.
You can pick any name you want, but is has to be unique for this data operation type.
Note that in case of conflict, the newly deployed data operation will overwrite the previous one. To guarantee its uniqueness, the best practice is to name your data operation by concatenating:
your account ID,
the source bucket name,
and the source directory name.
version type: string optional
Version of the configuration. Must be "2" in order to use the latest features.
Default : "1" but only version "2" supports start_date and schedule_interval. Version 1 is deprecated.
environment
type: string
mandatory
Deployment context.
Values: PROD, PREPROD, STAGING, DEV.
account
type: string
mandatory
Your account ID is a 6-digit number assigned to you by your Tailer Platform administrator.
doc_md
type: string
optional
Path to a file containing a detailed description of the data operation. The file must be in Markdown format.
start_date
type: string
optional
only available for version: "2" and latest
mandatory if you want to specify a schedule_interval
Start date of the data operation.
The format must be: "YYYY, MM, DD"
Where:
YYYY >= 1970
MM = [1, 12]
DD = [1, 31]
schedule_interval
type: string
optional
only available for version: "2" and latest
A VM Launcher can be launched in two different ways:
If you want the data operation to start at regular intervals, you can define this in the schedule_interval parameter with a Cron expression.
Example:
For the VM Launcher to start everyday at 7:00, you need to set it as follows:
"schedule_interval": "0 7 * * *",
activated
type: boolean
optional
Flag used to enable/disable the execution of the data operation.
If not specified, the default value will be "true".
archived
type: boolean
optional
Flag used to enable/disable the visibility of the data operation's configuration and runs in Tailer Studio.
If not specified, the default value will be "false".
Information about the script location and instructions to execute it.
gcp_project_id
type: string
mandatory
Google Cloud Platform project ID for the bucket containing the script.
gcs_bucket
type: string
mandatory
Name of the GCS bucket containing the script.
gcs_working_directory
type: string
mandatory
Path in the GCS bucket containing the script, e.g. "some/sub/dir".
gcp_credentials_secret
type: dict
mandatory
Encrypted credentials needed to read/move data from the source bucket.
script_to_execute
type: array
mandatory
List of Unix commands to be executed (similar to a Bash script) on the VM.
Information related to the Google Cloud Compute Engine VM where the script will be executed.
vm_delete
type: boolean
optional
If set to "true", this parameter will force the deletion of the VM at the end of the data operation. Running Compute Engine VMs will incur extra costs, so it is recommended to leave this parameter on "true".
Default value: true
vm_core_number
type: string
optional
Virtual CPU (vCPU) count. It is recommended to leave the default parameter, as this should allow sufficient performance to run a standard script.
Default value: 2
vm_memory_amount
type: string
optional
System memory size (in GB).
It is recommended to leave the default parameter, as this should allow sufficient performance to run a standard script.
Default value: 4
vm_disk_size
type: string
optional
Persistent disk size (in GB).
It is recommended to leave the default parameter, as this should provide enough space to store the data to process.
Default value: 20
vm_compute_zone type: string optional
Select the zone where the vm can execute its jobs Default value: europe-west1-b
vm_custom_os_image_family
type: string
optional
Image family of the custom image.
Note that for the time being, custom OS images MUST be based on a Ubuntu 20.04 LTS.
Default value: ubuntu-2004-lts
vm_custom_os_image_project
type: string
optional
GCP Project hosting the custom image.
Note that this parameter is mandatory if vm_custom_os_image_family is set.
Default value: ubuntu-os-cloud
If schedule_interval is set to "None", the data operation will need to be started with a , when a given condition is met.
Note: You need to define a start_date to schedule a data operation, otherwise the schedule_interval is ignored.
You can find online tools to help you edit your Cron expression (for example,).
You should have generated credentials when . To learn how to encrypt them, refer to .