# VM Launcher configuration file for code processing

The configuration file is in JSON format. It contains the following sections:

* Global parameters: General information about the data operation.
* Script parameters: Information about the script location and instructions to execute on the VM.
* VM parameters: Information related to the VM where to execute the script.

## :eye\_in\_speech\_bubble: Example

Here is an example of VM Launcher configuration file for code processing:

```json
{
    "$schema": "http://jsonschema.tailer.ai/schema/vm-launcher-veditor",
    "configuration_type": "vm-launcher",
    "configuration_id": "000010_my-vm-job",
    "version": "2",
    "environment": "DEV",
    "account": "000099",
    "doc_md": "readme.md",
    "start_date": "2022, 11, 16",
    "schedule_interval": "0 7 * * *",
    "activated": false,
    "archived": false,
    "gcp_project_id": "my-project",
    "gcs_bucket": "my-bucket",
    "gcs_working_directory": "/",
    "credentials": {
        "gcp-credentials.json": {
            "content": {
                "cipher_aes": "xxx", 
                "tag": "xxx", 
                "ciphertext": ""xxx, 
                "enc_session_key": "xxx"
            }
        }
    },
    "script_to_execute": [
        "mkdir -p input_DEV",
        "cd ./input_DEV && python3 my-python-script.py"
    ],
    "vm_delete": true,
    "vm_core_number": "2",
    "vm_memory_amount": "4",
    "vm_disk_size": "20",
    "vm_compute_zone": "europe-west1-b",
    "vm_custom_os_image_family": "ubuntu-2004-lts",
    "vm_custom_os_image_project": "ubuntu-os-cloud"
}
```

## :globe\_with\_meridians: Global parameters

| Parameter                                                                                                                                                                          | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| <p><strong>$schema</strong></p><p>type: string</p><p>optional</p>                                                                                                                  | The url of the json-schema that contains the properties that your configuration must verify. Most Code Editor can use that to validate your configuration, display help boxes and enlighten issues.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| <p><strong>configuration\_type</strong></p><p>type: string</p><p>mandatory</p>                                                                                                     | <p>Type of data operation.</p><p>For an STS data operation, the value is always "storage-to-storage".</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| <p><strong>configuration\_id</strong></p><p>type: string</p><p>mandatory</p>                                                                                                       | <p>ID of the data operation.</p><p>You can pick any name you want, but is has to be <strong>unique</strong> for this data operation type.</p><p>Note that in case of conflict, the newly deployed data operation will overwrite the previous one. To guarantee its uniqueness, the best practice is to name your data operation by concatenating:</p><ul><li>your account ID,</li><li>the source bucket name,</li><li>and the source directory name.</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| <p><strong>version</strong><br>type: string<br>optional</p>                                                                                                                        | <p>Version of the configuration. Must be "2" in order to use the latest features.</p><p><em>Default : "1"</em> but only version "2" supports start\_date and schedule\_interval. Version 1 is deprecated.</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| <p><strong>environment</strong></p><p>type: string</p><p>mandatory</p>                                                                                                             | <p>Deployment context.</p><p>Values: PROD, PREPROD, STAGING, DEV.</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| <p><strong>account</strong></p><p>type: string</p><p>mandatory</p>                                                                                                                 | Your account ID is a 6-digit number assigned to you by your Tailer Platform administrator.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| <p><strong>doc\_md</strong></p><p>type: string</p><p>optional</p>                                                                                                                  | Path to a file containing a detailed description of the data operation. The file must be in Markdown format.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| <p><strong>start\_date</strong></p><p>type: string</p><p>optional</p><p>only available for version: "2" and latest</p><p>mandatory if you want to specify a schedule\_interval</p> | <p>Start date of the data operation.</p><p>The format must be: "YYYY, MM, DD"</p><p>Where:</p><ul><li>YYYY >= 1970</li><li>MM = \[1, 12]</li><li>DD = \[1, 31]</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| <p><strong>schedule\_interval</strong></p><p>type: string</p><p>optional</p><p>only available for version: "2" and latest</p>                                                      | <p>A VM Launcher can be launched in two different ways:</p><ul><li>If <strong>schedule\_interval</strong> is set to "None", the data operation will need to be started with a <a href="../../orchestrate-processings-with-workflow">Workflow</a>, when a given condition is met.</li><li>If you want the data operation to start at regular intervals, you can define this in the <strong>schedule\_interval</strong> parameter with a Cron expression.</li></ul><p><strong>Note:</strong><br><span data-gb-custom-inline data-tag="emoji" data-code="26a0">⚠️</span> You need to define a start\_date to schedule a data operation, otherwise the schedule\_interval is ignored.</p><p><strong>Example:</strong></p><p>For the VM Launcher to start everyday at 7:00, you need to set it as follows:</p><p><code>"schedule\_interval": "0 7 \* \* \*",</code></p><p>You can find online tools to help you edit your Cron expression (for example,<a href="https://crontab.guru"> crontab.guru</a>).</p> |
| <p><strong>activated</strong></p><p>type: boolean</p><p>optional</p>                                                                                                               | <p>Flag used to enable/disable the execution of the data operation.</p><p>If not specified, the default value will be "true".</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| <p><strong>archived</strong></p><p>type: boolean</p><p>optional</p>                                                                                                                | <p>Flag used to enable/disable the visibility of the data operation's configuration and runs in Tailer Studio.</p><p>If not specified, the default value will be "false".</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |

## :writing\_hand: Script parameters

Information about the script location and instructions to execute it.

| Parameter                                                                          | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| ---------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| <p><strong>gcp\_project\_id</strong></p><p>type: string</p><p>mandatory</p>        | Google Cloud Platform project ID for the bucket containing the script.                                                                                                                                                                                                                                                                                                                                                                                                   |
| <p><strong>gcs\_bucket</strong></p><p>type: string</p><p>mandatory</p>             | Name of the GCS bucket containing the script.                                                                                                                                                                                                                                                                                                                                                                                                                            |
| <p><strong>gcs\_working\_directory</strong></p><p>type: string</p><p>mandatory</p> | Path in the GCS bucket containing the script, e.g. "some/sub/dir".                                                                                                                                                                                                                                                                                                                                                                                                       |
| <p><strong>gcp\_credentials\_secret</strong></p><p>type: dict</p><p>mandatory</p>  | <p>Encrypted credentials needed to read/move data from the source bucket.</p><p>You should have generated credentials when <a href="https://app.gitbook.com/s/-MIIsP_DvP2J-c1szWrQ/data-pipeline-operations/getting-started/set-up-google-cloud-platform.md">setting up GCP</a>. To learn how to encrypt them, refer to <a href="https://app.gitbook.com/s/-MIIsP_DvP2J-c1szWrQ/data-pipeline-operations/getting-started/encrypt-your-credentials.md">this page</a>.</p> |
| <p><strong>script\_to\_execute</strong></p><p>type: array</p><p>mandatory</p>      | List of Unix commands to be executed (similar to a Bash script) on the VM.                                                                                                                                                                                                                                                                                                                                                                                               |

## :computer: VM parameters

Information related to the Google Cloud Compute Engine VM where the script will be executed.

| Parameter                                                                                | Description                                                                                                                                                                                                                                               |
| ---------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| <p><strong>vm\_delete</strong></p><p>type: boolean</p><p>optional</p>                    | <p>If set to "true", this parameter will force the deletion of the VM at the end of the data operation. Running Compute Engine VMs will incur extra costs, so it is recommended to leave this parameter on "true".</p><p><em>Default value: true</em></p> |
| <p><strong>vm\_core\_number</strong></p><p>type: string</p><p>optional</p>               | <p>Virtual CPU (vCPU) count.<br>It is recommended to leave the default parameter, as this should allow sufficient performance to run a standard script.</p><p><em>Default value: 2</em></p>                                                               |
| <p><strong>vm\_memory\_amount</strong></p><p>type: string</p><p>optional</p>             | <p>System memory size (in GB).</p><p>It is recommended to leave the default parameter, as this should allow sufficient performance to run a standard script.</p><p><em>Default value: 4</em></p>                                                          |
| <p><strong>vm\_disk\_size</strong></p><p>type: string</p><p>optional</p>                 | <p>Persistent disk size (in GB).</p><p>It is recommended to leave the default parameter, as this should provide enough space to store the data to process.</p><p><em>Default value: 20</em></p>                                                           |
| <p><strong>vm\_compute\_zone</strong><br>type: string<br>optional</p>                    | <p>Select the zone where the vm can execute its jobs<br><em>Default value: europe-west1-b</em></p>                                                                                                                                                        |
| <p><strong>vm\_custom\_os\_image\_family</strong></p><p>type: string</p><p>optional</p>  | <p>Image family of the custom image.</p><p>Note that for the time being, custom OS images MUST be based on a Ubuntu 20.04 LTS.</p><p><em>Default value: ubuntu-2004-lts</em></p>                                                                          |
| <p><strong>vm\_custom\_os\_image\_project</strong></p><p>type: string</p><p>optional</p> | <p>GCP Project hosting the custom image.</p><p>Note that this parameter is mandatory if <strong>vm\_custom\_os\_image\_family</strong> is set.</p><p><em>Default value: ubuntu-os-cloud</em></p>                                                          |
