# \[V3] Table to Storage configuration file

The configuration file is in JSON format. It contains the following sections:

* Global parameters: General information about the data operation. You can specify here default values for parameters that will apply to all the tasks, if the parameter is not overriden in the task description.
* Table copy parameters: Optionally, you can add a creation step for a table that will contain the result of the extraction.

## :eye\_in\_speech\_bubble: Example

Here is an example of TTS configuration file:

```json
{
    "$schema": "http://jsonschema.tailer.ai/schema/table-to-storage-v3editor",
    "configuration_type": "table-to-storage",
    "configuration_id": "tts-some-id-example",
    "short_description": "Short description of the job",
    "environment": "DEV",
    "account": "000099",    
    "version": "3",
    "activated": true,
    "archived": false,
    "doc_md": "readme.md",
    "start_date" : "2023, 2, 10",
    "schedule_interval" : "5 1 * * *",
    "print_header": true,
    "destination_format": "CSV",
    "gcs_dest_bucket": "fd-io-test-bucket-out",
    "gcs_dest_prefix": "tts_exemple/",
    "gcp_project_id": "fd-tailer-demo",
    "field_delimiter": ",",
    "compression": "None",
    "sql_query_template": "TEMPLATE_CURRENT_DATE",
    "bq_data_location": "EU",
    "generate_top_file": false,
    "delete_dest_bucket_content": false,
    "tasks": [
        {
            "task_id": "Export_with_default_values",
            "sql_file" : "the_tts_SQL_file.sql",
            "output_filename" : "THE_FILE_NAME_{{FD_DATE}}.csv",
            "copy_table": true,
            "dest_gcp_project_id": "fd-tailer-demo",
            "dest_gbq_dataset": "dlk_exemple_tts",
            "dest_gbq_table": "to_exemple_tts",
            "dest_gbq_table_suffix": "dag_execution_date",
            "bq_data_location": "US"
        },
        {
            "task_id": "Export_with_specific_values",
            "gcs_dest_bucket": "different-bucket-out",
            "gcs_dest_prefix": "tts_number_2/",
            "gcp_project_id": "fd-tailer-destination",
            "field_delimiter": "|",
            "compression": "GZIP",
            "sql_file": "my_other_SQL_file.sql",
            "output_filename": "A_DIFFERENT_FILE_NAME_{{FD_DATE}}.csv",
            "destination_format": "NEWLINE_DELIMITED_JSON",
            "sql_query_template": "TEMPLATE_CURRENT_DATE",
            "generate_top_file": true,
            "delete_dest_bucket_content": false,
            "copy_table": true,
            "dest_gcp_project_id": "fd-tailer-demo-destination",
            "dest_gbq_dataset": "my_destination_dataset",
            "dest_gbq_table": "my_other_extraction",
            "dest_gbq_table_suffix": "dag_execution_date"
        }
    ]
}
```

## :globe\_with\_meridians: Global parameters

General information about the data operation.

You can specify here default values for parameters that will apply to all the tasks, if the parameter is not overriden in the task description.

| Parameter                                                                                                                                                                              | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| <p><strong>$schema</strong></p><p>type: string</p><p>optional</p>                                                                                                                      | The url of the json-schema that contains the properties that your configuration must verify. Most Code Editor can use that to validate your configuration, display help boxes and enlighten issues.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| <p><strong>configuration\_type</strong></p><p>type: string</p><p>mandatory</p>                                                                                                         | <p>Type of data operation.</p><p>For a TTS data operation, the value is always "table-to-storage".</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| <p><strong>configuration\_id</strong></p><p>type: string</p><p>mandatory</p>                                                                                                           | <p>ID of the data operation.</p><p>You can pick any name you want, but is has to be <strong>unique</strong> for this data operation type.</p><p>Note that in case of conflict, the newly deployed data operation will overwrite the previous one. To guarantee its uniqueness, the best practice is to name your data operation by concatenating:</p><ul><li>your account ID,</li><li>the word "extract",</li><li>and a description of the data to extract.</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| <p><strong>short\_description</strong></p><p>type: string</p><p>optional</p>                                                                                                           | Short description of the table to storage data operation.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| <p><strong>environment</strong></p><p>type: string</p><p>mandatory</p>                                                                                                                 | <p>Deployment context.</p><p>Values: PROD, PREPROD, STAGING, DEV.</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| <p><strong>account</strong></p><p>type: string</p><p>mandatory</p>                                                                                                                     | Your account ID is a 6-digit number assigned to you by your Tailer Platform administrator.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| <p><strong>version</strong><br>type: string<br>mandatory, otherwise default is 1 and in that case refers to the deprecated <a href="table-to-storage-configuration-file-1">V1</a>.</p> | <p>Version of the configuration. Must be "3" in order to use the latest features.</p><p><em>Default : "1" for backward compatibility purposes but only version "3" supports the latest features. Version 1 is deprecated.</em></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| <p><strong>activated</strong></p><p>type: boolean</p><p>optional</p>                                                                                                                   | <p>Flag used to enable/disable the execution of the data operation.</p><p><em>Default value: true</em></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| <p><strong>archived</strong></p><p>type: boolean</p><p>optional</p>                                                                                                                    | <p>Flag used to enable/disable the visibility of the data operation's configuration and runs in Tailer¯Studio.</p><p><em>Default value: false</em></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| <p><strong>doc\_md</strong></p><p>type: string</p><p>optional</p>                                                                                                                      | Path to a file containing a detailed description of the data operation. The file must be in Markdown format.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| <p><strong>start\_date</strong></p><p>type: string</p><p>optional</p>                                                                                                                  | <p>Start date of the data operation.</p><p>The format must be:</p><p>"YYYY, MM, DD"</p><p>Where:</p><ul><li>YYYY >= 1970</li><li>MM = \[1, 12]</li><li>DD = \[1, 31]</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| <p><strong>schedule\_interval</strong></p><p>type: string</p><p>optional</p>                                                                                                           | <p>A Tables to Tables data operation can be launched in two different ways:</p><ul><li>If <strong>schedule\_interval</strong> is set to "None", the data operation will need to be started with a <a href="../orchestrate-processings-with-workflow">Workflow</a>, when a given condition is met. (This solution is recommended.)</li><li>If you want the data operation to start at regular intervals, you can define this in the <strong>schedule\_interval</strong> parameter with a Cron expression.</li></ul><p><strong>Note:</strong><br><span data-gb-custom-inline data-tag="emoji" data-code="26a0">⚠️</span> You need to define a start\_date to schedule a data operation, otherwise the schedule\_interval is ignored.</p><p><strong>Example</strong></p><p>For the data operation to start everyday at 7:00, you need to set it as follows:</p><p><code>"schedule\_interval": "0 7 \* \* \*",</code></p><p>You can find online tools to help you edit your Cron expression (for example,<a href="https://crontab.guru"> crontab.guru</a>).</p> |
| <p><strong>print\_header</strong><br>type: boolean<br>optional</p>                                                                                                                     | <p>Print a header row in the exported data.</p><p><em>Default value: true</em></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| <p><strong>destination\_format</strong></p><p>type: string</p><p>optional</p>                                                                                                          | <p>Define the format of the output file :</p><p>Possible values: "NEWLINE\_DELIMITED\_JSON" (JSON file), "AVRO", "PARQUET"</p><p>Note that if you specify "NEWLINE\_DELIMITED\_JSON", the field-delimiter parameter is not taken into account.<br><em>Default value: "CSV"</em></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| <p><strong>gcs\_dest\_bucket</strong></p><p>type: string</p><p>mandatory</p>                                                                                                           | <p>Google Cloud Storage destination bucket.</p><p>This is the bucket where the data is going to be extracted.</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| <p><strong>gcs\_dest\_prefix</strong></p><p>type: string</p><p>mandatory</p>                                                                                                           | <p>Path in the GCS bucket where the files will be extracted, e.g. "some/sub/dir".<br><br>Note that you can use {{FD\_DATE}} inside the path to include the current ISO date.<br>e.g. "some/sub/dir/{{FD\_DATE}}"</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| <p><strong>gcp\_project\_id</strong></p><p>type: string</p><p>mandatory</p>                                                                                                            | ID of the Google Cloud project containing the BigQuery instance.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| <p><strong>field\_delimiter</strong></p><p>type: string</p><p>optional</p>                                                                                                             | <p>Separator for fields in the CSV output file, e.g. ";".</p><p><strong>Note</strong>: For Tab separator, set to "\t".</p><p><em>Default value: "</em></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| <p><strong>compression</strong></p><p>type: string</p><p>optional</p>                                                                                                                  | <p>Compression mode for the output file.</p><p>Possible values: "None", "GZIP", "SNAPPY".<br></p><p>Note that if you specify "GZIP", a ".gz" extension will be added at the end of the filename.<br><em>Default value: "None"</em></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| <p><strong>sql\_query\_template</strong><br>type: string<br>optional</p>                                                                                                               | <p>If you want to use variables in your SQL query or script, you need to set this parameter to "TEMPLATE\_CURRENT\_DATE" (only supported value). This variable will be set to the execution date of the data operation (and not today's date).</p><p>For example, if you want to retrieve data corresponding to the execution date, you can use the following instruction:</p><p><code>WHERE sale\_date = DATE('</code>{{TEMPLATE\_CURRENT\_DATE}}<code>')</code></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| <p><strong>bq\_data\_location</strong><br>type: string<br>optional</p>                                                                                                                 | <p>Bigquery location used by default in all tasks.<br>If not specified the value 'EU' will be set.<br><br>The list of available values can be found here : <a href="https://cloud.google.com/bigquery/docs/locations"><https://cloud.google.com/bigquery/docs/locations></a></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| <p><strong>generate\_top\_file</strong></p><p>type: boolean</p><p>optional</p>                                                                                                         | <p>If true, generates an empty file when the data export is complete.</p><p>This file name is defined by the file name template. For exemple if the file name template is "{{FD\_DATE}}-my\_data\_extraction.csv" then the top file generated on 2022-01-01 will be named as: 20220101-my\_data\_extraction.csv.top</p><p><em>Default value: false</em></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| <p><strong>delete\_dest\_bucket\_content</strong></p><p>type: boolean</p><p>optional</p>                                                                                               | <p>If set to true, this parameter will trigger the preliminary deletion of any items present in the destination directory.</p><p>This can prevent an issue when a new run of the same operation is needed after a fix. If the first run had generated file-0.csv and file-1.csv, and then the 2nd run only returns and erases file-0.csv, then you need to delete the destination bucket at the begining of the 2nd run, or you will end up with a file-0.csv from the 2nd run and a file-1.csv from the first run.</p><p><span data-gb-custom-inline data-tag="emoji" data-code="26a0">⚠️</span> If several table-to-storage operations write in the same directory at the same time, and if this parameter is true, then some extracted files can be deleted by mistake. The best practice is to have a dedicated subdirectory for each operation.</p><p><em>Default value: false</em></p>                                                                                                                                                                |
| <p><strong>tasks</strong><br>type: array of maps<br>mandatory</p>                                                                                                                      | <p>List of tasks the data operations will execute.</p><p>Check the section below for detailed information on their parameters.</p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |

## :envelope\_with\_arrow:Tasks Parametrers

With le latest version, it is now possible to export the data to different locations in one configuration. And this is possible thanks to the parameter "tasks".

Every task specifies an export. The tasks will use the parameters defined in the global configuration by default. If a parameter is specified in a task and in the global parameters, then the parameter in the task will overwrite the default parameter.

| Parameter                                                                                                                     | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| ----------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| <p><strong>task\_id</strong><br>type: string<br>mandatory</p>                                                                 | The unique ID of your task.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| <p><strong>sql\_file</strong></p><p>type: string</p><p>mandatory</p>                                                          | Path to the file containing the extraction query.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| <p><strong>output\_filename</strong></p><p>type: string</p><p>mandatory</p>                                                   | <p>Template for the output filename.</p><p>You can use the following placeholders inside the name:</p><ul><li>{{FD\_DATE}}: The date format will be YYYYMMDD</li><li>{{FD\_TIME}}: The time format will be hhmmss<br></li></ul><p><span data-gb-custom-inline data-tag="emoji" data-code="26a0">⚠️</span> BigQuery splits the content in several numbered files if you export more than 1 GB of data. A number starting at 0 and left-padded to 12 digits is added before the extension and after a "-". To ensure a consistent behavior, this number is always added, even if you export less than 1 GB.<br><br>For example, an operation with the output\_filename <strong>"</strong>{{FD\_DATE}}-{{FD\_TIME}}\_my\_data\_extraction.csv" executed the 2022-01-01 on 06:32:16 will generate a file: 20220101-063216\_my\_data\_extraction-000000000000.csv</p> |
| <p><strong>copy\_table</strong></p><p>type: boolean</p><p>optional</p>                                                        | <p>Parameter used to enable a copy of the output data in a BigQuery table.</p><p><em>Default value: false</em></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| <p><strong>dest\_gcp\_project\_id</strong></p><p>mandatory if <strong>copy\_table</strong> is set to "true"</p>               | ID of the GCP project that will contain the table copy.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| <p><strong>dest\_gbq\_dataset</strong></p><p>mandatory if <strong>copy\_table</strong> is set to "true"</p>                   | Name of the BigQuery dataset that will contain the table copy.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| <p><strong>dest\_gbq\_table\_suffix</strong></p><p>optional, to use only if <strong>copy\_table</strong> is set to "true"</p> | <p>The only supported value for this parameter is "dag\_execution\_date".</p><p>This will add "\_yyyymmdd" at the end of the table name to enable ingestion time partitioning.<br><em>Default value: None</em></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| <p><strong>bq\_data\_location</strong><br>type: string<br>optional</p>                                                        | <p>Bigquery location used in this specific task.<br>If not specified the value used will be the global "bq\_data\_location" set at the configuration root.<br><br>The list of available values can be found here : <a href="https://cloud.google.com/bigquery/docs/locations"><https://cloud.google.com/bigquery/docs/locations></a></p>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| and all the global parameters can be overwritten here                                                                         | If a parameter is specified in a task and in the global parameters, then the parameter in the task will overwrite the default parameter.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
