Storage to Tables configuration file

This is the description of the configuration file of a Storage to Tables data operation.

The configuration file is in JSON format. It contains the following sections:

  • Global parameters: General information about the data operation.

  • Source parameters: Information related to the data source provider.

  • Destination parameters: Information about input file templates and destination tables. The "destinations" section will refer to DDL files, which contain the schema of the destination tables.

👁️‍🗨️ Example

Here is an example of STT configuration file for a GCS to BigQuery transfer:

  "$schema": "",
  "configuration_type": "storage-to-tables",
  "configuration_id": "Load_sales_files_from_it",
  "version": "2",
  "environment": "DEV",
  "account": "000099",
  "activated": true,
  "archived": false,
  "max_active_runs" : 5,
  "short_description": "This Job load sales files into the Persistent Staging Area",
  "doc_md": "",
  "source": {
    "type": "gcs",
    "gcp_project_id": "dlk_demo",
    "gcs_source_bucket": "mirror-fd-io-exc-demo-wbd--n-in",
    "gcs_source_prefix": "testjul",
    "gcs_archive_prefix": "archive",
    "gcp_credentials_secret": {
      "cipher_aes": "223xxx",
      "tag": "8ddxxx",
      "ciphertext": "4c7xxx",
      "enc_session_key": "830xxx"
  "destinations": [
      "type": "bigquery",
      "gcp_project_id": "my-project",
      "gbq_dataset": "dlk_demo_wbd_psa",
      "source_format": "CSV",
      "create_disposition": "CREATE_IF_NEEDED",
      "write_disposition": "WRITE_TRUNCATE",
      "bq_load_job_ignore_unknown_values": true,
      "skip_leading_rows": 1,
      "field_delimiter": "|",
      "add_tailer_metadata": true,
      "gcp_credentials_secret": {
        "cipher_aes": "223xxx",
        "tag": "8ddxxx",
        "ciphertext": "4c7xxx",
        "enc_session_key": "830xxx"
      "tables": [
          "table_name": "sales_details_test",
          "short_description": "Daily sales with tickets and tickets lines",
          "filename_template": "input_{{FD_DATE}}-{{FD_TIME}}-sales_details.csv",
          "ddl_mode": "file",
          "ddl_file": "ddl/sales_details.json",
          "doc_md": "ddl/",
          "add_tailer_metadata": true,
          "skip_leading_rows": 0,
          "write_disposition": "WRITE_APPEND"
          "table_name": "stores",
          "short_description": "Full Stores referential",
          "filename_template": "input_{{FD_DATE}}-{{FD_TIME}}-stores.csv",
          "ddl_mode": "file",
          "ddl_file": "ddl/stores.json",
          "doc_md": "ddl/",
          "add_tailer_metadata": false

🌐 Global parameters

General information about the data operation



type: string


The url of the json-schema that contains the properties that your configuration must verify. Most Code Editor can use that to validate your configuration, display help boxes and enlighten issues.


type: string


Type of data operation. For an STT data operation, the value is always "storage-to-tables".


type: string


ID of the data operation.

You can pick any name you want, but is has to be unique for this data operation type.

Note that in case of conflict, the newly deployed data operation will overwrite the previous one. To guarantee its uniqueness, the best practice is to name your data operation by concatenating:

  • your account ID,

  • the source bucket name,

  • and the source directory name.

version type: string mandatory

Use only version 2, version 1 is depreciated.


type: string


Deployment context. Values: PROD, PREPROD, STAGING, DEV.


type: string


Your account ID is a 6-digit number assigned to you by your Tailer Platform administrator.


type: boolean


Flag used to enable/disable the execution of the data operation. If not specified, the default value will be "true".


type: boolean


Flag used to enable/disable the visibility of the data operation's configuration and runs in Tailer Studio. If not specified, the default value will be "false".


type: integer


This parameter limits the number of concurrent runs for this data operation.

If not set, the default value is 1.


type: string


Short description of the context of the configuration.


type: string


Path to a file containing a detailed description. The file must be in Markdown format.

⬇️ Source parameters (GCS)

The destination section contains all information related to the data source provider.

"source": {
  "type": "gcs",
  "gcp_project_id": "my-project",
  "gcs_source_bucket": "mirror-fd-io-exc-demo-wbd--n-in",
  "gcs_source_prefix": "testjul",
  "gcs_archive_prefix": "archive",
  "gcp_credentials_secret": {
    "cipher_aes": "223xxx",
    "tag": "8ddxxx",
    "ciphertext": "4c7xxx",
    "enc_session_key": "830xxx"


type: string


Source type.

The only supported source type for now is "gcs".


type: string


Specify the Google Cloud Platform project where to deploy the data operation and its associated cloud functions.

If not set, the user will be prompted to choose a project.


type: string



type: string



type: string optional

Path where the source files will be archived.

If present and populated, the STT data operation will archive the source files in the location specified, in the GCS source bucket.

If not present or empty, there will be no archiving.


type: dict


Encrypted credentials needed to read/move data from the source bucket.

You should have generated credentials when setting up GCP. To learn how to encrypt them, refer to this page.

⬆️ Destination parameters (BigQuery)

The destination section contains all the information related to the data destinations.

The destinations parameter is an array containing maps. Each map can contain a type of destination and many actual "tables" as ultimate destination.


"destinations": [
  "type": "bigquery",
  "gcp_project_id": "my-project",
  "gbq_dataset": "dlk_demo_wbd_psa",
  "source_format": "CSV",
  "create_disposition": "CREATE_IF_NEEDED",
  "write_disposition": "WRITE_TRUNCATE",
  "skip_leading_rows": 1,
  "field_delimiter": "|",
  "add_tailer_metadata": true,
  "gcp_credentials_secret": {
    "cipher_aes": "223008256918c292ff3ec1axxx",
    "tag": "8dd3db4c71bfb963d475d1bbd1xxx",
    "ciphertext": "4c74df268d5a7541f8264c2e7a282fxxx",
    "enc_session_key": "830606cbc4f9401a29c7977d364398xxx"
      "tables": [
        "table_name": "sales_details_test",
        "short_description": "Daily detailed Sales with tickets and tickets lines",
        "filename_template": "input_{{FD_DATE}}-{{FD_TIME}}-ORS-ventes.csv",
        "ddl_mode": "file",
        "ddl_file": "ddl/sales_details.json",
        "doc_md": "ddl/",
        "add_tailer_metadata": true,
        "skip_leading_rows": 0,
        "write_disposition": "WRITE_APPEND"
        "table_name": "stores",
        "short_description": "Full Stores referential",
        "filename_template": "input_{{FD_DATE}}-{{FD_TIME}}-ORS-magasins.csv",
        "ddl_mode": "file",
        "ddl_file": "ddl/stores.json",
        "doc_md": "ddl/",
        "add_tailer_metadata": false

Global destination parameters



type: string


Type of destination.

The only supported destination type for now is "bigquery".


type: string


Default GCP Project ID.

This parameter can be set for each table sub-object, and will be overridden by that value if it is different.


type: string


Default BigQuery Dataset.

This parameter can be set for each table sub-object, and will be overridden by that value if it is different.


type: object


Encrypted credentials needed to interact with Storage and BigQuery.

You should have generated credentials when setting up GCP. To learn how to encrypt them, refer to this page.


type: string


Default source format for input files.

Possible values (case sensitive):

This parameter can be set for each table sub-object, and will be overridden by that value if it is different.


type: string


Specifies behavior for creating tables (see Google BigQuery documentation).

Possible values:

  • "CREATE_IF_NEEDED" (default): If the table does not exist, BigQuery creates the table.

  • "CREATE_NEVER": The table must already exist.

This parameter can be set for each table sub-object, and will be overridden by that value if it is different.


type: string


Action that occurs if the destination table already exists (see Google BigQuery documentation).

Possible values:

  • "WRITE_TRUNCATE" (default): The run will write table data from the beginning. If the table already contained lines, they will all be deleted and replaced by the new lines. This option is used most of the time for daily runs to avoid duplicates.

  • "WRITE_APPEND": The run will append new lines to the table. When using this option, make sure not to run the data operation several times.

  • "WRITE_EMPTY": This option only allows adding data to an empty table. If the table already contains data, it returns an error. It is hardly ever used as data operations are usually run periodically, so they will always contain data after the first run. This parameter can be set for each table sub-object, and will be overridden by that value if it is different.


type: integer


Number of rows to skip when reading data, CSV only.

This parameter can be set for each table sub-object, and will be overridden by that value if it is different. Default value: 1


type: string


Separator for fields in a CSV file, e.g. ";".

Note: For Tab separator, set to "\t". This parameter can be set for each table sub-object, and will be overridden by that value if it is different. Default value:


type: string


Character used to quote data sections, CSV only (see Google BigQuery documentation).

Note: For quote and double quotes, set to "'" and """ respectively.

This parameter can be set for each table sub-object, and will be overridden by that value if it is different. Default value: ""


type: string


Represents a null value, CSV only (see Google BigQuery documentation).

This parameter can be set for each table sub-object, and will be overridden by that value if it is different. Default value: ""


type: boolean


Ignore extra values not represented in the table schema (see Google BigQuery documentation).

This parameter can be set for each table sub-object, and will be overridden by that value if it is different. Default value: false


type: integer


Number of invalid rows to ignore (see Google BigQuery documentation).

This parameter can be set for each table sub-object, and will be overridden by that value if it is different. Default value: 0


type: array


Specifies updates to the destination table schema to allow as a side effect of the load job (see Google BigQuery documentation).

This parameter can be set for each table sub-object, and will be overridden by that value if it is different. Default value: []


type: boolean


Allows quoted data containing newline characters, CSV only (see Google BigQuery documentation).

This parameter can be set for each table sub-object, and will be overridden by that value if it is different. Default value: false


type : boolean


Allows missing trailing optional columns, CSV only (see Google BigQuery documentation).

This parameter can be set for each table sub-object, and will be overridden by that value if it is different. Default value: false


type : boolean


[NEW] Allows automatic metadata feature that add specific columns during the ingestion process related to the input source.

The added columns are: tlr_ingestion_timestamp_utc (TIMESTAMP) tlr_input_file_source_type (STRING) tlr_input_file_name (STRING) tlr_input_file_full_resource_name (STRING)

This parameter can be set for each table sub-object, and will be overridden by that value if it is different.

Default value : false

Table sub-object parameters

The "table" object contains the definition of expected input files and their BigQuery target.




type: string


Name of the destination BigQuery table.


type: string


Short description of the destination BigQuery table.


type: string


Template for the files to be processed. The following placeholders are currently supported:

  • "FD_DATE" looks for an 8-digit date (e.g. "20191015").

  • "FD_DATE_YEAR_4" looks for 4-digit year (e.g "2021").

  • "FD_DATE_YEAR_2" looks for 2-digit year (e.g "21").

  • "FD_DATE_MONTH" looks for 2-digit month (e.g "05").

  • "FD_DATE_DAY" looks for 2-digit day (e.g "12").

  • "FD_TIME" looks for a 6-digit time (e.g. "124213").

  • "FD_BLOB_n", where "n" is a non-zero positive integer, looks for a string of characters of "n" length.

  • FD_TABLE_NAME: This is a special template used when you have to process a large number of different files sharing the same destination schema. This template has to be used in conjunction with the table_name parameter.


  • if "FD_DATE" is specified, it will have priority upon "FD_DATE_YEAR_X".

  • if "FD_DATE_YEAR_4" or "FD_DATE_YEAR_2" is specified, the final date will be concatenated with "FD_DATE_MONTH" and "FD_DATE_DAY".

  • if "FD_DATE_YEAR_2" is specified, it will be prefixed by "20".

  • if "FD_DATE_YEAR_4" or "FD_DATE_YEAR_2" is specified only "FD_DATE_MONTH" and "FD_DATE_DAY" will be set to "01".

Example 1

This template:


will allow you to process this type of files:


Example 2

This template:


will allow you to process this type of files:


Example 3

If table_name is set to: "table_{{FD_TABLE_NAME}}"

and filename_template to: "{{FD_DATE}}_{{FD_TIME}}fixedvalue{{FD_TABLE_NAME}}.csv"

A file named "20201116_124523_fixedvalue_stores.csv" will be loaded into a table named: "table_stores_20191205"

A file named "20190212_063412_fixedvalue_visits.csv" will be loaded into a table named: "table_visits_20190212"


type: string


This parameter allows you to specify how the schema of the table will be obtained.

Possible values:

  • "file": Legacy mode. The table schema is described in the DDL file specified in the ddl_file parameter.

  • " file_template": The table schema is described in a DDL file provided in the source directory together with the source file. It must have the same filename as the source file, with the ".ddl.json" suffix.

  • "header" (CSV file only): The columns of the CSV file first line are automatically used as columns for the database table. All the columns are given the STRING type. No DDL file needs to be provided.

  • "autodetect" (not recommended): Google’s default mode. The schema is automatically detected from the source file. This mode doesn’t work well with CSV files, but gives good results with structured formats such as JSON. (see Google BigQuery documentation).

Default value: file


type: string

mandatory if ddl_mode is set to "file"

Path to the DDL file where the destination schema is described.

doc_md type: string optional

Path to the Markdown file containing detailed information about the destination table.


type : boolean


[NEW] Allows automatic metadata feature that add specific columns during the ingestion process related to the input source.

The added columns are: tlr_ingestion_timestamp_utc (TIMESTAMP) tlr_input_file_source_type (STRING) tlr_input_file_name (STRING) tlr_input_file_full_resource_name (STRING) Default value : false

Last updated