API To Storage configuration file

The configuration file is in JSON format. It contains the following sections:

  • Global parameters: General information about the data operation.

  • Source parameters: Information related to the data source provider.

  • Destination parameters: One or several destination blocks, containing information about the data destinations.

👁️‍🗨️ Example

Here is an example of ATS configuration file exposing a Pub/Sub endpoint and an output to GCS :

{
  "$schema": "http://jsonschema.tailer.ai/schema/api-to-storage-veditor",
  "configuration_type": "api-to-storage",
  "configuration_id": "000099-ats-example-products",
  "environment": "DEV",
  "account": "000099",
  "activated": true,
  "archived": false,
  "short_description": "This API will receive PRODUCTS and store them to GCS blobs.",
  "doc_md": "000099-ats-example-products.md",
  "source": {
    "type": "pubsub",
    "gcp_project_id": "fd-io-jarvis-demo-dlk",
    "pubsub_topic_suffix": "example-products",
    "protocol_buffers_file": "000099-ats-example-products.proto"
  },
  "destinations": [
    {
      "type": "gcs",
      "gcs_destination_bucket": "fd-io-demo-n-in",
      "gcs_destination_prefix": "ats-repository/products/input",
      "gcs_filename_template": "products.json",
      "gcp_credentials_secret": {
        "cipher_aes": "473a2d0cb3",
        "tag": "6f72f",
        "ciphertext": "ba9c04ebe99e7c83cdd1fde63e63a8485472906",
        "enc_session_key": "1bd17b6fd0ac286161d6"
      },
      "description": "This is a short description of the GCS output."
    }
  ]
}

🌐 Global parameters

General information about the data operation

ParameterDescription

$schema

type: string

optional

The url of the json-schema that contains the properties that your configuration must verify. Most Code Editor can use that to validate your configuration, display help boxes and enlighten issues.

configuration_type

type: string

mandatory

Type of data operation. For an ATS data operation, the value is always "api-to-storage".

configuration_id

type: string

mandatory

ID of the data operation.

You can pick any name you want, but is has to be unique for this data operation type.

Note that in case of conflict, the newly deployed data operation will overwrite the previous one. To guarantee its uniqueness, the best practice is to name your data operation by concatenating:

  • your account ID,

  • the source bucket name,

  • and the source directory name.

environment

type: string

Mandatory

Deployment context. Values: PROD, PREPROD, STAGING, DEV.

account

type: string

mandatory

Your account ID is a 6-digit number assigned to you by your Tailer Platform administrator.

activated

type: boolean

optional

Flag used to enable/disable the execution of the data operation. If not specified, the default value will be "true".

archived

type: boolean

optional

Flag used to enable/disable the visibility of the data operation's configuration and runs in Tailer Studio. If not specified, the default value will be "false".

max_active_runs

type: integer

optional

This parameter limits the number of concurrent runs for this data operation.

If not set, the default value is 1.

short_description

type: string

optional

Short description of the context of the configuration.

doc_md

type: string

optional

Path to a file containing a detailed description. The file must be in Markdown format.

Source parameters (Pub/Sub)

The destination section contains all information related to the data source provider.

"source": {
    "type": "pubsub",
    "gcp_project_id": "fd-io-jarvis-demo-dlk",
    "pubsub_topic_suffix": "example-products",
    "protocol_buffers_file": "000099-ats-example-products.proto"
  }
ParameterDescription

type

type: string

mandatory

Source type.

The only supported source type for now is "pubsub".

gcp_project_id

type: string

mandatory

Specify the Google Cloud Platform project where to deploy the data operation and its associated cloud functions.

If not set, the user will be prompted to choose a project.

pubsub_topic_suffix

type: string

mandatory

Name of the Pub/Sub topic that will be created.

protocol_buffers_file

type: string

optional

Filename pointing to a Protocol Buffers 2 file.

If specified, all incoming data streamed through the topic will be checked against the definition included in the Protocol Buffers file.

Protocol Buffers File syntax

Pub/Sub allow to verify an incoming payload using a Protocol Buffers definition.

Protocol Buffers (proto2) langage guide : https://developers.google.com/protocol-buffers/docs/proto?hl=fr

The definition should be defined as follow. Note that the message "Item" is where you must customize your payload schema.

Let's use the following example where the attribute "new_item" is optional:

{
    "input_data": [
        {
            "product_id": "123456789",
            "label": "Some label ABC",
            "description": "A specific description for product 123456789"
        },
        {
            "product_id": "987654321",
            "label": "Some label YUI",
            "description": "A specific description for product 987654321"
        },
        {
            "label": "Some label XYZ",
            "product_id": "66668888",
            "description": "A specific description for product 66668888"
        },
        {
            "product_id": "66668888",
            "label": "Some label XYZ",
            "description": "A specific description for product 66668888",
            "new_item": "some new data"
        }
    ]
}

The corresponding Protocol Buffers definition should be like this:

syntax = "proto2";

message GlobalMessage {

  message Item {
    required string product_id = 1;
    required string label = 2;
    required string description = 3;
    optional string new_item = 4;
  }

  repeated Item input_data = 1;
}

Destination parameters

These parameters allow you specify a list of destinations. You can add as many "destination" sub-objects as you want, they will all be processed.

Google Cloud Storage destination

Example:

"destinations": [
    {
      "type": "gcs",
      "gcs_destination_bucket": "fd-io-demo-n-in",
      "gcs_destination_prefix": "ats-repository/products/input",
      "gcs_filename_template": "products.json",
      "gcp_credentials_secret": {
        "cipher_aes": "473a2d0cb3",
        "tag": "6f72f",
        "ciphertext": "c9e7c83cdd1fde63e63a8485472906",
        "enc_session_key": "1bd17b6fd0ac286161d621bf07c20593f14ea"
      },
      "description": "This is a short description of the GCS output."
    }
  ]
ParameterDescription

type

type: string

mandatory

Type of destination.

In this case : "gcs".

gcs_destination_bucket

type: string

mandatory

Google Cloud Storage destination bucket.

gcs_destination_prefix

type: string

mandatory

Google Cloud Storage destination path, e.g. "/subdir/subdir_2" to send the files to "gs://BUCKET/subdir/subdir_2/source_file.ext"

gcp_credentials_secret

type: dict

mandatory

Encrypted credentials needed to read/write/move data from the destination bucket.

You should have generated credentials when setting up GCP. To learn how to encrypt them, refer to this page.

gcs_filename_template

type: dict

mandatory

Filename template that will be used to write incoming data to a GCS storage.

description

type: dict

optional

Short description of the destination.

Last updated