Storage to Storage configuration file
This is the description of the JSON configuration file of a Storage to Storage data operation.
The configuration file is in JSON format. It contains the following sections:
Global parameters: General information about the data operation.
Source parameters: One source block, containing information about the data source.
Destination parameters: One or several destination blocks, containing information about the data destinations.
👁️🗨️ Example
Here is an example of STS configuration file for a GCS to SFTP transfer:
🌐 Global parameters
Parameter | Description |
---|---|
$schema type: string optional | The url of the json-schema that contains the properties that your configuration must verify. Most Code Editor can use that to validate your configuration, display help boxes and enlighten issues. |
configuration_type type: string mandatory | Type of data operation. For an STS data operation, the value is always "storage-to-storage". |
configuration_id type: string mandatory | ID of the data operation. You can pick any name you want, but is has to be unique for this data operation type. Note that in case of conflict, the newly deployed data operation will overwrite the previous one. To guarantee its uniqueness, the best practice is to name your data operation by concatenating:
|
environment type: string mandatory | Deployment context. Values: PROD, PREPROD, STAGING, DEV. |
account type: string mandatory | Your account ID is a 6-digit number assigned to you by your Tailer Platform administrator. |
version type: string optional | Version of the configuration in order to use new features. |
filename_templates type: string mandatory | List of filename templates that will be processed. You can set the value to "*" for all files to be copied. However, this is not recommended, as unnecessary or sensitive files might be included by mistake. Besides, the date value specified in filename_template will be used to sort files in the archive folder. If no date value is specified, all files will be stored together under one folder named /ALL. The best practice is to specify one or more filename templates with the filename_template and file_description parameters as described in the next paragraphe. |
activated type: boolean optional | Flag used to enable/disable the execution of the data operation. If not specified, the default value will be "true". |
archived type: boolean optional | Flag used to enable/disable the visibility of the data operation's configuration and runs in Tailer Studio. If not specified, the default value will be "false". |
max_active_runs type: integer optional | This parameter limits the number of concurrent runs for this data operation. If not set, the default value is 5. |
empty_file_policy type: string optional | This parameter will tell Tailer how to behave when an empty file (0 bytes) is read.
The default value is: "NONE" |
short_description type: string optional | Short description of the Data Operation |
doc_md type: string optional | Path to a file containing a detailed description. The file must be in Markdown format. |
"Filename Templates" sub-object parameters
The "filename_templates" object contains the definition of expected source files to copy to the destinations.
Parameter | Description |
---|---|
filename_template type: string mandatory | Template for the files to be processed. The following placeholders are currently supported:
Information:
Example 1 This template:
will allow you to process this type of files: "stores_20201116_124213.txt" Example 2 This template:
will allow you to process this type of files: "20201116_12397_fixedvalue_12312378934.gz" |
file_description type: string optional | Short description of the files that will match the filename template. |
⬇️ Source parameters
There can only be one source block, as STS data operations can only process one source at a time.
Google Cloud Storage source
Example:
Parameter | Description |
type type: string mandatory | Type of source. In this case : "gcs". |
gcp_project_id type: string mandatory | Set the project where deploy the source configuration and associated cloud functions If not set, the user will be prompted to choose a profile where deploy the configuration |
gcs_source_bucket type: string mandatory | Name of the source bucket. |
gcs_source_prefix type: string mandatory | Path where the files will be found, e.g. "some/sub/dir". |
archive_prefix type: string optional | Path where the source files will be archived. If present and populated, the STS data operation will archive the source files in the location specified, in the GCS source bucket. If not present or empty, there will be no archiving. |
gcp_credentials_secret type: dict mandatory | Encrypted credentials needed to read/move data from the source bucket. You should have generated credentials when setting up GCP. To learn how to encrypt them, refer to this page. |
Amazon S3 source
Example:
Parameter | Description |
type type: string mandatory | Type of source. In this case : "s3". |
s3_source_bucket type: string mandatory | Name of the source S3 bucket. |
s3_source_prefix type: string mandatory | Path where the files will be found, e.g. "some/sub/dir". |
archive_prefix type: string optional | Path where the source files will be archived. If present and populated, the STS data operation will archive the source files in the location specified, in the GCS source bucket. If not present or empty, there will be no archiving. |
aws_access_key type: string mandatory | Amazon S3 access key ID. |
aws_access_key_secret type: dict mandatory | Encrypted Amazon S3 access private key. This is needed to read/move data from the source bucket. To learn how to encrypt the private key value, refer to this page. |
Azure source
Example:
Parameter | Description |
type type: string mandatory | Type of source. In this case : "azure" |
azure_source_storage type: string mandatory | Name of the source Azure storage. |
azure_source_prefix type: string mandatory | Path where the files will be found, e.g. "some/sub/dir". |
archive_prefix type: string optional | Path where the source files will be archived. If present and populated, the STS data operation will archive the source files in the location specified, in the GCS source bucket. If not present or empty, there will be no archiving. |
azure_connection_string_secret type: dict mandatory | Encrypted Azure access private key. This is needed to read/move data from the source bucket. To learn how to encrypt the private key value, refer to this page. |
SFTP source
Example:
Parameter | Description |
type type: string mandatory | Type of source. In this case : "sftp". |
sftp_source_directory type: string mandatory | Sub-path to switch to before downloading the file. |
sftp_source_filename type: string mandatory | File to retrieve. |
archive_prefix type: string optional | Path where the source files will be archived. If present and populated, the STS data operation will archive the source files in the location specified, in the GCS source bucket. If not present or empty, there will be no archiving. |
sftp_host type: string mandatory | SFTP host, e.g. "sftp.something.com". |
sftp_port type: integer mandatory | SFTP port, e.g. "22". |
sftp_userid type: string mandatory | SFTP user ID, e.g. "john_doe". |
sftp_authentication_method type: string optional | Authentication method used to connect to the SFTP server. The following methods are supported:
Default : USERNAME_PASSWORD |
sftp_password_secret type: dict optional | Encrypted SFTP password for the user ID. This is needed to read/move data from the source SFTP. This attribute MUST be set if sftp_authentication_method is set to USERNAME_PASSWORD To learn how to encrypt the password, refer to this page. |
sftp_private_key_secret type: dict optional | Encrypted SFTP private key. This attribute MUST be set if sftp_authentication_method is set to PRIVATE_KEY. To learn how to encrypt the password, refer to this page. |
sftp_private_key_passphrase_secret type: dict optional | Encrypted SFTP private key passphrase if provided This attribute MUST be set if sftp_authentication_method is set to PRIVATE_KEY. To learn how to encrypt the password, refer to this page. |
⬆️ Destination parameters
These parameters allow you specify a list of destinations. You can add as many "destination" sub-objects as you want, they will all be processed.
Google Cloud Storage destination
Example:
Parameter | Description |
---|---|
type type: string mandatory | Type of destination. In this case : "gcs". |
gcs_destination_bucket type: string mandatory | Google Cloud Storage destination bucket. |
gcs_destination_prefix type: string mandatory | Google Cloud Storage destination path, e.g. "/subdir/subdir_2" to send the files to "gs://BUCKET/subdir/subdir_2/source_file.ext" |
gcp_credentials_secret type: dict mandatory | Encrypted credentials needed to read/write/move data from the destination bucket. You should have generated credentials when setting up GCP. To learn how to encrypt them, refer to this page. |
Amazon S3 destination
Example:
Parameter | Description |
---|---|
type type: string mandatory | Type of destination. In this case : "s3". |
s3_bucket type: string mandatory | Amazon S3 bucket name. |
s3_destination_prefix type: string mandatory | Amazon S3 destination path, e.g. "subdir_A/subdir_B" to send the files to "s3://bucket/subdir_A/subdir_B/source_file.ext". |
aws_access_key type: string mandatory | Amazon S3 access key ID. |
aws_access_key_secret type: dict mandatory | Encrypted Amazon S3 access private key. This is needed to read/write/move data from the destination bucket. To learn how to encrypt the private key value, refer to this page. |
Azure destination
Example:
Parameter | Description |
---|---|
type type: string mandatory | Type of destination. In this case : "azure". |
azure_destination_prefix type: string mandatory | Complete Azure destination path, i.e. storage name and subdirectory if needed. |
azure_connection_string_secret type: dict mandatory | Encrypted Azure access private key. This is needed to read/move data from the source bucket. To learn how to encrypt the private key value, refer to this page. |
SFTP destination
Example:
Parameter | Description |
---|---|
type type: string mandatory | Type of destination. In this case : "sftp". |
generate_top_file type: string optional | This flag, if set, will generate a TOP file along with the file copied. Possible values are:
|
sftp_destination_dir type: string mandatory | Path to switch to before uploading the file. |
sftp_destination_dir_create type: string mandatory | Will try to create the subdir specified in sftp_destination_dir on the SFTP filesystem before switching to it and copying files. |
sftp_host type: string mandatory | SFTP host, e.g. "sftp.something.com". |
sftp_port type: string mandatory | SFTP port, e.g. "22". |
sftp_userid type: string mandatory | SFTP user ID, e.g. "john_doe". |
sftp_authentication_method type: string optional | Authentication method used to connect to the SFTP server. The following methods are supported:
Default : USERNAME_PASSWORD |
sftp_password_secret type: dict optional | Encrypted SFTP password for the user ID. This is needed to read/move data from the source SFTP. This attribute MUST be set if sftp_authentication_method is set to USERNAME_PASSWORD. To learn how to encrypt the password, refer to this page. |
sftp_private_key_secret type: dict optional | Encrypted SFTP private key. This attribute MUST be set if sftp_authentication_method is set to PRIVATE_KEY. To learn how to encrypt the password, refer to this page. |
sftp_private_key_passphrase_secret type: dict optional | Encrypted SFTP private key passphrase if provided. This attribute MUST be set if sftp_authentication_method is set to PRIVATE_KEY. To learn how to encrypt the password, refer to this page. |
Last updated