Convert XML to CSV configuration file

This is the description of the JSON configuration file for a Convert XML to CSV data operation.
The configuration file is in JSON format. It contains the following sections:

👁️‍🗨️
Example

Here is an example of Convert XML to CSV configuration file:
{
"$schema": "http://jsonschema.tailer.ai/schema/xml-conversion-veditor",
"configuration_type": "xml-conversion",
"configuration_id": "000099-test-xml-conversion",
"environment": "DEV",
"account": "000099",
"activated": true,
"archived": false,
"doc_md": "readme.md",
"gcp_project_id": "fd-io-jarvis-demo-dlk",
"gcs_bucket": "fd-io-demo-ds",
"gcs_working_directory": "test_xml_conversion",
"credentials": {
"gcp-credentials.json": {
"content": {
"cipher_aes": "dd34e56f4...",
"tag": "3d968340...",
"ciphertext": "046ffe41f9c00448ea0f816119...",
"enc_session_key": "36a0f8ffe1b1f0..."
}
}
},
"filename_templates": [
{
"filename_template": "coupon_{{FD_DATE}}.xml",
"file_description": "This is a description.",
"xsd_schema_file": "coupon.xsd",
"output_suffix_filters":[
"advantage.tsv",
"barcodeType.tsv"
]
}
]
}

🌐
Global parameters

General information about the data operation.
Parameter
Description
$schema
type: string
optional
The url of the json-schema that contains the properties that your configuration must verify. Most Code Editor can use that to validate your configuration, display help boxes and enlighten issues.
configuration_type
type: string
mandatory
Type of data operation.
For a Convert XML to CSV data operation, the value is always "xml-conversion".
configuration_id
type: string
mandatory
ID of the data operation.
You can pick any name you want, but is has to be unique for this data operation type.
Note that in case of conflict, the newly deployed data operation will overwrite the previous one. To guarantee its uniqueness, the best practice is to name your data operation by concatenating:
  • your account ID,
  • "xml-conversion".
  • and a description of the data to convert.
environment
type: string
mandatory
Deployment context.
Values: PROD, PREPROD, STAGING, DEV.
account
type: string
mandatory
Your account ID is a 6-digit number assigned to you by your Tailer Platform administrator.
activated
type: boolean
optional
Flag used to enable/disable the execution of the data operation.
If not specified, the default value will be "true".
archived
type: boolean
optional
Flag used to enable/disable the visibility of the data operation's configuration and runs in Tailer Studio.
If not specified, the default value will be "false".
doc_md
type: string
optional
Path to a file containing a detailed description of the data operation. The file must be in Markdown format.

💼
Working folder parameters

Information related to the input/output working directory in Google Cloud Storage.
Parameter
Description
gcp_project_id
type: string
mandatory
Google Cloud Platform project ID for the bucket where the data is going to be converted.
gcs_bucket
type: string
mandatory
Name of the GCS bucket where the data is going to be converted.
gcs_working_directory
type: string
mandatory
Path in the GCS bucket where the input files will be placed, and the output files generated, e.g. "some/sub/dir".
gcp_credentials_secret
type: dict
mandatory
Encrypted credentials needed to read and write data in the GCS bucket.
You should have generated credentials when setting up GCP. To learn how to encrypt them, refer to this page.

💱
Conversion parameters

Information about the input file to process and the output files generated.
Parameter
Description
filename_templates
type: array
mandatory
Array containing one or several filename_template parameters (see below).
filename_template
type: string
mandatory
When a file with a name matching the template set here is added to the specified GCS folder, the conversion will be launched automatically.
The following placeholders are currently supported:
  • "FD_DATE" looks for an 8-digit date (e.g. "20191015").
  • "FD_TIME" looks for a 6-digit time (e.g. "124213").
  • "FD_BLOB_XYZ", where XYZ is a non-zero positive integer, looks for a string of characters of XYZ length.
Example 1
This template:
"stores_{{FD_DATE}}{{FD_TIME}}.txt"
will allow you to process this type of files:
"stores_20201116_124213.txt"
Example 2
This template:
"{{FD_DATE}}{{FD_BLOB_5}}fixedvalue{{FD_BLOB_11}}.gz"
will allow you to process this type of files:
"20201116_12397_fixedvalue_12312378934.gz"
file_description
type: string
mandatory
A short description of the file template entry.
xsd_schema_file
type: string
mandatory
Name of the XSD file that will be used to validate the XML file before the conversion.
In the current version, only one XSD can be used per XML entry. The XSD file name must be identical to the corresponding XML file name, excluding suffixes. For example "coupon.xsd" can be used to validate "coupon_20210404.xml".
output_suffix_filters
type: array of string
optional
Names of the output files to be kept after the conversion.
If the XML file contains many child entities, the conversion will create a lot of CSV files (one for each entity). This filter allows you to prevent unnecessary file upload to the output bucket.
It works by finding an occurrence of the string in the filename. For example, if a file named "coupon_20210404_advantage.tsv" is generated and the filter "advantage.tsv" was added, then this file will be kept.