Tailer Documentation
  • What is Tailer Platform?
  • Getting Started
    • Prepare your local environment for Tailer
    • Install Tailer SDK
    • Set up Google Cloud Platform
    • Encrypt your credentials
  • [Tutorial] Create a first data pipeline
    • Introduction
    • Prepare the demonstration environment
    • Copy files from one bucket to another
    • Load files into BigQuery tables
    • Prepare data
    • Build predictions
    • Export data
    • Congratulations!
    • [Video] Automatic Script
      • SQL script file
      • DDL script file
      • Tables to Tables script file
      • Launch configuration and furthermore
  • Data Pipeline Operations
    • Overview
    • Set constants with Context
      • Context configuration file
    • Move files with Storage to Storage
      • Storage to Storage configuration file
    • Load data with Storage to Tables
      • Storage to Tables configuration file
      • Storage to Tables DDL files
    • Stream incoming data with API To Storage
      • API To Storage configuration file
      • API To Storage usage examples
    • Transform data with Tables to Tables
      • Tables to Tables configuration file
      • Table to Table SQL and DDL files
    • Export data with Tables to Storage
      • [V3] Table to Storage configuration file
      • Table to Storage SQL file
      • [V1-V2: deprecated] Table to Storage configuration file
    • Orchestrate processings with Workflow
      • [V2] Workflow configuration file
      • [V1: deprecated] Workflow configuration file
    • Convert XML to CSV
      • Convert XML to CSV configuration file
    • Use advanced features with VM Launcher
      • Process code with VM Launcher
        • VM Launcher configuration file for code processing
      • Encrypt/Decrypt data with VM Launcher
        • VM Launcher configuration file for data encryption
        • VM Launcher configuration file for data decryption
    • Monitoring and Alerting
      • Monitoring and alerting parameters
    • Asserting Data quality with Expectations
      • List of Expectations
    • Modify files with File Utilities
      • Encrypt/Decrypt data with File Utilities
        • Configuration file for data encryption
        • Configuration file for data decryption
    • Transfer data with GBQ to Firestore
      • Table to Storage: configuration file
      • Table to Storage: SQL file
      • VM Launcher: configuration file
      • File-to-firestore python file
  • Tailer Studio
    • Overview
    • Check data operations' details
    • Monitor data operations' status
    • Execute data operations
    • Reset Workflow data operations
    • Archive data operations
    • Add notes to data operations and runs
    • View your data catalog
    • Time your data with freshness
  • Tailer API
    • Overview
    • Getting started
    • API features
  • Release Notes
    • Tailer SDK Stable Releases
    • Tailer Beta Releases
      • Beta features
      • Beta configuration
      • Tailer SDK API
    • Tailer Status
Powered by GitBook
On this page
  • 📋 Global SQL pattern
  • First SQL example
  • Example 2: create a list in the document

Was this helpful?

Edit on GitHub
  1. Data Pipeline Operations
  2. Transfer data with GBQ to Firestore

Table to Storage: SQL file

PreviousTable to Storage: configuration fileNextVM Launcher: configuration file

Last updated 2 years ago

Was this helpful?

To run a Table to Storage data operation, you first need to prepare a SQL query that will extract the data to export.

The SQL file must contain a BigQuery standard SQL query. You can write it directly in the query editor and then save it into a .sql file.

This query will be executed when the data operation is launched, and the result will be stored in the JSON file specified in your configuration.

For a GBQ to Firestore data pipeline, you must at least select a firestore_path column

📋 Global SQL pattern

In order to use the Python script to load the data into Firestore, the SQL that extracts it must follow a specific pattern.

Column
Descriptions

timestamp

type: timestamp

optional

For your different use cases, it can be interesting to have the last calculation date of your dataset to Firestore. This column is optional but recommanded.

firestore_path

type: string

mandatory

Variable read by the Python code to build the target path of documents and collections in Firestore. Each category and sub-category must be separated by a pipe "|". ⚠ Remember to remove the "|" that could be in the variables that are used as path names, or it would be interpreted as a category separator! ⚠ The Firestore path is a succession of collections and documents. You must at all costs end up on a collection of documents. The defined path must therefore contain an even number of categories and sub-categories. See a screenshot of an exemple below.

other variables

type: string

optional

The other variables are the ones you want to display in your firestore document.

You can define as many variable as you like, as described in the first example below.

We'll even see below how to create sub categories with the second example.

First SQL example

SELECT
  CURRENT_TIMESTAMP() AS timestamp,
  CONCAT(
    "tailer-activities-runs", 
    "|", REPLACE(account, "|", "_"), -- do not forget to replace any potential pipes!
    "|", REPLACE(configuration_type, "|", "_"),  
    "|", REPLACE(configuration_id, "|", "_"),  
    "|", "freshness", 
    "|", "job_id", 
    "|", REPLACE(job_id, "|", "_"), 
    "|", "next_execution"
  ) AS firestore_path,
-- here starts the "other variables"  
  account,
  configuration_type,
  configuration_id,
  job_id,
  CONCAT("freshness_", job_id) AS collection_groupe_name,
  last_execution_datetime,
  next_execution_datetime,
  frequence,
  status
FROM
  `tailer-demo.dlk_tailer_bda_freshness.metrics`
WHERE
  (1 = 1) -- you could add filters here

You will get a BigQuery Result like this:

And after loading it into Firestore (see next pages for the next steps), you create collections and documents as specified in the firestore_path column and get data like this in Firestore:

You can also create a list in the document using a "data" column, created using the BigQuery ARRAY_AGG(STRUCT()) functions.

Variables
Descriptions

data

type: struct of string or struct of struct of string

mandatory

Allows you to create a list of sub-elements that correspond N times to the element (for example for a product, you can create a list of sales sorted by date)

The SQL is more complex. Here is an example:

WITH
  tmp AS (
  SELECT
    "app-data"|| "|" || "000000" || "|" || "product-details" || "|" ||season_code || "|" ||"references"|| "|" ||reference_color_id AS firestore_path,
    CURRENT_TIMESTAMP() AS extraction_timestamp,
    "000000" AS account,
    season_code,
    reference_color_id,
    ARRAY_AGG(STRUCT(
        date,
        discount_value_,
        sales_)
    ORDER BY date ASC
  ) AS data
  FROM
    `dlk_bda_pa_demo.product_metrics_details`
  WHERE
    (1=1) -- you could add filters here
  GROUP BY
    firestore_path,
    extraction_timestamp,
    account,
    season_code,
    reference_color_id)
SELECT
  extraction_timestamp,
  firestore_path,
  account,
  season_code,
  reference_color_id,
  STRUCT(data AS data) AS details
FROM
  tmp

The result looks like this in Firestore:

Example 2: create a list in the document

👁️‍🗨️
👁️‍🗨️
BigQuery
BigQuery Result
Data stored in Firestore
Data stored in Firestore