Table to Storage: SQL file

To run a Table to Storage data operation, you first need to prepare a SQL query that will extract the data to export.

The SQL file must contain a BigQuery standard SQL query. You can write it directly in the BigQuery query editor and then save it into a .sql file.

This query will be executed when the data operation is launched, and the result will be stored in the JSON file specified in your configuration.

For a GBQ to Firestore data pipeline, you must at least select a firestore_path column

📋 Global SQL pattern

In order to use the Python script to load the data into Firestore, the SQL that extracts it must follow a specific pattern.

ColumnDescriptions

timestamp

type: timestamp

optional

For your different use cases, it can be interesting to have the last calculation date of your dataset to Firestore. This column is optional but recommanded.

firestore_path

type: string

mandatory

Variable read by the Python code to build the target path of documents and collections in Firestore. Each category and sub-category must be separated by a pipe "|". ⚠ Remember to remove the "|" that could be in the variables that are used as path names, or it would be interpreted as a category separator! ⚠ The Firestore path is a succession of collections and documents. You must at all costs end up on a collection of documents. The defined path must therefore contain an even number of categories and sub-categories. See a screenshot of an exemple below.

other variables

type: string

optional

The other variables are the ones you want to display in your firestore document.

You can define as many variable as you like, as described in the first example below.

We'll even see below how to create sub categories with the second example.

👁️‍🗨️ First SQL example

SELECT
  CURRENT_TIMESTAMP() AS timestamp,
  CONCAT(
    "tailer-activities-runs", 
    "|", REPLACE(account, "|", "_"), -- do not forget to replace any potential pipes!
    "|", REPLACE(configuration_type, "|", "_"),  
    "|", REPLACE(configuration_id, "|", "_"),  
    "|", "freshness", 
    "|", "job_id", 
    "|", REPLACE(job_id, "|", "_"), 
    "|", "next_execution"
  ) AS firestore_path,
-- here starts the "other variables"  
  account,
  configuration_type,
  configuration_id,
  job_id,
  CONCAT("freshness_", job_id) AS collection_groupe_name,
  last_execution_datetime,
  next_execution_datetime,
  frequence,
  status
FROM
  `tailer-demo.dlk_tailer_bda_freshness.metrics`
WHERE
  (1 = 1) -- you could add filters here

You will get a BigQuery Result like this:

And after loading it into Firestore (see next pages for the next steps), you create collections and documents as specified in the firestore_path column and get data like this in Firestore:

👁️‍🗨️ Example 2: create a list in the document

You can also create a list in the document using a "data" column, created using the BigQuery ARRAY_AGG(STRUCT()) functions.

VariablesDescriptions

data

type: struct of string or struct of struct of string

mandatory

Allows you to create a list of sub-elements that correspond N times to the element (for example for a product, you can create a list of sales sorted by date)

The SQL is more complex. Here is an example:

WITH
  tmp AS (
  SELECT
    "app-data"|| "|" || "000000" || "|" || "product-details" || "|" ||season_code || "|" ||"references"|| "|" ||reference_color_id AS firestore_path,
    CURRENT_TIMESTAMP() AS extraction_timestamp,
    "000000" AS account,
    season_code,
    reference_color_id,
    ARRAY_AGG(STRUCT(
        date,
        discount_value_,
        sales_)
    ORDER BY date ASC
  ) AS data
  FROM
    `dlk_bda_pa_demo.product_metrics_details`
  WHERE
    (1=1) -- you could add filters here
  GROUP BY
    firestore_path,
    extraction_timestamp,
    account,
    season_code,
    reference_color_id)
SELECT
  extraction_timestamp,
  firestore_path,
  account,
  season_code,
  reference_color_id,
  STRUCT(data AS data) AS details
FROM
  tmp

The result looks like this in Firestore:

Last updated