Export data
The fifth and final data operation of this tutorial will consist in exporting our data back to a bucket.
🗺️ Overview
During this step, we will take our aggregated store data located one BigQuery table and export them to a Google Cloud Storage bucket CSV file so they can later be used with other tools, such as a warehouse management system.
🗂️ Create a bucket and a folder
For the detailed procedure on how to create GCS buckets (manually or using gsutil), refer to this page.
- Create a bucket in the project of your choice. As bucket names need to be unique globally, you can pick any name you want. Select the settings that you want. 
- In this bucket, create a folder named store_clustering_export that will contain our future export file. 
📄 Create your configuration files
Create the JSON file that configures the data pipeline operation
- Access your tailer-demo folder. 
- Inside, create a folder named 5-Export-data for this new step. 
- In this folder, create a JSON file named 000099-tailer-demo-export-data.json for your data operation. 
- Copy the following contents into your file: - { "$schema": "http://jsonschema.tailer.ai/schema/table-to-storage-veditor", "configuration_type": "table-to-storage", "configuration_id": "000099-tailer-demo-export_YOUR_NAME", "short_description": "Short description of the job", "environment": "DEV", "account": "000099", "version": "3", "activated": true, "archived": false, "start_date" : "2023, 2, 10", "schedule_interval" : "None", "dest_gcp_project_id": "my-gcp-project", "gcs_dest_bucket": "my-gcs-bucket", "gcs_dest_prefix": "output_YOUR_NAME", "print_header": true, "destination_format": "CSV", "field_delimiter": ",", "copy_table": true, "dest_gcp_project_id": "my-gcp-project", "dest_gbq_dataset": "my_dataset", "dest_gbq_table_suffix": "dag_execution_date", "tasks": [ { "task_id": "demo_export", "sql_file" : "my_SQL_file.sql", "output_filename" : "demo_export_YOUR_NAME_{{FD_DATE}}.csv", "dest_gbq_table": "demo_export_YOUR_NAME" } ] }
- Edit the following values: ◾ Replace my-gcp-project with the ID of the GCP project containing the source table. This is where the SQL query will be run. ◾ Replace my-gbq-dataset with the name of the dataset where you want to create a copy of the table generated with the SQL request. ◾ Replace my-gcs-bucket with the name of the bucket that you've just created, where the export file will be generated. ◾ If you share the project with others, then don't forget to personalize your outputs so you won't erase your team mate's work. You can search for "_YOUR_NAME" and replace all the occurrences. 
- Save your file. 
Create a SQL file
- Inside the 5-Export-data folder, create a file named export_data.sql. 
- Copy the following contents into the export_data.sql file: - SELECT * FROM `my-gbq-dataset.store_clustering`
- Replace my-gbq-dataset with the name of your working dataset in the previous step. 
- Save your file. 
Create the JSON file that will trigger the workflow
- Inside the 5-Export-data folder, create a file named workflow.json. 
- Copy the following contents into your file: - { "$schema": "http://jsonschema.tailer.ai/schema/workflow-veditor", "configuration_type": "workflow", "version":"2", "configuration_id": "000099-tailer-demo-export-data_YOUR_NAME", "short_description": "Launch the Tailer demo data load", "account": "000099", "environment": "DEV", "activated": true, "archived": false, "gcp_project_id": "my-project", "authorized_job_ids": [ "gbq-to-gbq|000099-load_my_gbq_dataset_my_table_DEV" ], "target_dag": { "configuration_type":"table-to-storage", "configuration_id":"000099-tailer-demo-export_YOUR_NAME_DEV" } }
- Save your file. 
▶️ Deploy the data operation
Once your files are ready, you can deploy the data operation:
- Access your working folder by running the following command: - cd "[path to your tailer folder]\tailer-demo\5-Export-data"
- To deploy the data operation, run the following command: - tailer deploy configuration 000099-tailer-demo-export-data.json
🖐️ Run your workflow manually
- Access Tailer Studio. 
- In the left navigation menu, select Table-to-storage. 
- In the Configurations tab, search for your data operation, 000099-tailer-demo-export-data. 
- Click the data operation ID to display its details. 
- In the upper right corner, click on Launch. 
🗳️ Check the result in GCS
Access the GCS folder in the bucket you've just created. Your data should now appear in the form of a CSV file that you can export to a different system.
You can now add more files into the input folder from the first step of this tutorial to see the whole pipeline play out!
🚀 Further steps
You can check the full Tables to Storage documentation and try other parameters:
- Add some tasks to perform different extractions 
- Create a JSON extract or compress the output using GZIP 
- Send the data file to a partner using a Storage to Storage operation, or ingest it into Firestore using a specific VM Launcher operation. 
- Insert the run date in your query using the "sql_query_template" parameters 
- Insert environment variables in your SQL using a Context configuration. 
Last updated
Was this helpful?
