Prepare your data (BETA)

Once you have imported some data from your datasources in Toucan Toco, you may need to add some cleaning pre-aggregations, tranformations, computations or to combine several datasets together.

Instead of repeating those operations in every query of your stories, you can prepare and load new datasets ready to be consumed in your stories. It will help you gain both efficiency and performance, by avoiding repetitive and resource-consuming operations that you would otherwise perform on-the-fly in each and every query of your stories.

Let’s see how this work.

Warning

Toucan Toco is a data storytelling platform, not a data preparation platform. What we offer you is a tool to help you save time and improve performance when manipulating the datasets you need for storytelling purposes, which should imply relatively limited amounts of data (a few hundreds of thousands rows maximum). At this stage, the feature is BETA and we do not guarantee a robust experience for datasets larger than a few hundreds of thousands rows. We will add improvements in the future to improve your experience when manipulating large datasets.

Please note in particular that at this stage, the join feature allowing you to combine 2 datasets will be in pain if you try to join a dataset with more than some thousands rows. Again, we are working to overcome those limits. In the meantime we are glad to make this BETA available so that you can test the feature. Do not hesitate to provide us with some feedbacks !

Create a new prepared dataset

To create a new prepared dataset, go to your Data Explorer (in the “DATA” tab of your Studio toolbar):

Then click on the “ADD DATA” button in the upper right corner of your data explorer, and select “From existing data”:

Then you need to pick an existing dataset to start from. Once it’s done, you can apply your transformations (in the example below we combine the current datasets with other datasets, and then preaggregate the data at country level):

You may have noticed that to manage data transformations, it’s the same tool that you use for queries in your stories (see Visual Query Builder)

Once you are fine with your cooking, you can save your new dataset via the button in the bottom right corner, and you will then be asked to give a name to your dataset:

Load / refresh a prepared dataset

When you have just created a new prepared dataset, when you get back to your data explorer you will see that your new dataset appears in orange with a message indicating you that it needs to be processed before it can be loaded and used in your stories:

To process your dataset, you have 2 options:

  1. Process only your dataset. This is the preferred option if it’s the only dataset that you need to refresh. When you do so, this dataset as well as all the others that depend on it or that it depends on will be refreshed.
  1. Process all your datasets

Edit or delete a prepared dataset

Of course, you can easily edit a prepared dataset and update your data transformations, or delete it:

Dependancies between prepared datasets

Several rules to keep in mind in terms of dependancies between prepared datasets:

  • When you refresh a dataset, it launches the process of parent datasets and dependant datasets
  • You will not be able to delete a dataset if it is referenced in another dataset.
  • You will not be able to append or join another dataset to your current dataset when it would create circular reference. Such a forbidden dataset will appear deactivated, in grey, in the dataset selection dropdown of the append/join step: