Geography

Add geographical zones to your data

Geographical datasets are datasets in which each row contains the coordinates of a zone that can be displayed on a map.

Geographical data

Geographical coordinates are like data, except they reside not in CSV or excels, but in other file types like GeoJSON, Shapefiles, etc. You can experiment with these files in online tools like GeoJSON.io and Mapshaper.

A typical GeoJSON describing zones consists in an array of “Features”, called a “FeatureCollection”. Each feature has:

  • a geometry which contains the list of coordinates necessary to draw the zone,
  • a set of properties describing the zone, a bit like a data row.

The properties will be essential to be able to associate the zone with the corresponding data.

If we draw the well-know hexagon of France in GeoJSON.io, we can observe the corresponding JSON on the right: Very simple zone for France in GeoJSON.io

Here is a very simplified view of large western European countries created in this tool, as an example. Note the table view which makes it easy to see the properties without scrolling through a long JSON full of coordinates. Very simple western europe in GeoJSON.io

Note

In this doc, we’ll use indifferently “zones” or “(geographical) features” (often abbreviated “geofeatures”).

Import geographical zones into your small app

Files like GeoJSON generally contain many zones, forming what we call a “mesh”.

These files can be imported in the “Data sources” interface, within the “Update Basemap” section. Upload mesh section

Let’s add a new mesh by hitting “Add basemap”. It has 2 main properties:

  • mesh is the name of the file (correspond to the file attribute for data sources)
  • basemap is the id of this mesh, to differentiate it from potential others (equivalent to the domain attribute for data sources) Creating a new basemap

From there, we can drop our GeoJSON file in the newly created block, and make the app read it and save it by starting the “populate basemaps” operation. Populate basemaps During this, the server reads the file and insert each zone in the database so we can use them later. It also add the basemap value to the properties of each zone of the mesh.

Note

Each time you drop a new mesh or modify the configuration of one, don’t forget to update the zones database by launching “Populate basemaps”.

You can find an example shape files suitable for a mapchart right here

Join geographical zones to data rows

Now that we have geographical zones available in our database, it’s time to join them with a dataset.

Given the previously hand-drawn zones and this data:

code GDP
FR 2.583
UK 2.622
DE 3.677
IT 1.935
ES 1.311
Sample countries data

Sample countries data

We can join the two with the option geofeatures in a data request:

query:
  domain: "western-europe-gdp-2017"
geofeatures:
  query:
    "properties.basemap": "simple-western-europe"
  join:
    on:
      features: "country_code"
      data: "code"

In geofeatures:

  • query filters the zones we want among all the available ones, usually based on properties,
  • join.on specifies the columns that must match in the features (zones) and in data.

All zones properties are made available in the dataset as columns.

A simple geodataframe

A simple geodataframe

Note

Note the yellow map icon that appears next to every row that contains a zone. Clicking on this icon opens the zone in GeoJSON.io.

Advanced joins

Different joins are available, which can be specified in the join property:

  • how: "data" (default) keeps data rows even if no features matches, and discard features which have no matching data row
  • how: "features" keeps features even if no data row matches, and discard data rows which have no matching feature
  • how: "outer" keeps every feature and data row
  • how: "inner" discards every feature or data row that haven’t matched together

Example:

query:
  domain: "western-europe-gdp-2017"
geofeatures:
  query:
    "properties.basemap": "simple-western-europe"
  join:
    on:
      features: "country_code"
      data: "code"
    how: "features"

Tip

While you prepare your dataset, it’s a good idea to use "outer", so if the join fails, you’ll be able to see why by exploring both columns of data and features.

Hierarchy

As described in the previous section, it’s possible to express parent/child relationship between data rows.

If a hierarchy applies correctly to a dataset…

Hierarchical dataset of continents and countries

Hierarchical dataset of continents and countries

…you can join geofeatures to it like you would do without any hierarchy.

Hierarchical dataset joined with geographical zones

Hierarchical dataset joined with geographical zones

Note

A dataset is nested hierarchically after being joined with geofeatures, so the hierarchy can be configured on geofeatures properties.

Aggregate zones for higher hierarchical levels

It’s highly common to have in the mesh only the lower-level features. Nevertheless, you may need to have higher-level zones to associate data with.

In the “Data sources” interface, under the “Upload basemaps” section, the parameter hierarchy is made to indicate the columns that will be used to aggregate zones together, from the top-level to the lower one.

During the operation “Populate basemaps”, small zones are aggregated into bugger ones and some info are added to identify each zone and its level. Each zone is enhanced by two properties:

  • current_type will contain the name of the hierarchy column corresponding to the level of this zone
  • current_id will contain the value of this column for this zone
  • parent_type and parent_id to identify the parent zone
Example with world continents, subregions and countries

In this mesh of administrative countries (open in GeoJSON.io), each zone identifies a country (identified with a name property), which is contained in a subregion and a continent.

With this configuration, each sub-region and continent will be created from the aggregation of their belonging countries: Basemap configuration with hierarchy

A tabular recap of this process would be:

Zones in the mesh:

continent subregion name
Europe Western Europe France
Europe Western Europe Germany
Europe Western Europe Netherlands
Europe Southern Europe Italy
Europe Southern Europe Spain
Europe Northern Europe United Kingdom
Europe Northern Europe Denmark

After “Populate basemaps”:

contin ent subregion name basem ap current_type current_ id parent_ type parent_i d
Europe     world continent Europe    
Europe Western Europe   world subregion Western Europe continent Europe
Europe Southern Europe   world subregion Southern Europe continent Europe
Europe Northern Europe   world subregion Northern Europe continent Europe
Europe Western Europe France world name France subregion Western Europe
Europe Western Europe Germany world name Germany subregion Western Europe
Europe Western Europe Netherlan ds world name Netherland s subregion Western Europe
Europe Southern Europe Italy world name Italy subregion Southern Europe
Europe Southern Europe Spain world name Spain subregion Southern Europe
Europe Northern Europe United Kingdom world name United Kingdom subregion Northern Europe
Europe Northern Europe Denmark world name Denmark subregion Northern Europe

These can be requested and visualized independently:

  • for continents: Continents dataset Continents zones previewed
  • for sub-regions: Sub-regions dataset Sub-regions zones previewed
  • for countries: Countries dataset Countries zones previewed

The columns added by the “Populate basemaps” operation (current_type, current_id, parent_type, parent_id) are ready to use with in the hierarchy property:

query: [...]
geofeatures:
  query:
    "properties.basemap": "world"
  join:
    on:
      features: "current_id"
      data: "entity_name"
    how: "features"
hierarchy:
  parent: ["parent_type", "parent_id"]
  id: ["current_type", "current_id"]

Hierarchical dataset of world continents, subregions and countries Drilled view on western europe

Lazy

When having a lot of rows and levels, fetching everything at once can be costly and slow. That’s why it’s possible to load only the top drill level, then the subsequent ones only when required. See the docs about laziness in hierarchical datasets.

How to also lazy load our geographical zones the ame way we do with the data rows?

By default, when the lazy mode is activated, the hierarchy root, id and parent columns also applies to geofeatures. For this to work, geofeatures must have the same column in their properties.

With common hierarchy columns

Here is a very simple example of data and geofeatures that shares the same hierarchy columns (location and parent_location):

parent_location location value
  France 65000000
France Île-de-France 12000000
Île-de-France Paris 3000000
[
  {
    "properties": {
      "basemap": "france_zones",
      "location": "France"
    },
    "geometry": [...]
  },
  {
    "properties": {
      "basemap": "france_zones",
      "parent_location": "France",
      "location": "Île-de-France"
    },
    "geometry": [...]
  },
  {
    "properties": {
      "basemap": "france_zones",
      "parent_location": "Île-de-France",
      "location": "Paris"
    },
    "geometry": [...]
  }
]

In this case, the hierarchy configuration apply to both data and geofeatures:

query:
  domain: "france_data"

geofeatures:
  query:
    "properties.basemap": "france_zones"
  join:
    on:
      data: "location"
      features: "location"

hierarchy:
  id: ["location"]
  parent: ["parent_location"]

lazy: true

With different hierarchy columns

When data and features have different columns names representing the hierarchy, it’s possible to make them correspond by having one hierarchy property for the data and one for the geofeatures (with the same nomenclature as the usual hierarchy parameter (id, parent and root)).

Here is a simple example:

parent_location location value
  France 65000000
France Île-de-France 12000000
Île-de-France Paris 3000000
[
  {
    "properties": {
      "basemap": "france_zones",
      "zone_id": "France"
    },
    "geometry": [...]
  },
  {
    "properties": {
      "basemap": "france_zones",
      "parent_zone_id": "France",
      "zone_id": "Île-de-France"
    },
    "geometry": [...]
  },
  {
    "properties": {
      "basemap": "france_zones",
      "parent_zone_id": "Île-de-France",
      "zone_id": "Paris"
    },
    "geometry": [...]
  }
]

In this case, the hierarchy configuration must be different for data and geofeatures:

query:
  domain: "france_data"

hierarchy:
  id: ["location"]
  parent: ["parent_location"]

geofeatures:
  query:
    "properties.basemap": "france_zones"
  join:
    on:
      data: "zone_id"
      features: "location"
  hierarchy:
    id: ["zone_id"]
    parent: ["parent_zone_id"]

lazy: true

Use it in charts!

The mapchart will now display the zones associated to its zones dataset, without any extra configuration.

See some examples on the mapchart docs.