Alexander Dunkel, TU Dresden; Institute of Cartography
TL;DR This notebook illustrates the process for rendering of tag maps shapefiles in Mapnik. This notebook can be run with the jupyter docker for cartography (docker tag jupyterlab:mapnik
), which contains the stable Mapnik installation and Python bindings, but must be accessed through the system Python installation (/usr/bin/python3
), not the conda environment.
The source files for this notebook are available in https://gitlab.vgiscience.de/ad/tagmaps-mapnik-jupyter.
Import dependencies
import sys
import math
import fiona
import rasterio
import numpy as np
import contextily as cx
import mapclassify as mc
import libpysal as lps
from IPython import display
from contextily import Place
from typing import Tuple, Optional
from rasterio.plot import show as rioshow
from pathlib import Path
from esda.getisord import G_Local
from shapely.geometry import shape
We are using a number of helper functions from py/modules/tools.py
to classify values.
module_path = str(Path.cwd().parents[0] / "py")
if module_path not in sys.path:
sys.path.append(module_path)
from modules import tools
%load_ext autoreload
%autoreload 2
Test Mapnik bindings
!/usr/bin/python3 -c "import mapnik;print(mapnik.__file__)"
Global Parameters
Some parameters are used throughout this notebook. Adjust to your needs.
INPUT: Path = Path.cwd().parents[0] / "input" # define path to input and working directory (shapefiles, stylesheets etc.)
OUTPUT: Path = Path.cwd().parents[0] / "output" # define path to output directory (map, notebook html etc.)
TMP: Path = INPUT / "intermediate"
MAP_X: int = 2500 # x dimensions of the final map, in pixels
MAP_Y: int = 1400 # y dimensions of the final map, in pixels
for folder in [OUTPUT, TMP]:
if not folder.exists():
folder.mkdir()
input/shapefiles
, created using TagMapsallTagCluster.shp
)allLocationCluster.shp
)input/tagmap_style.xml
, containing the style information for how the data should be renderedinput/tagmap_xml.py
, using Mapnik Python bindings to process data and generate the mapThe mapnik renderer can be accessed through Python bindings, available in the system python installation. Below, we use a python script that provides specific instructions to render a map in Mapnik.
%%time
!cd {INPUT} && /usr/bin/python3 tagmap_xml.py
Load the generated image to jupyter
display.Image(f"{OUTPUT}/tagmap_style.png")
Contextily allows to plot existing basemaps in Python.
berlin = Place("Berlin", source=cx.providers.Esri.WorldImagery)
ax = berlin.plot()
It also provides options to store results as geotiff, see the docs.
filename = f"{OUTPUT}/berlin_carto.tif"
img, ext = cx.bounds2raster(
berlin.w, berlin.s, berlin.e, berlin.n,
filename, source=cx.providers.CartoDB.PositronNoLabels,
ll=True)
with rasterio.open(filename) as r:
rioshow(r)
Get the basemap for our tagmaps TUD Campus area. Also add zoom
parameter, to get a higher level of detail.
zoom = 17
bg_name = f"{INPUT}/bg/tudcampus_carto_{zoom}.tif"
if not Path(bg_name).exists():
img, ext = cx.bounds2raster(
13.71216, 51.0218707395, 13.749046, 51.0340579,
bg_name, source=cx.providers.CartoDB.PositronNoLabels,
ll=True, zoom=zoom)
Below, use mapnik-cli
to use map generation with parameters.
stylesheet = "tagmap_bg_style.xml"
output_name = "tagmap_bg_style.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x {MAP_X} \
--map_dimensiony_y {MAP_Y} \
--input_path {INPUT} \
--output_path {OUTPUT}
display.Image(f'{OUTPUT}/{output_name}')
TagMaps puts out two shapefiles, one for Emoji+Tag Clusters (allTagCluster.shp
), and one for Location clusters (allLocationCluster
), which can be used to visualize overall frequentation patterns, regardless of the tags/emoji used.
The steps to visualize these clusters in ESRI ArcMap are explained in the docs. The process requires additional tools, such as Incremental Spatial Autocorrelation and the Getis-Ord GI* Statistic.
Below, the same approach is followed using Open Source Software in Python.
stylesheet = "tagmap_points_style.xml"
output_name = "tagmap_points_style.png"
For the actual Point, we need to create an svg
.
Create the svg first:
r
defines how 'smooth' the svg curve is (how many vertices are used), adapt to your final output map dimensionpoint_svg = \
"""<?xml version="1.0" standalone="yes"?>
<svg height="128" width="128" version="2"
xmlns="http://www.w3.org/2000/svg">
<circle cx="0" cy="0" r="100" stroke="white" stroke-width="10" fill="#849EB9" />
</svg>
"""
Write svg to file:
with open(TMP / "circle.svg", "w") as svg_file:
svg_file.write(point_svg)
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x {MAP_X} \
--map_dimensiony_y {MAP_Y} \
--input_path {INPUT} \
--output_path {OUTPUT}
display.Image(f'{OUTPUT}/{output_name}')
The Mapnik PointSymbolizer has little ability to modify size of points. Directly Mapping the [Join_Count]
or [Weights]
attribute from the Location cluster shapefile provides no useful output: The largest cluster dominates in the map above.
We can read in the shapefile using Fiona
and add a field, with values to be used as the svg
scale factor for points in Mapnik's PointSymbolizer. This direction provides more flexibility to influence the style of the location cluster layer. Below is mainly following the docs.
Read allLocationCluster.shp
data_src = Path(INPUT / "shapefiles" / "allLocationCluster.shp")
locations = fiona.open(data_src, encoding='UTF-8', mode="r")
The Join_Count
refers to the number of user that took photos or posted from a particular cluster. Weights
is a normalized version of Join_Count, to the values range 1..1000
.
locations.schema
Copy the source schema and add a new property, for point size.
updated_schema = locations.schema
updated_schema["properties"]["point_size"] = "float"
Write a new shapefile with the same format and coordinate reference system as the source, with the new field.
The formula used for point size below:
4+(math.sqrt(feature["properties"]["Join_Count"])*8)
The use of square root means that the largest values will be affected most, which helps a bit reducing the dominant largest clusters on the map.
with fiona.open(
Path(TMP / "allLocationCluster_Size.shp"), "w",
crs=locations.crs,
driver=locations.driver,
schema=updated_schema,
) as shapefile:
for feature in locations:
_size = 4+(math.sqrt(feature["properties"]["Join_Count"])*8)
feature["properties"].update(point_size=_size)
shapefile.write(feature)
Create map
stylesheet = "tagmap_pointssize_style.xml"
output_name = "tagmap_pointssize_style.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x {MAP_X} \
--map_dimensiony_y {MAP_Y} \
--input_path {INPUT} \
--output_path {OUTPUT}
display.Image(f'{OUTPUT}/{output_name}')
Create a map with all labels
stylesheet = "tagmap_pointssize_style_v2.xml"
output_name = "tagmap_pointssize_style_v2.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x {MAP_X} \
--map_dimensiony_y {MAP_Y} \
--input_path {INPUT} \
--output_path {OUTPUT}
display.Image(f'{OUTPUT}/{output_name}')
The process below is based on the PySal Notebook Exploratory Analysis of Spatial Data: Spatial Autocorrelation and the API docs for esda.G_Local.
Loading location clusters and preparing the point data set for spatial analysis. We are using Join_Count
field as the weight for clusters in our spatial autocorrelation analysis.
points = []
weights = []
with fiona.open(data_src, encoding='UTF-8', mode="r") as shapefile:
for feature in locations:
point = (feature["geometry"]['coordinates'][0], feature["geometry"]['coordinates'][1])
points.append(point)
weights.append(float(feature["properties"]["Join_Count"]))
The points are already projected to a UTM coordinate system (33N
), meaning that the values are given in Meters.
points[:5]
weights[:5]
Below, libpysal
is used to create a weights object from points.
The threshold value referc to the distance threshold and is specified in Meters (based on the UTM Zone 33 Coordinate System). Smaller values will include less points, larger values will include more points in the distance matrix of the statistical calculation. The result is a smooth transition between colors (larger values), or more spatially fine grained color classification (smaller values).
distance_threshold = 350
w = lps.weights.DistanceBand(points, threshold=distance_threshold, binary=False)
Preparing the weights variable
y = np.array(weights)
Applying Getis and Ord local G* test using a binary weights object
There are a number of other examples in the pysal/esda docs for parameters. We are using the Getis Ord Gi*, where the star means focal observation.
%%time
lg_star = G_Local(y, w, transform='B', star=True)
lg_star.Zs
p-value based on standard normal approximation from permutations. Smaller p-values mean less significance.
round(lg_star.p_sim[0], 3)
PySal mapclassify provides classification schemes such as Quantiles, HeadTailBreaks, or NaturalBreaks.
print(lg_star.Zs.min())
print(lg_star.Zs.max())
scheme_breaks = tools.classify_data(
values=lg_star.Zs, scheme="Quantiles")
scheme_breaks
cat_nr = scheme_breaks.find_bin(
lg_star.Zs)
cat_nr
Update schema and write to shapefile
updated_schema["properties"]["cat_nr"] = "int"
with fiona.open(
Path(TMP / "allLocationCluster_SizeClass.shp"), "w",
crs=locations.crs,
driver=locations.driver,
schema=updated_schema,
) as shapefile:
for ix, feature in enumerate(locations):
_size = 4+(math.sqrt(feature["properties"]["Join_Count"])*8)
feature["properties"].update(point_size=_size)
feature["properties"].update(cat_nr=int(cat_nr[ix]))
shapefile.write(feature)
The last remaining step is to assign colors to classes. Since Mapnik style sheets provide no parameter for color values, everything must be defined in the svg. We will create a separate svg for each of the 7 classes, each having a different color defined.
Define list of colors (hot and cold spots):
hex_list = ['#4576b5', '#849eba', '#c0ccbe', '#ffffc0', '#fab984', '#ed7551', '#d62f27']
tools.display_hex_colors(hex_list)
def get_svg_circle(color: str) -> str:
"""Returns a circle as svg with the given color"""
return \
f"""<?xml version="1.0" standalone="yes"?>
<svg height="128" width="128" version="2"
xmlns="http://www.w3.org/2000/svg">
<circle cx="0" cy="0" r="100" stroke="white" stroke-width="10" fill="{color}" />
</svg>
"""
for ix, hex_color in enumerate(hex_list):
with open(TMP / f"circle_{ix}.svg", "w") as svg_file:
point_svg = get_svg_circle(color=hex_color)
svg_file.write(point_svg)
stylesheet = "tagmap_production.xml"
output_name = "tagmap_production.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x {MAP_X} \
--map_dimensiony_y {MAP_Y} \
--input_path {INPUT} \
--output_path {OUTPUT}
display.Image(f'{OUTPUT}/{output_name}')
Commonly, these maps would be interactively explored using a tile server with multiple zoom levels. In order to zoom in to certain clusters, we can also create static close-ups of focus regions.
To locate elements, select a hashtag that is written bold (e.g. BISMARCKTURM). Bold tells us that this is the most important cluster of the hashtag BISMARCKTURM
. In the data table, this entry is highlighted with himp=1
. We can use this information to get the latitude and longitude of the element and tell Mapnik to zoom into this location.
data_src = Path(INPUT / "shapefiles" / "allTagCluster.shp")
focus_tag = 'BISMARCKTURM'
def add_buffer_bbox(
bbox: Tuple[float,float,float,float], buffer: int) -> Tuple[float,float,float,float]:
"""Add buffer to bbox tuple (Meters)"""
return (bbox[0]-buffer, bbox[1]-buffer, bbox[2]+buffer, bbox[3]+buffer)
def find_feature(data_src: Path, feature_name: str, add_buffer: Optional[int]) -> Tuple[float,float,float,float]:
"""Returns bounding box of a feature (x, y), if found"""
with fiona.open(data_src, encoding='UTF-8', mode="r") as shapefile:
for feature in shapefile:
properties = feature["properties"]
if properties["HImpTag"] == 1 and properties["ImpTag"] == feature_name.lower():
bounds = shape(feature["geometry"]).bounds
if add_buffer:
bounds = add_buffer_bbox(bounds, buffer = add_buffer)
return str(bounds).lstrip("(").rstrip(")").replace(" ","")
focus_bbox = find_feature(data_src=data_src, feature_name=focus_tag)
focus_bbox
Create map and zoom in to bounding box
output_name = f"tagmap_production_{focus_tag.lower()}.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x 500 \
--map_dimensiony_y 250 \
--input_path {INPUT} \
--output_path {OUTPUT} \
--bbox {focus_bbox}
display.Image(f'{OUTPUT}/{output_name}')
Sometimes, the bounding box returned from the cluster is small. We can add a buffer to crop the map to a larger area, for the cluster under investigation.
focus_tag = 'BOCKBIERANSTICH'
focus_bbox = find_feature(data_src=data_src, feature_name=focus_tag, add_buffer=100)
output_name = f"tagmap_production_{focus_tag.lower()}.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x 500 \
--map_dimensiony_y 250 \
--input_path {INPUT} \
--output_path {OUTPUT} \
--bbox {focus_bbox}
display.Image(f'{OUTPUT}/{output_name}')
Now it is time to experiment. For instance, changing the background map to aerial imagery and filtering only emoji.
focus_tag = 'TUDRESDEN'
zoom = 17
bg_name = f"{INPUT}/bg/tudcampus_aerial_{zoom}.tif"
if not Path(bg_name).exists():
img, ext = cx.bounds2raster(
13.71216, 51.0218707395, 13.749046, 51.0340579,
bg_name, source=cx.providers.Esri.WorldImagery,
ll=True, zoom=zoom)
Create a white svg circle
with open(TMP / f"circle_white.svg", "w") as svg_file:
point_svg = get_svg_circle(color='#FFFFFF')
svg_file.write(point_svg)
Zoom to Campus
focus_bbox = find_feature(data_src=data_src, feature_name=focus_tag, add_buffer=450)
Visualize
stylesheet = "tagmap_aerial_emoji.xml"
output_name = f"tagmap_production_aerial_{focus_tag.lower()}.png"
%%time
!/usr/bin/python3 -m mapnik_cli \
--stylesheet_name {stylesheet} \
--output_name {output_name} \
--map_dimensiony_x 1600 \
--map_dimensiony_y 900 \
--input_path {INPUT} \
--output_path {OUTPUT} \
--bbox {focus_bbox}
display.Image(f'{OUTPUT}/{output_name}')
Experiment further, e.g.:
Segoe UI Symbol Regular
in input/*.xml
Stylesheets with another Emoji font, e.g. Twitter Color Emoji Regular
, Segoe UI Symbol Regular
, Segoe UI Emoji Regular
, Noto Emoji Regular
, Symbola Regular
etc.!jupyter nbconvert --to html_toc \
--output-dir=../output/html ./01_mapnik-tagmaps.ipynb \
--template=../nbconvert.tpl \
--ExtractOutputPreprocessor.enabled=False >&- 2>&- # create single output file and suppress output
Copy HTML file to resource folder
!cp ../output/html/01_mapnik-tagmaps.html ../resources/html/