Actvity Comparison ATKIS-Intersection LBSM-DE¶

In this notebook, we'll explore ways to visualize activity rankings across different types of land use. Land use data is derived from ATKIS Basis-DLM (Selected categories) and intersected with geolocated Social Media Posts (Flickr, Instagram Twitter). From Originally 35 Million Social Media Posts, about 8 Million are found in the subset of chosen categories. This data is the base for the analysis in this notebook. The process for intersecting ATKIS and LBSM-Data is shown here.

Imports and logging¶

First, we start with our imports and get logging established:

# imports needed and set up logging
# import gensim 
import os
import logging
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
import holoviews as hv
import re
from collections import defaultdict
from collections import namedtuple
import csv
from pathlib import Path
import numpy as np
import pandas as pd
hv.extension('bokeh')

Dataset preview¶

The Dataset(s) we will be loading have already been intersected with land-use data - therefore, we can dive straight into analysis, without prior classification. Have a look first:

Post = namedtuple('Post','origin_id post_guid user_guid post_body post_title hashtags emoji, post_time')
data_file = '03_Output_LBSM/Germany_LBSM_weinbau.csv'

def get_post(post_line):
    """Concatenate topic info from post columns"""
    origin_guid = post_line.get('origin_id')
    post_guid = post_line.get('post_guid')
    user_guid = post_line.get('user_guid')
    post_title = post_line.get('post_title')
    post_body = post_line.get('post_body')
    hashtags = post_line.get('tags').split(';')
    emoji = post_line.get('emoji').split(';')
    post_time_hr = post_line.get('post_time')[:10]
    return Post(origin_guid, post_guid, user_guid, post_body, post_title, hashtags, emoji, post_time_hr)

with open(data_file, 'r', encoding="utf-8") as file_handle:
    post_reader = csv.DictReader(
                file_handle,
                delimiter=',',
                quotechar='"',
                quoting=csv.QUOTE_MINIMAL)
    for ix, post in enumerate(post_reader):
        print(f'{post}\n')
        lbsn_post = get_post(post)
        print(f'{lbsn_post}')
        break

OrderedDict([('origin_id', '2'), ('post_guid', 'd1defaaee172f757e6791926db8c910a'), ('user_guid', '1ee8a2d889cda05b776f4bbd976b266a'), ('origin_dist', '321'), ('atkis_cat', 'weinbau'), ('gemeinde_typ', 'Größere Kleinstadt'), ('post_time', '2010-04-11 17:43:42'), ('post_thumbnail_url', ''), ('post_views_count', '20'), ('post_like_count', ''), ('post_url', ''), ('tags', 'carygreisch;deu;fellerich;geo:lat=4968855300;geo:lon=650608200;geotagged;germany;rheinlandpfalz'), ('emoji', ''), ('post_title', 'Fellerich'), ('post_body', 'Fellerich, Rheinland-Pfalz, Deutschland'), ('post_geoaccuracy', 'latlng'), ('post_comment_count', ''), ('post_type', 'image'), ('post_filter', ''), ('place_guid', 'b88b0ff28ebfb4b1c15bb3e16d54c9f0'), ('place_name', '')])

Post(origin_id='2', post_guid='d1defaaee172f757e6791926db8c910a', user_guid='1ee8a2d889cda05b776f4bbd976b266a', post_body='Fellerich, Rheinland-Pfalz, Deutschland', post_title='Fellerich', hashtags=['carygreisch', 'deu', 'fellerich', 'geo:lat=4968855300', 'geo:lon=650608200', 'geotagged', 'germany', 'rheinlandpfalz'], emoji=[''], post_time='2010-04-11')

Read files into a list¶

Now that we've had a sneak peak of our dataset, we can read it into a list so that we can pass this on to the Ranking model. We'll stream files and only process post by post to reduce memory burden.

def scan_local_files():
    """Read Local Files according to config parameters"""
    pathname = Path.cwd()
    input_path = pathname / '03_Output_LBSM'
    filelist = list(input_path.glob(
        f'*.csv'))
    return filelist
def read_input_file(input_file):
    """Read Input file lines and convert to post"""
    logging.info(f"Reading file {os.path.basename(input_file)}..")
    with open(input_file, 'r', encoding="utf-8") as file_handle:
        post_reader = csv.DictReader(
                file_handle,
                delimiter=',',
                quotechar='"',
                quoting=csv.QUOTE_MINIMAL)
        for ix, post_line in enumerate(post_reader):
            lbsn_post = get_post(post_line)
            if (ix % 100000 == 0):
                logging.info (f"read {ix} posts")
            # do some pre-processing and return a list of words for each review text
            yield lbsn_post

Topic Selection and Ranks¶

First, we'll define our topics. A topic is defined as a list of terms. Note that an "activity" can be defined specific or broad:

hiking would be a specific activity, which can be described by a longer walk, slow pace, perhaps done in groups and on planned occation (e.g. a day trip). Someone would not describe a 5 minute walk as hiking.
sports on the other hand is group of activities; sports usually has a defined purpose regarding fitness or health; there are many sports such as jogging, hiking, playing football etc. which all have their individual benefits for certain health aspects
friends can be described as another activity group with the main purose of socializing: one meets with freinds to talk, interact and communicate. Some specific activities are more suitable for socializing than others, e.g. game or group-activities that are usually done together

As a conclusion, we can say that we want to define our topics as diverse as possible. Some activities or groups of activities might overlap, while other might describe opposide ends of a continuum of possible activity-groups. The goal here is not to be holistic, but to get a cross-section of a selected list of relevant green space activities.

Furthermore:

related terms can be queried using relatedwords.org
for an overview of important activity categories, see wikipedia
future goal: replace by topic vector

topics = dict()  
topics['hiking'] = ('hike', 'hiking', 'wandern', 'wanderung', 'wanderer', 'wanderweg', 'wanderroute', '🥾') # optional: 🚶 (person walking)
# biking, this is a very specific atcivity
topics['biking'] = ('bike', 'biking', 'bicycle', 'cycling', 'fahrrad', 'velo', '🚲', '🚴')
# just plain walking
topics['walking'] = ('walk', 'walking', 'spazieren', 'stroll', 'fußweg', 'spazierweg', 'spaziergang') # optional: 🚶 (person walking)
# broad category with a bias towards jogging
topics['sport']  = ('sport', 'jogging', 'running', 'exercise', 'run', 'workout', 'rennen', 'dauerlauf', '🏃')
topics['relaxing'] = ('relaxing', 'sitting', 'relaxation', 'entspannen', 'innehalten', 'erholen', 'ausruhen', 'recreation')
# meeting with friends, this can encompass a group of activities
# note that we use 'meeting'; in green-space land use, this likely hints to meeting with friends, not within work environment
topics['friends'] = ('friends', 'friends', 'meeting', 'socialize', 'freunde', 'treffen', 'hang around', 'abhängen')
# anything related to family and kinder/kids
topics['family'] = ('family', 'familie', 'kinder', 'baby', 'familienausflug', 'familytrip', '👪')
# tourist/sighseeing group
topics['tourist'] = ('tourist', 'sighseeing', 'sehenswürdigkeit', 'excursion', 'exkursion', 'sight-seeing', 'tour', 'travel', 'reise', '🌇')
# very general: spielen/playing
topics['playing'] = ('spielen', 'playing', 'play', 'spiel', 'game', '🎲', '🎮')
# lets add some specific activities: picknick-grillen, soccer ..
topics['picnic']  = ('picnic', 'barbecure', 'picknick', 'picknickkorb', 'grillen', 'grill')
topics['soccer'] = ('soccer', 'fussball', 'fußball', 'football', '⚽')

Counting Posts/ Userdays based on matching topics¶

For selecting posts and counting userdays based on topic-terms, we define the following rules:

search terms in title, post_body, tags and emoji because some people might not use tags at all, others might only provide titles and yet other mainly communicate using emoji
when searching terms in post_body or title, only match full words - e.g. "walkman" is ignored when searching for "walk"; this allows for more specific semantic disambiguation; we use the regex module for full word matching
emoji sometimes accurately encapsulate specific activity-meaning, therefore we can make an exception to full-word-search when using emoji - e.g. it is quite clear that 🚴 is usually used in the context of biking
we ignore case of characters when searching (upper or lower)
since tags might repeat often in posts of a single user, we user UserDays as the most appropiate metric to measure activities: this will count repititive behaviour only once, e.g. someone might upload 500 pictures with the same tags of their single sunday picknick, this will only count as one; however, if parks are visited by the same person at multiple days, each day will be counted once: the allows us to include typical patterns of behaviuour for green-land-use such as parks beeing visited often during a month for recreation (but shopping is perhaps done less frequently).

%%time
from IPython.display import clear_output

def word_in_text(word, text_value):
    """Checks whether full word is in string"""
    if re.search(r'\b' + word + r'\b', text_value, re.IGNORECASE):
        return True

def check_topic(topic, lbsn_post):
    """Checks whether topic is in post"""
    for term in topic:
        if \
        term in lbsn_post.hashtags or \
        term in lbsn_post.emoji or \
        word_in_text(term, lbsn_post.post_title) or \
        word_in_text(term, lbsn_post.post_body):
            return True
def get_userday(post):
    return f'{post.user_guid}{post.post_time}'

# init count structures
total_counts = dict()
total_userdaycounts = defaultdict(set)
cnt_dict = dict()
userday_cnt_dict = dict()

# init dicts for each topic
for activity_name in topics.keys():
    # use default dict to init int:
    cnt_dict[activity_name] = defaultdict(int)
    # use set for counting userdays:
    userday_cnt_dict[activity_name] = defaultdict(set)
    
# perform topic matching
for file_name  in scan_local_files():
    # get land use type from filename
    f_name = os.path.basename(file_name)
    if f_name == 'all_intersected_guids.csv':
        # skip
        continue
    # strip leading 'Germany_LBSM_'
    type_text = f_name[13:].rstrip('.csv')
    total_counts[type_text] = 0
    # loop posts
    for lbsn_post in read_input_file(file_name):
        # count post
        total_counts[type_text] += 1
        # count userday
        userday = get_userday(lbsn_post)
        total_userdaycounts[type_text].add(userday)
        for activity_name, topic_terms in topics.items():
            if check_topic(topic_terms, lbsn_post):
                # count post
                cnt_dict[activity_name][type_text] += 1
                # count userday
                userday_cnt_dict[activity_name][type_text].add(userday)
    # count distinct userdays
    for activity_name in topics.keys():
        userday_cnt_dict[activity_name][type_text] = len(userday_cnt_dict[activity_name].get(type_text))


clear_output(wait=True)
selected_cnt = sum([sum(x.values()) for x in cnt_dict.values()])
total_cnt = sum(total_counts.values())
perc_cnt = selected_cnt/(total_cnt/100)
print(
    f'Done. Found topic matches in {selected_cnt} posts '
    f'of {total_cnt} total posts ({perc_cnt:.2f}%)')

for land_use in total_userdaycounts.keys():
        total_userdaycounts[land_use] = len(total_userdaycounts[land_use])
selected_userdays = sum([sum(x.values()) for x in userday_cnt_dict.values()])
total_userdays  = sum(total_userdaycounts.values())
perc_userdays  = selected_userdays /(total_userdays/100)
print(
    f'Done. Found topic matches in {selected_userdays} userdays '
    f'of {total_userdays} total userdays ({perc_userdays:.2f}%)')

Done. Found topic matches in 1124397 posts of 7948315 total posts (14.15%)
Done. Found topic matches in 777017 userdays of 4559082 total userdays (17.04%)
Wall time: 1h 17min 40s

Convert dict to pandas dataframe for easier handling. We can choose to analyse absolute post counts here (prone to error but fast) or userdays (less prone but slow to calculate)

#df = pd.DataFrame.from_dict(cnt_dict)
df = pd.DataFrame.from_dict(userday_cnt_dict)
# get preview
df.style.background_gradient(cmap='viridis')

# post counts:
#df_total = pd.DataFrame.from_dict(
#    total_counts.items())
# user days:
df_total = pd.DataFrame.from_dict(
    total_userdaycounts.items())

Optional: store intermediate results (pandas dataframe pickle)

# write:
#df.to_pickle("activity_intermediate_userdays.pkl")
#df_total.to_pickle("activity_total_userdays.pkl")
# load:
df = pd.read_pickle("activity_intermediate_userdays.pkl")
df_total = pd.read_pickle("activity_total_userdays.pkl")

Compare to total post counts:

print('Post count per topic')
df_postcount = pd.DataFrame.from_dict(cnt_dict)
df_postcount.style.background_gradient(cmap='viridis')

Post count per topic

These are absolute values with little meaning because some land use types appear more often, similarly, some activities have typica a higher frequency of matches. To normalize these values, we'll therefore calculate absolute percentages first for each landuse category. Afterwards, we can nornmalize (i.e. stretch) results to 0-100 range.

replace index names for display

name_ref = {
'gruenland':'Gruenland',
'ackerland':'Ackerland',
'laubholz':'Laubholz',
'nadelholz':'Nadelholz',
'gehoelz':'Gehoelz',
'mischholz':'Mischholz',
'sportfreizeiterholung':'sonst. Sport-, Freizeit-, Erholungsfl.',
'streuobst':'Streuobst',
'parkgruenanlage':'Park, Gruenanlage',
'friedhof':'Friedhof',
'kleingarten':'Kleingarten',
'moor':'Moor',
'weinbau':'Weinbau',
'obstbau':'Obstbau',
'sonstlandwirt':'sonst. Landwirtschaftsfl.',
'sumpf':'Sumpf',
'wochenendferienhau':'Wochenend-, Ferienhaussiedl.',
'gartenland':'Gartenland',
'heide':'Heide',
'sonstsiedlungsfreifl':'sonstige Siedlungsfreifl.',
'golfplatz':'Golfplatz',
}
df.rename(index=name_ref, inplace=True)
df.index
#for dict_key, value_count in total_counts.items():
#    total_counts[name_ref.get(dict_key)] = value_count
#    total_counts.pop(dict_key)

Index(['Ackerland', 'Friedhof', 'Gartenland', 'Gehoelz', 'Golfplatz',
       'Gruenland', 'Heide', 'Kleingarten', 'Laubholz', 'Mischholz', 'Moor',
       'Nadelholz', 'Obstbau', 'Park, Gruenanlage',
       'sonst. Landwirtschaftsfl.', 'sonstige Siedlungsfreifl.',
       'sonst. Sport-, Freizeit-, Erholungsfl.', 'Streuobst', 'Sumpf',
       'Weinbau', 'Wochenend-, Ferienhaussiedl.'],
      dtype='object')

df_total.columns = ['Land use', 'Post Count']
df_total = df_total.set_index(['Land use'])
df_total.rename(index=name_ref, inplace=True)
df_total['Percentage'] = df_total['Post Count']/(total_cnt/100)
df_total.style.background_gradient(cmap='summer')

# transpose
df_perc = df.T
# normalize using total counts for each land use cat
for type_text, total_count in total_userdaycounts.items():
    type_text = name_ref.get(type_text)
    df_perc[type_text] = df_perc[type_text]/(total_count/100)
# transpose again
df_perc = df_perc.T

# show percentages
df_perc.style.background_gradient(cmap='summer')
#df.index
#df.columns
#df.shape

We'll use Holoviews Heatmap to display data

from holoviews import opts
hv.HeatMap({'x': df_perc.columns, 'y': df_perc.index, 'z': df_perc}, ['x', 'y'], 'z'
          ).opts(opts.HeatMap(tools=['hover'], colorbar=True, width=700, height=400,cmap='greens'))

To improve legebility and colorization, we stretch values for each actitity to 1-100 range. Furthermore, we use log-values to reduce peaks and highlight information in the long tail.

def normalize(df):
    result = df.copy()
    for feature_name in df.columns:
        max_value = df[feature_name].max()
        min_value = df[feature_name].min()
        result[feature_name] = (df[feature_name] - min_value) / (max_value - min_value)
    return result
# log scale (reduce peaks) and normalize (0-1 range)
df_norm = normalize(np.log(df_perc))

Calculate Alpha Values (transparency of cells) from total available posts (= accuracy)

# log-scale and normalize between 0.5 and 1 (= final transparency)
df_total['Log-Norm. Percentage'] = np.interp(
    np.log(df_total['Percentage']), np.log((df_total['Percentage'].min(), df_total['Percentage'].max())), (0.5, 1))
df_alpha = df_total['Log-Norm. Percentage']
np_alpha = df_alpha.values
np_alpha = np.tile(np_alpha, (len(topics), 1)).transpose()
df_alpha = pd.DataFrame(np_alpha)
df_alpha.index = df.index

from holoviews import dim, opts 
from bokeh.models import HoverTool

def hook(plot, element):
    # remove axis for plot
    plot.handles['xaxis'].visible = False
    plot.handles['yaxis'].visible = False
    plot.outline_line_color = None
    plot.border_fill_color = None
    plot.background_fill_color = None
    plot.outline_line_width = 0
    plot.outline_line_alpha = 0
    #plot.axis.visible = False

# explicitly declare hover tool so we can add "%" sign
TOOLTIPS = [
    ('Activity (LBSM)', '@x'),
    ('Land Use (ATKIS)', '@y'),
    ('Relative importance (Log & Norm 0-1)', '@z{1.1111}'),
    ('Percentage of userdays (abs)', '@z2{1.11}%'),
    ('Total userdays (abs)', '@z3'),
]
hover = HoverTool(tooltips=TOOLTIPS)

hv.HeatMap({'x': df.columns, 'y': df.index, 'z': df_norm, 'z2': df_perc, 'z3': df, 'z4': df_alpha}, 
           kdims=[('x', 'Activity (LBSM)'), ('y', 'Land Use (ATKIS)')], 
           vdims=['z', 'z2', 'z3', 'z4'], 
    ).opts(
           opts.HeatMap(
           title_format="Heatmap for selected ATKIS categories and LBSM activities",
           tools=[hover], 
           colorbar=True, 
           width=720, 
           height=520,
           cmap='greens'
           #alpha='z4' # dim cells based on total available posts (=accuracy)
           )
        )
# use http://tools.zenverse.net/word-wrap/ for word wrap

Measuring similarity betwen different activities / betwen land uses¶

Dot Product (Skalarprodukt) can be used to compute vectors between all values of two columns (i.e. activities) or two rows (i.e. land uses). This can be used to compare patterns based on cosine similarity. A cosine similarity of 1 means identical, where 0 means completely different.

from scipy.spatial.distance import cosine
from pandas import DataFrame

print(f'hiking/biking: {1 - cosine(df["hiking"], df["biking"])}')
print(f'walking/soccer: {1 - cosine(df["walking"], df["soccer"])}')
print(f'sport/soccer: {1 - cosine(df["sport"], df["soccer"])}')
      
print(f'Park, Gruenanlage/Friedhof: {1 - cosine(df.loc["Park, Gruenanlage"], df.loc["Friedhof"])}')
print(f'Golfplatz/Nadelholz: {1 - cosine(df.loc["Golfplatz"], df.loc["Nadelholz"])}')

hiking/biking: 0.6152442856715981
walking/soccer: 0.3112305944068454
sport/soccer: 0.8156649130861259
Park, Gruenanlage/Friedhof: 0.9435236031345471
Golfplatz/Nadelholz: 0.5082803251293814

Clustered Heatmap¶

We can use these similarity scores to re-order the heatmap based on similarity measures. Seaborn, for example, offers clustermap, which allows specifiying different cluster and metrics methods. There are many other ways to created clustered heatmaps (see links below).

import seaborn as sns
heatmap_sns = sns.clustermap(df_norm, metric="correlation", standard_scale=1, method="ward", cmap="Greens")

heatmap_sns.savefig("clusterheatmap_userdays_greens.png")
heatmap_sns.savefig("clusterheatmap_userdays_greens.svg",format="svg")

Reorder columns/rows¶

We can access the reordered rows/columns from the seaborn plot using this suggestion. Afterwards, the original dataframes can be updated using the new ordering.

print(f'rows: {heatmap_sns.dendrogram_row.reordered_ind}')
print(f'columns: {heatmap_sns.dendrogram_col.reordered_ind}')

rows: [3, 9, 11, 5, 8, 6, 10, 15, 19, 2, 4, 16, 1, 18, 20, 13, 0, 7, 14, 12, 17]
columns: [1, 0, 2, 7, 4, 6, 9, 10, 5, 3, 8]

#columnsTitles = ['onething', 'secondthing', 'otherthing']

# get col and row names by ID
colname_list = [df.columns[col_id] for col_id in heatmap_sns.dendrogram_col.reordered_ind]
rowname_list = [df.index[row_id] for row_id in heatmap_sns.dendrogram_row.reordered_ind]
# change row/col order 
df_ro = df.reindex(rowname_list)
df_ro = df_ro[colname_list]
df_norm_ro = df_norm.reindex(rowname_list)
df_norm_ro = df_norm_ro[colname_list]
df_perc_ro = df_perc.reindex(rowname_list)
df_perc_ro = df_perc_ro[colname_list]

print(rowname_list)
print(colname_list)

['Gehoelz', 'Mischholz', 'Nadelholz', 'Gruenland', 'Laubholz', 'Heide', 'Moor', 'sonstige Siedlungsfreifl.', 'Weinbau', 'Gartenland', 'Golfplatz', 'sonst. Sport-, Freizeit-, Erholungsfl.', 'Friedhof', 'Sumpf', 'Wochenend-, Ferienhaussiedl.', 'Park, Gruenanlage', 'Ackerland', 'Kleingarten', 'sonst. Landwirtschaftsfl.', 'Obstbau', 'Streuobst']
['biking', 'hiking', 'walking', 'tourist', 'relaxing', 'family', 'picnic', 'soccer', 'friends', 'sport', 'playing']

Compose final output and store to html¶

%%output filename="meingruen_activities_userdays" # uncomment for output to file 
heatm = hv.HeatMap({'x': df_ro.columns, 'y': df_ro.index, 'z': df_norm_ro, 'z2': df_perc_ro, 'z3': df_ro, 'z4': df_alpha}, 
           kdims=[('x', 'Activity (LBSM)'), ('y', 'Land Use (ATKIS)')], 
           vdims=['z', 'z2', 'z3', 'z4'], 
    ).opts(
           opts.HeatMap(
           title_format="Heatmap for selected ATKIS categories and LBSM activities",
           tools=[hover], 
           colorbar=True, 
           width=720, 
           height=520,
           cmap='greens'
           #alpha='z4' # dim cells based on total available posts (=accuracy)
           )
        ) 
heatm + \
hv.Text(x=0.01, y=0.5, 
        text='Geotagged Social Media posts (Twitter,\n'
             'Instagram, Flickr) have first been\n'
             'intersected with ATKIS geometries for\n'
             'Germany. This heatmap shows the correlation\n'
             'between selected activities expressed in\n'
             'intersected Social Media posts and the bias\n'
             'for certain land use types (ATKIS).\n'
             'Dark-green colors mean high correlation (1),\n'
             'whereas lighter colors mean low correlation\n'
             '(0) between land use and activity. Columns\n'
             '(activities) and rows (land use) have been\n'
             'ordered using 2-D cosine-similarity\n'
             'clustering, with the goal to group cells of\n'
             'similar patterns. The base measure here is\n'
             'Userdays (see Wood, Guerry, Silver, & Lacayo,\n'
             '2013. Each user is counted once per day and\n'
             'activity.'
       ).opts(
    height=450, show_frame=False, hooks=[hook], text_align='left', text_font_size='13px')

Export to svg

from bokeh.io import export_svgs
p =  hv.render(heatm, backend='bokeh')
p.output_backend = "svg"
export_svgs(p, filename="heatmap_userdays_greens.svg")

['heatmap_userdays.svg']

	hiking	biking	walking	sport	relaxing	friends	family	tourist	playing	picnic	soccer
ackerland	4576	4921	6277	8210	1540	9960	10573	11451	2441	1076	1698
friedhof	272	261	834	426	141	825	1004	1072	189	47	77
gartenland	13	34	45	87	20	56	59	137	18	8	14
gehoelz	4427	1144	1599	1732	295	2097	1729	4104	439	181	252
golfplatz	75	43	111	544	68	306	207	225	335	20	31
gruenland	11482	5184	7308	7777	1647	9049	9707	14524	2019	1067	1190
heide	393	108	302	149	48	101	146	265	37	13	10
kleingarten	153	426	640	896	170	1052	1034	782	278	224	209
laubholz	6747	2446	4192	3952	644	3158	3691	6042	769	406	430
mischholz	9737	2389	3664	3361	632	2857	3011	8208	649	277	345
moor	205	62	193	90	36	95	124	174	21	5	7
nadelholz	15100	2371	3743	3615	757	3221	3439	11012	616	317	265
obstbau	54	36	45	42	15	73	85	70	28	15	17
parkgruenanlage	3906	12931	30943	32676	6031	32426	22242	51202	6672	7091	2508
sonstlandwirt	43	50	42	38	4	45	30	41	15	5	6
sonstsiedlungsfreifl	110	65	131	88	13	119	117	216	25	13	17
sportfreizeiterholung	5853	8619	9129	44879	5517	41079	35400	33764	26670	2320	63960
streuobst	222	127	246	190	45	229	213	236	76	53	31
sumpf	125	104	131	90	33	153	266	260	32	11	16
weinbau	684	171	471	183	64	273	277	497	66	42	38
wochenendferienhau	328	185	345	404	245	498	771	1044	127	59	34

	hiking	biking	walking	sport	relaxing	friends	family	tourist	playing	picnic	soccer
ackerland	7172	5967	8082	10801	1627	11612	12627	14027	2761	1231	1986
friedhof	362	457	2434	632	184	1254	2070	1560	203	56	162
gartenland	20	55	59	370	25	58	65	278	24	8	14
gehoelz	6631	1789	2004	5358	312	2290	1982	5951	910	208	321
golfplatz	119	48	127	1112	68	319	265	863	530	21	32
gruenland	19661	8169	9325	15254	1945	11597	11958	19743	2451	1327	2914
heide	648	128	419	175	53	127	181	324	40	14	25
kleingarten	200	542	834	980	171	1475	1205	892	349	258	982
laubholz	12612	3614	5694	5478	936	4027	4611	9055	1078	445	765
mischholz	16294	3171	5305	5068	715	3396	3781	11173	1073	343	440
moor	382	72	260	93	82	96	134	289	22	6	7
nadelholz	23047	3298	5401	5476	888	3539	4403	14222	1185	371	322
obstbau	83	49	60	50	15	79	97	97	34	16	19
parkgruenanlage	5250	16721	39408	39553	6819	39425	27206	69110	7788	9329	3761
sonstlandwirt	61	54	51	39	4	47	41	46	16	6	7
sonstsiedlungsfreifl	150	80	170	110	14	127	135	310	31	14	24
sportfreizeiterholung	6978	13663	15229	92245	6507	53643	43391	47869	42556	2713	120049
streuobst	428	175	395	438	47	274	323	360	127	87	37
sumpf	154	118	167	2426	35	159	286	343	32	11	16
weinbau	1424	778	792	436	65	598	315	792	68	100	44
wochenendferienhau	366	228	369	433	278	546	855	1327	137	87	34

	Post Count	Percentage
Land use
Ackerland	438550	5.51752
Friedhof	47857	0.602102
Gartenland	4507	0.0567038
Gehoelz	112909	1.42054
Golfplatz	12116	0.152435
Gruenland	440041	5.53628
Heide	9163	0.115282
Kleingarten	49571	0.623667
Laubholz	182892	2.30102
Mischholz	172552	2.17093
Moor	6650	0.0836655
Nadelholz	185159	2.32954
Obstbau	4414	0.0555338
Park, Gruenanlage	1392178	17.5154
sonst. Landwirtschaftsfl.	2264	0.028484
sonstige Siedlungsfreifl.	7201	0.0905978
sonst. Sport-, Freizeit-, Erholungsfl.	1428244	17.9691
Streuobst	11849	0.149076
Sumpf	8221	0.103431
Weinbau	17709	0.222802
Wochenend-, Ferienhaussiedl.	25035	0.314972

	hiking	biking	walking	sport	relaxing	friends	family	tourist	playing	picnic	soccer
Ackerland	1.04344	1.12211	1.43131	1.87208	0.351157	2.27112	2.4109	2.6111	0.556607	0.245354	0.387185
Friedhof	0.56836	0.545375	1.74269	0.890152	0.294628	1.72389	2.09792	2.24001	0.394927	0.0982092	0.160896
Gartenland	0.28844	0.754382	0.998447	1.93033	0.443754	1.24251	1.30907	3.03972	0.399379	0.177502	0.310628
Gehoelz	3.92086	1.01321	1.41618	1.53398	0.261272	1.85725	1.53132	3.63479	0.388809	0.160306	0.223189
Golfplatz	0.619016	0.354903	0.916144	4.48993	0.561241	2.52559	1.70848	1.85705	2.76494	0.165071	0.25586
Gruenland	2.6093	1.17807	1.66075	1.76734	0.374283	2.0564	2.20593	3.3006	0.458821	0.242477	0.270429
Heide	4.28899	1.17865	3.29586	1.6261	0.523846	1.10226	1.59336	2.89207	0.403798	0.141875	0.109135
Kleingarten	0.308648	0.859373	1.29108	1.80751	0.342942	2.12221	2.0859	1.57754	0.560812	0.451877	0.421617
Laubholz	3.68906	1.3374	2.29206	2.16084	0.35212	1.7267	2.01813	3.30359	0.420467	0.221989	0.235111
Mischholz	5.64294	1.38451	2.12342	1.94782	0.366266	1.65573	1.74498	4.75683	0.376119	0.160531	0.19994
Moor	3.08271	0.932331	2.90226	1.35338	0.541353	1.42857	1.86466	2.61654	0.315789	0.075188	0.105263
Nadelholz	8.15515	1.28052	2.02151	1.95238	0.408838	1.73959	1.85732	5.94732	0.332687	0.171204	0.14312
Obstbau	1.22338	0.815587	1.01948	0.951518	0.339828	1.65383	1.92569	1.58586	0.634345	0.339828	0.385138
Park, Gruenanlage	0.280568	0.928832	2.22263	2.34711	0.433206	2.32916	1.59764	3.67783	0.479249	0.509346	0.180149
sonst. Landwirtschaftsfl.	1.89929	2.20848	1.85512	1.67845	0.176678	1.98763	1.32509	1.81095	0.662544	0.220848	0.265018
sonstige Siedlungsfreifl.	1.52757	0.902652	1.81919	1.22205	0.18053	1.65255	1.62477	2.99958	0.347174	0.18053	0.236078
sonst. Sport-, Freizeit-, Erholungsfl.	0.409804	0.603468	0.639176	3.14225	0.386279	2.87619	2.47857	2.36402	1.86733	0.162437	4.47823
Streuobst	1.87358	1.07182	2.07612	1.60351	0.379779	1.93265	1.79762	1.99173	0.641404	0.447295	0.261625
Sumpf	1.5205	1.26505	1.59348	1.09476	0.401411	1.86109	3.23562	3.16263	0.389247	0.133804	0.194624
Weinbau	3.86244	0.965611	2.65966	1.03337	0.361398	1.54159	1.56418	2.80648	0.372692	0.237168	0.21458
Wochenend-, Ferienhaussiedl.	1.31017	0.738965	1.37807	1.61374	0.97863	1.98922	3.07969	4.17016	0.50729	0.23567	0.13581