Matthias Budde, Julio De Melo Borges, Stefan Tomov, Till Riedel, Michael Beigl from TECO/Pervasive Computing Systems, Karlsruhe Institute of Technology (KIT), presented their paper on spatio-temporal clustering for urban infrastructure monitoring at Urb-IoT 2014 Conference held in Rome on October 27-28, 2014. The authors of the paper on innovative use-case application of data mining techniques for urban infrastructure monitoring have deservedly received the best paper award.
The study portrays and assesses a novel framework for clustering spatio-temporal data based on the initial transformation of data into a graph structure and clustering of the graph. The researchers compare the framework performance to modern spatio-temporal clustering methods for the use case of duplicate detection.
What kind of framework is presented by the researchers? The framework includes two steps: Data Modeling, in which spatio-temporal equivalent reports are detected and the ST-Graph is formed and ST-Graph, in which graph clustering algorithms are applied to extract densely connected sub-groups.
Data collection process is also twofold and involved two issue tracking platforms: KA-Feedback (KAF) and SeeClickFix (SCF).
The core of the article is dedicated to the framework evaluation process. Two spatio-temporal clustering algorithms were chosen for comparison: ST-GRID and ST-DBSCAN while SCF dataset was used as ground truth. The evaluation regarding the parameterization of the clustering framework tested the impact of the thresholds on both precision and recall on the first step of the algorithm regarding its ability to cluster duplicate reports. However, the researchers went further by testing another density based approach which was solely based on spatial elements. The approach showed a much lower F1-Measure than the spatio-temporal one.
As for the algorithm runtime, the researchers enhanced it by utilizing highly topical Big Data technology.
The evaluation is concluded with enthusiastically positive outcomes: the quantitative performance of the presented method displays potential spurring of manual processing in the root application case. The approach outperformed both algorithms and delivered excellent results. A significant advantage of the proposed approach is the use of less parameters rendering it easy to use by non experts (e.g. civil servants).
Interested in the framework? Read the full paper here