Publishing and Interlinking Linked Geospatial Data
Tutorial in Conjunction with the 12th Extended Semantic Web Conference
June 1, 2015
Portoroz, Slovenia


Short Description

In this tutorial we present the life cycle of linked geospatial data and we focus on two important steps: the publication of geospatial data as RDF graphs and interlinking them with each other. Given the proliferation of geospatial information on the Web many kinds of geospatial data are now becoming available as linked datasets (e.g., Google and Bing maps, user-generated geospatial content, public sector information published as open data etc.). The topic of the tutorial is related to all core research areas of the Semantic Web (e.g., semantic information extraction, transformation of data into RDF graphs, interlinking linked data etc.) since there is often a need to re-consider existing core techniques when we deal with geospatial information. Thus, it is timely to train Semantic Web researchers, especially the ones that are in the early stages of their careers, on the state of the art of this area and invite them to contribute to it.

In this tutorial we give a comprehensive background on data models, query languages, implemented systems for linked geospatial data, and we discuss recent approaches on publishing and interlinking geospatial data. The tutorial is complemented with a hands-on session that will familiarize the audience with the state-of-the-art tools in publishing and interlinking geospatial information.

Tutorial Description

We have recently witnessed a proliferation of geospatial data on the Web. In addition to professionally-produced material being offered for free (e.g., Google or Bing maps), the public has also been encouraged to make geospatial content, including their geographical location, available online (e.g., OpenStreetMap). In addition, there is now a substantial amount of public sector information becoming available as open geospatial data The volume of such geospatial Web content is already big and constantly growing.

Semantic Web researchers and practitioners have also started to make geospatial data available as linked data (e.g., Ordnance Survey, Great Britain's national mapping agency, makes available a lot of its geospatial data as linked data http://data.ordnancesurvey.co.uk/, the portal LinkedGeoData makes OpenStreetMap data are made available as RDF http://linkedgeodata.org, etc.). Since a lot of data useful to the wider public is geospatial (e.g., open government data), we expect this trend to continue in the near future.

In this tutorial we will present the life cycle of linked geospatial data and we will concentrate on two important steps: the publication of geospatial data as RDF graphs and their interlinking. To set the stage for the tutorial, we give a comprehensive coverage of works on representing and querying geospatial information in the semantic web.

Intended audience – Prerequisite knowledge

The tutorial is targeted towards Semantic Web researchers in the early stages of their career. The prerequisite is good knowledge of RDF and SPARQL and some knowledge of other Semantic Web technologies (OWL, RDF stores, Linked Data). Knowledge of geospatial technologies is not a prerequisite and will be covered in some depth.

Programme

Time Description PDF/PPT
9:00-9:15 Introduction pdf
9:15-10:30 Background in geospatial data modeling, represnenting geospatial information in the Semantic Web, and querying linked geospatial data.
pdfpdf
 

  • GIS concepts and vocabulary
  • Geographic space modeling and representation (vector space representation, half-space representation)
  • Co-ordinate systems
  • Relevant OGC standards (Well Known Text, Geography Markup Language)
  • Representing geospatial data in RDF
  • Geospatial ontologies and rules
  • Examples of publicly available linked geospatial data
  • The query languages stSPARQL and GeoSPARQL (detailed introduction to the features of the languages, semantics of query evaluation, comparison).
The material will come from our previous tutorials:
  • M. Koubarakis, K. Kyzirakos and M. Karpathiotakis. Data models, Query Languages, Implemented Systems and Applications of Linked Geospatial Data, 9th Extended Semantic Web Conference (ESWC), May 27 - 31 2012, Heraclion, Crete, Greece.
  • M. Koubarakis, M. Karpathiotakis, K. Kyzirakos, C. Nikolaou, and M. Sioutis. Data Models and Query Languages for Linked Geospatial Data. Invited tutorial at the 8th Reasoning Web Summer School 2012 (RW 2012). September 3-8, 2012. Austria, Vienna. In: Eiter, T., Krennwallner, T. (eds.) Reasoning Web. Semantic Technologies for Advanced Query Answering. Lecture Notes in Computer Science, vol. 7487, pp. 290328. Springer.

10:30-11:00 coffee break
11:00-12:00 Publishing geospatial information as RDF graphs
pdf
 

The material will come mostly from the following papers and W3C recommendations:

  • M. Arenas, E. Prud'hommeaux, J. Sequeda. A Direct Mapping of Relational Data to RDF, W3C Recommendation, 2012.
  • S. Das, S. Sundara, R. Cyganiak. R2RML: RDB to RDF Mapping Language, W3C Recommendation, 2012.
  • S. Auer, S. Dietzold, J. Lehmann, S. Hellmann, and D. Aumueller. Triplify: Light-weight Linked Data Publication from Relational Databases. International Conference on World Wide Web 2009.
  • C. Bizer and A. Seaborne. D2RQ: treating non-RDF databases as virtual RDF graphs. International Semantic Web Conference 2004.
  • A. d. Leon, V. Saquicela, L. M. Vilches, B. Villaz ́on-Terrazas, F. Priyatna, O. Corcho, Geographical Linked Data: a Spanish Use Case, in: I-SEMANTICS, ACM, 2010.
  • K. Kyzirakos, I. Vlachopoulos, D. Savva, S. Manegold, M. Koubarakis. GeoTriples: a Tool for Publishing Geospatial Data as RDF Graphs Using R2RML Mappings. Terra Cognita 2014, 6th International Workshop on the Foundations, Technologies and Applications of the Geospatial Web, in conjunction with ISWC 2014.

12:00-12:30 Discovering Spatial and Temporal Links among RDF graphs
pdf
 

The material will come mostly from the following papers:

  • O. Hassanzadeh, A. Kementsietsidis, L. Lim, R. J. Miller, and M. Wang. A framework for semantic link discovery over relational data. Information and knowledge Management, 2009.
  • A.-C. N. Ngomo and S. Auer. Limes: A time-efficient approach for large-scale link discovery on the web of data. International Joint Conference on Artificial Intelligence, 2011.
  • R. Isele and C. Bizer. Active learning of expressive linkage rules using genetic programming. Web Semantics: Science, Services and Agents on the World Wide Web, 2013.
  • Axel-Cyrille Ngonga Ngomo. Orchid: reduction-ratio-optimal computation of geospatial distances for link discovery. International Semantic Web Conference, 2013.
  • Vivek Sehgal, Lise Getoor, and Peter D Viechnicki. Entity resolution in geospatial data integration. In Proceedings of the 14th annual ACM international symposium on Advances in geographic information systems, pages 83–90. ACM, 2006.
  • Luis M Vilches-Blázquez, Víctor Saquicela, and Oscar Corcho. Interlinking geospatial information in the web of data. In Bridging the Geographic Information Sciences, pages 119–139. Springer, 2012.
  • P. Smeros and M. Koubarakis. Discovering Spatial and Temporal Links among RDF data. Unpublished paper, to appear.

12:30-14:00 Lunch break
14:00-15:00 Hands-on session: Publishing geospatial information as RDF graphs
pdf
 

The tutorial attendees will be given access to a collection of geospatial data sets that have already been discussed in the morning sessions (e.g., OpenStreetMap). Then, they will be asked to generate automatically some R2RML mappings using GeoTriples (https://github.com/LinkedEOData/GeoTriples), customize the R2RML mappings to follow the vocabulary of their liking (e.g., GeoSPARQL), generate an RDF graph using GeoTriples, and store the resulting graph to the geospatial RDF store Strabon (http://strabon.di.uoa.gr/). The attendees will be provided with a virtual machine to deploy on their laptops, where all necessary software and data will have been installed in advance.

15:00-15:30 Hands-on session: Discovering Spatial and Temporal Links among RDF graphs
pdf
 

The tutorial attendees will be given access to a collection of geospatial linked data that have already been transformed to RDF during the previous session. In this session they will be asked to write link specifications and provide them as input to Silk (https://github.com/silk-framework/silk) in order to discover spatial and temporal relations among these datasets. The attendees will be provided with a virtual machine to deploy on their laptops, where all necessary software and data will have been installed in advance.

15:30-16:00 coffee break
16:00-16:30 Hands-on session: Discovering Spatial and Temporal Links among RDF graphs (cont'd)
 
16:30-17:00 Summary and conclusions of the tutorial  

Presenter CVs

Manolis Koubarakis is a Professor in the Dept. of Informatics and Telecommunications, National and Kapodistrian University of Athens. He has published more than 150 papers that have been widely cited in the areas of Artificial Intelligence (especially Knowledge Representation), Databases, Semantic Web and Linked Data. His research has been financially supported by the European Commission (projects CHOROCHRONOS, DIET, BRIDGEMAP, Evergrow, OntoGrid, SemsorGrid4Env, TELEIOS, Optique, LEO and MELODIES), the Greek General Secretariat for Research and Technology (more recently through a Research Excellence Grant), the European Space Agency (project Prod-Trees) and industry sources (Microsoft Research and British Telecommunications). He is currently co-ordinating project LEO (http://www.linkedeodata.eu/) which develops tools for linked Earth Observation data and linked geospatial data, and applies them to the development of a precision farming application. Manolis’ team develops the linked data infrastructure to be used in project MELODIES (http://www.melodiesproject.eu/) which studies how to exploit linked open data in a variety of environmental applications. He also participates in Optique (http://www.optique-project.eu/), a recent European effort in the area of Big Data with application scenarios from the energy sector (industrial partners Statoil and Siemens). He recently co-chaired the European Data Forum 2014 (http://2014.data-forum.eu/), the top European event aiming towards the development of a strong data economy in Europe. Manolis has 18 years teaching experience in academic institutions in Greece and the United Kingdom, and has given many talks in international conferences and workshops (some of them invited). He has served as Tutorial chair for ESWC 2011.

Kostis Kyzirakos is a post-doctoral researcher in the Database Architectures group of Centrum Wiskunde en Informatica in The Netherlands. He has a Diploma in Engineering from the School of Electrical and Computer Engineering, NTUA, Athens, and a PhD in Computer Science from the Department of Informatics and Telecommunications, University of Athens. He has participated in projects funded by the European Commision (Ontogrid, SemsorGrid4Env, TELEIOS, LEO) and the Greek General Secretariat for Research and Technology (P2P Techniques for Semantic Web Services). He is one of the main developers of the open-source semantic geospatial DBMS Strabon and one of the main developers of the open-source publishing tool GeoTriples that automates the publication of geospatial data as RDF graphs. During his PhD, he studied and proposed how to represent and query geospatial data in the Semantic Web, published various geospatial datasets as linked geospatial data and implemented applications combining these data with previously published linked geospatial data. His current research focuses on modeling and querying semantic spatio-temporal information on top of traditional DBMS. He has given a tutorial on building semantic sensor webs and applications at ESWC 2011, a tutorial on Data models, Query Languages, Implemented Systems and Applications of Linked Geospatial Data at ESWC 2012 and the 8th Reasoning Web 2012 Summer School, and a tutorial on Linked Geospatial Data at ICTAI 2012.

Panayiotis Smeros is a researcher in the Department of Informatics and Telecommunications, University of Athens. He received his Bachelor degree and his Master of Science from the Department of Informatics and Telecommunications of the University of Athens. He has participated in projects funded by the European Commision (TELEIOS, LEO, MELODIES) and he is one of the main developers of the semantic geospatial DBMS Strabon and the main developer of the geospatial and temporal extensions of Silk that that was developed in the context of these projects. In the same context, he published various geospatial datasets as linked geospatial data and implemented applications combining these data with previously published linked geospatial data. His current research focuses on the overlapping areas of Geospatial Semantic Web, and Linked Data.

Dimitrianos Savva is a researcher in the Department of Informatics and Telecommunications, University of Athens. He received his Bachelor degree and his Master of Science from the Department of Informatics and Telecommunications of the University of Athens. He has participated in projects funded by the European Commision (LEO, MELODIES) and he is one of the the main developers of the open-source publishing tool GeoTriples that automates the publication of geospatial data as RDF graphs. In the same context, he published various geospatial datasets as linked geospatial data and implemented applications combining these data with previously published linked geospatial data. His current research focuses on the overlapping areas of Geospatial Semantic Web, and Linked Data.

Organization/Sponsorship

This tutorial is organized by the European projects LEO and MELODIES.

  • LEO (Linked Open Earth Observation Data for Precision Farming) is a recent European project that studies techniques and software for the whole life cycle of reuse of linked open geospatial data (with a particular emphasis on open Earth Observation data), and develops a precision farming application that is heavily based on such data.
  • MELODIES is a recent European project that develops eight new services which combine Earth Observation data with other open data sources to produce new information for the benefit of scientists, industry, government decision-makers, public service providers and citizens.