Datasets

TourPedia contains two main datasets, which belong to the specific domain of tourism:

License

Tourpedia is released under the Creative Commons CCZero license.

Places

Places contain accommodations, restaurants, attractions and points of interest. Places were retrieved from the following social media: Facebook, Foursquare, Google Places and Booking. They are related to the following locations: Amsterdam, Tuscany, Barcelona, Berlin, Dubai, London, Paris and Rome.

The following table shows the description of each place.

Field Description
id the unique identifier of the place
name name of the place (e.g. a hotel name)
address address of the place
category one among accommodation, attraction, restaurant, poi (point of interest)
location one among Rome, Amsterdam, London, Paris, Berlin, Dubai, Barcelona, Tuscany
lat Latitude
lng Longitude
services the list of services provided by the place. It is set only if the place is an accommodation.
phone_number national phone number associated to the place
international_phone_number international phone number associated to the place
website URL of the web site associated to the place
Icon picture associated to the place
description description of the place in the six languages of the OpeNER project
external_urls external URLs associated to the place. It contains the URLs of Foursquare, Facebook, GooglePlaces and Booking (the last one is present only whether the place is an accommodation)
statistics statistics associated to the place; they are retrieved from Foursquare and Facebook
subCategory The category provided by the source. It is more specific than the field category
polarity The opinion about the place

Reviews

The collection Reviews contains reviews on the above-described places. The following table describes the schema of each review.

Field Description
Id the unique identifier of the review
Text the text of the review
language The language of the review
source one among GooglePlaces, Foursquare, Facebook
rating Rating expressed by the user. Range is between 1 and 5
Time Date of the review
wordsCount Number of words of the text
analysis.kaf The result of the OpeNER pipeline in KAF
analysis.json The result of the OpeNER pipeline in KAF-JSON
polarity The polarity of the review. It is extracted from the Polarity tagger module
place.id id of the place associated to the review
place.name Name of the place associated to the review
place.location Location of the place associated to the review
place.category Category of the place associated to the review
authorName The name of the review author

Download datasets

A complete RDF dump of Tourpedia is available here.

Datasets are divided per category and location.

Places

Location Accommodation Restaurant POI Attraction
Amsterdam CSV CSV CSV CSV
Barcelona CSV CSV CSV CSV
Berlin CSV CSV CSV CSV
Dubai CSV CSV CSV CSV
London CSV CSV CSV CSV
Paris CSV CSV CSV CSV
Rome CSV CSV CSV CSV
Tuscany CSV CSV CSV CSV