Inizio Contenuto

Linked Open Data



'Linked Open Data' (LOD) allow to access and navigate data in open format based on semantic web technologies and standards. The publication of the data in LOD format is based on the definition of ontologies (formal representations, shared and explicit conceptualization of the domain of interest). The first group of 'Linked Open Data' published by Istat consists of data from the Census of Population and Housing, 2011. In particular, the following ontologies were defined:


  1. Territorial Ontology

  2. Census Data Ontology





Territorial Ontology



The territorial ontology formalizes and describes the Italian territory analyzing it both from the administrative and the statistical point of view. In particular, from an administrative point of view, the territory is divided into:


  • State: Political entity that rules and exercises the sovereign power over a given territory and subject matters that belong to it

  • Geographical division: north-west, north-east, south, center and island

  • Region: local authority defined in the Italian legal system

  • Province: local authority having jurisdiction in a group of municipalities, not necessarily contiguous

  • Municipality: administrative body determined by specific territorial limits on which insists a portion of the population

  • Sub-municipal area: municipalities, districts, etc. present in the 34 municipalities with higher population size and more than 100,000 inhabitants



From a statistical point of view, the territory is partitioned as follows:


  • Census sections: portion of territory on which the ISTAT surveys are conducted. The average population is about 170 individuals

  • Census areas: groups of census sections, adjacent to each other, between census sections and locality of inhabited centers, belonging to the main centers

  • Locality: portion of territory, usually known by its own name, on which are located one or more grouped or scattered houses



The ontology also characterize the special areas, such as: (i) geomorphological entities (ponds, fishing valleys, river, lagoon, ..), (ii) administrative islands and (iii) areas in dispute, and special nucleus such as: (i) hospitals, (ii) abbeys, (iii) shelters, .. for a total of 45 special nucleus. All of these entities are linked together both by hierarchical relations and by other relations such as eassegnataa that links an area in dispute with the municipality in which that area has been allocated, or the relation econtesada that links the municipality claiming the disputed area. Finally, connections with the equivalent concepts in the international Geonames ontology were made, for example: for the rivers, defined as a special area, was made explicit the equivalence with the coding H.STM in the ontology Geonames.




Census Data Ontology



The census data ontology formalizes and describes the metadata of the census variables. Dealing with statistical data, to formalize data we used the Date Cube Vocabulary that is a meta-ontology created specifically for the representation of multi-dimensional data.
The census variables can be divided into eight groups: (i) population, (ii) foreign population, (iii) families, (iv) educational level, (v) employment status, (vi) commuting, (vii) housing, and ( viii) buildings. For each of these groups, the following measures and dimensions are formalized in the ontology:


  1. population
    • Measures: resident population

    • Dimensions: year, territory, sex, age, marital status.

  2. Foreign population
    • Measures: foreigners and stateless population

    • Dimensions: age, country of origin, year, territory, sex.

  3. Families
    • Measures: number of families, number of component of families

    • Dimensions: component of family, condition of housing usage, territory, year.

  4. educational level
    • Measures:

    • Dimensions: territory, year, level of education, age, sex.

  5. Professional condition
    • Measures: resident population

    • Dimensions: labor force, employment status, age, year, region, sex.

  6. commuting
    • Measures: resident population

    • Dimensions: commuting, year, territory.

  7. housing
    • Measures: number of accommodations, area

    • Dimensions: employment status, type of housing, territory, year.

  8. Buildings
    • Measures: number of buildings, number of apartments in a building

    • Dimensions: territory, construction materials, construction period, state of preservation, year, number of floors, number of apartments, type of building usage.





CERTIFICATION OF ORIGIN OF DATA



Whilst opening to the new frontier of linked open data, Istat does not underestimate the importance of data quality, thus Istat certify the origin of its data using the O-PROV Ontology that allows to describe in details the origin of the data. In more detail, the published data are equipped with the following set of metadata that describe and certify the origin as well as the quality of data:


  • Entity, activities and agents participating in the process of data generation

  • Responsible of the data

  • Owner of the rights of the data

  • Who publishes data

  • Date of last data modification

  • Title of the published data

  • Data reference period

  • Publication license

  • Description of the data

  • Spatial reference of the data





FURTHER INFORMATION



For more details on published data please read the document (in Italian) 'Descrizione dei dati geografici e delle variabili censuarie per sezione di censimento Anni 1991, 2001, 2011' ( download the document). For more information on the technological aspect of the publication of data LOD format, see the article 'Publishing the 15th Italian Population and Housing Census in Linked Open Data' SemStats 2014 ( download the document). To know the guidelines for the enhancement of public information please read the guidelines published by the Agency for Digital Italy ( download the document).