ARTÍCULO
TITULO

Analysis of quality metadata in the GEOSS Clearinghouse

Paula Díaz Redondo    
Joan Masó    
Eva Sevillano    
Miquel Ninyerola    
Alaitz Zabala    
Ivette Serral    
Xavier Pons    

Resumen

The Global Earth Observation System of Systems (GEOSS) Clearinghouse is part of the GEOSS Common Infrastructure (GCI) that supports the discovery of the data made available by the Group on Earth Observations (GEO) members and participant organizations in GEOSS. It also acts as a unified metadata catalogue that stores complete metadata records, not only about datasets but also for other kinds of components and services. By exploring these records, users often try to find the fit-for-use data. Quality indicators and provenance are included in the metadata and are potentially useful variables that allow users to make an informed decision avoiding to download and to assess the data themselves. However, no previous studies have been made on the completeness and correctness of the metadata records in the Clearinghouse. The objective of this paper is to analyze the data quality information distributed by the GEOSS Clearinghouse. The aim is to quantify its completeness and to provide clues on how the current status of the Clearinghouse could be improved and how useful quality aware tools could be. The methodology used in the current analysis consists in first harvesting of the Clearinghouse and then quantify the quality information found in 97203 metadata records, by using a semi-automatic approach. The results reveal that the inclusion of quality information on metadata records is not rare: 19.66% of the metadata records contain some quality element. However, this is not general enough and several aspects could be improved. For instance, 77.78% of quantitative measures lack measure units. When quality indicators are not sufficient, the lineage metadata information could be used to mitigate this situation by analysing the process steps and sources used to create a dataset. However, even though lineage is reported in 15.55% of the records, only 1.27% of the cases return a complete list of process steps with sources. This paper also provides indications on what is lacking in the current producer metadata model and, detected a gap in usage or user feedback metadata in GEOSS. Moreover, information extracted from GeoViQua interviews with users indicates that they value informal comments and user feedback on datasets as a complement of the more formal producer-oriented metadata description of the data. Although, many efforts within the scientific community and the Quality Assurance Framework for Earth Observation (QA4EO) group have been invested in describing how to parameterize data quality and uncertainty, we conclude that still extra work can be done to provide complete quality information in the metadata catalogues. In brief, since the GEOSS Clearinghouse references data from the most important agencies and research organizations, the results presented in this paper provide a perspective on how well quality is disseminated in the Earth observation community in general.

PÁGINAS
pp. 352 - 377
MATERIAS
INFRAESTRUCTURA
REVISTAS SIMILARES

 Artículos similares