Meta Check: A Comprehensive Framework for Profiling and Repairing Metadata Quality in Open Data Repositories

Authors

  • Riaz Ahmed Bachelor’s Student at Faculty of Higher IT School, National Research Tomsk State University, Tomsk, Russian Federation.
  • Shakil Ahmed Bachelor’s Student at Department of Petroleum Engineering, Chattogram University of Engineering and Technology, Chattogram, Bangladesh.
  • Md Ruhul Ibna Khan Jesun Bachelor’s Student at TISP Molecular Engineering, National Research Tomsk State University, Russian Federation.
  • Shaik Abul Zaid Bachelor’s Student at Faculty of Higher IT School, National Research Tomsk State University, Tomsk, Russian Federation.

DOI:

https://doi.org/10.63468/jpsa.4.2.06

Keywords:

Metadata Quality, Open Data, Data Discovery, FAIR Principles, Data Profiling, Petroleum Engineering, Mining Engineering

Abstract

The quality of metadata that is pervasive in nature is critical in undermining the utility of open data repositories, which are the key component of the modern research and open government endeavours. Such shortcomings are very detrimental to data discovery, interoperability, and reuse, and, in effect, data silos form within a so-called interconnected ecosystem. The paper presents a new and holistic approach, named Meta Cheque, to profile the quality of metadata systematically, assess it, and fix it in different open data repositories. The high quality of our methodology lies in the fact that it is a four-tier process: (1) scalable harvesting of metadata of a large variety of portals, both general-purpose (e.g. Data.gov) and domain-specific (e.g. the EDX of the National Energy Technology Laboratory), (2) multi-dimensional profiling of metadata quality based on quantifiable metrics of completeness, consistency, timeliness, and standard compliance (DCAT and FAIR), (3) the use of a suite of lightweight yet powerful automated repair methods, including ontology Findings of a large scale examination of more than 50,000 datasets suggest that there is a high level of metadata deficiency in general and domain-specific repositories. The use of our automated repair procedures resulted in significant quality improvements, and completeness scores were raised by a mean of 35 per cent, and consistency by more than 50. Search simulations conducted with federated search revealed a meaningful improvement in discoverability, whereby the average recall showed a 28 per cent improvement and the average precision showed a 22 per cent improvement. A more thorough case study (based on petroleum engineering data) found that domain-specific fixes, like matching terms to the NASA GCMD thesaurus, increased the accuracy of complex technical queries significantly, e.g., of concepts like "gamma ray log" and "3D seismic survey." We have definitively shown that metadata remediation can be done in a systematic, automated manner, which is not only technically feasible but is central to the development of an open data ecosystem that is actually integrated, functional, and trustworthy. This is more important in those data-intensive and big-stakes areas such as energy and geosciences, where the price of either not discovering or failing to interpret the data is extremely high.

Downloads

Download data is not yet available.

Downloads

Published

2026-03-20

How to Cite

Ahmed, R. ., Ahmed, S. ., Jesun , M. R. I. K. ., & Zaid, S. A. . (2026). Meta Check: A Comprehensive Framework for Profiling and Repairing Metadata Quality in Open Data Repositories. Journal of Political Stability Archive, 4(2), 84-96. https://doi.org/10.63468/jpsa.4.2.06

Similar Articles

1-10 of 336

You may also start an advanced similarity search for this article.