5. Data Governance
5.1 Purpose
This chapter describes how we organise and steer data management activities in order to ensure that:
-
The guidelines described above are implemented throughout the organisation;
-
Our data management practices are in line with and contribute to the institute’s strategic aims;
-
Our data management regime is subject to review, analysis and revision in a timely manner.
These higher level aspects of data management are often referred to as data governance. A useful definition is:
"Data governance … is the overall management of the availability, usability, integrity and security of data used in an enterprise. A sound data governance program includes a governing body or council, a defined set of procedures and a plan to execute those procedures."
In this chapter we address many aspects of this definition, but a full description of data governance touches on management structures that are beyond the scope of this handbook.
5.2 Data life cycle management
Data life cycle management is steered by documentation describing how data generated or used in an activity will be handled throughout the lifetime of the activity and after the activity has been completed. This is living documentation that follows the activity and specifies what kind of data will be generated or acquired, how the data will be described, where the data will be stored, whether and how the data can be shared, and how the data will be retired (archived or deleted). The purpose of life cycle management is to safeguard the data, not just during their “active” period but also for future reuse of the data, and to facilitate cost-effective data handling.
This DMH recommends the following concepts of life cycle management to be implemented for the institution:
-
An institution specific Data Management Handbook (DMH) based on a common general template;
-
Extended discovery metadata for data in internal production chains (these are metadata elements that provide the necessary information for life cycle management just described); and
-
A Data Management Plan (DMP) document (a DMP is expected for datasets produced in external projects, but may also be useful for internal datasets, as a supplement to the extended discovery metadata).
The goal is that life cycle management information shall be readily available for every dataset managed by the institute. How these concepts are implemented are described in the subsections below.
5.2.1 Data Management Plan
A Data Management Plan (DMP) is a document that describes textually how the data life cycle management will be carried out for datasets used and produced in specific projects. Generally, these are externally financed projects for which such documentation is required by funding agencies. However, larger internal projects covering many datasets may also find it beneficial to create a specific document of this type.
Currently, agencies funding R&D (such as NFR and the EU) do not strictly require a DMP from the start of any project. However, for projects in the geosciences, data management is an issue that must be addressed, and the agencies strongly recommend a DMP solution. For example, NFR publishes guidelines for the contents of a DMP, including links to tools (templates and online services); these guidelines are recommended for any data management project or activity and will in time become a requirement according to NFR.
5.3 Data governance at NINA
5.3.1 Current implementation
5.3.1.1 Organisational Roles
5.3.1.2 Status DMH
5.3.1.3 Status Discovery metadata
5.3.1.3 Status DMP
5.3.2 Planned developments in the near-term (< 2 years)
Revise DMH annually or when needed.
5.3.3 Expected evolution in the longer term (> 2 years)
Revise DMH annually or when needed.