Data Curation Service
The Data Curation Service allows data curation by monitoring data and performing transfer, processing and provisioning of the data. The subsystem is composed of a Data Curation Service object.
The Data Curation Service provides only one computational function, which is the monitoring of data. When needed, it consumes other subsystems to perform the actual curation. It provides the following interfaces:
- Monitor data (server): this is a public interface that allows data to be monitored.
- Transfer data (client): this is an interface for transferring data from e.g. data provisioning to data processing.
- Process data (client): the interface for processing the monitored data. This may include the calculation of checksums, or the transformation to new data formats.
- Create data (client): the interface for the creation of new data. This may include the creation of new metadata.
Provenance Data Subsystem
The provenance data subsystem is a specialised form of the data curation service which deals with the capture and maintenance of provenance data about the creation, processing, and dissemination of data within a service. See above.
User Metadata Subsystem
The user metadata subsystem is a specialised form of the data curation service which deals with the capture and maintenance of user metadata within a service, for example user and groups privileges within a VRE instance of a VRE service. See above.
Data Provision Subsystem1
The Data Provision Subsystem provides data sharing, data discovery and data access functions and is composed of a data provision service object, a data inventory service and a data storage controller.
The Data Provision Service is a proxy service for the Data Provision Subsystem. It supports the following interactions:
- Share data (server): is a public interface for allow data to be shared.
- Discover data (server): is a public interface for searching provided data.
- Access data (server): is a public interface for accessing provided data.
The Data Storage Controller is a controller object that persists the published data. It supports the following interactions:
- Post data (server): is an interface for persisting data to the data storage controller.
- Retrieve data (server): is an interface for getting data from the storage controller.
The Data Inventory is a computational object that allows data to be discovered. It supports the following interactions:
- Index data (server): is an interface for registering/indexing data in the inventory .
- Query inventory (server): is an interface for querying the inventory for registered/indexed data.
Data Creation Subsystem2
The Data Creation Subsystem creates data by recording observations. The subsystem is composed of a data creation service object, a creation instrument workbench object and an creation instrument controller object.
The Data Creation Service is a proxy object for managing the instruments and the resulting data. It supports the following interactions:
- Deploy instrument (server): is a public interface for the deployment of a new tool.
- Configure tool (server): is a public interface for controlling the tool.
- Provide data (server): is a public interface for retrieving the created data from the tool.
The Creation Instrument Workbench is a computational object that instantiates creation tool controller objects. It supports the following interactions:
- Create tool controller (server): is an interface for requesting a new instrument controller.
- New creation tool controller (instantiation): an instantiation of a new tool controller object by the creation instrument workbench.
The Creation Tool Controller is a controller object that records observations. It supports the following interactions:
- Configure tool (server): is an interface for configuration of the tool controller.
- Get data (server): is an interface for requesting the data created by the controller.
Data Processing Subsystem 3
The Data Processing Subsystem provides data processing functions and is composed of a data processing service object, a data process workbench object and an data process controller object.
The Data Processing Service is a proxy object for managing the processing controllers and the data that is (to be) processed. It supports the following interactions:
- Deploy process (server): is a public interface for requesting a new process.
- Get data (client): is a public interface for retrieving the processed data.
- Post data (client): is a public interface for providing data to the process.
The Data Process Workbench is a computational object that instantiates data process controller objects. It supports the following interactions:
- Create data process controller (server): is an interface for requesting a process controller object.
- New data process controller (instantiation): is the instantiation of the new data process controller object.
The Data Process Controller is a controller object that processes data. It supports the following interactions:
- Configure process (server): the interface for configuring the process.
- Get data (server): is the interface for retrieving processed data from the data process controller object.
- Post data (server): is the interface for providing data to the data process controller object.
Data Transfer Subsystem4
The Data Transfer Subsystem provides transfer services within the service and RI. The subsystem is composed of a Data Transfer Service and an Abstract Data Transfer Object, which may be instantiated as e.g. a Data Importer Object or a Data Exporter Object.
The Data Transfer Service is a proxy object for providing data transfers. It supports the following interface:
- Deploy transfer (server): is a public interface for requesting a data transfer.
The Abstract Data Transfer Object is an abstract object that provides the following interfaces.
- Fetch data (client): is a client interface for retrieving data from a source.
- Post data (client): is a client interface for providing the data to a target.
The example Data Importer implements the Data Transfer Object for specific environments. In this case, the data importer may specify how the data is fetched from a remote source, whether specific processing needs to take place (via a Data Processing Subsystem) and how the outcome should be provided to the target system.
CV Back End Objects
Back End Objects are objects which encompass the systems and resources provided for preserving, publishing, and processing research data through user accessible services.
Storage System
The Storage System is a system that manages and stores data and metadata of the research infrastructure.
The File Management System manages the storage and retrieval of data as files in a computer system. The Database Management System manages the storage and retrieval of data and metadata in logically structured repository systems.
Service Registry
The service registry is an information system for registering services within the research infrastructure.
The Service Registry is a proxy object that encapsulates all actions needed to register, update and request service information. It supports the following interactions:
- Register/update service (server): is a public interface for registering a service and any maintenance events.
- Request service attributes (server): is a public interface for requesting provided attributes for a registered service.
-
From the Reference Architecture for a SSH Infrastructure: https://sites.google.com/a/dans.knaw.nl/reference-model-for-ssh-data-infrastructure/part-2/computational-viewpoint/data-provision-subsystem ↩
-
From the Reference Architecture for a SSH Infrastructure: https://sites.google.com/a/dans.knaw.nl/reference-model-for-ssh-data-infrastructure/part-2/computational-viewpoint/data-creation-subsystem ↩
-
From the Reference Architecture for a SSH Infrastructure: https://sites.google.com/a/dans.knaw.nl/reference-model-for-ssh-data-infrastructure/part-2/computational-viewpoint/data-processing-subsystem ↩
-
From the Reference Architecture for a SSH Infrastructure: https://sites.google.com/a/dans.knaw.nl/reference-model-for-ssh-data-infrastructure/part-2/computational-viewpoint/data-transfer-subsystem ↩