Monitoring the physiological states of a microbial community and exploration of inter-microbial interactions within requires a data management system that is able to share information and services at and advanced level and allows for tight integration of wet-dry lab approaches. The basic functionality of such a system includes data and metadata (i) collection, (ii) integration and (iii) delivery.
Maintaining a high degree of data interoperability is key and requires automatic integration of laboratory process execution (LIMS) data, collected (Omics) assay data and associated experimental meta-data in a Findable Accessible, Interoperable and Reusable (FAIR) format. Application of these four foundational principles will allow researchers to extract maximum benefit from the research investments made.
Platform Technical details
UNLOCK Knowledge management consist of four parts:
- An integrated Rule-Oriented Data management System (iRODS) takes care of the collected (raw) assay data, transformed data and meta-data.
- In the UNLOCK iRODS implementation, data files and folder are hierarchically organized through implementation of the Investigation/Study/Assay (ISA) format, an open general-purpose framework to collect and communicate complex metadata. In this set-up an ‘Investigation’ is collection of experiments revolving around a set of common research questions. The Investigation folder thus forms the root of a set of hierarchically organized folders and files containing data and meta-data derived from experiments related to the research questions.
- Experimental design meta-data is used to: (i) automatically create the appropriate ISA folder structure at the start of the Investigation and (ii) automatically start data crunching when raw data is obtained.
- Standardized workflows and container technology is used to transform the raw data in information (see Figure below).
Maintenance of the UNLOCK iRODs infrastructure and long-term preservation of data generated within the UNLOCK infrastructure is outsourced to SURFsara.
Most used application pipelines. Examples and technical details
- Amplicon analysis
- Metagenomics analysis