Abstract
Background:
Increasing technological advances in the field of biological signal recording, along with diverse available data storage and sharing facilities, has made it much easier for researchers to access extensive biological data for use in their studies. Today, data once recorded in a study can be repeatedly reused by other researchers through access to shared databases. Access to biosignal pools, on the one hand, can save considerable energy and reduce costs by preventing duplicate studies. On the other hand, it improves opportunities for meta-analysis and in-depth studies using diverse datasets with greater statistical power, which provides more reliable results, as well as new insights into biological issues. However, the lack of some agreed-upon data standardization and consistency across the research community creates some barriers to reusing data. Data from different studies often have different formats and structures, which may impose extensive data reformatting for meta-analysis and comparative studies. Moreover, there is no standardized structure for organizing biosignals-associated information (e.g. subject demographics or recording technical information) throughout the research community, which may impair subsequent data reporting and analysis due to the lack of some necessary information.Objectives:
In this article, we briefly report on the efforts made by the Iranian Brain Mapping Biobank (IBMB) to develop a standardized format and structure for recording and archiving electroencephalography (EEG) signals.Methods:
In the process of developing a new EEG data structure in IBMB, we focused on three main issues, as follows: (1) what information should be combined with EEG signals as metadata? There is still no agreement on the content of EEG metadata. Thus, in many cases, the recording of information needed for subsequent EEG signal analysis is neglected. By reviewing the international guidelines on EEG performing and reporting (e.g. [1, 2]) along with by consulting the experts from various fields, we proposed a structured template for recording EEG metadata. (2) Which file format is best suited for storing EEG data? To date, many different data formats for storing EEG have been introduced (e.g. EDF, GDF, and TXT). These formats differ in terms of data type, combined metadata content, storage structure, and storage requirements (for an overview see [3]). Although some of these formats are widely accepted, there is no comprehensive format that can meet all the requirements. The format considered by IBMB addresses the needs for a basic format, which is compact while it can save numerical data with high precision, can be easily used in popular processing applications, and can accommodate the suggested EEG metadata. (3) How to organize EEG datasets structurally? The brain imaging data structure (BIDS) project [4] and EEG study schema (ESS) [5] are among a few recently important efforts to create an infrastructure for structured EEG storage. In line with such efforts, we developed a new hierarchical data structure to store EEG data, which can facilitate EEG data retrieval and sharing.Findings:
IBMB has developed a customized EEG header consisting of 60 elements, including subject demographics, and data technical information, which covers data syntactic, semantic, and pragmatic aspects. This metadata combined with EEG is organized into three main hierarchical levels of Study, Session, and Task, corresponding to the same levels of physical EEG storage.Conclusion:
We introduced a new EEG data structure hierarchy and file content with embedded header information. This data structure encompasses all the information needed for reporting and analyzing EEG; thus, it can facilitate EEG data reuse, as well as large-scale analysis. We propose this approach for archiving EEG datasets in research-oriented EEG repositories.To see references please refer to the PDF file.