Management of scientific data

A major technical challenge in publishing biomolecular simulation data is the lack of suitable file formats for many data types. Only molecular configurations and sequences of such configurations (trajectories) are well supported by today’s software tools. Other important information, such as molecular systems definitions, including force fields and their parameters, normal modes, or models used in trajectory analysis, are difficult to archive or exchange, and are therefore not published at all. We are working on the development of modular and extensible data model and file formats for all aspects of molecular simulation. Current projects in this field are the MOSAIC data model and the digital scientific notation Leibniz.

Publications:

  • Hinsen K.
    Computational science: shifting the focus from tools to models.
    F1000Research 2014;3:101.
    doi:10.12688/f1000research.3978.2

  • Hinsen K.
    MOSAIC: a data model and file formats for molecular simulations.
    J Chem Inf Model. 2014;54(1):131–7.
    doi:10.1021/ci400599y

  • Hinsen K.
    Caring for Your Data.
    Comput Sci Eng. 2012;14(6):70–4.
    doi:10.1109/MCSE.2012.108

Software: